Google: South Africa Market Share

July 31, 2008

MoneyWeb reported on July 31, 2008 about “Google’s Search Dominance.” You can read Rudolph Muller’s article here. The points about Google that I found interesting were:

  • Google’s South African office is headed up by a former Novell wizard, Stafford Masie
  • Google traffic dwarfs that of Ananzi and Aardvark. “Ananzi currently attracts 221,436 unique monthly visitors, down from 314,132”, reports Mr. Muller. Aardvark “received 88 774 unique monthly visitors, down from 106 102 during the same period in 2007.”
  • “Mobile remains the leading telecommunications medium in the country,” Mr. Muller reports. Google offers universal search for mobile in South Africa.
  • YouTube.com is popular in South Africa.

Africa is quickly becoming the next “big thing”. Google appears to be poised for growth.

Stephen Arnold, July 31, 2008

Stanford TAP: Google Cool that Trails Cuil

July 31, 2008

in the period from 2000 to 2002, Dr. Ramanathan Guha with the help of various colleagues and students at Stanford built a demonstration project call TAP. You can download a Power Point presentation here. I verified this link on July 30, 2008. Frankly I was surprised that this useful document was still available.

TAP was a multi-organization research effort. Participants included IBM, Stanford, and Carnegie Mellon University.

Why am I writing about information that is at least six years old? The ideas set forth in the Power Point were not feasible when Dr. Guha formulated them. Today, the computational power of multi core processors coupled with attractive price-performance ratios for storage makes the demos from 2002 possible in 2008.

TAP was a project set up to unify islands of XML from disparate Web services. TAP also brushed against automatic augmentation of human-generated Web content.Working with Dr. Guha was Rob McCool, one of the developers of the common gateway interface. Mr. McCool worked at Yahoo, and he may still be at that company. Were he to leave Yahoo, he may want to join some of his former colleagues at Google or a similar company.

Now back to 2002.

One of TAP’s ambitious goals was to “make the Web a giant distributed database.” The reason for this effort was to bring “the Internet to programs”. The Web, however, is messy. One problem is that “different sites have different names for the same thing.” TAP wanted to develop a system and method for descriptions, not editors, to choreograph
the integration.”

The payoff for this effort, according to Dr. Guha and Mr. McCool is that “good infrastructures have waves of applications.” I think this is a very important point for two reasons:

  1. The infrastructure makes the semantic functions possible and then the infrastructure supports “waves of applications”.
  2. The outputs of the system described is new combinations of information, different ways to slice data, and new types of queries, particularly those related to time.

Here’s a screen shot of TAP augmenting a query run on Google.

augmented search results

The augmented results appear to the left of the results list. These are sometimes described as “facets” or “assisted navigation hot links”. I find this type of enhance quite useful. I can and do scan result lists. I find overviews of the retrieved information and other information in the system helpful. When well executed, these augmentations are significant time savers.

Keep in mind that when this TAP work up was done, Dr. Guha did not work at Google. Mr. McCool was employed at Stanford. Yet the demo platform was Google. I find this interesting as well that the presentation emphasizes this point: “We need [an] infrastructure layer for semantics.”

Let me conclude with three questions:

  1. Google was not directly mentioned as participating in this project, yet the augmented results were implemented using Google’s plumbing. Why is this?
  2. The notion of fueling waves of applications seems somewhat descriptive of Google’s current approach to enhancing its system. Are semantic functions one enabler of Google’s newer applications?
  3. When will Google implement these enhanced features of its interface? As recently as yesterday, the Cuil.com interface was described as more up to date than Google. Google had functionality in 2002 or shortly thereafter that moves beyond what Cuil.com showed today.

Let me close with a final question. What’s Google waiting for?

Stephen Arnold, July 31, 2008

Cluuz.com: Useful Interface Enhancements

July 31, 2008

Cluuz.com is one of the search companies tapping Yahoo’s search index. The Cluuz.com has introduced some useful interface changes. I will be digging into this system in future write ups, but I want to call your attention to one of the innovations I found useful. (my first Cluuz.com write up is here.)

Navigate to Cluuz.com here. Enter your query. You will see a result screen that looks like my query for “fractal frameworks”.

fractalframeworkquery

The three major changes shown in this screenshot are:

  1. Entities appear in the tinted area above the graphic. My test queries suggested to me that Cluuz.com was identifying the most important entities in the result set.
  2. A top ranked link with selected images. Each image is a hot link. I could tell quickly that the top ranked document included the type of technical diagram that I typically want to review.
  3. A selected list of other entities and concepts.

Read more

Useful SharePoint Info, Useless Presentation

July 30, 2008

A happy quack to J. Peter Bruzzese for his “Desperately Seeking Enterprise Search” which appeared in the July 30, 2008, InfoWorld Web log. You can read the story here. For me the most useful part of the write up was this passage:

Although the MOSS and Search offerings are still available and current, Microsoft has moved on with offers like Search Server 2008 Express and Search Server 2008. From a feature comparison perspective, MOSS 2007 still wins out despite the lack of streamlined installation; it more than makes up for that with such features as People and Expertise Searching, Business Data Catalog, and SharePoint Productivity Infrastructure.

One useful part of the write up is the inclusion to links about SharePoint in its various incarnations. These comparisons and descriptions can be tough to find on the InfoWorld Web site. I recommend that you snag these links and tuck them away for future reference.

Now, to the presentation. Mr. Bruzzese just writes the articles, some other group sets up the InfoWorld Web log. Here’s what you will encounter if you try to print the page: partial printing and two blank pages. Pretty annoying.

There are some workarounds involve that browser extensions, but here’s a work around that doesn’t require installing any:

  1. View the source
  2. Scroll to the beginning text for the story; that is, “There’s a new search player…”
  3. Copy the text of the story plus any tags to this point in the story: “I’d like to know your opinion.”
  4. Paste the text into an HTML editor or even a blank Word document
  5. Save the file.

InfoWorld is so eager to sell that it uses a pop up before you see this story. This is called a “prestitial”, which I dismiss instantly. Then it dumps into the page with the useful information lots and lots of ad baloney, which I also ignore.

You can go back and edit out the embedded calls within Mr. Bruzzese’s quite useful write up. So only Mr. Bruzzese gets the happy quack. The Beyond Search addled goose is winging toward InfoWorld’s Web wizard’s automobile to deposit an avian memento on the vehicle’s waxed fender.

To bad a good story was made hugely annoying to me by a presentation that is more confused than this addled goose.

Stephen Arnold, July 30, 2008

Cognition Rolls Out Semantic Medline

July 30, 2008

Resource Shelf reports that Cognition Technologies has indexed Medline content with its semantic search system. The new service is free, and you can try it yourself at http://www.semanticmedline.com/. Remember that you will be searching abstracts, not the full text of medical documents.

You can read the Resource Shelf story here. The point that jumped out at me was:

[This is] a new free service that enables complex health and life science material to be rapidly and efficiently discovered with greater precision and completeness using natural language processing (NLP) technology.

Cognition Technologies, like Hakia, develops semantic search and content processing systems. You can find out more about the company here. The company also offers a demonstration of its content processing applied to the Wikipedia. You can access that service here.

Stephen Arnold, July 30, 2008

Intel Chases the Cloud a Second Time

July 30, 2008

I wrote about Convera’s present business in vertical search here because I heard that Intel was going to chase clouds again. But before we look at the new deal with Hewlett Packard (the ink company), Yahoo (goodness knows what its business is now), and Intel, let’s go back in time.

Remember in late 2000 when Intel signed a deal with Excalibur? Probably not. Convera was the result of a fusion of Intel’s multimedia unit and Excalibur Technologies. When this deal took form, Intel had 10 data centers.

An Intel executive at the time was quoted in Tabor Communications DSstar saying:

We are creating a global network of Internet data centers with the goal of becoming a leader in world-class Internet application hosting and e-Commerce services, said Mike Aymar, president, Intel Online Services. The opening of a major Internet data center in Virginia is a key step toward this goal. We’ll bring our reliable and innovative approach to hosting customers running mission-critical Internet applications, both in the U.S. and around the world.

Part of the deal included the National Basketball Association. Intel and Convera would stream NBA games. These deals were complex and anticipated the online video boom that is now taking place. The problem was that Intel jumped into this game with Convera technology that was shall we say immature. In less than a year, the deal blew up. The NBA terminated its relationship with Convera. By the time the dust and law suits settled, the total price tag of this initiative was in the hundreds of millions of dollars.

Outside of a handful of Wall Street analysts and data center experts, few people know that Intel anticipated the cloud, made a play, muffed the bunny, and faded quietly into the background until today.

Intel is back again and demonstrating that it still doesn’t have a knack for picking the right partners. The big news is that Intel, HP, and Yahoo are going to tackle cloud computing. The approach is to allow academic researchers to collaborate with industry on projects. The companies will create an experimental network. In short, risk is reduced and the costs spread across the partners. You can read Thomson Reuters’ summary here.

Will the chip giant’s Cloud Two initiative work?

Sure, anything free will garner attention among academics and corporate researchers. Will the test spin money for the ink vendor and the confused online portal? Probably not.

keystone kops

Rounding up more cloud computing suspects.

But there’s another angle I want to discuss briefly.

Intel pumped money in Endeca, a well-regarded search and content processing company. You can refresh your memory about that $10 million investment here.

Is there a connection between this investment in Endeca and today’s cloud computing announcement from Intel? I believe there is. Intel is making chips with CPU cycles to spare. Few applications saturate the processors. With even more cores on a single die coming, software and applications are lagging far behind the chips capabilities.

Read more

Funnelback CTO Interview Now Available

July 29, 2008

Dr. David Hawking, the chief technical officer of Funnelback, has joined the search and content processing company full time. Dr. Hawking is well known among the information retrieval community. His students have joined Google and Microsoft Research. Dr. Hawking’s interview with ArnoldIT.com is now available as part of the Search Wizards Speak series at www.arnoldit.com/search-wizards-speak.

Dr. Hawking said that Funnelback, now in version 8, delivers search ranking quality and tunability, geospatial query processing, folksonomy tagging of search results, streamlined set up and configuration, customizable work flows, and a software as a service option. In short, Funnelback is a capable enterprise search solution.,

Located in Canberra, Australia, the Funnelback system has a number of high profile clients in Australia and New Zealand. The company also has clients in the United Kingdom and Canada.,

Dr. Hawking said,

Funnelback includes an intuitive Web based administration interface for configuration, user interface customization and viewing query reports. No programming skills are required for the majority of configuration tasks, but deeper integrations can be achieved by developing specific interfaces to work with various enterprise application such as content management systems or portal applications.

The next release of Funnelback will appear in the first half of 2009. The company has plans to expand into other countries, but Dr. Hawking would not reveal specific plans for new offices. He hinted that Funnelback is working on solutions for vertical markets. The company already has a vertical implementation for one of Australia’s law enforcement agencies. That project has been well received by the users.

You can read the full text of the interview here. Information about the company is here.

Stephen Arnold, July 29, 2008

Google’s Publishing Baby Step

July 29, 2008

I have written about Knol, Google’s publishing technology in Google Version 2.0. Outsell (a consulting firm) recycled some on of my Google publishing research in the summer of 2007. I will have an update available from my UK publisher, Infonortics, Ltd., in Tetbury, Glou., in September 2008. If you want to read my take of Google’s publishing technology, you can snag a copy of Google Version 2.0 here. In my analysis, Knol is a publishing baby step, but it is an important one because it delivers two payoffs: [a] content to monetize and [b] inputs for Google’s smart software. I explain why Google wants to process quality content, not just Webby dogs and cats in Google Version 2.0.

You may also want to read Andrew Lih’s “Google Know Wikipedia Comparison Faulty” analysis here. Mr. Lih does a good job of pointing out what Knol is and is not. Particularly useful to those confused about the competition Google faces, Mr, Lih’s identification of Google’s “real competition” is solid.  The part of his essay I enjoyed was his “grading” of those who were covering the Knol story. He identifies who did poorly, those who were stuck in the mire of the bell curve, and the informed souls who received a gold star for excellence. I won’t spoil your fun, but you will find at the back of the class some names with which  you will be familiar.

A happy quack to Mr. Lih.

Stephen Arnold, July 29, 2008

Financial Close Dance: Connotate and High Step Rumba

July 29, 2008

BobsGuide.com revealed on July 23, 2008, that High Step Capital (yep, a money outfit) is using Connotate’s agent technology in a clever new way. In the financial world, clever means finding a way to make money in today’s unsettled market.

BobsGuide.com reports:

By creating a group of agents that monitor real-time changes to information on multiple Web sites-for example, for monitoring the prices of electronic products from competing companies-and aggregating the results, a user can create a real-time feed of prices or other information… That feed is loaded into a database hosted by Connotate, which provides a Web portal for Jones to view the data online or to download the information into spreadsheets…

You can read the full story “High Step Adds Connotate Data to Models” here. For more information about Connotate, you can visit the company’s Web site here, or you can buy a copy of my April 2008 study Beyond Search here.  Connotate competes with Relegence, a unit of America Online which is owned by Time Warner. You can read about Relegence here.

Why is this important?

Services that merge internal and external data are one of the Web 2.0 technologies that work and deliver fungible payoffs. Some Web 2.0 functions are nifty but tough to tie to a financial benefit.

Stephen Arnold, July 29, 2008

Opinion: Cuil, Google, and Microsoft

July 28, 2008

Before I go out and feed the geese on my pond in Harrods Creek, I wanted to offer several unsolicited comments about Microsoft, Cuil, and search.

First, now that Microsoft has its own search technologies, Fast Search & Transfer’s search technologies for the enterprise and the Web, and Powerset’s search technologies, does Cuil look cool?

This is a tough question, and I don’t think that Microsoft had much knowledge of the Cuil team and its work ins search. My research suggests that work on Cuil began for real in 2007. The work profiles of the Cuil team is decidedly non-Microsoft. My thought is that Microsoft did not have a competitive profile about this company. My working hypothesis is that this search system struck Microsoft like a bolt from the blue.

Second, will Microsoft buy Cuil? This is a question that will probably garner some discussion at Microsoft. The Linux “heads” at Microsoft will probably resonate with the idea. Cuil incorporates some of the “beyond” Google technology that one can find at Exalead and now at Cuil. The architecture of these “beyond” Google operations might be quite useful to Microsoft. On the other hand, Microsoft is charging forward with its own approach to massively parallel distributed systems that the “beyond” Google engineering would be a touch pill to swallow.

Third, will Cuil get traction? The answer is yes. My hypothesis is that the folks who flock to Cuil will be Google users, but the real impact of Cuil may well be taking orphaned or disaffected users from Ask.com, Live.com, and Yahoo.com search.

The short term impact on Google may be significant for several reasons:

  1. Cuil has poked a finger in Google’s eye with its user tracking policy. Simply stated, Cuil won’t build user and usage profiles that tie to an individual in a stateful session or to an individual assigned to a fine grained group of clusters in a stateless session. See my July August KMWorld feature for more about the data model of this type of tracking.
  2. Cuil hit Google with its larger index of 120 Web pages processed to Google’s 30 to 40 million pages. Keep in mind that size doesn’t matter, but it is a public relations hook that could snare Googzilla around the ankles.
  3. Cuil includes bells and whistles that have not be released on the public Google system. For example, there are snazzier results displays, insets for suggested searches, and tabs to allow slicing results. Google has these features, but the GOOG keeps them under wraps. Right now, Cuil looks cooler (pun intended). The Cuil search page is black which even says “green”. Clever.

Google now has to sit quietly and watch Xooglers implement features that Google has had in the can for years. Interesting day for both Microsoft (Should we buy Cuil too?) and Google (What’s the next step for the Xooglers’ service?).

Stephen Arnold, July 28, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta