Cogenta: Metasearch

Technology from Harrod's Creek
December 2002

The Dot commotion forced many startups to avoid trade shows. The behaviour was like that of geese in Kensington Park. Let a tiny, Crufts-winning Westie yap twice, and the geese flap wildly and waddle clumsily.

The absence of small start ups at most information trade shows is unfortunate. With trepidation, I entered the Olympia Grand Hall in London in December 2002. The International Online Show, now in its third decade, traditionally offers the standard line up of marquee names in flat-sized porta-booths.

Such former show champions as Thomson, Reed Elsevier, and Factiva paraded, preened, and panted for attendees.

To the uninitiated, their products and services glistened at the show. To a frontiersman from Harrod's Creek, no ferocious surprises lurked in the drapery of the porta-booths. The products and services on offer emitted a trace of a quick spruce up with finishing gel.

This year's technology buzz was about metasearch. Verity, Plumtree, and others showcased enterprise products that take a single query and return results from different sources. A good example of basic metasearch is the public site operated by Scott Banister. A single query can be passed serially to Yahoo!, Alta Vista, and Lycos. The user must review three lists of hits. Metasearch is well past its prime.

Ixquick, operated from lower Manhattan, bumps metasearch up a notch in usefulness. A single query is automatically passed against AltaVista, Kanoodle, Overture (heaven help us), WiseNut, and the Open Directory. The user can tick the search engines to include. The results are merged into a single list and ranked by relevance using Ixquick's own algorithms. These combine popularity plus other metrics derived, in part, from the type of maths used for calculating the financial return on index funds. Similar functionality is on offer at the oddly-named Dogpile, an InfoSpace property.

However, the metasearch technologies do not do an effective job of passing a query against branded, for-fee content, specific Web sites, and internal repositories in Lotus Notes databases. These firms have enterprise software focused on customers with hefty budgets for bespoke information, extensive system customisation, and sometimes precious little commonsense.

I was half-listening to a pitch from a professional presenter when I overheard a serious looking man say to his companion, "Interesting, that Cogenta. I quite like their metasearch product."

"Cogenta," I thought. "Quite a good name, suggests clear thinking, on point."

Cogenta is a six-person operation in Farnham, Surrey. (Farnham does not connote technology; Farnham resonates with the Bear and Ragged Staff pub, the sluggish River Wey, or the sleepy Lion and Lamb Yard.)

Cogenta occupied a modest two metre by three metre stand near the rear fire exit. When I arrived, the stand was hard to see because of the clot of attendees peering at laptop monitors.

The 30-something president, David Phillips, explained Cogenta this way: "With our software you can build a current Web index of the sites that you care about and use them as jumping off points to the rest of the relevant Web."

Mr. Phillips continued: "Google and the other search engines try to index the entire Web. But most professionals care about a small number of Web sites. Some may be public and available without charge. Others may require the user to pay and enter a username and password to access the content. Public search engines and search engines bundled with Intranet portal software are geared to the needs of an individual chemist, journalist, marketing manager, or solicitor."

The crowd of about 20 people nodded in unison like a group of bobble-head plastic dogs. Mr. Phillips rolled on: "How many of you wade through all the hits returned from a Google search?"

No bobbleheads raised their paws.

Mr. Phillips continued, "With Cogenta, you can use our prepackaged list of Web sites for specific business functions, add your own sites, or do both. You can include content from your own Intranet as well."

"You decide when to search these sites. Go off and have a latte [much head bobbing] and when you come back, you have a personal index of the Web documents and sites you want to look at, including pages from sites you never knew existed."

The clump of listeners crowded the monitors. Mr. Phillips lowered his voice an octave, "You can copy the Cogenta results to your laptop or office machine and review it at your leisure. Cogenta offers point-and-click data exploration tools that go far beyond those available from shareware authors or the low-cost metasearch tools aimed at students."

Cogenta, unlike most of Web indexing operations, offers its software as a desktop application as does Copernic, the Canadian software maker of search and retrieval tools. Unlike Copernic, Cogenta handles Intranet and for-fee content. More importantly, the infrastructure of Plumtree and equivalent enterprise software is simply not needed with Cogenta's technology.

The idea for a personal indexing service came from Mr. Phillips' experiences in document imaging and knowledge management. After the exhibits closed, he said, "The Cogenta idea is obvious, really-- give professionals control over their time and their content. In my corporate work, I saw first-hand that a great many professionals were searching Yahoo! and AltaVista plus a few narrowly-focused sites. What professionals needed was a way to index the sites of importance to them. Google and the other search engines were there for broad, get-oriented searches. The enterprise systems weren't set up to meet the needs of one person with special research requirements. We fill the gap."

Unlike products from many mainstream providers, Cogenta is built using Microsoft's VisualStudio.Net tools. The spider, the parser, the search engine, and the interface are Dot Net compliant. Mr. Phillips believes the Dot Net approach will reduce the time needed to develop new features. Cogenta says that it will release a Web browser version of the product early in 2003.

Here's what the software does. It takes a query and generates a page of results. The user can also copy specific Web pages of content to his computer. Cogenta provides drag-and-drop tools that plonk the document in a specific folder. Cogenta makes filing and sorting online content dead simple.

"The use of Windows Explorer-type functionality seems obvious, doesn't it?" asks Mr. Phillips. "No training is needed to use our software. Anyone familiar with the Windows interface can generate a personal Web index, file documents, and retrieve them without worrying about network access or if the source document is still on a Web server."

There is a staggering variety of free Web research tools. Why does the world need Cogenta? Mr. Phillips said, "The expert documentalist can do just about anything with Internet browsers and shareware. The average consumer relies on keying one or two words and hitting the enter key or pointing at a Yahoo! listing and taking the first hit that pops up."

"It seems we struck a chord with the individual professional who does not want to invest the time to become an Internet search wizard, nor does this professional want to do research by looking at for-fee listings that appear at the top of a Web search engine's result page. Our focus is on the busy professional who wants content from for-fee sites, his organization's Intranet, and more than Yahoo! and less than 300,000 hits on Alta Vista."

The majority of business professionals spend too much time searching and not enough analysing. Cogenta has been engineered to flip this model so that research occupies less time, thus creating more time for the analysis of information.

Cogenta offers packages of Web sites ready to plug into the spider and indexing system. Added Mr. Phillips, "We are expanding our bundles of Web sites, working with subject specialists. The response to prepackaging important sites by topic is evidence that our blend of software, functions, and sites is on target."

One of the most important technologies in the Cogenta software is its original relevancy ranking algorithms. These are built using its expertise in computational linguistics. A patent has been filed.

"The results appear with the most relevant hits at the top of the list. However, with a mouse click, results can be displayed by url, date, or topic." Mr. Phillips noted, "We have just started the patent process. We want to nail down our approach to 'smart relevancy' quickly. Each time a user interacts with our list of hits, we are able to make adjustments to the algorithm's functions for a particular user. Over time, Cogenta delivers more relevant results ranking because it learns from the user's actions."

Cogenta has other intelligence embedded in its software. For example, when a user enters a query, the Cogenta software performs a "more like this" and a "see also" search. User actions instruct the spider to look for additional links on Web pages that the user clicks on. Over time, new urls are added to user queries.

Mr. Phillips said, "We build into Cogenta round robin and queues for spidering. A user can set the type of timing function as well as how aggressively the spider visits sources. The result is that Cogenta-- unlike the bandwidth hogging pioneer in personal indexing, the oft-villified PointCast--can mesh with typical office networks.

Because of the technology framework, Mr. Phillips sees numerous opportunities for tailoring Cogenta. The software slips into small- and mid-sized organizations where the Verities and OpenTexts of the world are wrong-sized." Cogenta's license fee is less than 1,000, compared to the five, six, and seven figures on offer from most search vendors.

How can one license Cogenta? The six-month-old company relies on resellers of customer relationship management, document management, and portal software to vend Cogenta. The software's architecture offers resellers an opportunity to sell value-added services and fill a big hole in the mainstream enterprise software products.

Mr. Phillips said, "We refuse to compete with Autonomy or Verity. Our software meshes with these companies' products and in fact relies upon them to find internal documents. We add functionality, and we can use the indexes of internal content built by Autonomy and Verity systems."

Cogenta uses what it calls "adaptors." The company offers these software modules for Verity, Lotus Notes and Microsoft Exchange. The idea is that Cogenta can make available information from content repositories built by any of these products and merged it in relevance-ranked results pages.

The secret sauce for Cogenta is, according to Mr. Phillips, that "Most of the enterprise products have no provision to allow a financial professional to get his must-see Web sites, for-fee financial data, and internal information in one indexed service under his control at all times. A user controls the Cogenta software, not a programmer."

One weakness in Cogenta is that it does not build a full text index of Web sites or internal content. Traditional keyword searching is somewhat slower than using a search engine from DT Search, for example. However, this is a Cogenta strength. By not having its own index it can never get old. Cogenta also avoids the impossible goal of one universal index. (A beefed up search-and-retrieval tool is coming in a few months, according to Cogenta's engineers.)

"We have," said Mr. Phillips, "an aggressive development roadmap and an investor who wants us to expand rapidly. We now have more than 200 business cards and several dozen hard leads."

Today's business climate can be challenging to start ups. The question is, "Will Cogenta's metasearch join Castle Hill as one of Farnham's top attractions in 2003?

Web site:
Headway Ho
Crosby Way
Farnham Surrey

Farnham Surrey


[ Top ]   [ AIT Home ]   [ Beargrass ]   [ Site Map ]