Brainware’s Growth Hits 900 Percent

March 31, 2008

James Zubok, chief financial officer of Brainware, a search and content processing company in northern Virginia revealed Brainware’s rapid growth in the last calendar year. In an exclusive interview, Mr. Zubok said, “In less than two years we’ve experienced remarkable growth. Our sales have grown by more than 900 percent and we’ve doubled our sales force. We’re in these larger Ashburn offices because we ran out of space in our previous facility.”

You can read the full interview at ArnoldIT.com’s Search Wizards Speak service. The full interview is at http://www.arnoldit.com/search-wizards-speak/brainware.html. Other “search wizards” participating in this series include executives from Endeca, ISYS Search Software, Vivisimo, and others.

Stephen Arnold, March 31, 2008

Brainware’s James Zubok Interviewed

March 31, 2008

Privately-held Brainware, once a unit of the German high-tech content management vendor SER Systems AG, is expanding rapidly, the company told Stephen Arnold, managing partner of ArnoldIT.com. The company uses a patented system and method anchored in numerical processes.

James Zubok, an attorney and the company’s chief financial officer, said in an interview on March 30, 2008: “In less than two years we’ve experienced remarkable growth. Our sales have grown by more than 900 percent and we’ve doubled our sales force.”

The complete interview appears as part of the Search Wizards Speak series available on the ArnoldIT.com Web site.

Brainware has a patented method for processing text. In sharp contrast to the dozens of vendors who index by key word and then try to discover metadata. The technique involves trigrams or three-letter sequences. Mr. Zubok described the system in this way:

When we index the word “BRAINWARE” we store a representation of the following trigrams: “BRA”; “RAI”; “AIN”; “INW”; etc. We create a similar trigram representation of all of the text in a search query. During a search, instead of trying to match up entire words, we match the trigrams, which allows our application to be incredibly fault tolerant. Even if some of the trigrams are not a match, our search yields relevant results without relying on any dictionaries or other pre-defined rules.

The system lends itself to some high-value applications; for example, patent application and patent analysis, email discovery, and competitive intelligence activities.

One interesting aspect of the Brainware approach to content processing is its work flow functions. Mr. Zubok said:

We have workflow solutions for our intelligent data capture offerings (they have embedded search capabilities). We have two workflow applications: WF-distiller, which is our principal workflow component that is used for creating and managing workflows of all types of complexities; and A/P-WebDesk, a specialized workflow module built using WF-distiller but used specifically for Accounts Payable management. A/P-WebDesk (which includes A/P-WebDesk for SAP, a version built specifically for seamless integration with SAP) provides an easy-to-use interface to manage the entire invoice processing lifecycle.

The company’s system can be “tuned” using additional word lists and knowledge bases. You can read the complete interview with James Zubok here. More information about Brainware is available on the company’s Web site. You can download a trial version of the desktop build of Brainware’s search and content processing system from the Brainware.com Web site.

Stephen Arnold, March 30, 2008

A TV First for Google

March 30, 2008

At about 6 pm Eastern time, a Davidson student held up a sign that enjoined basketball fans to “Davidson. Just Google it.” With US television ad rates chewing through some companies’ budgets, Google scored today. Google and basketball–an eye ball slam dunk.

When online companies run adverts, serious money changes hands. Google has reached something of a cult status at least among the Davidson College, a small, elite institution not far from Charlotte, North Carolina.

Microsoft, Yahoo, Autonomy, and Fast Search & Transfer will consider getting signs into the hands of basketball fans during next week’s collegiate basketball finals.

Stephen Arnold, March 30, 2008

Search: The Wheel Keeps on a Turnin’

March 30, 2008

In the late 1990s, I learned about a news aggregator. The company was Retrieval Technologies. The company’s founder had a great idea–aggregate news and make it available in real time. The product was News Machine. Among its features were in 1995 on-the-fly classification. In retrospect, News Machine was a proprietary version of today’s RSS (really simple syndication).

That company was acquired by an outfit called Sagemaker in 1999. Sagemaker was one of the first companies providing a dashboard, vertical business intelligence, and the New Machine’s real-time updates–on a Microsoft Windows platform.

The idea was that the Intranet was “a management tool”. Instead of search, Sagemaker provided users with personalization tools. The idea was that a “one size fits all” approach to search and retrieval was not what companies wanted., The Sagemaker system federated information from behind-the-firewall sources and external sources. The public Internet could be harvested. The system’s could also ingest analyst reports and make those available to Sagemaker users. Sagemaker called these types of for-fee, third-party materials “branded content”. On the back end, Sagemaker included a usage tracking system. At the time, I thought it was quite robust, and it offered the type of granularity that online Web search systems now have in place.

A Forward-Looking Approach to Search

In my files I located this overview of the Sagemaker architecture. The acronym EIP stands for Enterprise Integration Platform. The idea is that functions–what Sagemaker called “card slots–were plugged into the EIP. XML was the lingua franca of the system. Java was used for the messaging service and the server was based on Java. Sagemaker, therefore, was a pioneer in merging Java servers with Windows. More intriguing was that parts of the Sagemaker service were hosted; that is, the functions ran from the cloud. Other functions–the graphical interface and the code that was installed on the licensee’s premises–were Windows.

architecture

I find that this approach was unable to generate sufficient traction to sweep the enterprise market. Sagemaker competed with Plumtree (now part of BEA, which is now part of Oracle) and Documentum, which is now part of EMC, the storage company turned into tech conglomerate. Read more

Search Hoops: Exercising Technology to Meeting User Needs

March 29, 2008

A “hoop” is a circular that binds a barrel’s staves together. A “hoops” has a more informal meaning; the word is a synonym for basketball. In Kentucky, you say, “The Louisville Cardinals shoot serious hoops”. This sentence won’t make much sense in Santiago, Chile, but it does at the local gas station.

Search “hoops” are different. These are technical spaces that make it possible for a person to look for information. The figure below shows a series of search hoops. I want to take a few minutes to talk briefly about each of these with particular emphasis on their relationship to behind-the-firewall search. As you know, I think the term enterprise search is essentially valueless. It’s become an audible pause mouthed by vendors of many shapes and sizes. When I hear it, I’m baffled. Truth be told, most of the vendors who use the term enterprise search don’t know what it means. The job of explaining its meaning is left to the pundits and mavens who earn a living blowing smoke to explain fuzziness. Visibility and comprehension hit the two to four inch range.

This is a diagram from a report I wrote for a company silly enough to pay me for an analysis of the online search-and-retrieval trends in the period 1975 to 2003. I have an updated version, but that’s something I sell to buy my beloved boxer dog Tyson Kibbles and Bits.

searchhoopstrimmed

© Stephen E. Arnold, 2002-2008

Please, click on the image so you can read the textual annotations to each of the rings. I’m not going to repeat the information in the diagram’s annotations. I will related these “hoops” to the challenge of behind-the-firewall search.

Read more

Exalead Adds Content Connectors

March 29, 2008

Exalead–a provider of search and content processing systems–said that it has added software connectors for Allfresco, FileNet P8, Hummingbird DM, Interwoven TeamSite, Micreosoft SharePoint, and IBM Lotus Quickplace. These newly-supported enterprise applications perform content and data management operations. Exalead’s system can now seamlessly access information in these systems’ content repositories.

These connectors supplement Exalead’s existing connectors for Microsoft Exchange, Lotus Notes, and common file types such as Word, PowerPoint, and Excel. Exalead also provides application programming interfaces that can be used to integrate the Exalead content processing system with enterprise applications, among other custom operations.

According to Exalead, the connectors are provided by EntropySoft, a firm focused on the integration of unstructured data. Exalead said, “Organizations today rely on a variety of data sources. Partnering with EntropySoft will allow us to build upon the enterprise connectors we have already developed.”

The deal allows Exalead to integrate EntropySoft bidirectional connectors exalead one:search. More information is available here.

Stephen Arnold, March 28, 2008

A 12-Step Program for Behind-the-Firewall Search

March 28, 2008

In 2006, one of the young engineers working on a search system at a large company said to me, “I’m in a 12-step program for this !%$&^ search system–two six packs of beer.”

This clever and stressed young engineer was the “owner” of her employer’s blue-chip, high–profile, it-slices-it-dices search system. The young wizard was learning that high marks in computer science do not a smooth behind-the-firewall search system make.

I kept this “12-step” tag in my mind. In late 2006, I used this graphic to illustrate one way to deploy a behind-the-firewall search system with few hassles and certainly no recourse to alcohol.

12 steps

Let me run through the 12 steps and conclude with a reminder that short cuts can lead to some interesting challenges.

Step 1. You will need a team to assist you with your behind-the-firewall search project. Search has quite a few moving parts. Working alone is not a good idea.

Step 2. You need to know a great deal about the content you plan to index. You want to know how much content you must index; how much change occurs in the content; how much new content becomes available every day, week, month, and year; access constraints; file types; and special issues such as chemical structures that must be indexed, among other points.

Step 3. You need to know what problem your behind-the-firewall search system is to solve. Is it key word search relevancy, or are you deploying a business intelligence system?

Step 4. You need to have a clear idea about who can access what information. If your organization has a security officer who handles these details, bond with this person. If not, yoiu will need to take steps to manage access to information processed by the system. Allowing colleagues to see health and salary data without authorization creates new challenges.

Step 5. You need to have a clear statement of system requirements. Keep in mind that you want to focus on the must-have features. The “nice to have” requirements should be winnowed from the “must have” requirements. Focus on the “must haves”. Read more

Northern Light: A New Business Information Search Service

March 27, 2008

Northern Light has made a free business information search services. You can try it yourself at www.nlsearch.com. Search and browse are free, but you will have to pay to access certain content. A day pass is priced at about $5.00 and enterprise licenses are available.

Northern Light, in the mid-1990s, offered a somewhat similar service. The company received an infusion of capital from Reuters in 1999. By 2002, the company had become part of the now-defunct divine Interventures.  Northern Light is once again a self-standing company. David Seuss, the former consultant who founded the firm, is once-again running Northern Light.

Northern Light was one of the first search systems to enhance its results list with folders grouping similar results. More information is available from the Northern Light Web site. Information Today’s Paula Hane’s story has additional details about the service here.

Stephen Arnold, March 27, 2008

Search: The Three Curves of Despair

March 27, 2008

For my 2005 seminar series “Search: How to Deliver Useful Results within Budget”, I created a series of three line charts. One of the well-kept secrets about behind-the-firewall search is that costs are difficult, if not impossible, to control. That presentation is not available on my Web site archive, and I’m not sure I have a copy of the PowerPoint deck at hand. I did locate the Excel sheet for the chart which appears below. I thought it might be useful to discuss the data briefly and admittedly in an incomplete way. (I sell information for a living, so I instinctively hold some back to keep the wolves from my log cabin’s door here in rural Kentucky.)

Let me be direct: Well-dressed MBAs and sallow financial mavens simply don’t believe my search cost data.

At my age, I’m used to this type of uninformed skepticism or derisory denial. The information technology professionals attending my lectures usually smirk the way I once did as a callow nerd. Their reaction is understandable. And I support myself by my wits. When these superstars lose their jobs, my flabby self is unscathed. My children are grown. The domicile is safe from creditors. I’m offering information, not re-jigging inflated egos.

Now scan these three curves.

thesearchcurves

© Stephen E. Arnold, 2002-2008.

You see a gray line. That is the precision / recall curve. This refers to a specific method of determining if a query returns results germane to the user’s query and another method for figuring out how much germane information the search system missed. Search and a categorical affirmative such as “all” do not make happy bedfellows. Most folks don’t know what a search system does not include. Could that be one reason why the “curves of despair” evoke snickers of disbelief? Read more

Autonomy: Leading the Search Herd with Its Positioning

March 26, 2008

Autonomy Corporation rolled out its Pan-Enterprise Search platform at a trade show in Baltimore, Maryland, on March 26, 2008.

The company has been able to stay one or two steps ahead of other behind-the-firewall search vendors since the company rolled out its “portal in a box” campaign in 1999. Autonomy was first out of the gate with its smart desktop search system Kenjin in 2000. Then Autonomy was one of the first search-and-retrieval vendors to redefine its system as a platform.

Today’s announcement gives IDOL a positioning that may force super platform vendors such as IBM, Microsoft, and Oracle to do a better job of explaining what their behind-the-firewall systems deliver to a customer.

The Sun Herald quoted Mike Lynch, founder and chief executive officer of Autonomy, as saying:

Despite standardization efforts, information is scattered across the enterprise among different vendors’ software, in different formats, and among numerous servers and laptops. Autonomy’s Pan-Enterprise Search platform is the only FRCP [Federal Rules of Civil Procedure]-compliant enterprise search platform available in the market, delivering a single unified and vendor-neutral platform for searching all [sic] file formats and media-types for legal and business purposes.”

New IDOL features in the Pan-Enterprise Search platform include video indexing, enhanced geographic clustering, and an improved relevancy method. The new approach–intent-based ranking–uses algorithms to determine a user’s intent. Autonomy asserts that its new approach matches results to the user’s context. Autonomy said it made changes to enhance system performance. A new multi-dimensional index rounds out the information platform.

Additional information about the Pan-Enterprise Search platform is available from Autonomy.

Stephen Arnold, March 26, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta