IBM OmniFind Text Search Server White Paper
November 4, 2010
IBM has published “Exploring the IBM OminiFind Test Search Server.” You can, as of November 3, 2010) download a copy without charge from this link. The page on the IBM site for this publication is on the Partnerworld subsite. The white paper covers OmniFind (based on Lucene) as a system that performs “high speed linguistic text searches against DB2 text data and documents stored in rich-text formats.” This suggested to me that OmniFind can query supported files in structured and unstru8ctured formats. Structured, as I understand the term, means content residing in relational databases like IBM’s own DB2. Unstructured, as I understand the term, means information like email, standard office worker file types like Microsoft Word, and Web pages. A list of supported file types appears in the white paper, and it includes a representative sample of what OmniFind can access; for example, XML, the ever popular Lotus Freelance, JustSystems Ichitaro, and Quattro Pro. The white paper dives into specific commands that are useful to an engineer installing the system. One interesting feature is that the system makes it easier to index “external data”. Manual inspection to make sure what you want indexed is indexed is a useful best practice. The XML search discussion makes clear that the user should be comfortable looking at XML mark up. Bob and Betty in marketing are likely to find the tags off-putting. If you are interested is learning how open source search technology can be used by a proprietary company in a commercial product. The white paper is worth a read and I think it is better than the writes up from some of azurini, who have discovered revenue by selling marketing services to search vendors.
Stephen E Arnold, November 4, 2010
Freebie
 
	




