Hlava on Machine Assisted Indexing
September 8, 2011
On September 7, 2011, I interviewed Margie Hlava, president and co-founder of Access Innovations. Access Innovations has been delivering professional taxonomy, indexing, and consulting services to organizations worldwide for more than 30 years. In our first interview, Ms. Hlava discussed the needs for standards and the costs associated with flawed controlled term lists and some loosely-formed indexing methods.
In this podcast, I spoke with her about her MAI or machine assisted indexing technology. The idea is that automated systems can tag in a consistent manner high volume flows of data. The “big data” challenge often creates significant performance problems for some content processing systems. MAI balances high speed processing with the ability to accommodate the inevitable “language drift” that is a natural part of human content generation.
In this interview, Ms. Hlava discusses:
- The value of a neutral format so that content and tags can be easily repurposed
- The importance of metadata enrichment which allows an indexing process to capture the nuances of meaning as well as the tagging required to allow a user to “zoom” to a septic location in a document, pinpoint the entities in a document, and automated summarization of documents
- The role of an inverted index versus the tagging of records with a controlled vocabulary.
One of the key points is that flawed indexing contributes to user dissatisfaction with some search and retrieval systems. She said, “Search is like standing in line for a cold drink on a hot day. No matter how good the drink, there will be some dissatisfaction with the wait, the length of the line, and the process itself.”
You can listen to the second podcast, recorded on August 31, 2011, by pointing your browser to http://arnoldit.com/podcasts/. You can get additional information about Access Innovations at For more information about Access Innovations at this link. The company publishes Taxodiary, a highly regarded Web log about indexing and taxonomy related topics.
Stephen E Arnold, September 8, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Protected: More SharePoint Myths Dispelled
September 8, 2011
Protected: SharePoint and the Future of Technology
September 7, 2011
Protected: SharePoint Social Tools Gets a Major Endorsement
September 6, 2011
Protected: Simplifying Email within SharePoint: An ABB Case Study
September 5, 2011
Oracle Text Release Notes Jackpot
September 2, 2011
Short honk: OraChat has posted a healthy list of useful release notes worth referencing under “Getting Hold with Oracle Database 11gR2 RAC“.
A few highlights:
- Complete Checklist for Manual Upgrades to 11gR2 (Doc ID: 837570.1)
- How to Download and Run Oracle’s Database Pre-Upgrade Utility (Doc ID: 884522.1)
- Release Schedule of Current Database Releases (Doc ID: 742060.1)
These are just the tip of the tip of the Oracle informative iceberg; there are dozens of documents listed. When you find you need help installing, running, building, adding-on or any number of Oracle activities you may have never considered, I would recommend perusing this list. It is a great and comprehensible collection.
Go ahead and bookmark it. Oracle and its search solutions may not light up the webinar, blog, and social conference circuit. But the various Oracle search solutions have purchase in most major organizations.
Sarah Rogers, September 2, 2011
Sponsored by Pandia.com
Protected: Another SharePoint Start Guide
September 2, 2011
Protected: Indexing FileNet from SharePoint
September 1, 2011
Protected: Complex SharePoint Performance Management
August 31, 2011
Protected: Calculated Field Formulas for SharePoint Made Easy
August 30, 2011