Thunderstone Texis Version 6 Released

March 21, 2011

Thunderstone details important changes to Texis in Version 6 in “Texis Version 6 Features and Changes:”

“Texis version 6 introduced many new features and enhancements. Some existing features were also modified to have different behavior. The following is a discussion of changes from Texis version 5 to 6, starting with a general overview of important changes, loosely grouped by functionality. All changes are then discussed in more detail in the sections that follow.”

Check this list before you upgrade in order to ensure a smooth transition. Keep in mind that the name Thunderstone is shared with a band. You may encounter some false drops when you run a query for the word “Thunderstone” without additional search terms. We recommend using the phrase “Thunderstone Texis.”

Cynthia Murrell, March 21, 2011

Freebie

Protected: New SharePoint Outlook Function

March 21, 2011

This content is password protected. To view it please enter your password below:

Consultant Asserts the Obvious

March 20, 2011

Years ago, I worked at the former blue chip consulting firm Booz, Allen & Hamilton. At that time, the firm was generating studies of world economic change, updates to the definitive discussion of new product development, and ground breaking studies in technical innovation methods. Now we learn that executives are distracted. Okay.

I learned about this obvious statement in “Executives Say They’re Pulled in Too Many Directions, According to Booz & Co. Survey.” According to the write up:

“The survey results tell us that deciding on priorities is a huge issue for companies – and that actually linking priorities to decisions is a hurdle that few companies get past. We see this ‘incoherent’ operating environment across industries and geographies, among all types of companies. It’s draining – and forcing companies to pay a significant penalty. We call it the incoherence penalty,” said Paul Leinwand, co-author of the just-released book “The Essential Advantage: How to Win with a Capabilities-Driven Strategy” (Harvard Business Review Press, December 2010).

When I read this, I thought about the type of research and marketing that consulting firms are forced to do to maintain their revenues. Some firms have become more like boutique marketing shops. Others are emulating PageRank and looking for topics that generate clicks. Booz seems to be blazing a path by putting numbers behind what most business professionals know. In a meeting, no one pays much attention. Distractions are the name of the game. People come and go, and most don’t know anything about Michelangelo.

I relate almost every thing I read to search and information access. I wonder how distracted executives can make good decisions. I thought about consulting firms trying to sell obvious generalizations to procurement teams more interested in fiddling with iPhones than figuring out whether the technical explanations were on point or even accurate.

The Booz study offers some evidence that we live in a PageRank world. No wonder it is hard to find valid, useful, substantive, actionable information.

Stephen E Arnold, March 20, 2011

Access Innovations and IEEE Team Up

March 20, 2011

Access Innovations has cultivated a solid relationship with the Institute of Electrical and Electronics Engineers, the foundation of which seems to be their Data Harmony software series.

Access Innovations is one of the leaders in indexing, controlled vocabulary development, and taxonomies. For IEEE Access Innovations has a long, successful track record in helping organizations develop thesauri and controlled vocabularies. The company also has proprietary software which can perform automatic content tagging.

IEEE is responsible for close to a third of the technical publications circulated around the globe, has now sought the firm’s help in revamping how their Xplore library catalogues the massive amounts of data stored within.

Access Innovations said:

To complete the latest project, Access Innovations used an implementation of Data Harmony Metadata Extractor to determine the article’s content type and then built an improved rules base to identify content types in order for each type to be indexed in a specific way using the IEEE Thesaurus.”

Access Innovation’s system provides users the ability to outline and remove information from the source, compiling a fresh record in the process. This marks yet another lucrative venture for the 33 year old company, which services a variety of academic institutions and government agencies.

Micheal Cory, March 20, 2011

Freebie

IBM OminiFind Fix Pack Failure

March 19, 2011

Shades of Microsoft’s Windows 7 phone update. Big companies seem to have some issues with details. The Apple method obviously does not have much of an impact on some big outfits.

If you are having problems with the OmniFind Enterprise Edition Fix Pack, Here’s help.

IBM provides the solution: “Starting Stellent Session Fails with rc=251658477 after Fix Pack Is Applied.” The error message appears in the CCL’s log and the Stellent session does not start with the esadmin check command.

The following message occurs:

  • com.ibm.es.ccl.server.responders.sys.SessionAttachMessageHandler doMessage
  • SEVERE: Attaching session “col1.Stellent” failed because of there are no message handlers for that session.

IBM offers this additional information:

“Other sessions are getting started but Stellent session fails. Thus you can still crawl files and might be able to parse some file types (such as html, text, etc) but you can not parse/index binary files such as PDF, Microsoft Office related files that goes through Stellent session.”

The articles does offer a workable solution. Haste creates work and, of course, generates consulting revenues. IBM professionals are very, very busy.

Whitney Grace, March 19, 2011

Freebie

IBM OmniFind Tip: Corrupt Index Ruining Your Day?

March 17, 2011

Short honk: IBM’s support page contains a little item titled “OmniFind Enterprise Edition Returns Extra Invalid Search Results when Index is Corrupted.” When using OmniFind 9.1, fixpack 1, a power outage during your crawl can corrupt results, causing invalid search data to be returned. Fortunately, the fix is not difficult: just re-crawl. Time-consuming, but easy. So this is open source. What happens with Watson? Interesting question.

Cynthia Murrell, March 17, 2011

Freebie unlike IBM’s on site service and the FRUs we know and love

Protected: SharePoint Search Content

March 17, 2011

This content is password protected. To view it please enter your password below:

From Jeopardy to the Hospital: Interesting Text Retrieval Route

March 16, 2011

Healthcare researchers now have a valuable tool at their disposal, asserts eWeek.com in “IBM Collaborates with BJC, WUSM on Health Care Data Analytics.

Working with BJC Healthcare and the Washington University School of Medicine Center for Biometrics, IBM is using its content analytics for good, extracting medical data from a whopping 50 million documents, including clinical notes, electronic health records, and diagnostic reports:

By being able to extract key data from up to 50 million documents in medical records, BJC and WUSM will be able to increase the speed of research, and therefore boost patient care. ‘You can never read 50 million documents and understand what the trends and patterns were across 50 million documents; it’s impossible,’ Rhinehart explained. ‘You couldn’t even take 500 people to do it, because there is never an efficient way to consistently understand the behavior in those documents and then figure out all the trends and patterns’.

The assembled information can be used to draw conclusions or test a hypothesis, for example. It’s about time semantic technology was applied to medical research. What better field?

Now we have some observations. First, applying semantic or other next generation search methods to medical content is somewhat less onerous than trying to figure out colloquial blog posts in Farsi. Second, IBM sells Lucene as OmniFind 9. If the technology is up to medical snuff, IBM needs to apply this method to its Web site’s search and retrieval. We find the access to IBM content on IBM’s own Web site sufficiently frustrating to give me a headache. Third, IBM is sending mixed messages. Is it search, text mining, data mining, or game show winning?

We think it is public relations and eWeek is happy to disseminate the joy.

Stephen E Arnold, March 16, 2011

Freebie unlike open source search wrapped in an OminFind package

Microsoft SharePoint Suggestions

March 16, 2011

Here’s a useful item for you SharePoint fans and consultants. The write up “Tools and Web Parts for SharePoint 2007 and 2010” explains Web parts. This is Microsoft speak for code gadgets. What makes the article useful is that it provides a succinct summary of how a programmer can set up SharePoint to make available more suggestions to a user for his/her query. The key point is:

The only way to add more words to the suggestion feature is using a PowerShell script to add them and run the job manually or with the script. This tool was created to handle all words of suggestions in each Search Services Applications created in SharePoint 2010.

The article has a link to download the tool, as well as an explanation and ten images to explain how to use it.  Web parts or web widgets can add needed functionality. Our question, “Why isn’t a more robust suggestion tool included with SharePoint?” We think the answer is that Microsoft likes to leave third parties with opportunities to earn money from the millions of SharePoint licensees. The tactic, in my opinion, is intentional incompleteness.

Stephen E Arnold, March 15, 2011

Freebie

Is Precision and Recall Making a Comeback?

March 15, 2011

Microsoft-centric BA Insight explored these touch points of traditional information retrieval. Precision and recall have quite specific meanings to those who care about the details of figuring out which indexing method actually delivers useful results. The Web world and most organizations care not a whit about fooling around with this equation.

image

And recall. This is another numerical recipe that causes the procurement team’s eyes to glaze.

image

I was interested to read in The SharePoint and FAST Search Experts Blog’s “What is the Difference Between Precision and Recall?”  This is a very basic question for determining the relevance of query search results.

Equations aside, precision is the percentage of relevant retrieved documents, and recall is the percentage of relevant documents that are retrieved.  In other words, when you have a search that’s high in precision, your results list will have a large percentage of items relevant to what you typed in, but you may also be missing a lot of items in the total.

With a search that is high in recall, your results list will have more items of what you’re searching for, but will also have a lot of irrelevant items as well.  The post points out that determining the usefulness of search results is actually simpler than this sounds:

“The truth is, you don’t have to calculate relevance to determine how SharePoint or FAST search implementation is performing.  You can look at a much more telling KPI.  Are users actually finding what they are looking for?”

The problem, in my opinion is that most enterprise search deployments lack a solid understanding of the corpus to be processed. As a result, test queries are difficult to run in a “lab-type” setting. A few random queries are close enough for horseshoes. The cost and time required to benchmark a system and then tune it for optimal precision and recall is a step usually skipped.

Kudos to BA Insight for bringing up the subject of precision and recall. My view is that the present environment for enterprise search puts more emphasis on point and click interfaces and training wheels for users who lack the time, motivation, or expertise to formulate effective queries. Even worse, the content processed by the index is usually an unexplored continent. There are more questions like “Why can’t I find that PowerPoint?” that shouts of Eureka! Just my opinion.

Stephen E Arnold, March 15, 2011

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta