IBMs ICAwES Red Book Available
April 17, 2014
The article on IBM.com titled Building Enterprise Search Solutions Using IBM Content Analytics with Enterprise Search involves IBM rolling out information about ICAwES. That excellent acronym stands for IBM® Content Analytics with Enterprise Search, as you may have guessed. It allows for customized synonym dictionaries for search, annotators, and the integration of diverse kinds of repositories. The abstract explains,
“With ICAwES enterprise search solutions, you can integrate fields from multiple content repositories to create a single, integrated user search experience. In addition, the enterprise search solutions can use fields and facets in various ways to create diverse views of your search result set, thus helping you identify the hidden meaning of your unstructured content. This IBM Redbooks® Solution Guide explains, from a high level, how to build enterprise search solutions using ICAwES.”
A red book is available through IBM Redbooks. It offers information on using the “text classification capability”, the “LanguageWare Resource Workbench” and “IBM Content Assessment”. It is aimed at IT architects and business users interested in expanding their usage and improving customer satisfaction and business operations, all interesting information. The reference to the “billion dollar baby Watson” appears in the footer, but not in the explanation of the ICAwES.
Chelsea Kerwin, April 17, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Litigation Software dtSearch Demo
April 16, 2014
The dtSearch Desktop Demonstration Video on nlsblog.org shows how to setup and search with dtSearch for Windows. The 12 minute video begins with an introduction to dtSearch, which is able to “recognize text in over 200 common file types.” By indexing the locations of words in different files, dtSearch is able to build an almost limitless index of documents. The demo walks through the setup of dtSearch. After naming the index,
“It is important to keep in mind that when we add items here, dtSearch is not creating copies… but links to those files. A good practice is to put the files and folder that we want to run searches on into a single centralized location, before we create the index… all we need to do is add this discovery folder, and the subfolders and files will be automatically included…dtSearch reads the text in the linked files and creates a searchable words list.”
Then you are able to search which index to search through, and limit it to one case, or all cases. The word appears with a number, show how often it appears in the index. Then you can add the keyword to the search request to find the documents in which the word appears. You are able to preview a document, copy a file, and create a search report. The demo goes into great detail about all of the search options, and should certainly be viewed in full to learn the best methods, but it does not provide metrics for the time required to build the initial index or update it. These metrics are useful.
Chelsea Kerwin, April 16, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
ArnoldIT Video: Search Brands Video
April 15, 2014
Whatever happened to Convera and the other four companies comprising the Top Five in enterprise search: Autonomy, Endeca, Fast Search & Transfer and Verity. The video also mentions Exalead and ISYS Search Software. The wrap up to the video points to three open source enterprise search options. For those who want to be reminded of the Golden Age of enterprise search, check out the free, six minute video from Stephen E Arnold, publisher of Beyond Search. Mr. Arnold is converting some of his research into brief, hopefully entertaining and useful free videos. You can access this short search history lesson at http://bit.ly/1etGExr. The next video in the series tackles the subject of buzzword, argot, jargon, lingo, and verbal baloney. What vendor is the leader in the linguistic linguini competition? The video will be available before the end of April. In the meantime, take a walk down memory lane and learn how Cornelius Vanderbilt obtained needed information in the early 19th century.
Kenneth Toth, April 15, 2014
How-To Guide for Amazon Search
April 15, 2014
The article on Search Engine Journal titled The Power of Amazon Search lays out the five main components of Amazon search for Amazon authors. The first is content, but the other four are more strategic. SEO experts are exceptional information managers, and this article is built around the components of sales, keywords, category, and reviews. It compares Google search to Amazon when it comes to keyword, and arrives at the following conclusion:
“The difference between doing a search on Google vs. Amazon is that with Amazon you do not want to rely on long tail keywords. Instead, you want to find the exact words people use when searching for a book. Aim for shorter phrases that reflect traditional book browsing. Think “Indian Cookbook” versus “Cookbook of traditional Indian dishes”. For example… when you type in the word Entrepreneur in Amazon there are 22,145 results? Comparatively, when you type in entrepreneurship there are 36,899 results.”
The category component builds on the keyword idea. Instead of opting for the broadest category, the article suggests narrowing your focus, and in turn your competition within a category. Similarly, the reviews component includes the advice to target the top reviewers, and aim for quality over quantity. It also links to Amazon’s Review Hall of Fame as a starting place.
Chelsea Kerwin, April 15, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Hawk Search Runs the Bases
April 13, 2014
We noted that HawkSearch, an open source search solution, from Thankx Media, has been licensed by Wrigleyville Sports. If you are not a baseball fan, you may not recognize the mythic stature of the Chicago Cubs’ baseball team and its relationship with all things Wrigley. According to the firm’s news release (written without too much of that old-style Endeca rhetoric), we learned:
Wrigleyville chose Hawk Search for the following features:
- User-friendly merchandising tools – a merchandising workbench that enables business users to generate dynamic landing pages, implement ranking and relevancy rules, and promote products.
- Easy to understand analytics – customizable dashboards can be used by merchandisers, buyers, and marketing managers to monitor activity on products and adjust strategies.
- Robust out-of-the-box functionality – Hawk Search includes faceted navigation, smart AutoComplete, dynamic landing pages, merchandising tools, and more without customization.
Thankx is an authorized Endeca integrator with some good, old fashioned Endeca professionals on the Illinois company’s team.
More information is available at www.thankx.com. Play ball or navigate to the Cubbies’ store and buy a Wrigley Field cap, no chewing gum included. If you want the computationally challenging Endeca on demand system system, Thankx can fix you up. Information about this late 1998 system is at http://www.thanxmedia.com/what-we-do/site-search/endeca/endeca-on-demand/.
Stephen E Arnold, April 13, 2014
Search and Big Data: Been There, Done That
April 12, 2014
Is the use of search to find information in large collections of content revolutionary? Er, no. What about using search to locate an Internet Protocol address in a repository of monitored email traffic? Er, no.
With the chatter on LinkedIn and the vacuous news releases from some floundering search companies, one would think that gathering up content and running a query was the equivalent of my ancestor stealing and ember and saying, “Look, I invented fire.”
Sorry.
Beyond the rather influential if specious IBM white paper published in 2010 (link is at http://bit.ly/1gckiPJ), a large number of companies continue to position some old as new again.
One interesting twist on the “search is better than SQL” is the useful solution brief from RainStor. In some circles, RainStor has a low profile. In others, the company has caught the attention of some recognized “names” in the Big Data world; for example, Cloudera and Dell. So think Hadoop friendly.
RainStor focuses on cost effective solutions for gathering, archiving, and querying content. Like the old CrossZ technology, RainStor queries the compressed files. There are benefits from this approach. Unlike CrossZ, no proprietary routines have to be run to extract a data cube. The person looking for information can use standard query syntax using SQL, MapReduce, or off the shelf business intelligence tools.
If you are confused by peas-in-a-pod desperate for a cannery with cash, you will want to check out RainStor. The company’s Web site is www.rainstor.com. I would have like RainStor to publish the numbers of their patents that were granted by the USPTO in 2013. The general description here reminded me of several other firms’ systems and methods.
Stephen E Arnold, April 12, 2014
Perceptive Search 10.3 Now Available
April 11, 2014
According to the marketing, the system from the 1980s formally known as ISYS Search is now up to date. Digital Journal shares, “Perceptive Software Launches Version 10.3 for Perceptive Enterprise and Workgroup Search.” New connector options and high-definition viewing are among the updated features for both the Enterprise and Workgroup platforms. The press release also tells us:
“Fidelity options for content rendering in Perceptive Search 10.3 allow administrators to set the appropriate level of fidelity for displaying search results. Options include several levels of standard text, standard XHTML and high-definition HTML5 that produce near-perfect paginated renditions.
“The addition of a document thumbnail preview provides Perceptive Enterprise Search 10.3 users additional confidence that they are selecting the right search results. With a glance at the first page of search results, users can often determine if the files meet the desired criteria. This instant visual confirmation of the search results further accelerates user productivity.”
That thumbnail view is a helpful touch. The search systems‘ updated connectors can access content in Google Drive, Microsoft SharePoint 2013, Microsoft Exchange 2013, and Symantec Enterprise Vault 10.
Founded as Genesis Software in 1988, Perceptive Software offers a range of process- and content-management solutions. Perceptive serves clients in a wide range of industries, and was acquired by Lexmark in 2010. The company is headquartered in Shawnee, Kansas and, according to their About page, is currently hiring.
Cynthia Murrell, April 11, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Hewlett Packard: Foreign Bribes and Search
April 10, 2014
I was disappointed with the news stories about Hewlett Packard’s recent hitch in its git-along. For example, I read “Hewlett Packard Agrees to $108 Million Fine for Foreign Bribes” and saw not one reference to information retrieval, search, and content processing technology. In my view, had HP used the Autonomy technology to process its internal information, IDOL and the Digital Reasoning Engine would have generated some outputs that pointed to anomalies like those the investigators found.
Apparently “findability” is more difficult than it appears even when the company in the spotlight owns one of the go-to search systems. I assumed that it would be trivial to run a few queries and produce documents and “big data” that would show that Hewlett Packard what was cooking in its subsidiaries or with non US deals.
Search apparently was not up to the task because allegations had to be “resolved by third parties.” Apparently it required attorneys and government folks to figure out that HP was taking some short cuts. Here’s a passage I noted:
“Hewlett-Packard subsidiaries created a slush fund for bribe payments, set up an intricate web of shell companies and bank accounts to launder money, employed two sets of books to track bribe recipients, and used anonymous email accounts and prepaid mobile telephones to arrange covert meetings to hand over bags of cash,” said Deputy Assistant Attorney General Bruce Swartz in the Justice Department statement.
Business actions like those mentioned in the Silicon Beat write up make it clear that HP management may not know what is going on or may not be paying attention to existing information about company activities.
Is this an anomaly?
I can’t answer the question, but when investigators from various countries are able to find useful factoids, it raises one question:
What does HP’s much hyped information retrieval system do for company executives?
and
Was important management information not available to HP’s senior executives? If so, who filtered the digital content?
This $100 million fine comes on the heels of HP’s paying $57 million to settle a shareholder lawsuit about the “personal computer maker’s former management of defrauding shareholders by abandoning a business model it had long touted.” See http://reut.rs/1iUC0re
The persistent HP business model seems to be one that does not engender my confidence in the company.
I am not sure the IDOL search system is at fault. Does HP use Autonomy’s fraud detection components? Why not index content, run queries, and make decisions based on the heterogeneous types of information that Autonomy can process, usually with some effectiveness?
The jury’s still out on search at HP. Two big fines in a short period of time is unsettling to me because both are germane to the effective use of information retrieval technology.
Stephen E Arnold, April 10, 2014
Elasticsearch Appeals to Its Core Audience with New Move
April 9, 2014
The open source search wunderkind, Elasticsearch (www.elasticsearch.org) is in the news again. In the crowdsourcing spirit that has helped propel it to the top of many lists, it is sharing more insider information as we learned in the post, “Elasticsearch: The definitive Guide.” http://www.elasticsearch.org/blog/elasticsearch-definitive-guide/ The book helps users of all stripes better understand the engine.
Our favorite part was how the guide is aimed at a particular audience:
“We expect you to have some programming background and, although not required, it would help to have used SQL and have some database experience. We explain concepts from first principles, helping novices to gain a sure footing in the complex world of search.”
There is no watering-down here to appeal to everyone. We like that attitude. The firm has never been one to pull punches regarding tough topics. Just recently they made more headlines by improving their ability to perform log analysis. http://betanews.com/2014/03/20/elasticsearch-makes-log-analysis-faster-and-simpler/ Most search engines would avoid this topic and leave it for programmers, but Elasticsearch understands its audience and gives them tricky tools to play with. We love the things they are doing and eagerly await their next move on the search engine chess board.
Patrick Roland, April 09, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Apple Takes Action to Improve Map Search
April 8, 2014
I don’t have an iPhone. I do have an ageing Mac notebook. It is reasonably reliable, and I have learned to save my high value content to another storage device. When I need to locate a document, I use more robust and less flakey information retrieval tools to retrieve my information.
I have had to help a couple of people look for information using an iPhone. One notable example was locating Cuba Libre restaurant in Washington, DC. A colleague and I were standing in front of Cuba Libre and we wanted to figure out whether to turn left or right to reach a destination. No luck. The restaurant was not findable.
In my opinion, not only was the Apple map search system inadequate, the system did not acknowledge the fungible existence of the restaurant in which I had eaten a pretty good sandwich.
When I read “Amazon A9s VP of Search Heads to Apple to Fix Up Maps Search,” it dawned on me that Apple seems to have taken action to fix at least one of its search systems. The other thought I had is that Apple, like many other big names, is likely to take a look at open source search technology.
I wonder if Hewlett Packard, IBM, and Oracle will be able to convince Apple to go with Autonomy or Watson or Endeca technology. Landing Apple would be a plus for these three enterprise search vendors.
A question: What happens if Apple embraces a hot open source search solution from an outfit like Elasticsearch or Searchdaimon?
The one striction associated with this alleged personnel shift is that I don’t think that Amazon’s search systems are helpful to me when I run a query. I struggle to NOT out books that are not yet available, and I have a very tough time locating some of Amazon’s lists. But in today’s findability swamp, Apple has to begin its long journey with a single step.
Is it the right one?
Interesting to think about I believe.
Stephen E Arnold, April 8, 2014