IBM Watson Crushes Humans. Is Google Next?

February 16, 2011

The Web is abuzz with IBM Watson’s victory over humans on the TV game show Jeopardy. In my father’s independent living facility, a number of ageing humans expressed concern that their medical diagnoses would be handled by the human crushing Watson.

Here in the airport, the response to Watson’s victory over humans seems more subdued. I asked one college student sitting next to me if he were worried about IBM’s supercomputer taking over the world. He said, “Eh, what, dude?”

National US media were excited about IBM’s triumph on a TV game show. In “Beyond ‘Jeopardy’: Watson Wins,” MSNBC said:

Watson was built to serve up quiz-show knowledge, but those question-answering capabilities would probably be most valuable in specialized fields such as medicine and law. Watson’s kin could help us puny humans sift through millions of possibilities and come up with the five or six best medical diagnoses, or legal precedents, or chemical configurations, or … well, you name it.

Okay, I will name it: PR.

In a clever attempt to regain the technology champion award, IBM—a $100 billion enterprise—showed its information technology on a TV game show. How appropriate for a company that created STAIRS III, Juru, Web Fountain, and dozens of other search and content processing systems. Most recently, IBM was groping for a solution to clustering in its ageing document management system. Its “flagship” enterprise search system is essentially the open source Lucene system.

My question is, “If IBM’s information retrieval technology is so darned good, what’s up with the clustering issue in its records management system?” And “Why is the IBM enterprise search solution based on Lucene?” And, “What has happened to STAIRS, Juru, et al?”

My view is that IBM does not seem to have much traction in the commercial, enterprise search space with its own technology. The IBM demo approach is marketing, and I think it is great public relations.

But where it counts, IBM is far behind Autonomy, Endeca, Exalead, and dozens of other vendors with enterprise solutions that work and are affordable.

What about Google? Does Watson frighten Messrs. Brin and Page? Are the wizards at Microsoft Oslo shaking in their ski boots?

I don’t think so.

In my opinion, IBM is far behind in search and content processing technology. IBM resells other vendors information technology, acting, it seems to me,  more like a consultant than an innovator in information retrieval. After buying Cognos and SPSS to bolster the firm’s data mining business, IBM is going to have to do more than beat meat on feet on a TV game show. IBM now has to win head to head procurements for search in the enterprise. Do you think that might be more difficult than winning a TV game show that is popular among those with whom my father hangs out, napping during commercials and shouting questions to the host’s answers?

I do.

Just my opinion. Honk.

Stephen E Arnold, February 16, 2011

Freebie unlike the costs of producing a TV game show, editing the program, and pumping hyperbole into the info stream.

Protected: BA-Insight Longitude

February 15, 2011

This content is password protected. To view it please enter your password below:

OpenText Opens Advanced Content Analytics Market

February 14, 2011

Following in the footsteps of other vendors, Open Text has opened an advanced content analytics market.

OpenText Licensing Agreement Brings Advanced Content Analytics to Market” reveals a tie up between OpenText and the National Research Council (Canada). The idea is that new Content Analytics innovations will be added to the ECM Suite and made available by spring 2011. The added content analytics to the ECM Suite will improve data mining and analysis. The key point is:

“Content analytics is the key to extracting business value from social media and text-rich online and enterprise information sources, an essential technology for marketing, online commerce, customer service, and improved search and Web experience. Given the mind-boggling growth in information volumes, no wonder uptake is booming, powered by rapid technical advances from leading-edge vendors such as OpenText.”

Content Analytics will perform data mining that will uncover and show relationships between businesses and other facts. It will be able to find information that a normal search engine wouldn’t find. This agreement is the beginning for OpenText to apply Content Analytics to all its enterprise content management Suite products.

Whitney Grace, February 14, 2011

Freebie

Greenplum, Big Data, and an Open Source Card

February 13, 2011

Wichita Business Journal has a write-up called “EMC Greenplum Introduces Free Community Edition of ‘Big Data’ Tools for Developers and Data Scientists.” EMC Corporation is one of the world’s leaders in information infrastructure solutions and it is releasing a free edition of the EMC Greenplum Database. This database product offers massively parallel processing, analytic algorithms, and data mining tools. This was announced at the 2011 O’Reilly Strata Conference. The key point was:

“Building on earlier Greenplum “Big Data” breakthroughs, like the EMC Greenplum Data Computing Appliance, the new EMC Greenplum Community Edition removes the cost barrier to entry for big data power tools empowering large numbers of developers, data scientists, and other data professionals. This free set of tools enables the community to not only better understand their data, gain deeper insights and better visualize insights, but to also contribute and participate in the development of next-generation tools and solutions.”

EMC Corporation is geared towards first-time users and current Greenplum customers. First-time users will benefit from business analytics environment and experimenting with its tools. Current customers can easily update their older versions. Our opinion, EMC and “free.” We are curious about commercial companies jumping on the “community” bandwagon.

Whitney Grace, February 12, 2011

Protected: FAST Read: Ontolica

February 11, 2011

This content is password protected. To view it please enter your password below:

Protected: The Defunct User Revolution

February 10, 2011

This content is password protected. To view it please enter your password below:

Nexeo Embraces PostgreSQL

February 10, 2011

Nuxeo, a software manufacturer specializing in Enterprise Content Management, recently expressed criticism for one particular open source object-relational database system and in the process praised another.

Per the aptly titled post “Why Avoid MySQL?”, the case is made against MySQL in a joint effort by Nuxeo’s Founder and head of R&D.  This is how the post opens: “Nuxeo can work with many databases (PostgreSQL, MySQL, Oracle, Microsoft SQL Server, and others could be added). But MySQL should be avoided if at all possible, because it has major deficiencies that Nuxeo cannot really work around.”  What follows is a bullet list of fourteen points reinforcing that claim.  The list cites dropped connections, row size limits as well as poor full text configuration, among other issues.  They end saying, “All these problems lead us to recommend not using MySQL in production, and using PostgreSQL instead which is a much nicer database engine.”

A quick jump to the PostgreSQL site will provide ample information on the product for any reader to contrast.  A few differences I found include MySQL’s row size, which peaks at 64 KB, whereas PostgreSQL’s extends to 1.6 TB.  The former’s triggers fail to activate by cascading key actions.  The latter’s on the other hand can be written in C and loaded as a library providing flexibility in pushing capabilities.  The information page also includes links to testimonials and a list of awards spanning over a decade.

Our take: Is Oracle’s approach to MySQL giving some folks an added incentive to look at PostgreSQL?

Stephen E Arnold, February 10, 2011

Freebie

Brainware and Its Work Flow Repositioning

February 3, 2011

I remember a couple of years back. I had a briefing in which Brainware emphasized its search and retrieval system based on trigrams. Content and source document language was mostly irrelevant for this method which broke an object into three letter strings or trigrams. The demonstration I recall was the use of the trigram method on patent claims. I was impressed, loaded the system on our test machine, and ran queries against my Google patent corpus. Pretty darned good I thought.

Now I read “Ovum Publishes Technology Audit on Brainware’s Data Capture Platform.” Ovum is one of those mid-tier consulting firms (what I call an azure chip consultant in opposition to an outfit like Bain, Booz, or McKinsey). What does the mid tier firm state? Brainware is now in a data capture platform mode. The passage that caught my attention was:

“Ovum (NewsAlert) recognizes that enterprises work with critical data from a multitude of document types in order to keep their business processes intact,” said Mike Davis, senior analyst at Ovum, in a press release.  “Thus there is an associated need to capture, search and retrieve data from the plethora of structured, semi-structured and unstructured documents received.  Given the increasing volumes of these document types, and the content contained, intelligent tools such as Brainware’s Distiller, which provide integration with a wide range of enterprise applications, are essential to undertake the automatic processing of documents across the enterprise and generate the value from the information contained.” “Intelligent data capture technology offers users unparalleled capability for boosting productivity across the enterprise,” said Charles Kaplan, vice president of marketing at Brainware, in a statement. “Brainware’s Distiller platform provides increased efficiency and visibility for the world’s largest enterprises, enabling them to expand their output considerably without adding headcount, or even shift their existing manual data entry staff to other value-added activities within the organization.”

Fair enough, but the repositioning of the trigram technology is notable for its absence. Second, the focus on work flow is very clear, almost like a marketing presentation, and  the use of the term “platform” is interesting. A number of search vendors are looking for a hook. Platform, it seems, is the worm of choice for 2011. The emphasis on paper and conversion reminded us of the presentations that Fujitsu and Kofax gave us a decade ago. Paper appears to be a problem even in 2011.

You can get more information directly from Brainware at www.brainware.com.

Stephen E Arnold, February 3, 2011

Freebie

Protected: Making Microsoft Duet into a Trio

February 1, 2011

This content is password protected. To view it please enter your password below:

Whither SAP and Project Argo?

January 31, 2011

On Friday, I had a chat with a flashy New York investment type. The topic was SAP. Apparently the Harvard MBA saw an old write up I did for the now disappeared Information World Review. (If anyone knows about this publication, let me know, please. My last column fell into a black hole. Sigh. Publishers.)

The flashy MBA had heard about Project Argo. The question, “What’s up?” My answers was: “I have no idea.” I poked around the Overflight junk bin and realized that after 2007, it seems to have dropped off the open source radar. The disappearance of a search engine or a publication is nothing new, of course. I found it interesting.

image

Where is Project Argo?

I wanted to capture the few items of information I had about SAP’s Project Argo, so if I get a similar question in three or four years, I won’t have to dig through so much digital detritus.

Some “facts” from open sources:

  1. Argo, a search system from the same outfit with TREX, became available for download in 2006. There was some chatter on message boards about the system’s requirements, but I found no information indicating that it swept the SAP world with excitement. The preview version disappeared in the middle of 2007.
  2. According to “The State of SAP xApps”, Project Argo was part of xApps. What are/were xApps? I think there were “composite” applications. I think this means “federating” methods so users could look one place for information.
  3. The focus was, according to “SAP to Add Enterprise Search with Project Argo” was, software that “extends SAP enterprise search to connect to Web services. A generic Web service that invokes search services such as Google is included in the beta version, according to information obtained from the SAP Developer Network Web site. Argo gives end users one central entry point to search company information from various data sources. With a single search query, users can use desktop widgets, browsers, e-mail and mobile devices to tap into company data from multiple sources.”

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta