Milward from Linguamatics Wins 2010 Evvie Award
April 28, 2010
The Search Engine Meeting, held this year in Boston, is one of the few events that focuses on the substance of information retrieval, not the marketing hyperbole of the sector. Entering its second decade, the conference speakers tackle challenging subjects. This year speakers addressed such topics as “Universal Composable Indexing” by Chris Biow, Mark Logic Corporation, “Innovations in Social Search” by Jeff Fried, Microsoft, and “From Structured to Unstructured and Back Again: Database Offloading”, by Gregory Grefenstette, Exalead, and a dozen other important topics.
From left to right: Sue Feldman, Vice President, IDC, Dr. David Milward, Liz Diamond, Stephen E. Arnold, and Eric Rogge, Exalead.
Each year, the best paper is recognized with the Evvie Award. The “Evvie” was created in honor of Ev Brenner, one of the pioneers in machine-readable content. After a distinguished career at the American Petroleum Institute, Ev served on the planning committee for the Search Engine Meeting and contributed his insights to many search and content processing companies. One of the questions I asked after each presentation was, “What did Ev think?”. I valued Ev Brenner’s viewpoint as did many others in the field.
The winner of this year’s Evvie award is David R. Milward, Linguamatics, for his paper “From Document Search to Knowledge Discovery: Changing the Paradigm.” Dr. Milward said:
Business success is often dependent on making timely decisions based on the best information available. Typically, for text information, this has meant using document search. However, the process can be accelerated by using agile text mining to provide decision-makers directly with answers rather than sets of documents. This presentation will review the challenges faced in bringing together diverse and extensive information resources to answer business-critical R&D questions in the pharmaceutical domain. In particular, it will outline how an agile NLPbased approach for discovering facts and relationships from free text can be used to leverage scientific knowledge and move beyond search to automated profiling and hypothesis generation from millions of documents in real time.
Dr. Milward has 20 years’ experience of product development, consultancy and research in natural language processing. He is a co-founder of Linguamatics, and designed the I2E text mining system which uses a novel interactive approach to information extraction. He has been involved in applying text mining to applications in the life sciences for the last 10 years, initially as a Senior Computer Scientist at SRI International. David has a PhD from the University of Cambridge, and was a researcher and lecturer at the University of Edinburgh. He is widely published in the areas of information extraction, spoken dialogue, parsing, syntax and semantics.
Presenting this year’s award was Eric Rogge, Exalead, and Liz Diamond, niece of Ev Brenner. The award winner received a recognition award and a check for $500. A special thanks to Exalead for sponsoring this year’s Evvie.
The judges for the 2010 Evvie were Dr. David Evans (Evans Research), Sue Feldman (IDC), and Jill O’Neill, NFAIS.
Congratulations, Dr. Milward.
Stuart Schram IV, April 28, 2010
Sponsored post.
Columba Global and Its Search System
April 27, 2010
I received an email about a company called Columba Global Systems associated with “naked objects”. A news story with the title “US Analysts Single Out Irish Firm’s Software Package.” became available on April 25, 2010. The document reports a new search system from CGS. I did a quick check of my Overflight repository and located some information. You can get a PowerPoint that provides an overview of the company’s approach at http://www.columba.com/downloads/Columba%20Brief%20Overview.ppt. The company uses components from several vendors to provide its customers with access to the processed content. It is not clear to me if the analyst’s report is an objective summary of the open source intelligence community’s software systems or a sponsored “white paper” / report. There are a number of firms offering “data fusion” products or “mash up systems”. These include Exalead, Fetch Technologies, and Kapow Tech, among others. These systems deliver the type of “one view” of content objects that has been associated with the type of information access in favor among intelligence and law enforcement professionals. In my experience, the key is optimizing the performance of the system. Purpose built systems such as Exalead’s often have an advantage in terms of performance and scalability over systems constructed on toolkits and systems acquired from different vendors. The positioning of the company in the Irish Times’s new story is one more indicator that traditional search is losing ground in some sectors.
Stephen E Arnold, April 27, 2010
Translating a Business Intelligence Warning
April 21, 2010
An azure chip consultant, according to the prolific Intelligent Enterprise publication, has issued a business intelligence warning. The write up “Gartner’s BI Summit Warning: Buyer Beware” is one you will want to read to make sure I am translating the message correction. The idea is folded within a number of buzzwords and hot sounding terms; for example, megavendor, stack centricity, BI, and similar code words.
The idea is that an azure chip consulting firm has alerted its customers to avoid the big outfits in the business intelligence sector. I think of Business Objects (now part of the challenged SAP), IBM’s bevy of business intelligence companies, SAS, and, if I am broad minded, Oracle. The consulting firm suggests that big outfits sell companies too much so money is wasted. In short, buy what you need. Save money. Live long. Prosper.
Okay.
First, I think that the consulting firm may be revealing unintentionally that some big outfits are not ponying up significant consulting contracts. The consulting firm’s advice is preparing the ground for smaller vendors to take buy some of the consulting firm’s expertise.
Second, I think that business intelligence like military intelligence are often oxymoronic. In my opinion, companies need timely, operational information; that is, facts directly related to making a sale, solving a problem, or figuring out whether to zig or zag. This more prosaic view of information is too much steak and not enough sizzle, so we get glittering generalities. The need for some words that make sales is increasing. Rhetoric is in. Basics are out perhaps?
Third, notion that a fuzzy concept like business intelligence presented as a cure all is a very popular and facile marketing method. I know a West Coast consultant who overpromises and tries to over deliver. The clients are usually disappointed in my experience. The client remembers the hyperbole and forgets the difficulty of providing solid information in a fluid, unpredictable business environment.
My hunch is that the general advice of “buyer beware” is like one of those Chinese proverbs. Those proverbs sound so darned meaningful. How many situations exist where the buyer does not know enough to be aware? How many buying situations are rubber stamp deals where the old vendor gets the new job auto-magically?
In short, cautions in today financial climate are, in my opinion, not really needed. Example range from Enron to Lehman Bros. My take is that silver bullets whether shot from azure chip consultants’ laptops or vendors’ PowerPoints have one goal: generate cash.
Caveat emptor! Absolutely. And the advice applies to consultants, vendors, and information disseminators as well.
Stephen E Arnold, April 21, 2010
Unsponsored post
A Surprising Spurt in Self Publishing
April 19, 2010
Short honk: I read “Self-Published Titles Topped 764,000 in 2009 as Traditional Output Dipped” and was surprised by this factoid:
A staggering 764,448 titles were produced in 2009 by self-publishers and micro-niche publishers, according to statistics released this morning by R.R. Bowker. The number of “nontraditional” titles dwarfed that of traditional books whose output slipped to 288,355 last year from 289,729 in 2008. Taken together, total book output rose 87% last year, to over 1 million books.
Quite a treasure trove of uncurated content. If I were younger, there might be some useful information tucked in these publications.
Stephen E Arnold, April 19, 2010
A freebie.
Operational Intelligence, the New Enterprise Search
April 14, 2010
Worlds are colliding. Business intelligence, search, analytics, and business process are hurtling toward one another. No collider is needed. The impetus comes from managers who are struggling to keep their firms above water. Make no mistake about it. The economic climate may be improving based on government data and the self serving reports from global financial powerhouses. But just look at the number of empty buildings, the fraying infrastructure, and the desperation in the eyes of most employees in North America.
For those lucky enough to be thriving in a world gone mad for sending ads to individuals, life may be good. For people who are in more traditional jobs, the notion of finding information is an everyday struggle. Without the right information at the moment it is needed, organizations can make costly mistakes. These are not errors of judgment like magazine publishers who see the iPad as the font of new revenue or the dew eyed MBA looking for a job with a third string consulting firm. Nope. These visages reflect the person who cannot explain to a customer why an order was lost or an automobile was delivered with a faulty electronic gizmo. In fact, I see the effects of downsizing, the need to squeeze extra money from every transaction, and crazy decisions made by committees everywhere I look, regardless of the country.
What’s the answer? According to a sponsored white paper from the consulting outfit IDC, Teradata has the fix. Now you may not think that even bigger piles of data will help your business. I admit that I don’t believe the premise either. You can get the story in “Real-Time Operational Intelligence Gains Momentum in Europe: Teradata-sponsored business survey shows adoption details for ‘Active Data Warehousing’” and make up your own mind. Big data means big costs in my experience.
What I liked about this write up was the phrase “real time operational intelligence”. True, the acronym RTOI is a bit clumsy, but I think the phrase points to an important shift in search and content processing. RTOI delivers what many of the people with whom I speak perceive enterprise search delivering. The idea is that the information in an organization is available when needed to help people answer questions and make decisions. Hopefully the decision makers did well in school and have a modicum of common sense.
After thinking about this phrase and the acronym RTOI, I had several thoughts:
- Vendors of enterprise search may want to make this phrase their own. It is a heck of lot more compelling than “putting information at your fingertips” or “dashboard”
- Search, in this phrase’s embrace, becomes an enabler. Search becomes like butter in a recipe. Without the ingredient the dish does not work. Many vendors of search see themselves as the fish, vegetables, and spices in the meal. RTOI makes search an essential but supporting ingredient.
- The conceptual outcome of RTOI may be consolidation of what now are marketed as separate systems. For RTOI to work, an organization needs an integrated approach. Data are not enough. The various features and functions of analytics, retrieval, report generation, and business processes must be woven together into one coherent, affordable system.
Is RTOT the future? I am willing to float a tentative, “Yes.” Fragmented information centric systems are now a cost and resource challenge for many organizations. The time is ripe for a new approach. Maybe it will be fueled by open source software like Lucene? Maybe it will be the use of a system like Google’s? Maybe it will be a roll up following the trajectory of Autonomy or OpenText.
The status quo is not delivering and change may be coming. Teradata may not be the winner, but it has contributed a useful catch phrase in my opinion. The phrase “enterprise search” could be put to rest which would be a step forward in my opinion.
Stephen E Arnold, April 14, 2010
Unsponsored post.
AIIM Report on Content Analytics
March 30, 2010
A happy quack to the reader who sent me a link available from the Allyis Web site for the report “Content Analytics – Research Tools for Unstructured Content and Rich Media”. If you are trying to figure out what about 600 AIIM members think about the changing nature of information analysis, you will find this report useful. I flipped through the 20 pages of data from what strikes me as a somewhat biased sample of enterprise professionals. Your mileage may vary, of course. One quick example. In Figure 4: How would you rate your ability to research across the following content types on page 7, the respondants’ data are pretty good at search customer support logs. The respondents are also confident of their ability to search “case files” and “litigation and legal reports.” My research suggests that these three areas are real problems in most organizations. I am not sure how this sample interprets their organizations’ capabilities, but I think something is wacky. How can, for example, a general business employee assess the ease with which litigation content can be researched. Lawyers are the folks who have the expertise. At any rate, another flashing yellow light is the indication that the respondents have a tough time searching for press articles and news along with collateral, brochures, and publications. This is pretty common content, and an outfit that can search “case files” should be able to locate a brochure. Well, maybe not?
There were three findings that I found interesting, but I am not ready to bet my bread crust on the solidity of the data.
First, Figure 14: What are your spending plans for the following areas in the next 12 months?. The top dog is enterprise search – application. This should give some search vendors the idea to market to the AIIM membership.
Second, respondents, according to the Key Findings, can find information on the Web more easily than they can find information within their organization. This matches what Martin White and I reported in our 2009 study Successful Enterprise Search Management. It is clear that this finding underscores the wackiness in Figure 4, page 7.
Finally, the Conclusion, page 15 states:
The benefits of investment in Finance and ERP systems have only come to the fore with the increasing power of Business Intelligence (BI) reporting tools and the insight they provide for business managers. In the same way, the benefits of Content Management systems can be much more heavily leveraged by the use of Content Analytics tools.
I don’t really understand this paragraph. Finance has been stretched with the present economic climate. ERP is a clunker. Content management systems are often quite problematic. So what’s the analysis? How about cost overruns?
I tucked the study into my reference file. You may want to do the same. If the Allyis link goes dead, you can get the report directly from AIIM but you may have to join the association.
Stephen E Arnold, March 31, 2010
Like the report, a freebie.
IBM and Its Do Everything Strategy
March 24, 2010
I read an unusual interview with Steve Mills. The story was “Q&A: IBM’s Steve Mills on Strategy, Oracle, and SAP.” What jumped out at me was that there was no reference to Google that I noticed. Odd. Google seems to be ramping up in the enterprise sector and poised to compete with just about everyone in the enterprise software and services market. When I noticed this, I decided to work through the interview to see what the rationale was for describing companies that are struggling with many “push back” issues from customers, resellers, and partners. The hassles Oracle is now enduring with regard to open source and the SAP service pricing fluctuations are examples of companies struggling to deal with a changing market needs.
Please, read the original interview because I am comfortable highlighting three comments in a blog post.
First, Mr. Mills said:
Our technology delivers important elements of the solution, but there are often third-part application companies that add to that solution. No one vendor delivers everything required. The average large business, if you went into their compute centers around the world, runs 50,000 to 60,000 programs that are part of 2,000 to 4,000 unique applications.
Yes, and it is the cost and complexity of the IT infrastructure in those companies today that are creating pressures on the CFO, the users, and stakeholders. IBM’s engineers helped created the present situation and the company is now in a position where those customers are likely to look for lower cost, different types of options. If I have a broken auto, would I go to the mechanic who failed to make the repair on an earlier visit? I seek a new mechanic, but perhaps IBM’s cash rich customers don’t think the way I do.
Second, Mr. Mills offered this “fact”:
But in the enterprise, for every dollar invested in ERP, there will be five dollars of investment made around that ERP package to get it fully implemented, integrated, scaled and running effectively.
My view is that the time value of the dinosaur like applications are likely to be put under increasing pressure by new hires. The younger engineers are more comfortable with certain approaches to computing. Over time, the IBM “factoid” will be converted into a question like, “If we shift to Google Apps, perhaps we could save some money?” The answer would require verification, but if the savings are accurate, the implications for Oracle and SAP are significant. I think IBM will either have to buy its way into the cloud and “try to make up the revenue delta” on volume or find itself in the same boat as other “old style” enterprise software vendors.
Third, Mr. Mills stated:
It’s money. That’s the No. 1 motivator. And money is not a single-dimensional factor because there’s short-term money, long-term money and money described in broader value terms versus the cost of a product. The surrounding costs are far in excess of products. Every month, customers convert from Oracle to DB2. Why do they do that? Well, Oracle is expensive. Oracle tries to use pricing power to capture a customer and then get the customer to keep on paying. Oracle raises its prices constantly. Oracle does not provide a strong support infrastructure. There are many customers who have decided to move away from Oracle across a variety of products because of those characteristics.
I agree. The implication are that IBM is a low cost option. Well, maybe in some other dimension which the addled goose cannot perceive. My view is that time, vale, and cost will conspire to create a gravity well into which the IBM-like companies will be sucked. IBM’s dalliance with open source, its adherence to its services model, and its reliance on acquisitions to generate revenue may lose traction in the future.
And finding stuff in IBM systems? Not mentioned. Also, interesting.
I don’t know when, but IBM’s $100 billion in revenue needs some oxygen going forward. The race is not a marathon. It’s more like a 200 or 440. Maybe Google will be in the race? Should be interesting.
Stephen E Arnold, March 24, 2010
No pay for this write up. I will report this to the GSA who has tapped IBM to build its next generation computing infrastructure. I think IBM will be compensated for this necessary work.
ArnoldIT Expands Overflight
March 22, 2010
If you want one-click access to what’s new from leading vendors of search and content processing, navigate to ArnoldIT’s free Overflight service. Pick a company name, select a Google topic area, or run a query on Google’s own 70 plus Web logs. We have added three vendors to the watch service:
- Comperio, one of the Microsoft Fast support entities which has former FAST Search engineers on staff.
- Exorbyte, a vendor with a system that matches other eCommerce and databased content systems feature for feature.
- Funnelback, the Australian open source search system offered by SQIZ, an open source content management company.
You will also find a list of three social network service providers: Facebook, Twitter, and LinkedIn. What’s interesting is to click through each of the autogenerated pages for the search and content processing vendors. You may be able to tell who is marketing with some savvy and who is clueless.
Stephen E Arnold, March 22, 2010
A shameless promotion of an ArnoldIT.com service. You now are reminded that Beyond Search is a marketing blog devoted to ArnoldIT.com and Stephen E Arnold.
Google Bombshell: Alleged Links to Intelligence Services Alleged
March 22, 2010
I was plonking along looking at ho hum headlines when I spotted “Chinese Media Hits Out at Google, Alleges Intelligence Links”. The addled goose does not know anything about this source nor about the subject of the article. But the addled goose is savvy enough to know that if this story is true, it is pretty darned important. The main point of the story in Economic Times / India Times is:
Xinhua said in an editorial: “Some Chinese Internet users who prefer to use Google still don’t realize perhaps that due to the links between Google and the American intelligence services, search histories on Google will be kept and used by the American intelligence agencies.”
Okay, that’s interesting. Several years ago, I heard a talk by a citizen in Washington, DC who made a similar comment. My recollection is that Google was pretty darned mad. I wondered if the citizen in Washington, DC was right or wrong. If another source comes up with more detail, the story becomes much more interesting.
Chinese intelligence agents are pretty savvy. And the Ministry of State Security is one of the best. I can’t remember whether Section 6 is the go-to bunch, but perhaps more information will surface.
Stephen E Arnold, March 22, 2010
A freebie. I will report non payment to DC Chief of Police who is really clued into Google’s activities in Washington.
Coveo and GEICO Host Webinar on March 23, 2010
March 21, 2010
Fierce Media has asked Beyond Search to facilitate a discussion about “how GEICO thinks about leveraging its data-rich enterprise systems to generate real-time business value and intelligence.” The participants are GEICO and Coveo as well as Stephen E Arnold.
Topics include how the Coveo system can:
- Enable improved business intelligence and decision making through dynamic dashboards and information mashups that provide actionable business information
- Access structured and unstructured data from across enterprise systems and repositories without complex integration or data migration, improving efficiency and cost effectiveness through a unified indexing layer
- Lower the cost of legacy system integrations and upgrades, and reduce time-consuming data migration
- Optimize social networks and incorporate the value of collaboration and just-in-time information exchange into the knowledge ecosystem
The audio program will be on Tuesday, March 23, 2010 beginning at 11:00am Eastern/8:00am Pacific. More information about Coveo may be found at http://www.coveo.com. You can register here.
Ben Kent, March 21, 2010, Beyond Search
This is a sponsored post.