2012: Enterprise Search Yields to Metadata?
October 30, 2011
Oh, my. The search dragon has been killed by metadata.
You might find yourself on an elevator ready to get off on a specific floor. The rest of your trip will start from that point and that point only. The same is true for learning, conversing, actually just about anything. We all have a particular place we want to enter the conversation. MSDN’s Microsoft Enterprise Content Management (ECM) Team Blog’s recent posting on “Taxonomy: Starting from Scratch” was a breath of fresh air in the way it addressed anyone–no matter what floor they needed.
For the novices to Managed Metadata Service, a service providing tools to foster a rich corporate taxonomy, the article recommends a starting point: Introducing Enterprise Metadata Management
According to the article. The more seasoned users are reminded to point their browsers towards import capabilities. Of course, there are more specific needs, and links to go with them, addressed too.
The article recommends the following for the clients who need a comprehensive understanding of both common and specific corporate terms. The author Ryan Duguid states:
“The General Business Taxonomy consists of around 500 terms describing common functional areas that exist in most businesses. The General Business Taxonomy can be imported in to the SharePoint 2010 term store within minutes and provides a great starting point for customers looking to build a corporate vocabulary and take advantage of the Managed Metadata Service.”
Overall, this article is worth keeping tucked away for a day when you might need information on WAND, SharePoint, or metadata and taxonomy in general because of the directness and the accessible next steps the variety of links offer.
Megan Feil, October 30, 2011
Sponsored by Pandia.com
Software and Smart Content
October 30, 2011
I was moving data from Point A to Point B yesterday, filtering junk that has marginal value. I scanned a news story from a Web site which covers information technology with a Canadian perspective. The story was “IBM, Yahoo turn to Montreal’s NStein to Test Search Tool.” In 2006, IBM was a pace-setter in search development cost control The company was relying on the open source community’s Lucene technology, not the wild and crazy innovations from Almaden and other IBM research facilities. Web Fountain and jazzy XML methods were promising ways to make dumb content smart, but IBM needed a way to deliver the bread-and-butter findability at a sustainable, acceptable cost. The result was OmniFind. I had made a note to myself that we tested the Yahoo OmniFind edition when it became available and noted:
Installation was fine on the IBM server. Indexing seemed sluggish. Basic search functions generated a laundry list of documents. Ho hum.
Maybe this comment was unfair, but five years ago, there were arguably better search and retrieval systems. I was in the midst of the third edition of the Enterprise Search Report, long since batardized by the azure chip crowd and the “real” experts. But we had a test corpus, lots of hardware, and an interest is seeing for ourselves how tough it was to get an enterprise search system up and running. Our impression was that most people would slam in the system, skip the fancy stuff, and move on to more interesting things such as playing Foosball.

Thanks to Adobe for making software that creates a need for Photoshop training. Source: http://www.practical-photoshop.com/PS2/pages/assign.html
Smart, Intelligent… Information?
In this blast from the past article, NStein’s product in 2006 was “an intelligent content management product used by media companies such as Time Magazine and the BBC, and a text mining tool called NServer.” The idea was to use search plus a value adding system to improve the enterprise user’s search experience.
Now the use of the word “intelligent” to describe a content processing system, reaching back through the decades to computer aided logistics and forward to the Extensible Markup Language methods.
The idea of “intelligent” is a pregnant one, with a gestation period measured in decades.
Flash forward to the present. IBM markets OmniFind and a range of products which provide basic search as a utility function. NStein is a unit of OpenText, and it has been absorbed into a conglomerate with a number of search systems. The investment needed to update, enhance, and extend BASIS, BRS Search, NStein, and the other systems OpenText “sells” is a big number. “Intelligent content” has not been an OpenText buzzword for a couple of years.
The torch has been passed to conference organizers and a company called Thoora, which “combines aggregation, curation, and search for personalized news streams.” You can get some basic information in the TechCrunch article “Thoora Releases Intelligent Content Discovery Engine to the Public.”
In two separate teleconference calls last week (October 24 to 28, 2011), “intelligent content” came up. In one call, the firm was explaining that traditional indexing system missed important nuances. By processing a wide range of content and querying a proprietary index of the content, the information derived from the content would be more findable. When a document was accessed, the content was “intelligent”; that is, the document contained value added indexing.
The second call focused on the importance of analytics. The content processing system would ingest a wide range of unstructured data, identify items of interest such as the name of a company, and use advanced analytics to make relationships and other important facets of the content visible. The documents were decomposed into components, and each of the components was “smart”. Again the idea is that the fact or component of information was related to the original document and to the processed corpus of information.
No problem.
Shift in Search
We are witnessing another one of those abrupt shifts in enterprise search. Here’s my working hypothesis. (If you harbor a life long love of marketing baloney, quit reading because I am gunning for this pressure point.)
Let’s face it. Enterprise search is just not revving the engines of the people in information technology or the chief financial officer’s office. Money pumped into search typically generates a large number of user complaints, security issues, and cost spikes. As content volume goes up, so do costs. The enterprise is not Google-land, and money is limited. The content is quite complex, and who wants to try and crack 1990s technology against the nut of 21st century data flows. Not I. So something hotter is needed.
Second, the hottest trends in “search” have nothing to do with search whatsoever. Examples range from conflating the interface with precision and recall. Sorry. Does not compute for me. The other angle is “mobile.” Sure, search will work when everything is monitored and “smart” software provides a statistically appropriate method suggests will work “most” of the time. There is also the baloney about apps, which is little more than the gameification of what in many cases might better be served with a system that makes the user confront actual data, not an abstraction of data. What this means is that people are looking for a way to provide information access without having to grunt around in the messy innards of editorial policies, precision, recall, and other tasks that are intellectually rigorous in a way that Angry Birds interfaces for business intelligence are not.

Third, companies engaged in content access are struggling for revenue. Sure, the best of the search vendors have been purchased by larger technology companies. These acquisitions guarantee three things.
- The Wild West spirit of the innovative content processing vendors is essentially going to be stamped out. Creativity will be herded into the corporate killing pens, and the “team” will be rendered as meat products for a technology McDonald’s
- The cash sink holes that search vendors research programs were will be filled with procedure manuals and forms. There is no money for blue sky problem solving to crack the tough problems in information retrieval at a Fortune 1000 company. Cash can be better spent on things that may actually generate a return. After all, if the search vendors were so smart, why did most companies hit revenue ceilings and have to turn to acquisitions to generate growth? For firms unable to grow revenues, some just fiddled the books. Others had to get injections of cash like a senior citizen in the last six months of life in a care facility. So acquired companies are not likely to be hot beds of innovation.
- The pricing mechanisms which search vendors have so cleverly hidden, obfuscated, and complexified will be tossed out the window. When a technology is a utility, then giant corporations will incorporate some of the technology in other products to make a sale.
What we have, therefore, is a search marketplace where the most visible and arguably successful companies have been acquired. The companies still in the marketplace now have to market like the Dickens and figure out how to cope with free open source solutions and giant acquirers who will just give away search technology.
Will XML Save Your Job?
October 29, 2011
If you work on enterprise search, enterprise content repurposing, or high end business intelligence systems, you may want to consider this question.
Is the Extensible Markup Language the ticket to first class retirement at a giant multi national firm?
At least one gosling asked me this morning, “What’s with the interest in XML?” I told him:
XML is complicated and can be explained in such a way that a CFO will write a check to save money due to the benefits of “intelligent content.”
If you believe that, then you are going to answer the question, “Will XML save your job?” yourself and probably before the end of 2011.
You will want to take a look at data2type’s AntillesXML tool. The product will definitely help lock in your expertise, making you indispensible to your employer. The story “A Unique Combination of XML Tools” asserts:
AntillesXML a perfectly equipped toolbox for dealing with XML documents. Thanks to the new graphical user interface which is easy and intuitively to handle, the numerous features are suitable for developers and users alike,” explains Manuel Montero, managing director of data2type GmbH.
If that does not bolster your confidence, you can follow the new White House Chief Information Officer, Steven VanRoekel. He is on board with XML. Navigate to “Federal CIO Unveils Initiatives to Push XML, Virtualization, Agile IT”. Imagine all government documents in XML.
Will this happen?
Well, US government initiatives seem to come and go. When was the last time you used USA.gov or Data.gov? Hmm.
Stephen E Arnold, October 29, 2011
Sponsored by Pandia.com
PolySpot Wins over OSEO with Enterprise Search
October 28, 2011
Paris-based PolySpot’s reliability in conjunction with their innovative technologies paid off. In the news release, “OSEO Opts for a new Search Engine with PolySpot” we got to hear about many of the specifics that made PolySpot stand out amongst the competition.
First, lets look at the issues that prompted OSEO to make the switch. OSEO had a Java-based directory in addition to a search engine supplied with its open source content management system.
OSEO’s former service was characterized by the following:
Indexing of data was restricted to the intranet and the search engine picked up too much ‘noise’. The users, unable to locate required information quickly, were no longer satisfied with the existing search engine which offered basic functionality.
Frédéric Vincent, Information System and Quality Assurance Manager champions their decision to use PolySpot Enterprise Search.
The functionalities that comprise an intuitive user interface make PolySpot’s Search stand out: users can now customize their internal search tool, see added-value tags related to their queries in tag cloud, and access search without quitting any other applications.
We think it may be a prudent step to check out PolySpot’s solutions at www.polyspot.com.
Megan Feil, October 28, 2011
Sponsored by Pandia.com
Enterprise Search: The Floundering Fish!
October 27, 2011
I am thinking about another monograph on the topic of “enterprise search.” The subject seems to be a bit like the motion picture protagonist Jason. Every film ends with Jason apparently out of action. Then, six or nine months later, he’s back. Knives, chains, you name it.
The Landscape
The landscape of enterprise search is pretty much unchanged. I know that the folks who pulled off the billion dollar deals are different. These guys and gals have new Bimmers and maybe a private island or some other sign of wealth. But the technology of yesterday’s giants of enterprise search is pretty much unchanged. Whenever I say this, I get email from the chief technology officers at various “big name” vendors who tell me, “Our technology is constantly enhanced, refreshed, updated, revolutionized, reinvented, whatever.”

Source: http://www.goneclear.com/photos_2003.htm
The reality is that the original Big Five had and still have technology rooted in the mid to late 1990s. I provide some details in my various writings about enterprise search in the Enterprise Search Report, Beyond Search for the “old” Gilbane, Successful Enterprise Search Management, and my June 2011 The New Landscape of Search.
Former Stand Alone Champions of Search
For those of you who have forgotten, here’s a précis:
- Autonomy IDOL, Bayesian, mid 1990s via the 18th century
- Convera, shotgun marriage of “old” Excalibur and “less old” Conquest (which was a product of a former colleague of mine at Booz, Allen & Hamilton, back when it was a top tier consulting firm
- Endeca, hybrid of Yahoo directory and Inktomi with some jazzy marketing, late 1990
- Exalead. Early 2000 technology and arguably the best of this elite group of information retrieval technology firms. Exalead is now part of Dassault, the French engineering wizardry firm.
- Fast Search & Transfer, Norwegian university, late 1990s. Now part of Microsoft Corp.
- Fulcrum, now part of OpenText. Dates from the early 1990s and maybe retired. I have lost track.
- Google Search Appliance. Late 1990s technology in an appliance form. The product looks a bit like an orphan to me as Google chases the enterprise cloud. GSA was reworked because “voting” doesn’t help a person in a company find a document, but it seems to be a dead end of sorts.
- IBM Stairs III, recoded in Germany and then kept alive via the Search Manager product and the third-party BRS system, which is now part of the OpenText stable of search solutions. Dates from the mid 1970s. IBM now “loves” open source Lucene. Sort of.
- Oracle Text. Late 1980s via acquisition of Artificial Linguistics.
There are some other interesting and important systems, but these are of interest to dinosaurs like me, not the Gen X and Gen Y azure chip crowd or the “we don’t have any time” procurement teams. These systems are Inquire (supported forward and rearward truncation), Island Search (a useful on-the-fly summarizer from decades ago), and the much loved RECON and SDC Orbit engines. Ah, memories.
What’s important is that the big deals in the last couple of months have been for customers and opportunities to sell consulting and engineering services. The deals are not about search, information retrieval, findability, or information access. The purchasers will talk about the importance of these buzzwords, but in my opinion, the focus is on getting customers and selling them stuff.
Three points:
SharePoint 2010 Is Easy to Adopt for a Reason
October 26, 2011
Amidst the news of Forrester Research’s results of a SharePoint 2010 Adoption and Migration Trends, I thought it would beneficial to take another look at Search Technologies article, “Leading with Search: A SharePoint 2010 Implementation Strategy.”
First, a little bit about the survey: 510 IT decision makers involved with evaluating, specifying, or administering SharePoint 2010 were consulted about their experiences. The article reported:
On the IT side, 79 percent of respondents said that SharePoint is meeting their expectations, with 21 percent giving a negative reply. When asked if SharePoint had met business management expectations, 73 percent said “Yes,” while 27 percent said “No.”
According to Search Technologies article, it’s no wonder that SharePoint has such positive feedback. They are all about engaging their users. It is both transparent data migration and easy enterprise searching and browsing that lead to user’s motivation to adopt to this platform.
Research shows that it takes 21 days to form a habit. With all the perks of SharePoint, I wonder if it takes even that long for users to feel at ease with the adoption. We know that if a SharePoint licensee relies on Search Technologies for engineering support, the speed of adoption accelerates as does user satisfaction. For more information about Search Technologies, navigate to www.searchtechnologies.com.
Iain Fletcher, October 26, 2011
Search Technologies
Protected: More Cheerleading for SharePoint Social Functions
October 26, 2011
Protected: Preserving Policy Settings in SharePoint
October 25, 2011
Protected: Watch Your SharePoint Practices
October 24, 2011
Enterprise Search Silliness
October 23, 2011
I am back in Kentucky and working through quite a stack of articles which have been sent to me to review. I don’t want to get phone calls from Gen X and Gen Y CEOs, chipper attorneys, and annoyed vulture capitalists, so I won’t do the gory details thing.
I do want to present my view of the enterprise search market. I finished the manuscript for “The New Landscape of Search”, published in June 2011, before the notable acquisitions which have taken place in the discombobulated enterprise search market. Readers of this blog know that I am not too fond of the words “information,” “search”, “enterprise”, “governance”, and a dozen or so buzzwords that Art History majors from Smith have invented in the sales job.
In this write up I want to comment on three topics:
- What’s the reason for the buy outs?
- The chase for the silver bullet which will allow a vendor to close a deal, shooting the competitors dead
- The vapidity of the analyses of the search market.
Same rules apply. Put your comments in the comments section of the blog. Please, do not call and want to “convince” me that a particular firm has the “one, true way”. Also, do not send me email with a friendly salutation like “Hi, Steve.” I am not in a “hi, Steve” mood due to my lousy vision and 67 year old stamina. What little I have is not going to be applied to emails from people who want to give me a demo, a briefing, or some other Talmudic type of input. Not much magic in search.

What’s the Reason for the Buy Outs?
The reasons will vary by acquirer, but here’s my take on the deals we have been tracking and commenting upon to our paying clients. I had a former client want to talk with me about one of these deals. Surprise. I talk for money. Chalk that up to my age and the experience of the “something for nothing” mentality of search marketers.
The HP Autonomy deal was designed to snag a company with an alleged 20,000 licensees and close to $1.0 billion in revenue, and not PriceWaterhouseCooper or Deloitte type of consulting revenue stream. HP wants to become more of a services company, and Autonomy’s packagers presented a picture that whipped HP’s Board and management into a frenzy. With the deal, HP gets a shot at services revenue, but there will be a learning curve. I think Meg Whitman of eBay Skype fame will have her hands full with Autonomy’s senior management. My hunch is that Mike Lynch and Andrew Kantor could run HP better than Ms. Whitman, but that’s my opinion.
HP gets a shot at selling higher margin engineering and consulting services. A bonus is the upsell opportunity to Autonomy’s customer base. Is their overlap? Will HP muff the bunny? Will HP’s broader challenges kill this reasonably good opportunity? Those answers appear in my HP Autonomy briefing which, gentle reader, costs money. And Oracle bought Endeca as a “me too” play.

