Autonomy Hits Top Tech Ranking and Number 21

December 16, 2010

Short honk: My hunch is the enterprise search in 2011 is going to be a blend of super charged innovations and marketing. The Memphis Business Journal made available a story with a hook that snagged me. The story is “Autonomy Named to Bloomberg Businessweek Hot Tech 50 Ranking.” Autonomy has the distinction of joining quite distinguished company; for example, Apple, Akamai, and Wipro. There are several companies on the list that offer findability components in their products; for example, Cisco and OpenText. Autonomy is the company most closely associated with search in this Businessweek list. The full list is at this link at this time (December 16, 2010, 10 30 am Eastern). A happy quack to team Autonomy.

Stephen E Arnold, December 16, 2010

Freebie

Lucid Announces Enterprise Search

December 16, 2010

Short honk: I learned via Marketwire that Lucid Imagination has announced the general availability of its enterprise search system, LucidWorks Enterprise. According to the report, the new search solution scales, features a cost-effective architecture, and delivers “enriched document handling”. The software is available without cost. Lucid Imagination offers a complete range of for-fee consulting and engineering services. For more information, navigate to www.lucidimagination.com.

Stephen E Arnold, December 16, 2010

Freebie

Repositioning 2011: The Mad Scramble

December 15, 2010

Yep, the new year fast approaches. Time to turn one’s thoughts to vendors of search, content processing, data fusion, text mining, and—who could forget?—knowledge management. In the last two weeks, I have done several live-and-in-person briefings about ArnoldIT.com’s views on enterprise search and related disciplines.

Today enterprise search has become what I call an elastic concept. It is stretched over a baker’s dozen of quite divergent information retrieval concepts. Examples range from the old bugaboo of many companies customer support to the effervescence of knowledge management. In between the hard realities of the costs of support actual customers and the frothy topping of “knowledge”.

Several trends are pushing through the fractured landscape of information retrieval. Like earthquakes, the effects can vary significantly depending on one’s position at the time of the event.

image

Source: http://www.sportsnet.ca/gallery/2009/12/30/scramble_gal_640.jpg

Search can looked at in different ways. One can focus on a particular problem; for example, content management system repositories. The challenge is to find information in these systems. One would think that after years of making Web pages, the problem would be solved. Apparently not. CMS with embedded search stubs trigger some grousing in most of the organizations with which I am familiar. Search works, just not exactly as the users expect. A vendor of search technology can position the search solution as one that makes it easy for users to locate information in a CMS. This is, of course, the pitch of numerous Microsoft Certified Gold resellers of various types of search solutions, utilities, and work arounds. This an example of a search market defined by the type of enterprise system that creates a retrieval problem.

Other problems for search crop up when specific rules and regulations mandate a particular type of information processing. One example is the eDiscovery market. Anyone can be sued, and eDiscovery systems have to make content findable, but the users of an eDiscovery system have quite particular needs. One example is bookkeeping so that the time and search process can be documented and provided upon request under certain conditions.

Social media has created a new type of problem. One can take a specific industry sector such as the Madison Avenue crowd and apply information technology to the social media problem. The idea is for a search system to “harvest” data from social content sources like Facebook or Twitter, process the text which can be ambiguous, and generate information about how the people creating Facebook messages or tweets perceive a product, person, ad, or some other activity for the advertising team. The idea is that search unlocks hidden information. The Mad Ave crowd thinks in terms of nuggets of information that will allow the ad team to upsell the advertiser. Search is doing search work but the object of the exercise is to make sense out of content streams that are too voluminous for a single person to read. This type of search market—which may not be classic search and retrieval at all—is closer to what various intelligence agencies want software to do to transcribed phone calls, email, and general information from a range of sources.

Let’s stop with the examples of information access problems already. There are more information access problems than at any other time, and I want to move on to the impact of these quite diverse problems upon vendors in 2011.

Now let’s take a vendor that has a search system that can index Word documents, email, and content found in most office environments. Nothing tricky like product specifications, chemical structures, or the data in the R&D department’s lab notebooks. For mainstream search, here is the problem:

Commoditization

Right now (now pun on the vendor of customer support solutions by the way) anyone can download an open source search solution. It helps if the person downloading Lucene, Solr, or one of the other open source solutions has a technical bent. If not, a local university’s computer science department can provide a student to do the installation and get the system up and running. If the part time contracting approach won’t work, you can hire a company specializing in open source to do the work. There are dozens of these outfits bouncing around.

Read more

Exclusive Interview: Brian Pinkerton

December 15, 2010

Introduction

At a recent conference, there was much buzz about consulting firms’ opinions about enterprise search. I spoke with several people who expressed surprise at the “rankings”. For example, one high-profile firm pronounced Vivisimo as the top vendor in enterprise search. Vivisimo positions itself as an “information optimization” company. I am not sure what that means, but it is clear that “enterprise search” is not the company’s main focus. Nevertheless, Vivisimo is number one.

Okay, but Vivisimo started life a company with on-the-fly clustering. Then Vivisimo morphed into a vendor of federated search. Next Vivisimo dabbled in government contracts. After an executive shake up and an infusion of venture capital, Vivisimo emerged as an “information optimization” company. The phrase is as confusing as Google’s “contextual discovery.”

What are these marketers talking about? The answer is making sales and no-calorie marketing jargon. The consulting firms know a sales opportunity exists when user satisfaction with enterprise search is chugging along in the 50 to 70 percent range. Yes, most users of an enterprise “findability” system are unhappy. Procurement teams are, therefore, busy because most companies are looking for a search silver bullet.

To cater to those looking for a quick, simple way to solve an enterprise information access problem, consultants and advisors offer impressionistic write ups. Madison Avenue works fine when selling toothpaste. Apply that method to the very tough problem of information retrieval, and you end up with confusion, rising costs, and unhappy users.

Let me give you another example that surfaced in my conversations with vendors in London at the December International Online Conference. I learned that one consulting firm named Endeca as the top dog in enterprise search. I am okay with that assertion as long as there are some data to back up the claim. When I hear the name “Endeca”, I think of eCommerce as the core strength. The system can be applied to other information problems, but when I recall Endeca’s patent applications, I think about eCommerce, not discovery and data fusion.

Perhaps some search firms are more adept at social engineering than software engineering? Are some search advisors doing Madison Avenue-type thinking, not engineering analyses?

I don’t have any quibble with consulting firms who peg Autonomy as Number One. The revenue alone makes the difference between Autonomy and other information access vendors evident. Last time I saw Andrew Kanter, the chief operating officer for the vendor of meaning-based computing solutions, I asked him, “When will Autonomy break the $1.0 billion in revenue barrier?” He told  and an audience of about 175 people that Autonomy “was only $900 million.” Yep, $900 million, which is orders of magnitude greater than most of the 300 vendors whose information retrieval technology I track. IBM, Google, Microsoft, and Oracle do not provide search revenue detail in the financial reports. So on revenue Autonomy has a valid claim to the Number One position in enterprise search.

Consulting Firms Want to Sell Work, Not Expose Warts

Consulting firms—particularly those confined to the mid-tier below the McKinseys, the Bains and the Booz Allens and above the independent experts—have to feed their firms’ revenue hunger. Consulting is an expensive business because full time employees have to be kept billable. Making sales, therefore, is more important than objectivity in my experience.

What mid tier consulting firm sales professional wants to irritate an IBM, Google, Microsoft, or Oracle? Big companies, therefore, are often graded on the curve. Is it not easier to rubber stamp search systems from these Big Four vendors? Get along, go along is perhaps the motto in certain situations.

One consequence of the pressure to make sales is that consulting firms have to back certain horses. The idea is to focus on commercial vendors who are likely to have an appetite for buying and paying for the services of the consulting firm.

Somewhat surprisingly, most of the consulting firms’ search analyses fumble the ball when it comes to open source search; namely, Lucene/Solr, FLAX, Tesuji, and others. The fact is that organizations like Cisco Systems, eHarmony, LinkedIn, MTV, and Twitter, among others are relying on open source “findability” solutions, in particular Lucene/Solr. Open source search is now a viable option for many organizations, and the deprecation of Lucene/Solr is surprising to me.

The bottom-line is that most search vendor league tables are suspect. Unfortunately, these league tables are viewed fact.

On December 10, 2010, I wanted to get an open source technology to talk about open source search and how that option is perceived by marketing organizations masquerading as independent analysts.

The Interview

I spoke with Dr. Brian Pinkerton, one of Lucid Imagination’s vice president of product development. Brian has has a Ph.D. in Computer Science & Engineering and started his work career as a senior software engineer at NeXT. He then developed WebCrawler, the Web’s first comprehensive search engine.

image

Brian Pinkerton, VP Product Development, Lucid Imagination

Since then he was Technical Architect at AOL (which acquired WebCrawler), VP of Engineering and Chief Scientist at Excite, Principal Architect at A9, Director of Search at Technorati and co-founder/President of Minimal Loop, whose technology was acquired by Scout Labs and where Brian was VP of Engineering.

Today (December 15, 2010) Lucid Imagination is announcing the general availability of its Lucid Works enterprise product, which is available for free download. the product is described as a search solution development platform built on open source Apache Lucene/Solr.

The full text of my interview with Brian appears below:

Several consulting firms have issued analyses of the enterprise search market. I noted that open source search in general and Lucid Imagination in particular were not highlighted as top candidates for the enterprise. Why is open source search put on the bench?

Economics, primarily.  Because customers spend huge amounts of money on commercial packages, a small industry has grown up to support and encourage such decisions.  This process is naturally set up to ignore disruptive technologies, especially ones that are price-disruptive.  The consulting firms don’t work for free: getting prominent placement in a report usually costs money.  Who’s paying that fee for open source?   Another important reason is the market: developers, not IT managers, are the main adopters of open source solutions, while IT execs are the main consumers of the fancy reports.

Large organizations rely on consultants’ reports. In your opinion are these reports accurate?

It’s hard to comment on these reports because the methods are not always transparent. These consultants spend a lot of time talking to vendors and customers, and draw some conclusions based on that. Many of them have been at it for a while, and they survive by providing useful insights.  One useful thing to note, though, is that their conclusions are biased by those they talk to and their target audience: the IT exec.  If you’re one of those, I’m sure you like the reports.  If you’re a developer, you might not.

How is Lucid Imagination productizing open source search?

We have released a product, LucidWorks Enterprise, that extends Lucene/Solr with features commonly needed by commercial customers.  We focus on is providing technology that will make open source Lucene/Solr more accessible to more people.  For instance, user interfaces  that simplify getting started, or APIs that are specifically targeted to the way enterprises build and integrate applications today.

For example, we extend Solr with RESTful interfaces for configuration; that provides developers with the ability to integrate it more easily. We also simplify functions that could be built from open source, but are more convenient to take as ready-made features.  Finally, we add features that 99.9% of software developers probably can’t create easily from scratch, such as our Click Scoring framework, which boosts search results selected most often by users.

Furthermore, open source projects are really good at broad innovation, transparency, and easy access.  But the communities around open source projects are not support organizations, so many vendors help companies adopting open source with timely expert support. That’s another one of the things we do at Lucid.

What steps have you taken to ensure the stability of the open source search product you offer?

We take the latest, most stable innovations from the open source development tree (known as ‘trunk’) and provide rigorous integration testing, as well as regular, stable releases driven by customer opportunities. We follow strict software engineering principles and use a quality-driven  release process to build LucidWorks Enterprise.  And we provide maintenance fixes and releases for our product in timely fashion to customers.

Proprietary search vendors emphasize that their approach ensures that licensees get timely bug fixes and updates. Is this a valid statement? What does Lucid Imagination provide a customer who wants timely bug fixes and updates?

I think both open-source vendors and commercial software suppliers provide timely bug fixes and updates.  On the open-source side, it’s an interesting challenge because some bugs are fixed nearly instantly by the open source community, but they are not packaged in a way that a production customer can easily consume.  Production customers want bug-fix-only branches of the the software, not bug fixes accompanied by the latest feature innovations that happened to be committed at the same time.  We insulate our customers from the open-source volatility by releasing stable, bug-fix-only branches for our production customers.

Search technology has fragmented into a mind numbing number of implementations such as an appliance, cloud or hosted search, on premises search, and combinations of methods. How does Lucid Imagination’s search product fit into this fragmented solutions landscape?

LucidWorks Enterprise is a product that spans the range from software appliance to developer toolkit.  Customers new to search can deploy it in a turnkey fashion, while more sophisticated customers can dive under the hood and build a complex application around it.  A key secret to great search is how well it fits the business it is meant to serve — in fact, this is true of any application, particularly custom built apps. We believe that anyone who needs better than ‘adequate’ search results will want to build their search solution, and we created LucidWorks Enterprise to provide the best, lowest cost, most scalable platform for building that search solution.

Microsoft SharePoint provides a search solution. Microsoft offers the Fast technology for a more robust solution. What does Lucid Imagination provide to a SharePoint licensee wanting an enhanced search solution?

We will release a robust SharePoint solution in the first two quarters of 2011 and provide anyone to use LucidWorks Enterprise to search their SharePoint data alongside data from other common sources.  One of the open questions about the new SharePoint solution is how long Microsoft will support Fast’s integration with anything but SharePoint.

Many search vendors offer faceted search; that is, the system generates hot links to related or supporting content. What is Lucid Imagination’s approach to faceted search?

Both LucidWorks Enterprise and Solr provide faceting support on every query that enables users to refine their results.   Faceting is most obviously useful in eCommerce, though a wide variety of applications also take advantage of the feature.  LucidWorks Enterprise and Solr support efficient and scalable faceting on any field, providing human-readable labels and accurate facet counts for the top facets.  One of the important considerations for large collections is the degree to which faceting works in a distributed configuration.  In LucidWorks Enterprise and Solr, faceting is supported seamlessly in distributed situations, offering the full performance at scale.

Would you describe a customer support use case for Lucid Imagination search?  What are some common themes?

Because we have a diverse base of customers, we see a wide range of search applications.  One common theme is relevance tuning: for instance, customers who need help tying certain results to certain queries, or just better optimizing the algorithms built with Solr & Lucene to deliver the right results.  Another common theme, and one that I personally enjoy helping customers with, is performance.  We had one customer who replaced a commercial search engine with Solr, reducing their median query response time from 30 seconds to about four seconds without our help. We then helped them reduce that by another factor of eight, to a median query response time of under half a second.

With open source search gaining acceptance within large companies like Cisco and high demand Web applications like Twitter, why are the consulting firms giving open source and Lucid so little attention?

One reason is that it’s coming up really, really fast — and they may not see it coming.  Also, open source adoption is often driven by a broad, diffuse population of developers.  The developers don’t generally put much stock in what the analysts say, if they’re even aware of the reports to begin with.  And on the flip side, the analysts are paying attention to their own customers, CIOs and vendor salespeople, who may not know how the work is really getting done.

What do you suggest a procurement team do to evaluate fully an open source search solution such as the one Lucid Imagination offers?

I think they need to make sure their company is comfortable with creating their own applications; it’s not a passive technology, but one that can be actively used to drive competitive advantage.  In looking at vendors, find one that can offer a solution that grows as their needs and skills grow: from something simple in the beginning to something fully customizable as they become more sophisticated consumers.  And most importantly, they should look for a company with the depth and expertise to provide training, support, and consulting to help them harness the full scope of search innovation.  Finally, they should do the math compared to what they might pay for a comparable implementation with a commercial enterprise search vendor. In many cases, they’re already spending many times what it would cost them to buy an open source-based solution. Sometimes they’ll pay more just for the annual maintenance — excluding consulting and license fees — than for a complete subscription for LucidWorks Enterprise.

In several of the recent analyses of enterprise search systems I have reviewed, I learned about such companies as Sinequa, Fabasoft and Expert System, both examples of firms that have zero profile in many organizations. In your opinion, why are these types of search vendors given so much attention in the search market?

I can imagine that the marketing guys at such organizations are always happy to talk to industry analysts. I spend my time mainly talking to customers and developers.

How can one get more information about Lucid Imagination and its open source enterprise search solution?

Our Web site  www.lucidimagination.com  is full of information about our product, LucidWorks Enterprise, and other information about the open source technologies, Lucene and Solr.  We also have case studies that show how customers are building applications and products with Solr, Lucene, and LucidWorks Enterprise. And I always recommend downloading our product, now available free to developers, and taking it for a spin.

ArnoldIT Comment

My view about consulting firms’ analyses of search and content processing vendors has evolved over the last two years. The economic impact has put pressure on most of the companies that sell technical advice. Since the 2008 financial storm roiled commercial waters, certain advisory firms have shifted from independent analyses to what generates revenue for the consulting firms.

Many of the consulting firms’ reports are white papers or marketing material. The problem is that search is a particularly difficult technical field. Selecting a search system is often a difficult challenge for a procurement team. There are numerous, complex factors to consider.

Consulting firms offer “advice” about what system or systems is the “best” at a particular function. The problem is that writing about search is different from implementing search. It is easier to describe what a search vendor asserts in a demo. It is harder to take that solution and solve a real-world problem in a Microsoft SharePoint environment or in a setting where numerous mission critical applications operate in a stand alone manner.

If you are looking for a search solution, you will need to develop a “tight spec” and then investigate the options that match specific requirements. Few organizations have the time or resources to test multiple systems before making a decision about what search system to license.

The need for information about search creates an opportunity for independent firms to provide information, often at a hefty fee. In my experience, selecting a search system requires an approach close to the one that Martin White and I set forth in our 2009 book Successful Enterprise Search Management, published by Galatea in the UK.

We suggest that procurement teams become familiar with the available literature about search. Then a methodical process of assessment and evaluation can be followed. The short cut often leads to the all-too-common complaints about a search system. Users cannot locate needed information and user satisfaction plummets.

Stephen E Arnold, December 15, 2010

Sponsored

Capgemini Renews Vows with Exalead

December 15, 2010

NMK reports that Exalead and Capgemini Extend Global Partnership to Provide Innovative Search-Based Solutions.” Congratulations are in order for the happy couple as they set foot into the international market! Prior to going global, Exalead and Capgemini worked together in France to deliver SBA solutions to joint clients.

“The rapid increase in online data sources means that businesses in today’s 24/7 economy, need quicker, more user-friendly and flexible tools to interpret huge volumes of information. This new global partnership with Exalead means that Capgemini is expected to help companies worldwide in the reduction of costs, and drive innovation through increasing the value of their information with the use of an innovative Information Management solution.”

Exalead Cloudview and Capgemini give their clients an easy, reliable program that can both manage and interpret structured and unstructured data. All levels of office workers will benefit from this program.

Stephen E Arnold, December 16, 2010

Freebie

A Clear Video from Lucid Imagination

December 13, 2010

Lucid Imagination has posted a video on their blog: “Solr Tuning Tips From Sourcesense.” It’s eighteen minutes of pure Solr excitement! Gustavo Fernades, Senior Open Source Consultant at Sourcesense UK, gives a rousing discussion about “six Solr tuning tips that can help significantly reduce query times in large-scale and near real time search environments. He covers adding memory, auto warming, application profiling, FieldValueCache, JVM and garbage collection, and using maxWarmingSearchers.”

When you watch the video, it helps to have a working knowledge of Solr to understand how to integrate into your program. Apache Lucene created Solr as an open source search platform. Its main features are dynamic clustering, database integration, full-text search, and hit highlighting.

Stephen E Arnold, December 13, 2010

Freebie

Endeca: Perfect in the Eyes of CTOLabs, Just Perfect

December 13, 2010

CTOlabs has a write up titled “Endeca’s MDEX Engine Review” that finds no fault with the Endeca system whatsoever. The system is glorified by describing the MDEX features and what popular websites use it, i.e. Walmart.com, ESPN.com, and HomeDepot.com.

“Endeca uses a powerful and revolutionary semi-structured database to provide advanced capabilities for high-performance search and information access across the entire spectrum of structured and unstructured enterprise data. The MDEX engine mixes search, navigation and exploration to dig and guide the user to discover.”

Endeca advertises the MDEX engine to have the following features: integrated API, semi-structured database, generational data updates, security, performance and scalability, end-to-end web services support, and extensibility. This is interesting, because most systems we test struggle with petascale dataflows, connecting to such content sources as the i2 ANB format, query response time under heavy load, and generating revenue. Perhaps these issues are not drawbacks in the informed eyes of the CTOLabs’ crew?

Guess Endeca is the leader in search, not Vivisimo, as we learned in the UK last week. Poor Autonomy, Google, Microsoft Fast, and Exalead. Maybe each should just become Endeca resellers.

Stephen E Arnold, December 13, 2010

Freebie

Exalead Brightens the Cloud

December 10, 2010

In many ways the computer makes old business practices obsolete, but in other ways it still feels like we have a messy desk stacked with papers and stickies—it’s just on our hard drive now. Global search solutions provider, Exalead, http://www.exalead.com is attempting to fix this feeling with a new release, CloudView 360, as we learned from Exalead’s Web site.

CloudView, according to a recent press release, aims to simplify search by offering an “extended package of CloudView modules developed specifically to integrate information from legacy CRM, ERP, HR enterprise applications, PLM systems and web content.” This is done by offering applications, “built on a flexible, scalable search backbone,” and “customized to incorporate knowledge of a business process.”

Faster searches, better structure and Exalead’s reputation of flawless search will make other search tools feel like using a manual typewriter again. When accurate info is necessary in a snap, programs like CloudView look like the answer.

Patrick Roland, December 10, 2010

Freebie

Oracle SES10g and Stellent

December 10, 2010

I have a stack of “read later” documents. I was waiting for a Skype call, and I read the October 2010 Oracle white paper “Searching Oracle Content Server (Stellent) with Oracle Secure Enterprise Search 10g.” The title caught my attention because SES is now at version 11g, but the October 2010 white paper focuses on Oracle SES10g. Interesting.

If you are a Stellent customer, you may want to check out the white paper. If you compete with Oracle, the white paper is a must read. Several points jumped out when I worked my way through the 25 pages of text and diagrams.

First, the good stuff begins on page 10 of the white paper. Here’s the passage that signals the work ahead for the never-to-be-fired Oracle database administrator:

“The first challenge is to retrieve the data and the metadata from the Content Server repository into SES.”

So much for seamless integration of the Stellent and the SES10g software.

Second, the method for mapping metadata does not specify who does the mapping.  A bit of noodling leads me to the hypothesis that the “smart” software does mapping, then the lucky Oracle database administrator gets to dive in. In short, mapping fields and document attributes is likely to be a bit of work.

Third, the security steps are no big surprise. What did catch my attention was the need to install an additional security component into Content Server and then take a snapshot of the repository. My hunch is that one wants to have sufficient system and storage and resources to handle beefy repositories. Running out of space is likely to mandate going back to square one. Not a hint of graceful recovery or even a checklist of “Do This Before You Start” items.

Fourth, a number of components and functions have to be configured by the system administrator. Once again: hands on work.

Oracle DBAs have job security by design I opine.

Stephen E Arnold, December 10, 2010

Freebie

Business Intelligence Bubbling

December 8, 2010

Open Source Business Intelligence: What Are Your Options?” investigates the momentum of OSBI in recent years.  In the words of Joe Nicholson, marketing VP at Pentaho, “The biggest trend we see is the rapidly increasing adoption of OSBI as a means to provide critical reports, dashboards and analysis to business users at far less costs than the traditional, proprietary BI solutions”.  It appears the interest in OSBI has been on the rise for quite some time, though what this will mean for consumers remains unclear.

What we are seeing is a good deal of posturing from the leaders in the field. With companies like Jaspersoft, SpagoBI and the aforementioned Pentaho, which claims it saves clients $1.5 million in fees over a three year span, clamoring for a spot at the head of the queue speculation on the future of the OSBI market is dramatic.  Aside from lowering costs, the idea of tapping into innovations spawned from the OS community itself has taken root, which according to Nicholson translates to more than six million downloads, upwards of eight thousand active projects and more than twelve hundred commercial customers.

Currently the community versions are available in a free downloadable format though they lack certain administrative and security functions, in addition to diminished ability to connect to prime data outlets. One still has to pay for premium access.

Sarah Rogers, December 8, 2010

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta