An Interview with Brian Pinkerton
At a recent conference, there was much buzz about consulting firms’ opinions about enterprise search. I spoke with several people who expressed surprise at the “rankings”. For example, one high-profile firm pronounced Vivisimo as the top vendor in enterprise search. Vivisimo positions itself as an “information optimization” company. I am not sure what that means, but it is clear that “enterprise search” is not the company’s main focus. Nevertheless, Vivisimo is number one.
Okay, but Vivisimo started life a company with on-the-fly clustering. Then Vivisimo morphed into a vendor of federated search. Next Vivisimo dabbled in government contracts. After an executive shake up and an infusion of venture capital, Vivisimo emerged as an “information optimization” company. The phrase is as confusing as Google’s “contextual discovery.”
What are these marketers talking about? The answer is making sales and no-calorie marketing jargon. The consulting firms know a sales opportunity exists when user satisfaction with enterprise search is chugging along in the 50 to 70 percent range. Yes, most users of an enterprise “findability” system are unhappy. Procurement teams are, therefore, busy because most companies are looking for a search silver bullet.
To cater to those looking for a quick, simple way to solve an enterprise information access problem, consultants and advisors offer impressionistic write ups. Madison Avenue works fine when selling toothpaste. Apply that method to the very tough problem of information retrieval, and you end up with confusion, rising costs, and unhappy users.
Let me give you another example that surfaced in my conversations with vendors in London at the December International Online Conference. I learned that one consulting firm named Endeca as the top dog in enterprise search. I am okay with that assertion as long as there are some data to back up the claim. When I hear the name “Endeca”, I think of eCommerce as the core strength. The system can be applied to other information problems, but when I recall Endeca’s patent applications, I think about eCommerce, not discovery and data fusion.
Perhaps some search firms are more adept at social engineering than software engineering? Are some search advisors doing Madison Avenue-type thinking, not engineering analyses?
I don’t have any quibble with consulting firms who peg Autonomy as Number One. The revenue alone makes the difference between Autonomy and other information access vendors evident. Last time I saw Andrew Kanter, the chief operating officer for the vendor of meaning-based computing solutions, I asked him, “When will Autonomy break the $1.0 billion in revenue barrier?” He told and an audience of about 175 people that Autonomy “was only $900 million.” Yep, $900 million, which is orders of magnitude greater than most of the 300 vendors whose information retrieval technology I track. IBM, Google, Microsoft, and Oracle do not provide search revenue detail in the financial reports. So on revenue Autonomy has a valid claim to the Number One position in enterprise search.
Consulting Firms Want to Sell Work, Not Expose Warts
Consulting firms—particularly those confined to the mid-tier below the McKinseys, the Bains and the Booz Allens and above the independent experts—have to feed their firms’ revenue hunger. Consulting is an expensive business because full time employees have to be kept billable. Making sales, therefore, is more important than objectivity in my experience.
What mid tier consulting firm sales professional wants to irritate an IBM, Google, Microsoft, or Oracle? Big companies, therefore, are often graded on the curve. Is it not easier to rubber stamp search systems from these Big Four vendors? Get along, go along is perhaps the motto in certain situations.
One consequence DSDSSof the pressure to make sales is that consulting firms have to back certain horses. The idea is to focus on commercial vendors who are likely to have an appetite for buying and paying for the services of the consulting firm.
Somewhat surprisingly, most of the consulting firms’ search analyses fumble the ball when it comes to open source search; namely, Lucene/Solr, FLAX, Tesuji, and others. The fact is that organizations like Cisco Systems, eHarmony, LinkedIn, MTV, and Twitter, among others are relying on open source “findability” solutions, in particular Lucene/Solr. Open source search is now a viable option for many organizations, and the deprecation of Lucene/Solr is surprising to me.
The bottom-line is that most search vendor league tables are suspect. Unfortunately, these league tables are viewed fact.
On December 10, 2010, I wanted to get an open source technology to talk about open source search and how that option is perceived by marketing organizations masquerading as independent analysts.
I spoke with Dr. Brian Pinkerton, one of Lucid Imagination’s vice president of product development. Brian has has a Ph.D. in Computer Science & Engineering and started his work career as a senior software engineer at NeXT. He then developed WebCrawler, the Web’s first comprehensive search engine.
Since then he was Technical Architect at AOL (which acquired WebCrawler), VP of Engineering and Chief Scientist at Excite, Principal Architect at A9, Director of Search at Technorati and co-founder/President of Minimal Loop, whose technology was acquired by Scout Labs and where Brian was VP of Engineering.
Today (December 15, 2010) Lucid Imagination is announcing the general availability of its Lucid Works enterprise product, which is available for free download. the product is described as a search solution development platform built on open source Apache Lucene/Solr.
The full text of my interview with Brian appears below:
Several consulting firms have issued analyses of the enterprise search market. I noted that open source search in general and Lucid Imagination in particular were not highlighted as top candidates for the enterprise. Why is open source search put on the bench?
Economics, primarily. Because customers spend huge amounts of money on commercial packages, a small industry has grown up to support and encourage such decisions. This process is naturally set up to ignore disruptive technologies, especially ones that are price-disruptive. The consulting firms don’t work for free: getting prominent placement in a report usually costs money. Who’s paying that fee for open source? Another important reason is the market: developers, not IT managers, are the main adopters of open source solutions, while IT execs are the main consumers of the fancy reports.
Large organizations rely on consultants’ reports. In your opinion are these reports accurate?
It’s hard to comment on these reports because the methods are not always transparent. These consultants spend a lot of time talking to vendors and customers, and draw some conclusions based on that. Many of them have been at it for a while, and they survive by providing useful insights. One useful thing to note, though, is that their conclusions are biased by those they talk to and their target audience: the IT exec. If you’re one of those, I’m sure you like the reports. If you’re a developer, you might not.
How is Lucid Imagination productizing open source search?
We have released a product, LucidWorks Enterprise, that extends Lucene/Solr with features commonly needed by commercial customers. We focus on is providing technology that will make open source Lucene/Solr more accessible to more people. For instance, user interfaces that simplify getting started, or APIs that are specifically targeted to the way enterprises build and integrate applications today.
For example, we extend Solr with RESTful interfaces for configuration; that provides developers with the ability to integrate it more easily. We also simplify functions that could be built from open source, but are more convenient to take as ready-made features. Finally, we add features that 99.9% of software developers probably can’t create easily from scratch, such as our Click Scoring framework, which boosts search results selected most often by users.
Furthermore, open source projects are really good at broad innovation, transparency, and easy access. But the communities around open source projects are not support organizations, so many vendors help companies adopting open source with timely expert support. That’s another one of the things we do at Lucid.
What steps have you taken to ensure the stability of the open source search product you offer?
We take the latest, most stable innovations from the open source development tree (known as ‘trunk’) and provide rigorous integration testing, as well as regular, stable releases driven by customer opportunities. We follow strict software engineering principles and use a quality-driven release process to build LucidWorks Enterprise. And we provide maintenance fixes and releases for our product in timely fashion to customers.
Proprietary search vendors emphasize that their approach ensures that licensees get timely bug fixes and updates. Is this a valid statement? What does Lucid Imagination provide a customer who wants timely bug fixes and updates?
I think both open-source vendors and commercial software suppliers provide timely bug fixes and updates. On the open-source side, it’s an interesting challenge because some bugs are fixed nearly instantly by the open source community, but they are not packaged in a way that a production customer can easily consume. Production customers want bug-fix-only branches of the the software, not bug fixes accompanied by the latest feature innovations that happened to be committed at the same time. We insulate our customers from the open-source volatility by releasing stable, bug-fix-only branches for our production customers.
Search technology has fragmented into a mind numbing number of implementations such as an appliance, cloud or hosted search, on premises search, and combinations of methods. How does Lucid Imagination’s search product fit into this fragmented solutions landscape?
LucidWorks Enterprise is a product that spans the range from software appliance to developer toolkit. Customers new to search can deploy it in a turnkey fashion, while more sophisticated customers can dive under the hood and build a complex application around it. A key secret to great search is how well it fits the business it is meant to serve — in fact, this is true of any application, particularly custom built apps. We believe that anyone who needs better than ‘adequate’ search results will want to build their search solution, and we created LucidWorks Enterprise to provide the best, lowest cost, most scalable platform for building that search solution.
Microsoft SharePoint provides a search solution. Microsoft offers the Fast technology for a more robust solution. What does Lucid Imagination provide to a SharePoint licensee wanting an enhanced search solution?
We will release a robust SharePoint solution in the first two quarters of 2011 and provide anyone to use LucidWorks Enterprise to search their SharePoint data alongside data from other common sources. One of the open questions about the new SharePoint solution is how long Microsoft will support Fast’s integration with anything but SharePoint.
Many search vendors offer faceted search; that is, the system generates hot links to related or supporting content. What is Lucid Imagination’s approach to faceted search?
Both LucidWorks Enterprise and Solr provide faceting support on every query that enables users to refine their results. Faceting is most obviously useful in eCommerce, though a wide variety of applications also take advantage of the feature. LucidWorks Enterprise and Solr support efficient and scalable faceting on any field, providing human-readable labels and accurate facet counts for the top facets. One of the important considerations for large collections is the degree to which faceting works in a distributed configuration. In LucidWorks Enterprise and Solr, faceting is supported seamlessly in distributed situations, offering the full performance at scale.
Would you describe a customer support use case for Lucid Imagination search? What are some common themes?
Because we have a diverse base of customers, we see a wide range of search applications. One common theme is relevance tuning: for instance, customers who need help tying certain results to certain queries, or just better optimizing the algorithms built with Solr & Lucene to deliver the right results. Another common theme, and one that I personally enjoy helping customers with, is performance. We had one customer who replaced a commercial search engine with Solr, reducing their median query response time from 30 seconds to about four seconds without our help. We then helped them reduce that by another factor of eight, to a median query response time of under half a second.
With open source search gaining acceptance within large companies like Cisco and high demand Web applications like Twitter, why are the consulting firms giving open source and Lucid so little attention?
One reason is that it’s coming up really, really fast — and they may not see it coming. Also, open source adoption is often driven by a broad, diffuse population of developers. The developers don’t generally put much stock in what the analysts say, if they’re even aware of the reports to begin with. And on the flip side, the analysts are paying attention to their own customers, CIOs and vendor salespeople, who may not know how the work is really getting done.
What do you suggest a procurement team do to evaluate fully an open source search solution such as the one Lucid Imagination offers?
I think they need to make sure their company is comfortable with creating their own applications; it’s not a passive technology, but one that can be actively used to drive competitive advantage. In looking at vendors, find one that can offer a solution that grows as their needs and skills grow: from something simple in the beginning to something fully customizable as they become more sophisticated consumers. And most importantly, they should look for a company with the depth and expertise to provide training, support, and consulting to help them harness the full scope of search innovation. Finally, they should do the math compared to what they might pay for a comparable implementation with a commercial enterprise search vendor. In many cases, they’re already spending many times what it would cost them to buy an open source-based solution. Sometimes they’ll pay more just for the annual maintenance — excluding consulting and license fees — than for a complete subscription for LucidWorks Enterprise.
In several of the recent analyses of enterprise search systems I have reviewed, I learned about such companies as Sinequa, Fabasoft and Expert System, both examples of firms that have zero profile in many organizations. In your opinion, why are these types of search vendors given so much attention in the search market?
I can imagine that the marketing guys at such organizations are always happy to talk to industry analysts. I spend my time mainly talking to customers and developers.
How can one get more information about Lucid Imagination and its open source enterprise search solution?
Our Web site www.lucidimagination.com is full of information about our product, LucidWorks Enterprise, and other information about the open source technologies, Lucene and Solr. We also have case studies that show how customers are building applications and products with Solr, Lucene, and LucidWorks Enterprise. And I always recommend downloading our product, now available free to developers, and taking it for a spin.
My view about consulting firms’ analyses of search and content processing vendors has evolved over the last two years. The economic impact has put pressure on most of the companies that sell technical advice. Since the 2008 financial storm roiled commercial waters, certain advisory firms have shifted from independent analyses to what generates revenue for the consulting firms.
Many of the consulting firms’ reports are white papers or marketing material. The problem is that search is a particularly difficult technical field. Selecting a search system is often a difficult challenge for a procurement team. There are numerous, complex factors to consider.
Consulting firms offer “advice” about what system or systems is the “best” at a particular function. The problem is that writing about search is different from implementing search. It is easier to describe what a search vendor asserts in a demo. It is harder to take that solution and solve a real-world problem in a Microsoft SharePoint environment or in a setting where numerous mission critical applications operate in a stand alone manner.
If you are looking for a search solution, you will need to develop a “tight spec” and then investigate the options that match specific requirements. Few organizations have the time or resources to test multiple systems before making a decision about what search system to license.
The need for information about search creates an opportunity for independent firms to provide information, often at a hefty fee. In my experience, selecting a search system requires an approach close to the one that Martin White and I set forth in our 2009 book Successful Enterprise Search Management, published by Galatea in the UK.
We suggest that procurement teams become familiar with the available literature about search. Then a methodical process of assessment and evaluation can be followed. The short cut often leads to the all-too-common complaints about a search system. Users cannot locate needed information and user satisfaction plummets.
Stephen E Arnold, December 21, 2010