Enterprise Search: Evidence It Is a Commodity
January 17, 2015
I was browsing through some information gathered by Overflight last week. I cam across an interesting page showing Libraries Australia Architecture Overview. Here’s a miniature of the diagram. The link provides a larger version. Where is search? Well, it is in the middle, represented by a purple storage icon.
The search system is Solr. I find this interesting for several reasons:
First, Solr replaced the Australian-developed TeraText search system, which I think is pretty good. TeraText was a commercial product, and Solr is an open source system.
Second, Solr is a component in a far larger system. No surprise here, but the diagram makes clear that search is a utility supporting many other library functions. For vendors who make search the fabric for a large-scale application, the Libraries Australia team may want you to give them a lecture about ways to improve their system.
Third, Libraries Australia has a number of systems, each of which presumably has its native search tools. The implication is that Solr provides one screen access to these diverse resources. I wonder if the Oracle DBA uses Solr instead of the native Oracle tools. My thought is that the Solr champions see no reason to fool with Oracle command lines. The DBA, on the other hand, may see information access from a different point of view.
Net net: A commercial account closes, and an open source account begins. Does this fact suggest that closing deals for proprietary search systems might be more difficult in 2015?
Stephen E Arnold, January 17, 2015
Worlds Apart: The Schism between Information Access and OId School Keyword Search
January 9, 2015
Ah, Dave Schubmehl. You may remember my adventures with this “expert” in search. He published four reports based on my research, and then without permission sold one of these recycled $3,500 gems on Amazon. A sharp eyed law librarian and my attorney were able to get this cat back into the back.
He’s back with a 22 page report “The Knowledge Quotient: Unlocking the Hidden Value of Information Using Search and Content Analytics” that is free. Yep, free.
I was offered this report at a Yahoo email address I use to gather the spam and content marketing fluff that floods to me each day. I received the spam from Alisa Lipzen, an inside sales representative, of Coveo. Ms. Lipzen is sufficiently familiar with me to call me “Ben”. That’s a familiarity that may be unwarranted. She wants me to “enjoy.” Okay, but how about some substance.
To put this report in perspective, it is free. To me this means that the report was written for Coveo (a SharePoint centric keyword search vendor) and Lexalytics (a unit of Infonic if this IDC item is accurate). IDC, in my view, was paid to write this report and then cooperated with Coveo and Lexalytics to pump out the document as useful information.
My interest is not in the content marketing and pay-for-fame methods of consulting firms and their clients. Nope. I am focused on the substance of the write up which I was able to download thanks to the link in the spam I received. Here’s the cover page.
For background, I have just finished CyberOSINT: Next Generation Information Access. Fresh in my mind are the findings from our original and objective research. That’s right. I funded the research and I did not seek compensation from any of the 21 companies profiled in the report. You can read about the monograph on my Xenky site.
What’s interesting to me is that the IDC “expert” generated marketing document misses the major shift that has taken place in information access.
Keyword search is based on looking at what happened. That’s the historical bias of looking for content that has been processed and indexed. One can sift through that index and look for words that suggest happiness or dissatisfaction. That’s the “sentiment” angle.
But these methods are retrospective.
As CyberOSINT points out the new approach that is gaining customers and the support of a number of companies like BAE and Google is forward looking.
One looks up information when one knows what one is seeking. But what does the real time flow of information mean for now and the next 24 hours or week. The difference is one that is now revolutionizing information access and putting old school vendors at a disadvantage.
Grand View Research Looks at Enterprise Search and Misses a Market Shift
January 7, 2015
Every time I write about a low-tier or mid-tier consulting firm’s reports, I get nastygrams. One outfit demanded that I publish an apology. Okay, no problem. I apologize for expressing that the research was at odds with my own work. So before I tackle Grand View Research’s $4,700 report called “Enterprise Search Market Analysis By End-Use (Government & Commercial Offices, Banking & Finance, Healthcare, Retail), By Enterprise Size (Small, Medium, Large) And Segment Forecasts To 2020,” Let me say, I am sorry. Really, really sorry.
This is a report that is about a new Fantasyland loved by the naive. The year 2020 will not be about old school search.
Image source: http://www.themeparkreview.com/parks/photo.php?pageid=116&linkid=12739
I know I am taking a risk because my new report “CyberOSINT: Next Generation Information Access” will be available in a very short time. The fact that I elected to abandon search as an operative term is one signal that search is a bit of a dead end. I know that there are many companies flogging fixes for SharePoint, specialized systems that “do” business intelligence, and decades old information retrieval approaches packaged as discovery or customer service solutions.
But the reality is that plugging words into a search box means that the user has to know the terminology and what he or she needs to answer a question. Then the real work begins. Working through the results list takes time. Documents have to read and pertinent passages copied and pasted in another file. Then the researcher has to figure out what is right or wrong, relevant or irrelevant. I don’t know about you, but most 20 somethings are spending more time thumb typing than old fashioned research.
What has Grand View Research figured out?
First off, the company knows it has to charge a lot of money for a report on a topic that has been beaten to death for decades. Grand View’s approach is to define “search” by some fairly broad categories; for example, small, medium and large and Government and commercial, banking and finance, healthcare, retail and “others.”
Enterprise Search: Parkour for Venture Funded Enterprise Search Vendors
January 3, 2015
Parkour refers to the sport of jumping and climbing on man made constructions. Note that most of these “obstacles” have doors, staircases, and maybe elevators.
There are some terms that make this seemingly crazy activity sound really cool. For example, I learned whilst on vacation about the KONG. This is a suat de chat and involves “diving forward over an obstacle so that the body becomes horizontal, pushing off with the hands and tucking the legs such that the body is brought back to a vertical positio0n, ready to land.” See Parkour Terminology.
I also found this maneuver fascinating:
Kash vault This vault is a combination of two vaults; the cat pass and the dash vault. After pushing off with the hands in a cat pass, the body continues past vertical over the object until the feet are leading the body. The kash vault is then finished by pushing off the object at the end, as in a dash vault.
Here’s an image of a parkour expert doing parkour, of course:
Image source: http://parkourfreerunningblog.com/wp-content/uploads/2011/10/parkour.jpg
Now this looks like something a crazy person does: Jumping off a large concrete structure. Just my opinion, of course.
And, from my point of view, parkour is very similar to selling proprietary enterprise search and content processing solutions to commercial enterprises. The danger comes from having to pay stakeholders for the cash borrowed to keep the enterprise search company afloat. The thrill comes from the knife edge under feet: one error and some serious pain results. I suppose this focuses the mind.
As 2015 gets underway, enterprise search “experts” and vendors are gearing up to make sales. Some of the antics are beneficial to the mid tier consulting firms and publications that list the “visionaries,” the “companies that matter”, and the “leaders.” There are individual experts who conflate search with mastering Big Data or delivering the fuzzy wuzzy notion of information governance. Then there are the search vendors who wrap keyword search and classification in Dollar General wrapping paper. The idea is that keyword search is customer relationship management, analytics, and business intelligence.
For me, this is search vendor parkour, and it is okay for the tiny percentage of the population who want to jump off man-made structures. But for a person with a bit of information retrieval perspective, there are some other ways to get some exercise, remain whole, and not look absolutely crazy to an outside observer.
Here are some enterprise search realities to ponder this weekend:
First, if IBM and HP actually hit their magical billion collar goals for Watson and IDOL, how much money will be left for the hundreds and hundreds of smaller search system vendors. The answer is, “Generating billions from search is not possible, and the money available tends to be a tiny fraction of these behemoths’ projections.”
Second, why would a company pay for a commercial keyword search system when there are perfectly functional open source solutions like Elasticsearch, FLAX, and SphinxSearch?
Third, how can keyword search enriched with some clustering deliver actionable intelligence? There are companies specializing in delivering actionable intelligence. Such firms as BAE and Leidos have robust platforms that collect, analyze, and report automatically. Guessing which words unlock the treasures of an index seems somewhat old fashioned to me.
Fourth, how will the companies pouring millions upon millions into Attivio, BA Insight, Coveo, and a dozen other keyword search companies get their money back? I suppose there is the hope that Google, Microsoft, or Oracle will buy one of these firms. But that looks like a long shot. My view is that paying back the investors is going to be difficult, if not impossible.
Now these statements are sobering. One can immerse oneself in that baloney generated by the mid tier consultants (one of which Dave Schubmehls my research), the silliness generated by content management blogs about findability, and the wonkery of search engine optimization wizards.
The year 2015 will witness some significant shifts in the enterprise search landscape. In my forthcoming CyberOSINT: Next Generation Information Access, I explain the type of systems that are underpinning intelligence systems in the US and EC nations. I point out the specific functionalities of these next generation systems that make search a utility. Think of Mac OSX and its inclusion of Spotlight. Nice to have, for sure, but search is not OSX. My research team and I also identify some important lessons the NGIA vendors are teaching their customers. We also look ahead and identify some research areas that are likely to capture investors’ attention and yield measurable results.
Search is a utility. The fact that some brave people convert it to parkour does not change the fact that the activity itself is risky, entertaining, and useless. If I were an athlete, which I am not, I would focus on sports that generate the big bucks. Hoops. Football. Soccer. Parkour? That looks nuts from my vantage point in Harrod’s Creek.
Why not sell something the customer can see solves a problem? Crazy jumps just call attention to the last gasps of a software sector that needs life support.
Stephen E Arnold, January 3, 2015
SAP Hana Search 2014
December 25, 2014
Years ago I wrote an analysis of TREX. At the time, SAP search asserted a wide range of functionality. I found the system interesting, but primarily of use to die hard SAP licensees. SAP was and still is focused on structured data. The wild and crazy heterogeneous information generated by social media, intercept systems, geo-centric gizmos, and humans blasting terabytes of digital images cheek by jowl with satellite imagery is not the playground of the SAP technology.
If you want to get a sense of what SAP is delivering, check out “SAP Hana’s Built-In Search Engine.” My take on the explanation is that it is quite similar to what Fast Search & Transfer proposed for the pre-sale changes to ESP. The built-in system is not one thing. The SAP explainer points out:
A standalone “engine” is not enough, however. That’s why SAP HANA also includes the Info Access “InA” toolkit for HTML5. The InA toolkit is a set of HTML5 templates and UI controls which you can use to configure a modern, highly interactive UI running in a browser. No code – just configuration.
To make matters slightly more confusing, I read “Google Like Enterprise Search Powered by SAP Hana.” I am not sure what “Google like” means. Google provides its ageing and expensive Google Search Appliance. But like Google Earth, I am not sure how long the GSA will remain on the Google product punch list. Futhermore, the GSA is a bit of a time capsule. Its features and functions have not kept pace with next generation information access technologies. Google invested in Recorded Future a couple of years ago and as far as I know, none of the high value Recorded Future functions are part of the GSA. Google also delivers its Web indexing service. Does Google like refer to the GSA, Google’s cloud indexing of Web sites, or the forward looking Recorded Future technology?
The Google angle seems to relate to Fiori search. Based on the screenshots, it appears that Fiori presents SAP’s structured data in a report format. Years ago we used a product called Monarch to deliver this type of information to a client.
My hypothesis is that SAP wants to generate more buzz about its search technology. The company has moved on from TREX, positioned Hana search as a Fast Search emulation, and created Fiori to generate reports from SAP’s structured data management system.
For now, I will keep SAP in my “maybe next year” folder. For now. I am not sure what SAP information access systems deliver beyond basic keyword search, some clustering, and report outputs. SAP at some point may have to embrace open source search solutions. If SAP has maintained its commitment to open source, perhaps these technologies are open source. I would find that reassuring.
Regardless of what SAP is providing licensees, it is clear that the basic features and functions of next generation information access systems are not part of the present line up of products. Like other IBM-inspired companies, the future is rushing forward with SAP search receding in tomorrow’s rear view mirror. Calling a system “Google like” is not helpful, nor does it suggest that SAP is ware of NGIA systems. Some of SAP’s customers will be licensing these systems in order to move beyond what is a variation of query, scan results, open documents, read documents, and hunt for useful information. Organizations require more sophisticated information access services. The models crafted in the 1990s are, in my opinion, are commoditized. Higher value NGIA operations are the future.
Stephen E Arnold, December 25, 2014
Coveoed Up with End of Week Marketing
December 22, 2014
I am the target of inbound marketing bombardments. I used to look forward to Autonomy’s conceptual inducements. In fact, in my opinion, the all-time champ in enterprise search marketing is Autonomy. HP now owns the company, and the marketing has fizzled in my opinion. I am in some far off place, and I sifted through emails, various alerts, and information dumped in my Overflight system.
I must howl, “Uncle.” I have been covered up or Coveo-ed up.
Coveo is the Canadian enterprise search company that began life as a hard drive search program and then morphed into a Microsoft-centric solution. With some timely venture funding, the company has amped up its marketing. The investor have flown to Australia to lecture about search. Australia as you may know is the breeding ground for the TeraText system which is a darned important enterprise application. Out of the Australia research petri dish emerged Funnelback. There was YourAmigo, and some innovations that keep the lights on in the Google offices in the land down under.
Coveo sent me email asking if my Google search appliance was delivering. Well, the GSA does exactly what it was designed to do in the early 2000s. I am not sure I want it to do anything anymore. Here’s part of the Coveo message to me:
Hi,
Is your Search Appliance failing you? Is it giving you irrelevant search results, or unable to search all of your systems? It’s time you considered upgrading to the only enterprise search platform that:
- Securely indexes all of your on-premise and cloud-based source systems
- Provides easy-to-tune relevance and actionable analytics
- Delivers unified search to any application and device your teams use
If I read this correctly, I don’t need a GSA, an Index Engines, a Maxxcat, or an EPI Thunderstone. I can just pop Coveo into my shop and search my heart out.
How do I know?
Easy. The mid tier consulting firm Gartner has identified Coveo as “the most visionary leader” in enterprise search. I am not sure about the methods of non-blue chip consulting firms. I assume they are objective and on a par with the work of McKinsey, Bain, Booz, Allen, and Boston Consulting Group. I have heard that some mid tier firms take a slightly different approach to their analyses. I know first hand that one mid tier firm recycled my research and sold my work on Amazon without my permission. I don’t recall that happening when I worked at Booz, Allen, though. We paid third parties, entered into signed agreements, and were upfront about who knew what. Times change, of course.
Another message this weekend told me that Coveo had identified five major trends that—wait for it—“increase employee and customer proficiency in 2015.” I don’t mean to be more stupid than the others residing in my hollow in rural Kentucky, but what the heck is “customer proficiency”? What body of evidence supports these fascinating “trends.”
The trends are remarkable for me. I just completed CyberOSINT: Next Generation Information Access. The monograph will be available in early 2015 to active law enforcement, security, and intelligence professionals. If you qualify and want to get a copy, send an email to benkent2020 at yahoo dot com. I was curious to see if the outlook my research team assembled from our 12 months of research into the future of information access matched to Coveo’s trends.
The short answer is, “Not even close.”
Coveo focuses on “the ecosystem of record.” CyberOSINT focuses on automated collection and analytics. An “ecosystem of record” sounds like records management. In 2015 organizations need intelligence automatically discovered in third party, proprietary, and open source content, both historical and real time.
Coveo identifies “upskilling the end users.” In our work, the focus is on delivering to either a human or another system outputs that permit informed action. In many organizations, end users are being replaced by increasingly intelligent systems. That trend seems significant in the software delivered by the NGIA vendors whose technology we analyzed. (NGIA is shorthand for next generation information access.)
Coveo is concerned about a “competent customer.” That’s okay, but isn’t that about cost reduction. The idea is to get rid of expensive call center humans and replace them with NGIA systems. Our research suggests that automated systems are the future, or did I just point that out in the “upskilling” comment.
Coveo is mobile first. No disagreement there. The only hitch in the git along is that when one embraces mobile, there are some significant interface issues and predictive operations become more important. Therefore, in the NGIA arena, predictive outputs are where the trend runway lights are leading.
Coveo is confident that cloud indexes and their security will be solved. That is reassuring. However, the cloud as well as on premises’ solutions, including hybrid solutions, have to adopt predictive technology that automatically deals with certain threats, malware, violations, and internal staff propensities. The trend, therefore, is for OSINT centric systems that hook into operational and intel related functions as well as performing external scans from perimeter security devices.
What I find fascinating is that in the absence of effective marketing from vendors of traditional keyword search, providers of old school information access are embracing some concepts and themes that are orthogonal to a very significant trend in information access.
Coveo is obviously trying hard, experimenting with mid tier consulting firm endorsements, hitting the rubber chicken circuit, and cranking out truly stunning metaphors like the “customer proficiency” assertion.
The challenge for traditional keyword search firms is that NGIA systems have relegated traditional information access approaches to utility and commodity status. If one wants search, Elasticsearch works pretty well. NGIA systems deliver a different class of information access. NGIA vendors’ solutions are not perfect, but they are a welcome advance over the now four decades old approach to finding important items of information without the Model T approach of scanning a results list, opening and browsing possibly relevant documents, and then hunting for the item of information needed to answer an important question.
The trend, therefore, is NGIA. An it is an important shift to solutions whose cost can be measured. I wish Mike Lynch was driving the Autonomy marketing team again. I miss the “Black Hole of Information”, the “Portal in a Box,” and the Digital Reasoning Engine approach. Regardless of what one thinks about Autonomy, the company was a prescient marketer. If the Lynch infused Autonomy were around today, the moniker “NGIA” would be one that might capture of Autonomy’s marketing love.
Stephen E Arnold, December 23, 2014
xx
Enterprise Search Vendors: Avoid the Silliness of “All”
December 15, 2014
When enterprise search vendors say their systems index “all” of the information an organization has, the statement is silly, if not downright crazy. A good example of why one does not index “all” information on an organization’s computers appears in “13 Revelations from the Sony Hack.” The next time a search vendor runs the “all” spiel, point them to this write up. In addition to salary information that alleges my former neightbor Jennifer Lawrence gets less dough than some others in her recent film, do I need to know a Sony employee has a drinking problem and nuked his liver? Search works if the content is vetted. That means someone has to do an information inventory and make some decisions about what gets indexed and who may view what. “All”—one more reason why enterprise search has found that open source solutions are “good enough” for many prospects.
Stephen E Arnold, December 15, 2014
Interview with Dave Hawking Offers Insight into Bing, FunnelBack and Enterprise Search
December 9, 2014
The article titled To Bing and Beyond on IDM provides an interview with Dave Hawking, an award-winner in the field of information retrieval and currently a Partner Architect for Bing. In the somewhat lengthy interview, Hawking answers questions on his own history, his work at Bing, natural language search, Watson, and Enterprise Search, among other things. At one point he describes how he arrived in the field of information retrieval after studying computer science at the Australian National University, where he the first search engine he encountered was the library’s card catalogue. He says,
“I worked in a number of computer infrastructure support roles at ANU and by 1991 I was in charge of a couple of supercomputers…In order to do a good job of managing a large-scale parallel machine I thought I needed to write a parallel program so I built a kind of parallel grep… I wrote some papers about parallelising text retrieval on supercomputers but I pretty soon decided that text retrieval was more interesting.”
When asked about the challenges of Enterprise Search, Hawking went into detail about the complications that arise due to the “diversity of repositories” as well as issues with access controls. Hawking’s work in search technology can’t be overstated, from his contributions to the Text Retrieval Conferences, CSIRO, FunnelBack in addition to his academic achievements.
Chelsea Kerwin, December 09, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Another Good Enough Challenge to Proprietary Enterprise Search
December 8, 2014
The protestations of the enterprise search vendors in hock for tens of millions to venture funders will get louder. The argument is that proprietary search solutions are just better.
Navigate to “Postgres Full-Text Search Is Good Enough!” This has been the mantra of some of the European Community academics for a number of years. I gave a talk at CeBIT a couple of years ago and noted that the proprietary vendors were struggling to deliver a coherent and compelling argument. Examples of too-much-chest-beating came from speakers representing and Exalead and a handful of consultants. See, for example, http://bit.ly/1zicaGw.
The point of the “Postgres Good Enough” article strikes me as:
Search has became an important feature and we’ve seen a big increase in the popularity of tools like Elasticsearch and SOLR which are both based on Lucent. They are great tools but before going down the road of Weapons of Mass Search, maybe what you need is something a bit lighter which is simply good enough! What do you I mean by ‘good enough’? I mean a search engine with the following features: stemming, ranking/boost, multiple languages, fuzzy search, accent support. Luckily PostgreSQL supports all these features.
So not only are the proprietary systems dismissed, so are the open source solutions that are at the core of a number of commercialization ventures.
I don’t want to argue with the premise. What is important is that companies trying to market enterprise search solutions now have to convince a buyer why good enough is not good enough.
For decades, enterprise search vendors have been engaged in a Cold War style escalation. With each feature addition from one vendor (Autonomy), other vendors pile on more features (Endeca).
The result is that enterprise search tries to push value on customers, not delivering solutions that are valued by customers.
The “good enough” argument is one more example of a push back against the wild and crazy jumbles of code that most enterprise search vendors offer.
The good news is that good enough search is available, and it should be used. In fact, next generation information access solution vendors are including “good enough” search in robust enterprise applications.
What is interesting is that the venture funding firms seem content to move executives in and out of companies not hitting their numbers. Examples include Attivio and LucidWorks (really?). Other vendors are either really quiet or out of business like Dieselpoint and Hakia. I pointed out that the wild and crazy revenue targets for HP Autonomy and IBM Watson are examples of what happens when marketing takes precedent over what a system can do and how many customers are available to generate billions for these big outfits.
Attention needs to shift to “good enough” and to NGIA (next generation information access) vendors able to make sales, generate sustainable revenue, and solve problems that matter.
Displaying a results list is not high on the list of priorities for many organizations. And when search becomes job one, that is a signal the company may not have diagnosed its technological needs accurately. I know there are many mid tier consultants and unemployed webmasters who wish my statements were not accurate. Alas, reality can be a harsh task master or mistress.
Stephen E Arnold, December 8, 2014
Enterprise Search: Confusing Going to Weeds with Being Weeds
November 30, 2014
I seem to run into references to the write up by a “expert”. I know the person is an expert because the author says:
As an Enterprise Search expert, I get a lot of questions about Search and Information Architecture (IA).
The source of this remarkable personal characterization is “Prevent Enterprise Search from going to the Weeds.” Spoiler alert: I am on record as documenting that enterprise search is at a dead end, unpainted, unloved, and stuck on the margins of big time enterprise information applications. For details, read the free vendor profiles at www.xenky.com/vendor-profiles or, if you can find them, read one of my books such as The New Landscape of Search.
Okay. Let’s assume the person writing the Weeds’ article is an “expert”. The write up is about misconcepts [sic]; specifically, crazy ideas about what a 50 year plus old technology can do. The solution to misconceptions is “information architecture.” Now I am not sure what “search” means. But I have no solid hooks on which to hang the notion of “information architecture” in this era of cloud based services. Well, the explanation of information architecture is presented via a metaphor:
The key is to understand: IA and search are business processes, rather than one-time IT projects. They’re like gardening: It’s up to you if you want a nice and tidy garden — or an overgrown jungle.
Gentle reader, the fact that enterprise search has been confused with search engine optimization is one thing. The fact that there are a number of companies happily leapfrogging the purveyors of utilities to make SharePoint better or improve automatic indexing is another.
Let’s look at each of the “misconceptions” and ask, “Is search going to the weeds or is search itself weeds?”
The starting line for the write up is that no one needs to worry about information architecture because search “will do everything for us.” How are thoughts about plumbing and a utility function equivalent. The issue is not whether a system runs on premises, from the cloud, or in some hybrid set up. The question is, “What has to be provided to allow a person to do his or her job?” In most cases, delivering something that addresses the employee’s need is overlooked. The reason is that the problem is one that requires the attention of individuals who know budgets, know goals, and know technology options. The confluence of these three characteristics is quite rare in my experience. Many of the “experts” working enterprise search are either frustrated and somewhat insecure academics or individuals who bounced into a niche where the barriers to entry are a millimeter or two high.
Next there is a perception, asserts the “expert”, that search and information architecture are one time jobs. If one wants to win the confidence of a potential customer, explaining that the bills will just keep on coming is a tactic I have not used. I suppose it works, but the incredible turnover in organizations makes it easy for an unscrupulous person to just keep on billing. The high levels of dissatisfaction result from a number of problems. Pumping money into a failure is what prompted one French engineering company to buy a search system and sideline the incumbent. Endless meetings about how to set up enterprise systems are ones to which search “experts” are not invited. The information technology professionals have learned that search is not exactly a career building discipline. Furthermore, search “experts” are left out of meetings because information technology professionals have learned that a search system will consume every available resource and produce a steady flow of calls to the help desk. Figuring out what to build still occupies Google and Amazon. Few organizations are able to do much more that embrace the status quo and wait until a mid tier consultant, a cost consultant, or a competitor provides the stimulus to move. Search “experts” are, in my experience, on the outside of serious engineering work at many information access challenged organizations. That’s a good thing in my view.
The middle example is what the expert calls “one size fits all.” Yep, that was the pitch of some of the early search vendors. These folks packaged keyword search and promised that it would slice, dice, and chop. The reality of information, even for the next generation information access companies with which I work, focus on making customization as painless as possible. In fact, these outfits provide some ready-to-roll components, but where the rubber meets the road is providing information tailored to each team or individual user. At Target last night, my wife and I bought Christmas gifts for needy people. One of the gifts was a 3X sweater. We had a heck of a time figuring out if the store offered such a product. Customization is necessary for more and more every day situations. In organizations, customization is the name of the game. The companies pitching enterprise search today lag behind next generation information access providers in this very important functionality. The reason is that the companies lack the resources and insight needed to deliver. But what about information architecture? How does one cloud based search service differ from another? Can you explain the technical and cost and performance differences between SearchBlox and Datastax?
The penultimate point is just plain humorous: Search is easy. I agree that search is a difficult task. The point is that no one cares how hard it is. What users want are systems that facilitate their decision making or work. In this blog I reproduced a diagram showing one firm’s vision for indexing. Suffice it to say that few organizations know why that complexity is important. The vendor has to deliver a solution that fits the technical profile, the budget, and the needs of an organization. Here is the diagram. Draw your own conclusion:
The final point is poignant. Search, the “expert” says, can be a security leak. No, people are the security link. There are systems that process open source intelligence and take predictive, automatic action to secure networks. If an individual wants to leak information, even today’s most robust predictive systems struggle to prevent that action. The most advanced systems from Centripetal Networks and Zerofox offer robust systems, but a determined individual can allow information to escape. What is wrong with search has to do with the way in which provided security components are implemented. Again we are back to people. Information architecture can play a role, but it is unlikely that an organization will treat search differently from legal information or employee pay data. There are classes of information to which individuals have access. The notion that a search system provides access to “all information” is laughable.
I want to step back from this “expert’s” analysis. Search has a long history. If we go back and look at what Fulcrum Technologies or Verity set out to do, the journeys of the two companies are quite instructive. Both moved quickly to wrap keyword search with a wide range of other functions. The reason for this was that customers needed more than search. Fulcrum is now part of OpenText, and you can buy nubbins of Fulcrum’s 30 year old technology today, but it is wrapped in huge wads of wool that comprise OpenText’s products and services. Verity offered some nifty security features and what happened? The company chewed through CEOs, became hugely bloated, struggled for revenues, and end up as part of Autonomy. And what about Autonomy? HP is trying to answer that question.
Net net: This weeds write up seems to have a life of its own. For me, search is just weeds, clogging the garden of 21st century information access. The challenges are beyond search. Experts who conflate odd bits of jargon are the folks who contribute to confusion about why Lucene is just good enough so those in an organization concerned with results can focus on next generation information access providers.
Stephen E Arnold, November 30, 2014