Insight Engines Are the next Enterprise Upgrade

September 6, 2017

When one buzzword loses its, marketing teams do their best to create the next term to stay on top of their competition.  When it comes to search, the newest buzzword appears to be “insight engine.”  Mindbreeze is top-selling insight engine, according to their Web site and the recent blog post, “The Global Insight Engine Market: European Solution Scores Top Position.”  The post makes a poignant point that quickly retrieving answers to complicated problems is a necessity, but regular enterprise search engines cannot crawl unstructured information.

While insight engines are the next buzzword and also the next generation of enterprise search engines, but what exactly do their do?

This is where so-called insight engines come into play. They interpret unstructured and structured data using semantic analysis, and prepare it for further use. Search results are improved and returned in a structured format. Of course, insight engines don’t just process unstructured information, but also all other existing company information. The connection to the individual data sources is made through so-called connectors. Another feature of insight engines is that search queries can be formulated in natural language. The intelligent tools interpret the query and provide the relevant corresponding search results.

Gartner recently ranked the global insight engines market (they have their own market from other search engines?) and Mindbreeze ranks at the top of all the engines in the “challenger” category.  What makes this a headliner is that Mindbreeze competed against IBM and HP.  Mindbreeze then brags about their features: less than 90 days to integrate into a system, more out-of-the-box solutions for data connectors than other vendors, and Mindbreeze is more popular now since Google withdrew from the market.

Since this was published on Mindbreeze’s own blog, of course, it is a publicity piece.  In an objective test, how would Mindbreeze compete against Europe’s other engine, Elasticsearch?

Whitney Grace, September 6, 2017

Lucidworks: The Future of Search Which Has Already Arrived

August 24, 2017

I am pushing 74, but I am interested in the future of search. The reason is that with each passing day I find it more and more difficult to locate the information I need as my routine research for my books and other work. I was anticipating a juicy read when I requested a copy of “Enterprise Search in 2025.” The “book” is a nine page PDF. After two years of effort and much research, my team and I were able to squeeze the basics of Dark Web investigative techniques into about 200 pages. I assumed that a nine-page book would deliver a high-impact payload comparable to one of the chapters in one of my books like CyberOSINT or Dark Web Notebook.

I was surprised that a nine-page document was described as a “book.” I was quite surprised by the Lucidworks’ description of the future. For me, Lucidworks is describing information access already available to me and most companies from established vendors.

The book’s main idea in my opinion is as understandable as this unlabeled, data-free graphic which introduces the text content assembled by Lucidworks.

image

However, the pamphlet’s text does not make this diagram understandable to me. I noted these points as I worked through the basic argument that client server search is on the downturn. Okay. I think I understand, but the assertion “Solr killed the client-server stars” was interesting. I read this statement and highlighted it:

Other solutions developed, but the Solr ecosystem became the unmatched winner of the search market. Search 1.0 was over and Solr won.

In the world of open source search, Lucene and Solr have gained adherents. Based on the information my team gathered when we were working on an IDC open source search project, the dominant open source search system was Lucene. If our data were accurate when we did the research, Elastic’s Elasticsearch had emerged as the go-to open source search system. The alternatives like Solr and Flaxsearch have their users and supporters, but Elastic, founded by Shay Branon, was a definite step up from his earlier search service called Compass.

In the span of two and a half years, Elastic had garnered more than a $100 million in funding by 2014and expanded into a number adjacent information access market sectors. Reports I have received from those attending Elastic meetings was that Elastic was putting considerable pressure on proprietary search systems and a bit of a squeeze on Lucidworks. Google’s withdrawing its odd duck Google Search Appliance may have been, in small part, due to the rise of Elasticsearch and the changes made by organizations trying to figure out how to make sense of the digital information to which their staff had access.

But enough about the Lucene-Solr and open source versus proprietary search yin and yang tension.

Read more

Docurated Expands Salesforce to Broaden Search

August 18, 2017

Enterprise search is growing to make the user experience easier as the demand for everyday use by company employees not deemed ‘data analysts’ is growing. One company slowly making a name for themselves by providing such a service is Docurated.

CMSWire explains their new federated search within Salesforce as the following,

…both sides win with this solution. By delivering content through the native search bar in Salesforce.com — the most used feature of the platform — marketing gets to use the most trafficked channel to drive content consumption, while sales receives content in context…Its Content Cloud uses a combination of inputs and analytics about the effectiveness of content, combined with powerful search, to retrieve relevant content…It fully integrates with all existing cloud and on-premises content repositories and tracks versions of content, sharing only the latest and most accurate version within the organization.

We’re seeing this trend continue to grow with more search vendors making the search process more user-friendly and able to work in multiple functions and across applications. While Google is going ad-happy with their user experience, most search companies are realizing Google had the right idea in the beginning and are making strides to duplicate it within enterprise search.

Catherine Lamsfuss, August 18, 2017

New Enterprise Search Market Study

August 1, 2017

Don Quixote and Solving Death: No Problem, Amigo

I read “Global Enterprise Search Market 2017-2022.” I was surprised that a consulting firms would invest time and energy in writing about a market sector which has not been thriving. Now don’t start sending me email about my lack of cheerfulness about enterprise search. The sector is thriving, but it is doing so with approaches that are disguised as applications which deliver something other than inflated expectations, business closures, and lawsuits.

Image result for don quixote

I will slay the beast that is enterprise search. “Hold still, you knave!”

First, let’s look at what the report covers, then I will tackle some of the issues about which I think as the author of the Enterprise Search Report and a number of search-related articles and analyses. (The articles are available from the estimable Information Today Web site, and the free analyses may be located at www.xenky.com/vendor-profiles.

The write up told me that enterprise search boils down to these companies:

Coveo Corp
Dassault Systemes
IBM Corp
Microsoft
Oracle
SAP AG

Coveo is a fork of Copernic. Yep, it’s a proprietary system which originally was focused on providing search for Microsoft. Now the company has spread its wings to include a raft of functions which range from the cloud to customer support / help desk services.

Dassault Systèmes is the owner of Exalead. Since the acquisition, Exalead as a brand has faded. The desktop search system was killed, and its proprietary technology lives on mostly as a replacement for Dassault’s internal search system which was based on Autonomy. Most of the search wizards have left, but the Exalead technology was good before Dassault learned that selling search was indeed a challenge.

IBM offers a number of products which include open source Lucene, acquired technology like Vivisimo’s clustering engine, and home brew code from its IBM wizards. (Did you  know that the precursor of PageRank was an IBM “invention”?) The key is that IBM uses search to sell services which have a higher margins than providing a free version of brute force information access.

Read more

A Potentially Useful List of Enterprise Search Engine Servers

July 20, 2017

We found a remarkable list at Predictive Analytics Today—“Top 23 Enterprise Search Engine Servers.” The write-up introduces its roster of resources:

Enterprise Search is the search information within an enterprise, searching of content from multiple enterprise-type sources, such as databases and intranets. These search systems index data and documents from a variety of sources including file systems, intranets, document management systems, e-mail, and databases. Enterprise search systems also integrate structured and unstructured data in their collections and also use access controls to enforce a security policy on their users.

Entries are logically presented under two categories, proprietary solutions and open source software. From Algolia to Xapian, the article summarizes pros and cons of each. See the post for details.

However, we have a few notes to add about some particular platforms. For example, the Google Search Appliance has been discontinued, though Constellio is still going… in Canada. SearchBlox is now Elasticsearch, and SRCH2 was originally designed for mobile searches. Also, isn’t Sphinx Search specifically for SQL data? Hmm. We suggest this list could make a good springboard, but server shoppers should take its specifics with a grain of salt, and be sure to do your own follow-up research.

Cynthia Murrell, July 20, 2017

Big Data in Biomedical

July 19, 2017

The biomedical field which is replete with unstructured data is all set to take a giant leap towards standardization with Biological Text Mining Unit.

According to PHYS.ORG, in a peer review article titled Researchers Review the State-Of-The-Art Text Mining Technologies for Chemistry, the author states:

Being able to transform unstructured biomedical research data into structured databases that can be more efficiently processed by machines or queried by humans is critical for a range of heterogeneous applications.

Scientific data has fixed set of vocabulary which makes standardization and indexation easy. However, most big names in Big Data and enterprise search are concentrating their efforts on e-commerce.

Hundreds of new compounds are discovered every year. If the data pertaining to these compounds is made available to other researchers, advancements in this field will be very rapid. The major hurdle is the data is in an unstructured format, which Biological Text Mining Unit standards intend to overcome.

Vishal Ingole, July 19, 2017

Elastic Search Redefining Enterprise Search Landscape

May 24, 2017

Open source enterprise search engine Elastic Search is changing the way large IT enterprises are enabling its user to search relevant data in a seamless manner.

Apiumhub in an in-depth report titled Elastic Search; Advantages, Case Studies & Books says:

Elastic search is able to achieve fast search responses because, instead of searching the text directly, it searches an index instead. This is more or less like searching for a keyword by scanning the index at the back of a book, as opposed to searching every word of every page of the book.

The search engine is easily scalable and can accommodate petabytes of data on multiple servers in short time. Considering it is based on Lucene, developers too find it easy to work with. Even if the keywords are misspelled, the search engine will correct the error and deliver accurate results.

At present, large organizations like Tesco, Wikipedia, Facebook, LinkedIn and Salesforce have already deployed the enterprise search engine across their servers. With the advent of voice-based search, capabilities of Elastic search will be in more demand in the near future, experts say.

Vishol Ingole, May 24, 2017

Swiftype Launches SaaS Enterprise Search Platform

May 10, 2017

While AI is a hot commodity, enterprise search has been more of a disappointment. That is why we are surprised by one company’s confidence in the search market—KMWorld shares, “Swiftype Launches AI-Powered Content Discovery Engine for Enterprise Users.” This integration of AI into enterprise search is the firm’s first (formal) venture into cloud services. Writer Joyce Wells tells us:

With a single search, the company says, a user can locate information across accounts in Salesforce, files on Dropbox, documents in Google G Suite or Office 365, information from internal databases, and conversation threads on Gmail. Swiftype also integrates directly into apps such as Salesforce and Confluence to allow users to search and find content across all of these services without disturbing their existing workflows.

According to the vendor, the platform provides Swiftype AI-powered search applications built natively for mobile, desktop, and web browsers, as well as additional workflow integrations that allow users to search all their data from the applications they already use. There is also a Connector Framework to help quickly connect cloud-based platforms.

So far, Swiftype has integrated the platforms of Google, Microsoft, Salesforce, Atlassian, and Zendesk into their product. We also learn the company’s AI platform, dubbed Enterprise Knowledge Graph, will take into account calendar events, email content, and user behavior as crafts analyses. Launched in 2012, the Swiftype is based in San Francisco.

Cynthia Murrell, May 10, 2017

You Do Not Search. You Insight.

April 12, 2017

I am delighted, thrilled. I read “Coveo, Microsoft, Sinequa Lead Insight Engine Market.” What a transformation is captured in what looks to me like a content marketing write up. Key word search morphs into “insight.” For folks who do not follow the history of enterprise search with the fanaticism of those involved in baseball statistics, the use of the word “insight” to describe locating a document is irrelevant. Do you search or insight?

For me, hunkered down in rural Kentucky, with my monitors flickering in the intellectual darkness of Kentucky, the use of the word “insight” is a linguistic singularity. Maybe not on the scale of an earthquake in Italy or a banker leaping from his apartment to the Manhattan asphalt, but a historical moment nevertheless.

Let me recap some of my perceptions of the three companies mentioned in the headline to this tsunami of jargon in the Datanami story:

  • Coveo is a company which developed a search and retrieval system focused on Windows. With some marketing magic, the company explained keyword search as customer support, then Big data, and now this new thing, “insight”. For those who track vendor history, the roots of Coveo reach back to a consumer interface which was designed to make search easy. Remember Copernic. Yep, Coveo has been around a long while.
  • Sinequa also was a search vendor. Like Exalead and Polyspot and other French search vendors, the company wanted manage data, provide federation, and enable workflows. After a president change and some executive shuffling, Sinequa emerged as a Big Data outfit with a core competency in analytics. Quite a change. How similar is Sinequa to enterprise search? Pretty similar.
  • Microsoft. I enjoyed the “saved by the bell” deal in 2008 which delivered the “work in progress” Fast Search & Transfer enterprise search system to Redmond. Fast Search was one of the first search vendors to combine fast-flying jargon with a bit of sales magic. Despite the financial meltdown and an investigation of the Fast Search financials, Microsoft ponied up $1.2 billion and reinvented SharePoint search. Well, not exactly reinvented, but SharePoint is a giant hairball of content management, collaboration, business “intelligence” and, of course, search. Here’s a user friendly chart to help you grasp SharePoint search.

image

Flash forward to this Datanami article and what do I learn? Here’s a paragraph I noted with a smiley face and an exclamation point:

Among the areas where natural language processing is making inroads is so-called “insight engines” that are projected to account for half of analytic queries by 2019. Indeed, enterprise search is being supplanted by voice and automated voice commands, according to Gartner Inc. The market analyst released it latest “Magic Quadrant” rankings in late March that include a trio of “market leaders” along with a growing list of challengers that includes established vendors moving into the nascent market along with a batch of dedicated startups.

There you go. A trio like ZZTop with number one hits? Hardly. A consulting firm’s “magic” plucks these three companies from a chicken farm and gives each a blue ribbon. Even though we have chickens in our backyard, I cannot tell one from another. Subjectivity, not objectivity, applies to picking good chickens, and it seems to be what New York consulting firms do too.

Are the “scores” for the objective evaluations based on company revenue? No.

Return on investment? No.

Patents? No.

IRR? No. No. No.

Number of flagship customers like Amazon, Apple, and Google type companies? No.

The ranking is based on “vision.” And another key factor is “the ability to execute its “strategy.” There you go. A vision is what I want to help me make my way through Kabul. I need a strategy beyond stay alive.

What would I do if I have to index content in an enterprise? My answer may surprise you. I would take out my check book and license these systems.

  1. Palantir Technologies or Centrifuge Systems
  2. Bitext’s Deep Linguistic Analysis platform
  3. Recorded Future.

With these three systems I would have:

  1. The ability to locate an entity, concept, event, or document
  2. The capability to process content in more than 40 languages, perform subject verb object parsing and entity extraction in near real time
  3. Point-and-click predictive analytics
  4. Point-and-click visualization for financial, business, and military warfighting actions
  5. Numerous programming hooks for integrating other nifty things that I need to achieve an objective such as IBM’s Cybertap capability.

Why is there a logical and factual disconnect between what I would do to deliver real world, high value outputs to my employees and what the New York-Datanami folks recommend?

Well, “disconnect” may not be the right word. Have some search vendors and third party experts embraced the concept of “fake news” or embraced the know how explained in Propaganda, Father Ellul’s important book? Is the idea something along the lines of “we just say anything and people will believe our software will work this way”?

Many vendors stick reasonably close to the factual performance of their software and systems. Let me highlight three examples.

First, Darktrace, a company crafted by Dr. Michael Lynch, is a stickler for explaining what the smart software does. In a recent exchange with Darktrace, I learned that Darktrace’s senior staff bristle when a descriptive write up strays from the actual, verified technical functions of the software system. Anyone who has worked with Dr. Lynch and his senior managers knows that these people can be very persuasive. But when it comes to Darktrace, it is “facts R us”, thank you.

Second, Recorded Future takes a similar hard stand when explaining what the Recorded Future system can and cannot do. Anyone who suggests that Recorded Future predictive analytics can identify the winner of the Kentucky Derby a day before the race will be disabused of that notion by Recorded Future’s engineers. Accuracy is the name of the game at Recorded Future, but accuracy relates to the use of numerical recipes to identify likely events and assign a probability to some events. Even though the company deals with statistical probabilities, adding marketing spice to the predictive system’s capabilities is a no-go zone.

Third, Bitext, the company that offers a Deep Linguistics Analysis platform to improve the performance of a range of artificial intelligence functions, is anchored in facts. On a recent trip to Spain, we interviewed a number of the senior developers at this company and learned that Bitext software works. Furthermore, the professionals are enthusiastic about working for this linguistics-centric outfit because it avoid marketing hyperbole. “Our system works,” said one computational linguist. This person added, “We do magic with computational linguistics and deep linguistic analysis.” I like that—magic. Oh, Bitext does sales too with the likes of Porsche, Volkswagen, and the world’s leading vendor of mobile systems and services, among others. And from Madrid, Spain, no less. And without marketing hyperbole.

Why then are companies based on keyword indexing with a sprinkle of semantics and basic math repositioning themselves by chasing each new spun sugar-encrusted trend?

I have given a tiny bit of thought to this question.

In my monograph “The New Landscape of Search” I made the point that search had become devalued, a free download in open source repositories, and a utility like cat or dir. Most enterprise search systems have failed to deliver results painted in Technicolor in sales presentations and marketing collateral.

Today, if I want search and retrieval, I just use Lucene. In fact, Lucene is more than good enough; it is comparable to most proprietary enterprise search systems. If I need support, I can ring up Elastic or one of many vendors eager to gild the open source lily.

The extreme value and reliability of open source search and retrieval software has, in my opinion, gutted the market for proprietary search and retrieval software. The financial follies of Fast Search & Transfer reminded some investors of the costly failures of Convera, Delphes, Entopia, among others I documented on my Xenky.com site at this link.

Recently most of the news I see on my coal fired computer in Harrod’s Creek about enterprise search has been about repositioning, not innovation. What’s up?

The answer seems to be that the myth cherished by was that enterprise search was the one, true way make sense of digital information. What many organizations learned was that good enough search does the basic blocking and tackling of finding a document but precious little else without massive infusions of time, effort, and resources.

But do enterprise search systems–no matter how many sparkly buzzwords–work? Not too many, no matter what publicly traded consulting firms tell me to believe.

Snake oil? I don’t know. I just know my own experience, and after 45 years of trying to make digital information findable, I avoid fast talkers with covered wagons adorned with slogans.

Image result for snake oil salesman 20th century

What happens when an enterprise search system is fed videos, podcasts, telephone intercepts, flows of GPS data, and a couple of proprietary file formats?

Answer: Not much.

The search system has to be equipped with extra cost  connectors, assorted oddments, and shimware to deal with a recorded webinar and a companion deck of PowerPoint slides used by the corporate speaker.

What happens when the content stream includes email and documents in six, 12, or 24 different languages?

Answer: Mad scrambling until the proud licensee of an enterprise search system can locate a vendor able to support multiple language inputs. The real life needs of an enterprise are often different from what the proprietary enterprise search system can deal with.

That’s why I find the repositioning of enterprise search technology a bit like a clown with a sad face. The clown is no longer funny. The unconvincing efforts to become something else clash with the sad face, the red nose, and  worn shoes still popular in Harrod’s Creek, Kentucky.

Image result for emmett kelly

When it comes to enterprise search, my litmus test is simple: If a system is keyword centric, it isn’t going to work for some of the real world applications I have encountered.

Oh, and don’t believe me, please.

Find a US special operations professional who relies on Palantir Gotham or IBM Analyst’s Notebook to determine a route through a hostile area. Ask whether a keyword search system or Palantir is more useful. Listen carefully to the answer.

No matter what keyword enthusiasts and quasi-slick New York consultants assert, enterprise search systems are not well suited for a great many real world applications. Heck, enterprise search often has trouble displaying documents which match the user’s query.

And why? Sluggish index updating, lousy indexing, wonky metadata, flawed set up, updates that kill a system, or interfaces that baffle users.

Personally I love to browse results lists. I like old fashioned high school type research too. I like to open documents and Easter egg hunt my way to a document that answers my question. But I am in the minority. Most users expect their finding systems to work without the query-read-click-scan-read-scan-read-scan Sisyphus-emulating slog.

Image result for sisyphus

Ah, you are thinking I have offered no court admissible evidence to support my argument, right? Well, just license a proprietary enterprise search system and let me know how your career is progressing. Remember when you look for a new job. You won’t search; you will insight.

Stephen E Arnold, April 12, 2017

Attivio Takes on SCOLA Repository

March 16, 2017

We noticed that Attivio is back to enterprise search, and now uses the fetching catchphrase, “data dexterity company.” Their News page announces, “Attivio Chosen as Enterprise Search Platform for World’s Largest Repository of Foreign Language Media.” We’ve been keeping an eye on Attivio as it grows. With this press release, Attivio touts a large, recent feather in their cap—providing enterprise search services to SCOLA, a non-profit dedicated to helping different peoples around the world learn about each other. This tool enables SCOLA’s subscribers to find any content in any language, we’re told. The organization regards today’s information technology as crucial to their efforts. The write-up explains: 

SCOLA provides a wide range of online language learning services, including international TV programming, videos, radio, and newspapers in over 200 native languages, via a secure browser-based application. At 85 terabytes, it houses the largest repository of foreign language media in the world. With its users asking for an easier way to find and categorize this information, SCOLA chose Attivio Enterprise Search to act as the primary access point for information through the web portal. This enables users, including teachers and consumers, to enter a single keyword and find information across all formats, languages and geographical regions in a matter of seconds. After looking at several options, SCOLA chose Attivio Enterprise Search because of its multi-language support and ease of customization. ‘When you have 84,000 videos in 200 languages, trying to find the right content for a themed lesson is overwhelming,’ said Maggie Artus, project manager at SCOLA. ‘With the Attivio search function, the user only sees instant results. The behind-the-scenes processing complexity is completely hidden.’”

Attivia was founded in 2007, and is headquartered in Newton, Massachusetts. The company’s client roster includes prominent organizations like UBS, Cisco, Citi, and DARPA. They are also hiring for several positions as of this writing.

Cynthia Murrell, March 16, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta