Big Data Continues to Draw Investors

November 16, 2012

Big Data is big news these days, and it is increasingly big business. Several Big Data companies, especially those who are founded on open source technologies, are raising considerable amounts of capital. Elasticsearch appears to join their ranks. Read the full report in the TechCrunch article, “Big Data Search And Analytics Startup Elasticsearch Raises $10M From Benchmark.”

The article states:

Elasticsearch, a real-time big data search and analytics startup,has raised $10 million in Series A financing led by Benchmark Capital. The other investors in the round include Rod Johnson, the creator of Spring and co-founder of SpringSource, and Data Collective . . . The company will use this funding to expand into new geographic regions and for further product development.

Open source is a good investment, and Big Data is definitely on the rise. However, Elasticsearch has yet to be proven in the industry. The article quotes 1.5 million downloads since 2009, but we received a piece just last week that leads us to believe the numbers may be inflated. “Are Elasticsearch Commits Lopsided?” discusses the questionable counting of downloads, but also the number of actual committers to the open source project. LucidWorks is an industry standard in open source search, so its natural extension into Big Data technology is a good fit and a trusted alternative to Elasticsearch.

Emily Rae Aldridge, November 16, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Some Oracle Secrets Revealed

November 16, 2012

We’ve found a resource for Oracle users at the Independent Oracle UCM Knowledge Center: “Secrets of the Full Text Search.” This clearly written and illustrated article explores the details of Oracle Content Server’s full text indexing. This might be one to peruse now, then tuck away for future reference. Writer Dmitri Khanine explains:

“Spend 15 min to understand exactly how Content Server’ Full Text search is working!

“This article takes you behind the scenes and shows you exactly how the full text indexing works in Oracle Content Server. If you ever tried to troubleshoot your search, indexer, batch loader or a performance issue – without a full understanding of how the things really work under the hood – I don’t have to tell you how much time this article can really save you. So without any further due – here it comes.”

And with that, Khanine dives into the technical details, walk-through style. Once you have your full text search enabled and options configured, he takes you through creating and working with a PDF.

I will share the one point Khanine saw fit to emphasize with a paragraph full of italics—since only the latest revision of any document is stored in the IdcColx tables, full text searches are only done on the latest released revision. See the article for more such technical tidbits.

Cynthia Murrell, November 16, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

ElasticSearch Analysis Published by IDC

November 15, 2012

Short honk: The ArnoldIT team worked with IDC to produce “ElasticSearch: An Open Source Search Option for Big Data.” The write up discusses the origins of the company. Compass was the precursor of Elastic Search. My understanding is that Compass was built on Lucence. The new incarnation is built on Solr. There is a discussion of ElasticSearch’s enhancements to the Solr system. The most significant part of the report is the explanation of the advantages and disadvantages of the ElasticSearch approach. The analysis is not written from the developers’ point of view. The focus is on the business value of ElasticSearch in the highly volatile, increasingly crowded market for search systems based on open source technology. Already published in the multi-part research series are analyses of Attivio, LucidWorks, and PolySpot. Unlike the cheerleading on free blogs and developer forums, the IDC analyses cost $3,500 per report. IDC customers have access to the analyses, but should check with their IDC account manager to determine if access is permitted under their subscription plan.

Stephen E Arnold, November 15, 2012

Sponsored by ArnoldIT. Watch for our new professional social media consulting services coming in 2013

An ElasticSearch Feature Comparison: Where Is the Beef?

November 15, 2012

There is an interesting but somewhat incomplete “feature comparison” between Solr and ElasticSearch. ElasticSearch, as you may know, is the new $10 million darling of the search world. Well, maybe Attivio with $42 million or Palantir with $150 million is “darlinger”?

You can find the write up at “Apache Solr vs ElasticSearch.” I want to point out that the comments to the basic information are quite useful. Among the points included in the comments which I found helpful were:

  • The notion of dynamic fields, field copying via multi-fields, and alternative query parsers
  • A reference to DataStax, Cassandra, and Solr
  • A suggestion that an eZ Publish reference be added.

However, I want to point out that in our analysis of ElasticSearch, there is one big factor not embraced by a feature list. Organizations want a system which is easy to install, maintain, and extend. Cost is a big deal, but when one factors in the costs associated with start up companies, there may be less predictability than with more established open source vendors such as Attivio, IBM, LucidWorks, and others.

As a side note, the publisher of the first three editions of the Enterprise Search Report, which I wrote, I had to produce nearly 20 feature charts. Guess what? Most of the feature charts were identical on the main points. The differences were of great technical importance to developers at the vendors’ firms. However, to the companies licensing software, the decisive factors were usually based on business considerations; for example:

  • Customer live demos and references from these customers
  • Pricing including support and training
  • Business stability
  • Engineering depth of the vendor
  • Financial performance over time
  • Management experience.

The fact that one vendor’s approach to k-means was “faster”, the metatagging system was “self learning”, or that another vendor’s system could index 10 gigabytes of content in X time slices was often irrelevant as decision time. Maybe open source search will be different, but right now, the open source world is on a vector that leads to the same business models which the traditional proprietary software vendors used with varying degrees of success?

In my view, a company in growth mode is juggling many balls at once and riding a unicycle. Consequently, the marketing and developer hyperbole may distract from the pure business considerations which garnered Attivio four times the funding that ElasticSearch obtained. The downside is that Attivio has to generate sufficient revenue to hit financial targets. Some financial types want five, 10, or 17 times the investment. I am too old and frail for that type of pressure. Even a $10 million cash infusion works out to $50, $170, or $100 million in revenues.

Only a handful of the 50 search vendors I track have revenues in shouting distance of $50 million. On a call with some MBA types last week, I learned that blowing past the revenues of Autonomy, Endeca, and Fast Search before their sale or implosion, was a “no brainer.”

I am not so sure. Building and sustaining revenue is more than a feature punch list. The real challenge is building and sustaining a business. Look at the present situation for HP Autonomy. Fast Search is, in my opinion, an end of life product. Endeca is an “all things to all people” solution. Endeca is darned good at eCommerce and processing certain types of data sets.

Open source software is important. Open source search is important too. What is more important is the constellation of factors that make “free” software into a viable commercial product which delivers a return to its funding sources. Will the open source community cheerlead when the VCs force the innovators who took those millions to produce a hefty profit? More than marketing and feature lists are needed. Just my opinion.

You can purchase the ElasticSearch analysis at this link for $3,500. Why so much? IDC has to generate revenue and return a profit. My hunch is that this is a fact of economic life that some open source code surfers do not yet hug and cuddle every hour or two.

Stephen E Arnold, November 14, 2012

Leading Austrian IT News Portal Adopts Mindbreeze InSite Solution

November 15, 2012

Monitor.at is Austria’s leading IT news portal for small and medium businesses. The organization recently added the Mindbreeze InSite search solution to their Web site. This integrated cloud solution helps site visitors quickly and efficiently find important and relevant facts. Details of the InSite adoption can be read in the article, “Monitor.at with New Site Search.” The author includes this comment from Monitor’s editor:

’With over 13,500 products to monitor.at it is for visitors not always easy to find the desired information. Integrating Mindbreeze InSite, we are offering our visitors a convenience feature to quickly and easily find the desired information. Addition be easier for us Mindbreeze InSite work. Messages to the top topics automatically, appears seeking based,’ explains Ing Markus Klaus Eder, editor monitor.

Monitor.at is a good example of a major Web site that has incorporated a powerful search solution for improved site experience. Increasing Web traffic and retaining site visitors is increasingly becoming a major avenue for business success as a Web site is often the first customer interaction with a business. The power of semantic search in addition to relevant content is necessary for gaining and retaining an audience. A powerful search system that can make connections among vast amounts of data can also help deliver a better search experience for the user. InSite is capable of searching a wide variety of specific documents, including PDFs, Excel sheets, and Word documents, as well as searching social media sites and Web sites. Consider the free-trial to see if the Mindbreeze solution works for you.

Philip West, November 15, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

SLI Systems Helps Stanfords Increase Conversion

November 15, 2012

SLI Systems has generated a conversion improvement, we learn from their press release, “Stanfords Creates 3.5X Improvement in Conversion Rate and 3X Higher Per-Visit Value with SLI Systems Site Search.” The write up tells us:

“Stanfords, the UK’s leading specialist retailer of maps, travel books, and travel accessories, is seeing a conversion rate for site search users that is 3.5 times the rate for non-site search users after implementing Learning Search from SLI Systems. In addition, per-visit value for visitors who use site search is three times higher than per-visit values for visitors who don’t use search. Stanfords chose SLI’s customizable refinements and learning-based approach to replace the site search built into its e-commerce platform from Exact Abacus.”

Interesting metric. Could there be something about users who don’t use site search that predisposes them to not buy?

Stanfords‘ e-commerce manager Joanna Lawton explained that the recent expansion into travel-related products prompted the move. She is happy with the increased relevance of her company’s results pages, as well as with the system’s intuitive user tools, she said.

SLI Systems supplies tools for site search, navigation, merchandising, and search engine optimization. They boast that their technology ‘learns’ from the behavior of visitors over time, resulting in more relevant results. The privately held company has offices in the US, the UK, Australia, and New Zealand.

Cynthia Murrell, November 15, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

New KnowledgeLake Capture Features Announced

November 14, 2012

KnowledgeLake, Inc., headquartered in St. Louis, develops document imaging related products and solutions for Microsoft SharePoint. In the press release, “KnowledgeLake Continues to Advance the Capabilities of Capture Solution for Microsoft SharePoint,” some new KnowledgeLake capabilities are announced. The author states:

Tightly integrated with Microsoft SharePoint, KnowledgeLake Capture enables end users to easily scan and index documents and store them in the appropriate SharePoint repository. A few of the new developments added to the robust solution include the ability to scan and index documents faster, scan multiple batches at a time, added language support and advances to prioritization functionality.

This is also added about Capture capabilities:

Capture’s sophisticated Batch Processing and Monitoring, allow for multiple documents of many different types to be scanned, viewed and indexed efficiently.

It seems like KnowledgeLake is making a few strides in SharePoint solution development. But when it comes to extending SharePoint capabilities, you may want to consider industry leaders, like Mindbreeze, that provide more than just SharePoint tailored services and have already been offering robust document indexing capabilities. Fabasoft Mindbreeze provides comprehensive access to business knowledge for everyone on the team and is backed by a customer focused support team that shares your purpose. The Microsoft SharePoint and Exchange Connectors facilitate comprehensive incorporation of all your electronic data repositories.

Philip West, November 14, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Liferay Launches Operations in Japan

November 14, 2012

Liferay, a provider of an enterprise class open source portal platform, is expanding its operations into Japan. Liferay is enjoying the momentum being built by open source solutions in general, as organizations see the benefits of open source for the enterprise without a hefty price tag. PRWeb covers the full press release in the article, “Liferay Launches Japan Operations.”

The author gets the input of the Liferay CEO:

“‘Organizations around the world are embracing the innovation, business agility, and lower cost of ownership benefits of open source,’ said Liferay CEO Bryan Cheung, who will be giving the keynote at the launch event. ‘We’re excited to make new connections and explore how Liferay can help solve the quickly evolving business challenges of enterprise users in the Asia Pacific region.’”

Open source does boost innovation and agility all while lowering costs. However, many open source solutions are not viable options for smaller organizations because of the high developer needs for customization and implementation. LucidWorks is one open source solution provider for the enterprise that ensures that solutions are ready to go out-of-the-box. LucidWorks Search is commercial grade and fully backed by the industry-trusted support of the LucidWorks team. Open source will continue to make headlines for its innovative advancements; however, when it comes to a practical open source solution, LucidWorks cannot be beat.

Emily Rae Aldridge, November 14, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

ElasticSearch: Was Google Right about Simplicity?

November 13, 2012

When the Google Search Appliance became available nine or 10 years ago, I was the victim of a Google briefing. The eager Googler showed me the functions of the original Google Search Appliance. I was not impressed. As I wrote in the Google Legacy, the GSA was a “good start” and showed promise.

But one thing jumped out at me. Google’s product planners had identified the key weakness or maybe “flaw” in most of the enterpriser search solutions available a decade ago—Complexity. No single Googler could install Autonomy, Endeca, Fast Search & Transfer, or Convera without help from the company. Once the system was up and running, not even a Googler could tune the system, perform reliable hit boosting, or troubleshoot indexers which could not update. Not surprisingly, most of the flagship enterprise search systems ran up big bills for the licensees. One vendor went down in flames because there were not enough engineers to keep the paying customers happy. So ended an era of complexity with the Google Search Appliance.

I may have been wrong.

I just read “Indexing BigData with ElasticSearch.” If you are not familiar with ElasticSearch (formerly Compass), think about the Compass search engine and the dozens of companies surfing on Lucene/Solr to get in the search game. Even IBM uses Lucene/Solr to slash development costs and free up expensive engineers for more value added work like the wrappers that allow Watson to win a TV game show. I have completed for IDC an analysis of 13 open source search vendors and some of these profiles are available for only $3,500 each. See http://www.idc.com/getdoc.jsp?containerId=236511 for an example.

Is your search system as easy to learn to ride as a Big Wheel toy? If not, there may be some scrapes and risks ahead. In today’s business climate, who wants to incur additional risks or costs in a pursuit of a short cut only a developer can appreciate. Not me or the CFOs I know. A happy quack to http://www.bigwheeltricycle.net/ for this image.

The write up explains how to perform Big Data indexing with ElasticSearch. I urge you to read the write up. Consider this key passage:

The solution finally appeared in the name of ElasticSearch, an open-source Java based full text indexing system, based on the also open-source Apache Lucene engine, that allows you to query and explore your data set as you collect it. It was the ideal solution for us, as doing BigData analysis requires a distributed architecture.

Sounds good. With a fresh $10 million ElasticSearch seems poised to revolutionize the world of enterprise search, big data, and probably business intelligence, search based applications, and unified information access. Why not? Most open source vendors exercise considerable license in an effort to differentiate themselves from next generation solutions such as CyberTap, Digital Reasoning, and others pushing the envelope of findability technology.

Read more

Monitor in Austria Adds Mindbreeze InSite for Improved Web Site Search

November 13, 2012

monitor.at is Austria’s IT guide for small and medium sized businesses, whether looking to stay in the loop on IT happenings or advance their business with more efficient and effective usage of information technology. The company recently added a new feature to their Web site: Mindbreeze InSite. Details of monitor.at’s usage of InSite can be read in, “monitor.at uses Mindbreeze InSite.” The new feature is explained:

With just one search inquiry website visitors receive all relevant information from the site, clearly structured using search tabs. Results can be further refined using facets so that the right information can be found in a matter of seconds.

The content of the page ‘Top-Themen’ (top topics) as well as the associated overview pages are automatically created by InSite – without any manual effort by the web editors. This service is enabled by Mindbreeze’s information pairing technology.

In addition, pre-defined search queries set the parameters for the content. This means that Mindbreeze is constantly checking contents and updates the landing pages automatically. A powerful Web site search can help drive traffic to your site and increase the number of revisits to your pages. InSite is a fast and intuitive search feature that is powered by information pairing technology. And with no install required, the self-service solution can save you time and resources.

Philip West, November 13, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta