Congratulations Apache Foundation on Solr 4.0

November 6, 2012

LucidWorks products, including LucidWorks Search and LucidWorks Big Data, are built upon the winning combination of open source products Apache Lucene and Solr. With the general availability and release of Apache Solr 4.0, LucidWorks sends its support and congratulations. Read more in the full story, “LucidWorks Congratulates Apache Foundation on General Release of Solr 4.0.”

The press release begins:

“A milestone in the maturity of open source search was reached today with the general availability of the Apache Foundation’s Solr 4.0 search server. Integrated tightly with the Apache Lucene search library, the combination, Lucene/Solr, is the industry’s most widely used platform for writing real-time embedded search applications that can scale to handle billions of documents and high query volumes. Solr is considered by many to be the open source standard for fast, flexible and scalable implementation.”

With its recently redesigned brand and Web presence, LucidWorks also invites the open source community to join them on SearchHub.org to share praise and talk through the new Solr offering.

“To congratulate the Apache Solr community on this release, LucidWorks, the trusted name in Search, Discovery and Analytics, invites the community to mark the occasion by sharing praise for Solr and posting comments to the Solr 4.0 goes GA blog post located on SearchHub.org.”

Join in the conversation at this SearchHub thread.

Emily Rae Aldridge, November 06, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Open Source Search: The Me Too Method Is Thriving

November 5, 2012

In the first three editions of The Enterprise Search Report (2003 to 2007), my team and I wrote, we made it clear that the commercial enterprise search vendors were essentially a bunch of me-too services.

The diagrams for the various systems were almost indistinguishable. Some vendors used fancy names for their systems and others stuck with the same nomenclature used in the SMART system. I pointed out that every enterprise search system has to perform certain basic functions: Content acquisition, indexing, query processing, and administration. But once those building blocks were in place, most of the two dozen vendors I profiled added wrappers which created a “marketing differentiator.” Examples ranged from Autonomy’s emphasis on the neuro linguistic processing to Endeca’s metadata for facets to Vivisimo’s building a single results list from federated content.

wheel of fortune fixed copy copy

The rota fortunae of the medieval software licensee. A happy quack to http://www.artlex.com/ArtLex/Ch.html

The reality was that it was very difficult for the engineers and marketers of these commercial vendors to differentiate clearly their system from dozens of look-alikes. With the consolidation of the commercial enterprise search sector in the last 36 months, the proprietary vendors have not changed the plumbing. What is new and interesting is that many of them are now “analytics,” “text mining,” or “business intelligence” vendors.

The High Cost of Re-Engineering

The key to this type of pivot is what I call “wrappers” or “add ins.” The idea is that an enterprise search system is similar to the old Ford and GM assembly lines of the 1970s. The cost for changing those systems was too high. The manufacturers operated them “as is”, hoping that chrome and options would give the automobiles a distinctive quality. Under the paint and slightly modified body panels, the cars were essentially the same old vehicle.

Commercial enterprise search solutions are similar today, and none has been overhauled or re-engineered in a significant way. That is okay. When a company licenses an enterprise search solution from Microsoft or Oracle, the customer is getting the brand and the security which comes from an established enterprise search vendor.

Let’s face it. The RECON or SDC Orbit system is usable without too much hassle by a high school student today. The precision and recall are in the 80 top 85 percent range. The US government has sponsored a text retrieval program for many years. The results of the tests are not widely circulated. However, I have heard that the precision and recall scores mostly stick in the 80 to 85 percent range. Once in a while a system will perform better, but search technology has, in my opinion, hit a glass ceiling. The commercial enterprise search sector is like the airline industry. The old business model is not working. The basic workhorse of the airline industry delivers the same performance as a jet from the 1970s. The big difference is that the costs keep on going up and passenger satisfaction is going down.

Open Source: Moving to Center Stage

But I am not interested in commercial enterprise search systems. The big news is the emergence of open source search options. Until recently, open source search was not mainstream. Today, open source search solutions are mainstream. IBM relies on Lucene/Solr for some of its search functions. IBM also owns Web Fountain, STAIRS, iPhrase, Vivisimo, and the SPSS Clementine technology, among others. IBM is interesting because it has used open source search technology to reduce costs and tap into a source of developer talent. Attivio, a company which just raised $42 million in additional venture funding, relies on open source search. You can bet your bippy that the investors want Attivio to turn a profit. I am not sure the financial types dive into the intricacies of open source search technology. Their focus is on the payoff from the money pumped into Attivio. Many other commercial content processing companies rely on open source search as well.

The interesting development is the emergence of pure play search vendors built entirely on the Lucene/Solr code. Anyone can download these “joined at the hip” software from the Apache Foundation. We have completed an analysis of a dozen of the most interesting open source search vendors for a big time consulting firm. What struck the ArnoldIT research team was:

  1. The open source search vendors are following the same path as the commercial enterprise search vendors. The systems are pretty much indistinguishable.
  2. The marketing “battle” is being fought over technical nuances which are of great interest to developers and, in my opinion, almost irrelevant to the financial person who has to pay the bills.
  3. The significant differentiators among the dozen companies we analyzed boils down to the companies’ financial stability, full time staff, and value-adding proprietary enhancements, customer support, training, and engineering services.

What this means is that the actual functionality of these open source search systems is similar to the enterprise proprietary solutions. In the open source sector, some vendors specialize by providing search for a Big Data environment or for remediating the poor search system in MySQL and its variants. Other companies sell a platform and leave the Lucene/Solr component as a utility service. Others just take the Lucene/Solr and go forward.

The Business View

In a conversation with Paul Doscher, president of LucidWorks, I learned that his organization is working through the Project Management Committee (PMC) Group of the Lucene/Solr project within the Apache Software Foundation to build the next-generation search technology. The effort is to help transform people’s ability to turn data into decision making information.

This next generation search technology is foundational in developing a big data technology stack to enable enterprisers to reap the rewards of the latest wave of innovation.

The key point is that figuring out which open source search system does what is now as confusing and time consuming as figuring out the difference between the proprietary enterprise search systems was 10 years ago.

Will there be a fix for me-too’s in enterprise search. I think that some technology will be similar and probably indistinguishable to non-experts? What is now raising the stakes is that search systems are viewed as utilities. Customers want answers, visualizations, and software which predicts what will happen. In my opinion, this is search with fuzzy dice, 20 inch chrome wheels, and a 200 watt sound system.

The key points of differentiation for me will remain the company’s financial stability, its staff quality, its customer service, its training programs, and its ability to provide engineering services to licensees who require additional services. In short, the differentiators may boil down to making systems pay off for licensees, not marketing assertions.

In the rush to cash in on organizations’ need to cut costs, open source search is now the “new” proprietary search solution. Buyer beware? More than ever. The Wheel of Fortune in search is spinning again. Who will be a winner? Who will be a loser? Place your bets. I am betting on open source search vendors with the service and engineering expertise to deliver.

Stephen E Arnold, November 5, 2012

AvePoint Continues Support for SharePoint

November 5, 2012

Many excellent options exist for third party SharePoint support.  AvePoint, with its focus on governance and management solutions, has been supporting SharePoint deployments for eleven years.  In the full report, “AvePoint Showcases Tradition of Excellence and Support for Microsoft SharePoint 2013 at SharePoint Conference 2012,” the author discusses how AvePoint will participate in the upcoming SharePoint 2012 conference.

AvePoint, leader in governance and management solutions for Microsoft SharePoint, today announced that it is a Platinum and Keynote Sponsor of Microsoft SharePoint Conference (SPC) 2012, taking place from November 12-15 at Mandalay Bay in Las Vegas, NV. AvePoint’s Platinum and Keynote sponsorship for the third consecutive SPC highlights the company’s continued commitment to the global SharePoint community, and provides an opportunity to showcase the DocAve Software Platform, its award-winning product line for managing, governing, and scaling enterprise technology platforms with support for Microsoft SharePoint 2013.

Some companies need more than governance, management, and backup support.  Some enterprises struggle to implement a SharePoint infrastructure because of staffing, funding, or other shortages.  That is where a third party add-on can help bridge the gap.  Fabasoft Mindbreeze Enterprise is quick, cost-efficient, and service oriented.  Its intuitive interface flattens out a potentially steep learning curve.  Furthermore, Mindbreeze works as a standalone solution, or as an addition to an existing infrastructure.

Emily Rae Aldridge, November 5, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

Google Promotes Open Source in Partnership with Eclipse

November 5, 2012

News from The H Open informs us that Google is embracing Eclipse and open source. We learn in the article “Google Becomes Strategic Member of the Eclipse Foundation” that Google has been welcomed by the Eclipse Foundation as a new strategic member for the open source company. What does this mean for Eclipse? The corporation will gain $250,000 in donations from the search giant as well as eight full-time developers.

The article continues:

“The company’s involvement in the Eclipse community is not new. It had long been an Eclipse Foundation Gold Sponsor. Google staff have also frequently collaborated as committers on a range of Eclipse projects and developed tools for the Eclipse platform to improve support for Google projects such as the Android SDK, AppEngine, GWT and Dart. Google projects such as WindowBuilder and CodePro Profiler have also previously been adopted as Eclipse projects. In September, Google also contributed $20,000 to the Foundation to purchase hardware to help performance testing of the Eclipse IDE.”

Google joins Eclipse in the shadow of other big names such as CA Technologies, Oracle, and IBM. Does this mean Google is following in the footsteps of these big companies in the continued use and promotion of open source software? Looks like the days of open source being embraced by only small companies and startups are gone. We are interested to see where this is headed.

Andrea Hayden, November 05, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

The Decline of PCs and Search?

November 4, 2012

I worked through “The Slow Decline of PCs and the Fast Rise of Smartphones/Tablets Was Predicted in 1993.” The main point is that rocket scientist cook and patent expert, Nathan P. Myhrvold anticipated the shift from desktop computers to more portable form factors. Years earlier I remember a person from Knight Ridder pitching a handheld gizmo which piggybacked on the Dynabook. When looking for accurate forecasts and precedents, those with access to a good library, commercial databases, and the Web can ferret up many examples of the Nostradamus approach to research. I am all for it. Too many people today do not do hands on research. Any exercise of this skill is to be congratulated.

Here’s the main point of the write up in my opinion:

His memo is amazingly accurate. Note that his term “IHC” (Information Highway Computer) could be roughly equated with today’s smartphone or tablet device, connecting to the Internet via WiFi or a cellular network. In his second last paragraph, Myhrvold predicts the winners will be those who “own the software standards on IHCs” which could be roughly equated with today’s app stores, such as those on iOS (Apple), Android (Google, Amazon) and Windows 8 (Microsoft). The only thing you could say he possibly didn’t foresee would be the importance of hardware design in the new smartphone and tablet industry.

Let’s assume that Mr. Myhrvold was functioning in “I Dream of Jeannie” mode. Now let’s take that notion of a big change coming quickly and apply it to search. My view is that traditional key word search was there and then—poof—without a twitch of the soothsayer’s nose, search was gone.

Look at what exists today:

  1. Free search which can be downloaded from more than a dozen pretty reliable vendors plus the Apache Foundation. Install the code and you have state of the art search, facets, etc.
  2. Business intelligence. This is search with grafted on analytics. I think of this as Frankensearch, but I am old and live in rural Kentucky. What do you expect?
  3. Content process. This is data management with some search functions and a bunch of parsing and tagging. Indexing is good, but the cost of humans is too high for many government intelligence organizations. So automation is the future.
  4. Predictive search. This is the Google angle. You don’t need to do anything, including think too much. The system does the tireless nanny job.

So is search in demise mode? Yep. Did anyone predict it? I would wager one thin dime that any number of azure chip consultants will have documents in their archive which show that the death of search was indeed predicted. One big outfit killed a “magic carpet tile” showing the search industry and then brought it back.

So search is not dead. Maybe it was Mark Twain who said, “The reports of my death have been greatly exaggerated.” Just like PCs, mainframes, and key word search?

Stephen E Arnold, November 4, 2012

Optimizing a Web Site and Google Places to Improve User Experience

November 2, 2012

Terry Van Horne has over 17 years of experience in Web development and search engine optimization (SEO) and is founder of SEO Pros and the OSEOP Organization, the first organization for search engine optimization professionals. He shares some tips and tricks for optimizing your site and Google Places landing page in his article, “Google Local Search: Optimization of Your Google Places or +Local Website Landing Page.” Horne has this to share about proving good content on a site:

I strongly recommend using structured data on your contact page and I always include full contact details in the footer of every page. I now recommend this information be marked up in structured data as well. Reviews, events, testimonials and more can be included in the SERP and these ‘Rich Snippets’ always drive more clicks on your listing. The number of testimonials and ratings on your website affects the ranking.

He adds that microdata information and syntax is available at Schema.org and strongly encourages linking your home page to your testimonials page to maximize the link equity to the page and structured data it contains. The power of semantic search in addition to relevant content is a big part of gaining and retaining an audience. One way to boost your site’s experience is to add a powerful search system, like InSite. Mindbreeze’s InSite solution gives a custom searching experience unique to the user with the added benefit of mobile capabilities. Latest content can easily be displayed to Web site visitors to keep your site fresh, and robust reports provide you with the feedback on what and why your visitors are searching.

Philip West, November 2, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

Boosting Web Site Impact with Optimized Search

November 1, 2012

Web site search optimization is a way to give your visitors intent to stay on your site and revisit. In addition, a powerful search feature can introduce users to good content on your site they may not have found otherwise. In “Changed Way of Site Optimization,” the author discusses the importance of SEO for Web site design and SEO Brisbane Company as one option. This is given about using SEO to boost traffic using SEO Brisbane:

Regardless to what type of website you are building, implementing SEO Brisbane into your website will result in better search results and high traffic numbers. And whether you plan on hiring a company or doing it yourself, it’s imperative that you know how to properly implement SEO techniques into your website. Some techniques are smaller than others, while other techniques take time. But there are a few SEO Brisbane techniques that you can use to help your website grow undeniably.

The author also points out that navigation tabs should be coded so that search engines can crawl them and to ensure no loss of pages as URLs change. However, techniques are still only vaguely explained and a glance at the SEO Brisbane Web site shows they heavily promise results without being as clear about the technology behind the solution, which makes us worry about the lack of substance in the tool. When you’re weighing options for Web site search solutions, consider Mindbreeze InSite to add a powerful and customized search experience for your Web site visitors. InSite is powered by semantic search and a visit to their site quickly illustrates the benefits of the cloud deployment options, customizable tabs, multiple data sources options, and faceted search features.

Philip West, November 1, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

Google Personal Search Integrates Disparate Tools

November 1, 2012

All Things D reveals Google’s plans to serve up yet more integration in, “Google Amps Up Personal Search to Combine Gmail, Calendar, Drive, and More.” The product combination, which brings together personal and public services, is bound to rub some people the wrong way. Writer Liz Gannes hypothesizes that this is why Google is currently only making the change for users who opt in to the “field trial.” This Googley experiment is only available in English, and only for personal Gmail accounts.

The write up explains:

“Starting today, Gmail users will be able to search across their mail, contacts, Google Drive documents and Google Calendar appointments, all from the search bar at the top of the Webmail application. But that’s only if they choose to opt into a “field trial” of the new product.

“This builds on top of an existing field trial that combines Gmail and search on Google.com. That experiment launched in August and as of today also includes Google Drive documents, spreadsheets and files. Users who opted into the first field trial will have to opt in again.”

Such a to-do about integrating pieces and parts reminds us that Google has a big-data view of small things. Does that perspective hurt or help the search giant? Or some of each?

Cynthia Murrell, November 01, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

How to Make a Search Play Work

October 31, 2012

I used to think that technology, solving a problem, and hard work were the keys to success. What do I know? Nothing if “Want to Be Rich? Be Lucky, Know the Right People” is accurate. The NPR story reports that a North Carolinian success believes:

One important decision that led to his success, Hatley says, was moving to Raleigh, where he went to work for Wachovia Bank and made a point of meeting the right people. “I learned a long time ago it’s not what you know, it’s who you know. Interpersonal skills trump brains,” Hatley says. “I happened to be on coffee breaks with successful people. I was going on calls with successful people. I was picking up the paper, reading about successful people that I would soon be working with,” he says. “I attribute so much of it to that.”

Assume this is accurate. How bitter must be those innovators who created Convera, Entopia, Delphes, and dozens of other search and content processing vendors. Some of Convera’s ideas are just now finding their way into systems. Entopia had a vision for federated search which included social elements. Delphes designed a system which made sense to financial professionals.

Why did these companies fail? The wrong connections? Bad luck?

What does that mean for venture firms pumping millions into promising companies in hopes of scoring a big win? I think it means that search like many tough problems is a roll of the dice. Join a club. Be sociable. Consult the Delphic oracle.

Learning? Persistence? Maybe over-rated. Little wonder search and content processing systems continue to disappoint many users. Are you feeling lucky?

Stephen E Arnold, October 31, 2012

Using SharePoint to Streamline Common Business Processes

October 31, 2012

SharePoint is a complex system and it helps to do some outside reading to stay abreast on helpful features and tips to share with your team. Ellen van Aken looks at some ways SharePoint can help save time and effort in an organization in her article, “4 Common Processes that SharePoint Can Streamline.” Aken explains that creating a tailor-made subsite and making a template out of it can allow for creating a ready-to-use team site for every project in no time. This could help make recurring projects more efficient.

She also explains streamlining requests with incomplete data coming from multiple channels:

How often do people send you a request, by plain email, telephone, or Word/Excel document? And how often do you have to contact them again to ask for missing information?

Depending on the complexity, you can use a simple SharePoint list, an Office template in a Document Library, or an InfoPath form in a Forms Library, with mandatory fields. As additional advantage SharePoint stores all your requests in one central place, so you do not have to spend time on filing them…The finished requests can be used to gain insights in your process.

She adds that a filter can be put in place to show which requests still need processing. Another way to streamline information requests with missing data is to provide a search feature that can tap into all company knowledge, whether text or person related. For example, Fabasoft Mindbreeze with the SharePoint connector provides an all-inclusive search without redundancy or loss of data. Additionally, Mindbreeze has the ability to process data and turn it into relevant knowledge for users.

Philip West, October 31, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta