A New French Business Search Engine
March 16, 2015
France is not the first country you think about when it comes to developing new search techniques, much less search engines. However, Web Time Media reports that, “France Datafari Labs Launches New Business Search Engine” meant to rival Polyspot, Exalead, and Sinequa.
France Labs designed the Datafari 1.0 specifically for the cloud and big data and it offers a complete open source enterprise search solution. It was made to be the top performing search application available via open source, making it stiff competition for Apache Solr and ElasticSearch as well.
The description of its offerings is pretty exciting [via Google Translate]:
“The promise to companies is to allow them to retrieve the data wherever they are, whatever they are, safe. Datafari for that innovates on several axes. At the technical level, it manages corpus big data, integrating Apache SolrCloud. Level analysis, it offers analytical queries and dashboards corpus. At development, it is Apache license, non-viral for business (they do not have the obligation to provide the community the developments they do). Finally, the interoperability level Datafari provides a set of REST APIs to expose its connectors as well as its search engine.”
Datfari 1.0 is already being downloaded and experimented with by developers to see if it offers a new, viable, and flexible solution for enterprise and singular networks. The open source search market is already swollen in the English-speaking world, so Datafari needs to explain more about what makes it different from other search applications.
Whitney Grace, March 16, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Microsoft Makes Bing Faster
March 16, 2015
Bing is classified as a generic search engine living in Google’s as well as DuckDuckGo’s shadows. In an attempt to make Bing a more viable product, ExtremeTech tells us that “Microsoft To Accelerate Bing Search With Neural Network.” When Bing scours the Internet, it pulls results from a Web index that is half the size of Google’s. Microsoft wants to increase Bing’s efficiency and speed, so they created the Field-Programmable Gate Array (FPGA) technology.
Microsoft breaks Bing’s search into three parts: machine learning scoring, feature extraction, and free-form expressions. Bing still uses Xeon processors for its document selection service and it needs to switch over to new FPGA software to increase its search speed. Microsoft called the team developing the new FPGA technology Project Catapult. Project Catapult uses similar tech designed in 2011, but it relies on half the servers as it did in the past.
Microsoft is relying on convolution neural network accelerators (CNNs) for the project:
“Convolutional neural networks (CNNS) are composed of small assemblies of artificial neurons, where each focuses on a just small part of an image — their receptive field. CNNs have already bested humans in classifying objects in challenges like the ImageNet 1000. Classifying documents for ranking is a similar problem, which is now one among many Microsoft hopes to address with CNNs.”
Armed with the new FPGA, Microsoft hopes to increase Bing’s search and rank business to compete at a greater level with Google. While that may increase Bing’s chances of returning better results, remember that Microsoft still creates OS’s that still fail on initial public releases.
Whitney Grace, March 16, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Swiftype Raises More Money for Web Site Search
March 16, 2015
TechCrunch tells us that search startup “Swiftype Raises $13M More For Its Starter Site And App Search.” Swiftype’s mission is pretty straightforward: they want to create customizable search tools that do not suck (TechCrunch’s own language). You have to admit that it is a bold move, considering many out-of-the-box solutions do stink worse than dial-up from 1995 and open source (while it is free and awesome) requires a bit of developer experience. Swiftype takes the guesswork and makes a tailored solution without the hassle or developer experience.
While Swiftype originally started out for Web sites, they have moved into other areas:
“On the other hand, online publishers might not be the most lucrative customer base, so while co-founders Matt Riley and Quin Hoxie told me they still support publishers (and we still use Swiftype at TechCrunch), they’ve also expanded into other areas, particularly knowledge bases (basically, FAQs and customer support sites) and e-commerce.”
The search company will use the $13 million will probably invest the money to expand its already popular search tools. New Enterprise Associates led the Series B funding and they were used for the original Series A round. Swiftype used New Enterprise Associates to form a long-term partnership.
Whitney Grace, March 16, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Square 9 Upgrades with Global Search
March 16, 2015
Square 9 Softworks is famous for its document management service and dtSearch is known for its document filters and developer text retrieval. The companies have partnered their technology on Square 9’s SmartSearch Document Management product line. The San Diego Times shares with us a new development from another team-up: “Square 9’s Award-Winning SmartSearch Document Management Installs Now Include GlobalSearch Embedding The dtSearch Engine.”
SmartSearch products will feature the new GlobalSearch, which enables intranet access to all SmartSearch repositories. SmartSearch is marketed as an out-of-the-box file management system for small businesses and enterprises. The GlobalSearch only improves the product line:
“Square 9’s GlobalSearch platform extends the reach of a SmartSearch installation by delivering anywhere, anytime access to documents from any browser or mobile device. Mobile users can search a single repository or across an entire database quickly and easily, locating exactly what they need. With their documents in hand, GlobalSearch users can securely take whatever action necessary to continue the flow of business information. Features include not only complete navigation and editing, but also automated routing, automatic notification and granular document security.”
An improvement on already highly praised product will only increase Square 9’s sales. Why is it hard for other out-of-the-box solutions to provide such ease of use?
Whitney Grace, March 16, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
HP Autonomy: HP Shareholder Deal
March 14, 2015
I am no legal eagle. I did find “US Judge Approves HP Shareholder Deal over Autonomy Acquisition” interesting. (If the news story disappears, well, that’s life in the world of real news.)
The story reports:
Hewlett-Packard Co won preliminary approval from a U.S. judge to settle shareholder litigation on Friday involving the information technology company’s botched acquisition of Autonomy Plc. The ruling, from U.S. District Judge Charles Breyer in San Francisco, comes after HP failed to win approval of two previous proposed deals. Breyer had written that the last deal may not have been fair for shareholders because it could have forced them to give up claims beyond the Autonomy deal.
The battle over HP’s decision to purchase Autonomy continues. I assume the lawyers representing the parties to this matter are thinking about appeals. Will billing cross their minds?
Stephen E Arnold, March 14, 2015
Attensity Understands Emoticons
March 13, 2015
Part of big data is being able to make sense of unstructured data, including the pieces that natural language processing software cannot understand like emoticons. Emoticons are an Internet phenomenal where people use grammatical symbols, numbers, and letters to represent feelings and ideas.
They cannot be spoken, so if an organization wants to analysis all of its data they need to be able to interpret emoticons. PC World tells us that Attensity has already created a way to understand emoticons without turning to a teenage girl to translate: “For Attensity’s BI Parsing Tool, Emoticons Are No Problem.”
Attensity’s Semantic Annotation natural-language processing tool was designed to handle large data loads. It can monitor and extract insights from unstructured data, including data from social media platforms and internal information like customer surveys and calls.
“Rather than relying on traditional keyword-based approaches to assessing sentiment and deriving meaning, Attensity’s Java-based product takes a more flexible natural-language approach. By combining and analyzing the linguistic structure of words and the relationship between a sentence’s subject, action and object, it’s designed to decipher and surface the sentiment and themes underlying many kinds of common language—even when there are variations in grammatical or linguistic expression, emoticons, synonyms and polysemies.”
This means Attensity can generate data straight from sentences rendered entirely in emoticons and acronyms.
Another practical use for Attensity’s Semantic Annotation would be to create a translation app for parents trying to decipher their teenager’s text messages.
Whitney Grace, March 13, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Google: Similarity Function Drifts from Relevance
March 12, 2015
I ran a test query for “Concept Searching,” an indexing outfit. I noticed that Google generated a list of companies with the label “People Also Search For.”
What I find interesting is that the list of companies is a bit of a grab bag. Here is the list presented to me:
- X1 Technologies, now in the eDiscovery business
- GenieKnows, an SEO outfit
- Funnelback, Squizz’s search solution which has gone quiet since David Hawking shifted roles
- dtSearch, a Microsoft centric desktop and CD-ROM search system for Windows
- Northern Light Group, now a research firm
- Coveo, the ageing startup once focused exclusively on Microsoft centric solutions
- ZyLAB Technologies, a legal document management and search solution
- Metalogix, a SharePoint migration specialist
- Convera, one of the spectacular business implosions which I documented in the Xenky profile available at www.xenky.com/vendor-profiles
- Dieselpoint, a search outfit that went quiet a couple of years ago
- Axceler, now a unit of Metalogix
- Fast Search & Transfer, the search company that has the distinction of a financial misstep and a founder with a painful brush with Norwegian law enforcement
- Exalead, now a unit of Dassault Systèmes, a company which has largely faded from the North American market
- Expert System, a quite good semantic vendor based in Modena, Italy
- Vivisimo, a metasearch outfit acquired by IBM and now part of the IBM Big Data machine.
Quite an assortment. I assume that these suggestions are helpful to the LinkedIn experts, the failed webmasters now rebranded as search wizards, and wanna-be academics looking for consulting revenue.
For me, the list is an illustration of what Google wants to do, provide on point suggestions. However, the list makes vivid the limitations of the Google methods. Hey, the company is focusing attention on balloons.
Stephen E Arnold, March 12, 2015
Open Source ElasticSearch Added to Google Cloud Platform
March 12, 2015
ElasticSearch is a popular open source search engine that has been downloaded over 10 million times since it deployed in 2010. Amazon recently announced they are planning on adding an ElasticSearch management service to EC2 to relieve workloads for developers. Rival Google announced on the Google Cloud Platform Blog that they will be adding ElasticSearch compatibility to its own cloud computing platform: “Deploy ElasticSearch On Google Compute Engine.”
The Google Compute Engine is ecstatic that ElasticSearch will be deployed on the platform and are actively encouraging end users to download it. They even made a list about why people need to start using ElasticSearch:
1 “Based on Lucene: Elasticsearch is an open source document-oriented search server based on Lucene. Lucene is a time tested open source library that is capable of reading everything from HTML to PDFs.
2 Designed for cloud: Elasticsearch was designed first for the cloud with its capabilities around simple cluster configuration and discovery and high-availability by default. This means you can expand your Elasticsearch deployment simply by adding new nodes. This expansion of your cluster — or in the case of a hardware failure, reduction — results in automatic reconfiguration of your document indices across the cluster.
3 Native use of JSON over HTTP: Extending the platform is simple for developers. The schema doesn’t need to be defined up front and your cluster can be extended with a variety of libraries in your languages of choice, even using the command line.”
ElasticSearch can be deployed with a few easy clicks ad once it is working you can immediately use it for log processes and analysis with Logstash, keyword text search, and data visualization with Kibana.
Deployment on the Google Compute Engine means ElasticSearch will reach an entirely new customer line. Other open source search engines will be pressured to up their ante with new features and services that ElasticSearch does not have. LucidWorks and other open source based search companies are feeling the pressure.
Whitney Grace, March 12, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
AWS Wants to Make Using ElasticSearch Easier
March 12, 2015
Amazon Web Services is one of the biggest purveyors of cloud and remote computing, but it still faces stiff competition from its rivals. AWS continues to add features and new technology to attract more users. TechTarget alerted us to how AWS is making developments with its search offerings: “Amazon Preps AWS ElasticSearch To Ease EC2 Integration.” AWS wants to make running ElasticSearch, an open source search engine, easier on its Elastic Compute Cloud (EC2) with a new service option.
Another open source search engine based of Apache Solr is already available on AWS called CloudSearch, but ElasticSearch has become more popular in recent years. Solr is still considered by many an open source project rather than a competitive application. ElasticSearch has remained on top of valuable open source products since created in 2010.
Response to an ElasticSearch service for EC2 has been positive and end-users are eager to see it deployed. Integrating ElasticSearch into EC2 is tricky, leading to memory shortages and leaks. If AWS manages the backend for ElasticSearch integrations, it would be a relief for users who have head to deal with the issue. They would be able to focus on other projects rather than keeping the backend running.
“’ I wouldn’t be surprised to see this kind of offering,’ said Dan Sullivan, a consultant with DS Applied Technologies, located in Portland, Ore, who did not have any direct knowledge of the upcoming service, but said it would make sense. ‘ElasticSearch is growing in popularity … and [an AWS service] would be something a lot of people would be interested in.’”
What does this spell for Apache Solr-based companies like LucidWorks? It puts more pressure on them to be a more viable rival.
Whitney Grace, March 12, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
HP Autonomy Blog: Everything but Autonomy Technology
March 10, 2015
I have been checking out the search and content processing vendors who have gone quiet. In my lingo, “quiet” means the company outputs little or no news in the form of blog posts, news releases, slide decks on Slideshare, etc.
One of the most aggressive and effective marketing outfits in search and content processing was Autonomy. since the HP deal, the majority of the Autonomy related news concerns the litigation between HP and Autonomy about HP’s purchase of Autonomy.
I checked links to the Autonomy blog on www.autonomy.com and clicked on the link at the top of this page:
The link is dead if this message is correct:
I then navigated to the GOOG and ran the query “Autonomy blog.” The first link pointed me to this page:
The only hurdle I encountered in my fly over was that the information is not “about” Autonomy, IDOL, or the Digital Reasoning Engine.
Perhaps I am overlooking HP’s brilliant marketing, but it seems to me that HP is not making much of an effort to take a page from Autonomy’s marketing plan book. That might be a mistake in some niches.
When a company goes quiet, I interpret the behavior as a signal about management resolve, financial resources, or having something substantive to communicate. Call me old fashioned, but I like a stream of information about sales, enhancements, bug fixes, and other artifacts of a growing company.
Stephen E Arnold, March 10, 2015