Lookeen Desktop Search: Exclusive Interview Reveals Lucene as a Personal Search Solution
March 17, 2015
Axonic’s enterprise-centric search products eliminate most, if not all, of the problems a Windows user encounters when trying to locate related information produced by different applications on a desktop computer. Email and other types of information are findable with a few keystrokes.
When I was in Germany in June 2014, I learned about Lookeen, a desktop search product that was built on Lucene. The idea was to tap the power of Lucene to put content on a user’s computer at one’s fingertips. Imagine working in Outlook, reading a message, and seeing a reference to a PowerPoint on the user’s external storage device. Lookeen allows access to the content from within Outlook. Now the company is releasing a commercial version of its desktop search product that promises to be a game changer on the desktop and in the enterprise. The company offers robust functionality at a very attractive price point.
The role of Lucene and other technical innovations in the high-performance software appears in an exclusive interview with Lookeen’s chief operating officer. You can find the interview at http://bit.ly/1LizbkQ.
The Lookeen interface is intuitive. No training is required to install the Lucene-based system nor to use it for simple or complex information retrieval tasks. Image used with the permission of Axonic GmbH.
Lookeen is a product developed by Axonic, a software and services firm located in Karlsruhe, Germany, in Rhine Valley, a short distance from Stuttgart. Axonic is one of the leading software development and services firms for Outlook and Exchange Server search technologies in Europe. The company specializes in enterprise applications and has a core competency in Microsoft technologies.
I wanted more detail about Lookeen’s approach to desktop search. In an exclusive interview, Peter Oehler, COO, revealed a its breakthrough approach to desktop search. The company’s Lookeen software gives Windows users the industry-leading search technology tuned for the Microsoft environment. Outlook email, PowerPoint decks, Word documents and other common file types are instantly findable.
Peter Oehler said:
We’ve utilized Lucene’s extensive query syntax to enable users to use familiar Google-like Boolean search, as well as wildcard, proximity, and keyword matching. The introduction of more search strings and filter features enable users to narrow down searches in an easy and intuitive way, and more proficient searchers can access the best of Lucene’s query syntax.
He added:
Lucene is a very good, widely used open source search system. Many of the innovations we’ve developed on top of the Lucene engine stem directly from our extensive experience with Outlook. For example, the Lookeen context menu allows a user to open, reply to, forward, move and summarize emails and topics, all from within Lookeen.
What sets Lookeen apart from proprietary, freeware, and shareware is that Axonic has engineered its system to provide real-time access to information on the user’s computer. The system can handle terabytes of user content, returning results almost instantaneously.
Axonic has deep experience with Microsoft technology. Oehler told me:
Lucene is a beast within the Microsoft environment. Microsoft doesn’t make it easy to work with Outlook without causing problems or affecting performance. Outlook is the lifeblood of most professionals – the most important tool. If it stops working, you stop working. The art of our product is how we tackle the complex code hiding under the surface of Outlook and combine it with Lucene to create a deceptively smooth and simple search solution.
Beyond Search ran tests on Lookeen and compared the results with outputs from a number of test systems. Lookeen’s response times were among the fastest. When indexing and searching email, including archived collections of emails, Lookeen was the top performer. Our test systems include Copernic, dtSearch, Effective File Search, Gaviri, ISYS Desktop Search, and X1.
Lookeen requires no special training or complex set up. Lookeen allows a user to search external shared content directly from the Lookeen app. The interface is clear and logical. A busy professional can access needed documents, view and interact with them without launching an external application.
A 14 day free trial is available. The license fee is $58 for a single user version. The company offers a business edition (at $83) which adds group policy functions and an enterprise edition, which begins at about $116 per user, however volume discounts are available.
To read the complete exclusive interview with Peter Oehler, navigate to the Search Wizards Speak service at this link on ArnoldIT. More information about the company is available at http://www.lookeen.com.
Stephen E Arnold, March 17, 2015
Square 9 Upgrades with Global Search
March 16, 2015
Square 9 Softworks is famous for its document management service and dtSearch is known for its document filters and developer text retrieval. The companies have partnered their technology on Square 9’s SmartSearch Document Management product line. The San Diego Times shares with us a new development from another team-up: “Square 9’s Award-Winning SmartSearch Document Management Installs Now Include GlobalSearch Embedding The dtSearch Engine.”
SmartSearch products will feature the new GlobalSearch, which enables intranet access to all SmartSearch repositories. SmartSearch is marketed as an out-of-the-box file management system for small businesses and enterprises. The GlobalSearch only improves the product line:
“Square 9’s GlobalSearch platform extends the reach of a SmartSearch installation by delivering anywhere, anytime access to documents from any browser or mobile device. Mobile users can search a single repository or across an entire database quickly and easily, locating exactly what they need. With their documents in hand, GlobalSearch users can securely take whatever action necessary to continue the flow of business information. Features include not only complete navigation and editing, but also automated routing, automatic notification and granular document security.”
An improvement on already highly praised product will only increase Square 9’s sales. Why is it hard for other out-of-the-box solutions to provide such ease of use?
Whitney Grace, March 16, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Funnelback: Getting Quieter and Quieter
March 7, 2015
Prior to some management changes, Funnelback was on my radar. The company seemed active in the UK and Australia. For 2015, the company has commenced operations in the US. The firm’s office is in Santa Monica. There is a Web site refreshing at www.funnelback.com. The last big news items I noticed was that Funnelback became a Crown Commercial Service Supplier on G-Cloud 6. Was Funnelback a bit of a one man band. When the drummer departed, the search focus softened. Worth watching.
Stephen E Arnold, March 7, 2015
Enterprise Search: Roasting Chestnuts in the Cloud
March 6, 2015
I read “Seeking Relevancy for Enterprise Search.” I enjoy articles about “relevance.” The word is ambiguous and must be disambiguated. Yep, that’s one of those functions that search vendors love to talk about and rarely deliver.
The point of the write up is that enterprise content should reside in the cloud. The search system can then process the information, build an index, and deliver a service that allows a single search to output a mix of hits.
Sounds good.
My concern is that I am not sure that’s what users want. The reason for my skepticism is that the shift to the cloud does not fix the broken parts of information retrieval. The user, probably an employee or consultant authorized to access the search system, has to guess which keywords unlock the information in the index.
Search vendors continue to roast the chestnuts of results lists, keyword search, and work arounds for performance bottlenecks. The time is right to move from selling chestnuts to those eager to recapture a childhood flavor and move to a more efficient information delivery system. Image source: http://www.mybalkan.org/weather.html
That’s sort of a problem for many searchers today. In many organizations, users express frustration with search because multiple queries are needed to find information that seems relevant. Then the mind numbing, time consuming drudgery begins. The employee opens a hit, scans the document, copies the relevant bit if it is noted in the first place, and pastes the item into a Word file or a OneNote type app and repeats the process. Most users look at the first page of results, pick the most likely suspect, and use that information.
No, you say.
I suggest you conduct the type of research my team and I have been doing for many years. Expert searchers are a rare species. Today’s employees perceive themselves as really busy, able to make decisions with “on hand” information, and believe themselves to be super smart. Armed with this orientation, whatever these folks do is, by definition, pretty darned good.
It is not. Just don’t try telling a 28 year old that she is not a good searcher and is making decisions without checking facts and assessing the data indexed by a system.
What’s the alternative?
My most recent research points to a relatively new branch or tendril of information access. I use the term “cyberosint” to embrace systems that automatically collect, analyze, and output information to users. Originally these systems focused on public content like Facebook, Twitter posts, and Web content. Now the systems are moving inside the firewall.
The result is that the employee interacts with reports generated with information presented in the form of answers, a map with dynamic data showing where certain events are now taking place, and in streams of data that go into other systems such as a programmatic trading application on Wall Street.
Yes, keyword search is available to these systems which can be deployed on premises, in the cloud, or in a hybrid deployment. The main point is that the chokehold of keyword search is broken using smart software, automatic personalization, and reports.
Keyword search is not an enterprise application. Vendors describe the utility function as the ringmaster of the content circus. Traditional enterprise search is like a flimsy card table upon which has been stacked a rich banquet of features and functions.
The card table cannot support the load. The next generation information access systems, while not perfect, represent a significant shift in information access. To learn more, check out my new study, CyberOSINT.
Roasting chestnuts in the cloud delivers the same traditional chestnut. That’s the problem. Users want more. Maybe a free range, organic gourmet burger?
Stephen E Arnold, March 6, 2015
Enterprise Search: Is Keyword Search a Lycra-Spandex Technology?
March 3, 2015
I read a series of LinkedIn posts about why search may be an enterprise application flop. To access the meanderings of those who believe search is a young Bruce Jenner, you will have to sign up for LinkedIn and then wrangle an invitation to this discussion. Hey, good luck with this access to LinkedIn thing.
Over the years, enterprise search has bulked up. The keyword indexing has been wrapped in layers of helper code. For example, search now classifies, performs work flows operations, identifies entities, supports business intelligence dashboards, delivers self service Web help, handles Big Data, and dozens of other services.
Image Source: www.sochealth.co.uk.
I have several theories about this chubbification of keyword search. Let me highlight the thoughts that I jotted down as I worked through the “flop” postings on LinkedIn.
First, keyword search is not particularly useful to some people looking for information in an organization. The employee has to know what he or she needs and the terminology to use to unlock the secrets of the index. Add some time pressure and keyword search becomes infuriating. The fix, which began when Fulcrum Technologies pitched a platform approach to search, was to make search a smaller part of a more robust information management solution. You can still buy pieces of the original 1980s Fulcrum technology from OpenText today.
Second, system users continue to perceive results list as a type of homework. The employee has to browse the results list, click on documents that may contain the needed information, scan the document, identify the factoid or paragraph needed, copy it to another document, and then repeat the process. Employees want answers. What better way to deliver those answers than a “point and click” interface? Just pick what one needs and be done with the drudgery of the keyword search.
Third, professionals working in organizations want to find information from external sources like Web pages and blogs and from internal sources such as the server containing the proposals or president’s PowerPoint presentations. Enterprise search is presented as a solution to information access needs. The licensee quickly learns that most enterprise search systems require money, engineers, and time to set up so that content from disparate sources can be presented from a single interface. Again employees grouse when videos from YouTube and from the training department are not in the search results. Some documents containing needed information are not in the search system’s index but a draft version of the document is available via a Bing or Google search.
Fourth, the enterprise search system built on keywords lacks intelligence. For many vendors the solution is to add semantic intelligence, dynamic personalization which figures out what an employee needs by observing his information behaviors, and predictive analytics which just predicts what is needed for the company, a department and an individual.
Fifth, vendors have emphasized that a smart organization must have a taxonomy, a list of words and concepts tailored to the specific organization. These terms enrich the indexing of content. To make taxonomy management easy as pie, search vendors have tossed in editorial controls for indexing, classification, and hit boosting so that certain information appears whether the employee asked for the data or not.
In short order, the enterprise search system looks quite a bit like the “Obesity Is No Laughing Matter” poster.
This state of affairs is good for consulting engineers (SharePoint search, anyone?), mid tier consulting firm pundits, failed webmasters recast as search experts, and various hangers on. The obese enterprise search system is not particularly good for the licensing organization, the employees who are asked to use the system, or for the system administrators who have to shoehorn search into their already stuffed schedule for maintaining databases, accounting systems, enterprise resource planning, and network services.
Search is morbidly obese. No diet is going to work. The fix, based on the research conducted for my new monograph CyberOSINT is that a different approach is needed. Automated collection, analysis, and outputs are the future of information access.
Keyword search is a utility and available in NGIA systems. Unlike the obese keyword search systems, NGIA information access has been engineered to deliver more integrated services to users relying on mobile devices as well as traditional desktop computers.
Obese search is no laughing matter. One cannot make a utility into an NGIA system. However, and NGIA can incorporate search as a utility function. Keep this in mind if you are embracing Microsoft SharePoint-type systems. Net net: traditional enterprise search is splitting its seams, and it is unsightly.
Stephen E Arnold, March 3, 2015
Automated Collection Keynote Preview
February 14, 2015
On February 19, 2015, I will do the keynote at an invitation only intelligence conference in Washington, DC. A preview of my formal remarks is available in an eight minute video at this link. The preview has been edited. I have inserted an example of providing access to content not requiring a Web site.
A comment about the speed with which information and data change and become available. Humans cannot keep up with external and most internal-to-the-organization information.
The preview also includes a simplified schematic of the principal components of a next generation information access system. The diagram is important because it reveals that keyword search is a supporting utility, not the wonder tool many marketers hawk to unsuspecting customers. The supporting research for the talk and the full day conference appears in CyberOSINT, which is now available as an eBook.
Stephen E Arnold, February 14, 2015
Coveo Asserts Record Growth and Improved Relevance
February 12, 2015
Proprietary enterprise search is one reason DARPA has made noise about a new threat center. The idea is that cyber intelligence is a hot issue. Without repeating the information in CyberOSINT, suffice it to say that keyword search is not up to the findability tasks in today’s world. For more on the threat center integration, you may want to review “New Threat Center to Integrate Cyber Intelligence.”
In this context, I read “Coveo Announces Record Growth in 2014.” The company was founded in 2005 in Canada. The the last nine years, according to Crunchbase, the company has ingested $34.7 million from eight investors. The most recent funding round was in December 2012 when the company obtained an additional $18 million. Let’s assume the data are accurate.
In the “record growth” announcement, the company states:
Coveo today announced accelerated growth in 2014 via strong demand for its enterprise search-based applications that help employees upskill as they work, and driven in large part by its continued strategic partnerships with leading organizations such as Salesforce. The year was also marked by the best quarter in the history of the company and the 1,000th enterprise activation of its software, with new customer Sonus Networks.
The “record growth” news story omits an important data point: Financial results with numbers. Coveo is a privately held company and under no obligation to provide any hard numbers. In lieu of metrics, the story provides this interesting item: Enhanced relevance tuning. After nearly nine years in the enterprise market, I had assumed that Coveo had figured out relevance.
Coveo, like its fellow travelers in the keyword search sector Attivio and BA Insight, is recognized in different “expert” advisory firms’ lists of important companies. Also, each of these three keyword search companies are working overtime to generate revenues that enable them to generate Autonomy or Endeca scale revenues. The three keyword search vendors have to differentiate themselves as the US Department of Defense are actively seeks next generation approaches. The sunny days of Autonomy and Endeca have been hit by climate change even as they recline in the shelter of Hewlett Packard and Oracle, their new owners.
My hunch is that if the financials back up the assertions in the “record growth” story, stakeholders will be happy campers. On the other hand, if those funding traditional search systems relying on proprietary code do not see a solid payback, dreary days may be ahead.
For functional information retrieval, many large companies—including the firms developing next generation information access systems—ignore proprietary search solutions. The open source software deliver a lower cost, license fee free commodity function.
Did anyone bring umbrellas? In the hay days of enterprise search, vendors gave away bumbershoots with logos affixed. These may be needed because the search climate has changed with heavier rainfall predicted.
Stephen E Arnold, February 12, 2015
Enterprise Search: Security Remains a Challenge
February 11, 2015
Download an open source enterprise search system or license a proprietary system. Once the system has been installed, the content crawled, the index built, the interfaces set up, and the system optimized the job is complete, right?
Not quite. Retrofitting a keyword search system to meet today’s security requirements is a complex, time consuming, and expensive task. That’s why “experts” who write about search facets, search as a Big Data system, and search as a business intelligence solution ignore security or reassure their customers that it is no big deal. Security is a big deal, and it is becoming a bigger deal with each passing day.
There are a number of security issues to address. The easiest of these is figuring out how to piggyback on access controls provided by a system like Microsoft SharePoint. Other organizations use different enterprise software. As I said, using access controls already in place and diligently monitored by a skilled security administrator is the easy part.
A number of sticky wickets remain; for example:
- Some units of the organization may do work for law enforcement or intelligence entities. There may be different requirements. Some are explicit and promulgated by government agencies. Others may be implicit, acknowledged as standard operating procedure by those with the appropriate clearance and the need to know.
- Specific administrative content must be sequestered. Examples range from information assembled for employee health or compliance requirements for pharma products or controlled substances.
- Legal units may require that content be contained in a managed system and administrative controls put in place to ensure that no changes are introduced into a content set, access is provided to those with specific credential, or kept “off the radar” as the in house legal team tries to figure out how to respond to a discovery activity.
- Some research units may be “black”; that is, no one in the company, including most information technology and security professionals are supposed to know where an activity is taking place, what the information of interest to the research team is, and specialized security steps be enforced. These can include dongles, air gaps, and unknown locations and staff.
An enterprise search system without NGIA security functions is like a 1960s Chevrolet project car. Buy it ready to rebuild for $4,500 and invest $100,000 or more to make it conform to 2015’s standards. Source: http://car.mitula.us/impala-project
How do enterprise search systems deal with these access issues? Are not most modern systems positioned to index “all” content? Is the procedures for each of these four examples part of the enterprise search systems’ administrative tool kit?
Based on the research I conducted for CyberOSINT: Next Generation Information Access and my other studies of enterprise search, the answer is, “No.”
Has Lightning Struck for MaxxCat?
February 10, 2015
Have you ever heard of MaxxCat? It has played around in the back of our RSS feed every now and then when they have accomplished a major breakthrough. The company skipped to the forefront of enterprise search news this morning with one of their products. Before we discuss what wonders MaxxCat plans to do for enterprise search, here is a little more about the company.
MaxxCat was established in 2007 to take advantage of the growing enterprise search solutions market. The company specializes in low cost search and storage as well as integration and managed hosting services. MaxxCat creates well-regarded hardware with an emphasis that their clients should be able to concentrate on more important things than storage. The company’s search appliance hosting page explains a bit more about what MaxxCat offers:
“MaxxCAT can provide complete managed platforms using your MaxxCAT appliances in one or more of our data centers. Our managed platforms allow you to focus on your business, and allow us to focus on getting the maximum performance and uptime from your enterprise search appliances. Nobody can host, tune or manage MaxxCAT appliances as well as the people who invented them.”
Enterprise search appliances without a headache? It is a new and interesting concept that MaxxCat seems to have a handle on it.
Whitney Grace, February 10, 2015
Sponsored by ArnoldIT.com, developer of Augmentext
Attivio: New, New, New after $70 Million and Seven Years
February 7, 2015
With new senior managers and a hunt on for a new director of financial services, Attivio is definitely trying to shake ‘em up. I received some public relations spam about the most recent version of the Attivio system. The approach combines open source software with home brew code, an increasingly popular way to sell licenses, consulting, and services. To top it off, Attivio is an outfit that has the “best company culture” and Dave Schubmehl’s IDC report about Attivio with my name on it available for free. This was a $3,500 item on Amazon earlier this year. Now. Free.
Attivio’s February 3, 2015, news release explains that Attivio is in the enterprise search business. You can read the presser at this link. Not too long ago, Attivio was asserting that it was the solution to some business intelligence woes. I suppose search and business intelligence are related, but “real” intelligence requires more than keyword search and a report capability.
The release explains that Attivio is—I find this fascinating—“reinventing Big Data Search and Dexterity.” Not bad for open source, home brew, and Fast Search & Technology flavoring. Search and dexterity. Definitely a Google Adword keeper.
Attivio’s presser says:
Attivio 4.3 delivers new functionality and improvements that make it dramatically easier to build, deploy, and manage contextually relevant applications that drive revolutionary insight. Companies with structured and unstructured data in disparate silos can now quickly gain immediate access to all information with universal contextual enrichment, all delivered from Attivio’s agile enterprise platform.
I like “revolutionary insight.” Keep in mind that Attivio was formed by former Fast Search & Transfer executives in 2007 and has ingested, according to Crunchbase, $71.1 million in seven years. That works out to $10 million per year to do various technical things and sell products and services to generate money.
More significant to me than money that may be difficult or impossible to repay with a hefty uptick is that in seven years, Attivio has released four versions of its flagship software. With open source providing a chunk of functionality, it strikes me that Attivio may be lagging behind the development curve of some other companies in the content processing sector. But with advisors like Dave Schubmehl and his colleagues, the pace of innovation is likely to be explained as just wonderful. At Cambridge University, one researcher pointed out that work done in 2014 is essentially part of ancient history. There is perhaps a difference between Cambridge in the UK and Cambridge in Massachusetts.
What does Attivo 4.3 offer as “key features”? Here’s what the news release offers:
- ASAP: Attivio Search Application Platform – a simple, intuitive user interface for non-technical users building search-based applications;
- SAIL: Search Analytics Interactive Layer – offers more robust functionality and an enhanced user experience;
- Advanced Entity Extraction: New machine-learning based entity extraction module enriches content with higher accuracy and improved disambiguation, enabling deeper discovery and providing a smart alternative to managing entity dictionaries;
- Simplified Management: Empowers business users to handle documents and manage settings in a code-free environment;
- Composite Documents: Unique ability to search across document fragments optimized to deliver sub-second response times;
- New Designer Tools: Simplifies Attivio management through Visual Workflow and Component Editors, enables all users to design and build custom processing logic in an integrated UI.
There are a couple of important features that are available in other vendors’ systems; for example, geographic functions, automated real-time content collection, automated content analytics, and automated outputs to a range of devices, humans, or other systems.
The notion of ASAP and SAIL are catchy acronyms, but I find them less than satisfying. The entity extraction function is interesting but there is no detail about how it works in languages other than Roman based character sets, how the system deals with variants, and how the system maps one version of an entity to another in content that is either static imagery or video.
I am not sure what a composite document is. If a document contains images and videos, what does the system do with these content objects. If the document is an XML representation, what’s the time penalty to convert content objects to well formed XML? With interfaces becoming the new black, Attivio is closing the gap with the Endeca interface toolkit. Endeca dates from the late 1990s and has blazed a trail through the same marketing jungle that Attivio is now retracing.
For more information about Attivio, visit the company’s Web site at www.attivio.com. The company will be better equipped to explain virtual, enterprise search, big data, and the company’s financial posture than I.
Stephen E Arnold, February 7, 2015