Exalead Clicks with Olympus
January 21, 2011
Olympus, respected manufacturer of equipment such as cameras, audio devices, and microscopes, is embarking on a relaunch of it’s 47 subsidiary websites around the world. It has chosen to serve it’s customers using the e-Spirit Enterprise Search Module, which is based on Exalead’s technology.
Integrating with the FirstSpirit content management system, also from e-Spirit, EnterpriseSearch promises to:
“. . .allow for much more efficient searching of relevant information than is possible with standard full-text searching. Among the benefits for Olympus are a convenient search interface with fast filter options, personalized search results for protected documents, and integrated search via linked ‘non FirstSpirit sources’ such as the shop system.”
Among Olympus’ challenges are managing content in 30 languages and simplifying editing and administration processes. The company also hopes to represent it’s innovative nature and appeal to their customers’ emotions. We’ll see whether FirstSpirit and e-Spirit are up to the challenge.
Cynthia Murrell January 21, 2011
Attivio Adds Transaction Like Document Processing
January 17, 2011
We learned that Attivio has added transaction like document processing to its Active Intelligence Engine. AIE is a scalable parallel asynchronous messaging system. Unlike other asynchronous systems methods, Attivio gives licensees controls to determine when a message is processed. Attivio reveals in “Transaction-like Document Processing in AIE”:
Grouped message processing allows a set (usually small) of related messages to be processed as a group (in order) when needed. This capability can be turned on or off by individual document processing transformers. What this means is that when the grouping isn’t required (the transformer doesn’t have side-effects which depend on the group ordering) then the messages are processed independently, delivering the maximum throughput. When a document transformer does require this transaction-like behavior, a simple configuration change is all that is necessary. This change causes the following semantics to come into play for the component:
- All non-document messages are blocked while processing the group. A system commit or optimize message cannot be processed until the group is complete. This means the processing of all the messages of the group will occur together before or after the commit, ensuring a consistent state.
- The documents in the group are sent to the component in the order they were added to the group.
- Only a single instance of the component is used to process the documents in the group.
The component that most heavily uses message groups is the ContentDispatcher (the gateway component for the AIE index).
You can get additional information, block diagrams, and links to special features on the Attivio Web site or in Attivio’s blog.
Stephen E Arnold, January 17, 2011
Freebie
Oracle and Drive BI Targets SAP
January 17, 2011
Search is in the Oracle enterprise products and services. Search is just not where the action is at either Oracle or SAP. As more agile competitors like Exalead erase the boundaries between traditional big iron business intelligence and petascale unstructured information, the solution is go after established competitors. My view is that this approach will be good for those who are working in open source business intelligence and in the next generation business intelligence solutions from outfits like Digital Reasoning.
A good example of dinosaurs snorting appears in “Introduction of Oracle Financial Analytics for SAP Extends Intelligence to SAP Financial Accounting.” Oracle’s initiative is a product and service package laser sited on SAP’s ageing system. Now “ageing” is relative. Oracle’s technology is no spring chicken either. Here’s what Oracle says about the business intelligence offering for the Bob Crachits of industry:
“Organizations that rely on SAP Financial Accounting can now turn to Oracle Financial Analytics for SAP to further improve business visibility and align decision-making,” said Paul Rodwick, vice president of Product Management, Oracle Business Intelligence. “For years, SAP customers have used Oracle Business Intelligence Enterprise Edition to deliver complete, relevant insight across their organizations. Having prebuilt analytic applications to support the Oracle E-Business Suite, Oracle’s PeopleSoft Enterprise, Oracle’s Siebel CRM and Oracle’s JD Edwards EnterpriseOne, it was natural to extend that support to SAP with Oracle Financial Analytics for SAP.”
Will this work? Sure. Oracle has won its recent legal tussle with SAP and Oracle gets interest on the judgment. SAP is juicy target. The problem is that new players in business intelligence are getting more attention. Even SAS is upping its game. IBM continues to move in erratic ways but may get its act together in business intelligence.
My view is that fixation on SAP is good and bad. SAP is an outfit with some challenges. The bad is that Oracle’s concentration may leave it vulnerable for some lateral pressure.
Stephen E Arnold, January 17, 2011
Freebie unlike some of the Oracle and SAP licensee support.
Taxonomy and Efficiency
January 14, 2011
One of the leaders in the data and content management field, Access Innovations, Inc., has compiled a list in “10 Reasons to Resolve to Create a Taxonomy for Your Business in 2011.” A taxonomy creates an organized system of classification.
Here are three of the reasons with which we resonated:
Every person or department uses a different term, even though they’re all talking about the same thing. Your coworkers can’t find the company policy for the Fourth (or Fifth, or Sixth) of July, because it’s tagged as Independence Day? An enterprise taxonomy can get all of you searching the same language, if not talking it.
A coworker just spent 45 minutes trying to locate a document, but didn’t know what search term to use. Taxonomy browsing should work for him or her. And with synonyms, he/she can look for eye doctors or even “optimalogists” and find ophthalmologists.
Everything for HR gets called “HR” – all 10,000 documents. Get your indexers, taggers, and searchers browsing down to the more specific terms that a taxonomy can show them. You have HR documents on free pizza as a fringe benefit? Add Fringe benefits as a narrower term, and add Free pizza under Fringe benefits, so people can save some dough.
Access Innovations asserts that their system of taxonomy can help to eliminate irrelevant and unwanted search returns and will enable a searcher to use relative terms for the same search regardless of punctuation, spelling, or terminology.
“The bottom line is that a good taxonomy can save your staff time, and your organization time and money.”
If this system can save companies both time and money, why not give it a try? Let’s face it, in today’s economy where every penny counts, waste and inefficiency can lead to the failure of a business venture while efficiency and ease tend to lend themselves to success.
Leslie Radcliff, January 14, 2011
Freebie
IBM OmniFind Search Documentation
January 12, 2011
We fielded a call about the architecture of IBM’s portal and OmniFind search technologies. The new OmniFind may have hit the market, but the IBM online documentation carries the date of January 2008. We still think that anyone integrating various IBM bits and pieces for an Intranet will want to check out the “old” documentation.
Here are the types of information available and a hot link to each set of Web pages. When we last checked (January 11, 2011), the information was online and did not require an IBM account or password to access.
- General information.
- Integration points. Important information.
- Installation procedure. Not exhaustive but useful.
- Specific integration of the IBM portal and search systems.
- Some additional IBM resources.
Can you install various IBM components without reading the documentation. Sure, but you will have some backtracking to do in order to figure out dependencies. Even though the documents pointed to date from 2008, the approach and method is useful.
Stephen E Arnold, January 12, 2011
Freebie
SAP Embedded Search
January 12, 2011
On a call today (January 10, 2011), we fielded a question about TREX, the plumbing for SAP’s bespoke search and retrieval service. The question had to do about what SAP calls search. Well, we have good luck locating information about TREX in the current SAP environment by searching for the string “embedded search.” Queries for TREX or for earlier versions of R/3, you may have more luck with the phrase “search engine service.” Years ago, I did a chapter in a book about TREX. My recollection is that information in 2006 and 2007 was difficult to find as well.
If you want to know about TREX and its various incarnations, you might consider these links:
- Basic description of TREX
- Discussion of classification functions
- Connecting a local search with an SAP search hub
We continue to hear chatter about third party solutions that “snap in” to SAP R/3 environments. We also hear comments about SAP’s on again, off again interest in buying a search vendor. Our suggestion: stick with SAP’s engine. A good alternative may become available.
Stephen E Arnold, January 12, 2010
Freebie
Wikileaks and Metadata
January 7, 2011
ITReseller’s “Working to Prevent Being the Next Wikileak? Don’t Forget the Metadata.” is worth a look. The write up calls attention to indexing as part of an organization’s buttoning up its document access procedures.
ITReseller says this about metadata:
A key part of the solution is metadata – data about data (or information about information) – and the technology needed to leverage it. When it comes to identifying sensitive data and protecting access to it, a number of types of metadata are relevant: user and group information, permissions information, access activity, and sensitive content indicators. A key benefit to leveraging metadata for preventing data loss is that it can be used to focus and accelerate the data classification process.. In many instances the ability to leverage metadata can speed up the process by up to 90 percent, providing a shortlist of where an organisation’s most sensitive data is, where it is most at risk, who has access to it and who shouldn’t. Each file and folder, and user or group, has many metadata elements associated with it at any given point in time – permissions, timestamps, location in the file system, etc. – and the constantly changing files and folders generate streams of metadata, especially when combined with access activity. These combined metadata streams become a torrent of critical metadata. To capture, analyze, store and understand so much metadata requires metadata framework technology specifically designed for this purpose.
Some good points here, but what raised our eyebrows was the thought that organizations have not yet figured out how to “index”. Automation is a wonderful thing; however, the uses of metadata are often anchored in humans. One can argue that humans need play no part in indexing or metadata.
We don’t agree. Maybe organizations will take a fresh look at adding trained staff to tackle metadata. By closing in house libraries, many organizations lost the expertise needed to deal with some of the indexing issues touched upon in the article.
Stephen E Arnold, January 7, 2011
Freebie
Oracle Documentation for SES11g
January 7, 2011
On a phone call yesterday (January 5, 2011), we learned that Oracle has a public documentation page at this location. The point made during the conversation was that this Oracle documentation page does not include an explicit link to either Oracle Text or Oracle’s enterprise search systems, Oracle SES10g and SES11g.
Frankly, we did not believe this statement. We took a look.
We found that the person telling us about this omission was partially correct. If you download the documentation for the Oracle Database, there are references to Oracle Text. We did not spot a direct link on this Oracle page to the company’s enterprise search system.
You cannot locate the documentation by running the query “SES11g” from this link.
So what do you do if you want SES11g documentation?
Well, you have to do some scouting around. If you click, this link, you will get the PDF of “Oracle Universal Content Management.” The document was dated May 2010, and the information in that file will get you rolling.
We had in our bookmarks a link to a Web page on the Oracle site called “Oracle Secure Enterprise Search”. You can get what appears to be reasonably complete installation information at this link. If you are working with SES11g, you may already have this page bookmarked. If you want to know more about SES11g, this Installation and Upgrade Guide will be useful at some point.
You may find the mini-access page called Tahiti helpful as well: http://tahiti.oracle.com/
What’s this exercise suggest about Oracle’s commitment to search and retrieval? We were surprised to say the least. Adding and explicit link to the Oracle documentation page seems easy from our vantage point in Harrod’s Creek.
Stephen E Arnold, January 7, 2011
Freebie
Lucid Imagination Moves to the Enterprise
January 7, 2011
“Has Lucid Imagination Found the Open Source Solution for Enterprise Search?” asks if, like the Star Trek Enterprise, Lucid Imagination has done what no other open source search engine has done before and created a product worth paying for. Why not just use Apache Solr/Lucene?
The article points out that without Lucid you can’t just index and search a set of documents, you have to create each connection type, and, most importantly, there is no security. It’s also easy to change over to Lucid since it’s built on top of the Apache engine without any significant alterations. To sum up:
Lucid Imagination reduces the technical complexity of leveraging Solr by providing an automated installer, configurable data connectors and a web-based administration interface. In addition to the add-on to Solr/Lucene users can easily observe, multiple enhancements were made to make the solution easier to deploy to the cloud.
If you’re interested in trying it out, the annual fees are straightforward too.
Alice Wasielewski, January 7, 2011
Manifold CF for Open Source SharePoint Search
January 6, 2011
Although we think Microsoft Fast is the cat’s pajamas for enterprise search, some Microsoft SharePoint admins may want have an interest in an open source alternative. Manifold CF (formerly known as Apache Connectors Framework) is undergoing incubation at Apache. If you’re not familiar with this project, it is explained as follows: “ManifoldCF is an effort to provide build and support an open source framework for connecting source content repositories like Microsoft SharePoint and EMC Documentum, to target repositories or indexes, such as Apache Solr.” One of our open source contacts reminded us that ManifoldCF was the Lucene Connectors Framework.
The ManifoldCF splash page says that the project is in incubation:
Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
The info page adds, “There are currently no released versions of ManifoldCF available, although one is planned However, a trunk svn checkout has been built and successfully hand-tested. Nightly builds and online javadoc will be coming shortly.” Worth watching.
Alice Wasielewski, January 6, 2011