Exclusive Interview: Sam Brooks, EBSCO Publishing
January 18, 2011
We have been covering “discovery” in Beyond Search since 2008. We added a discovery-centric blog called IntelTrax to our line up in September 2010. One of the companies that caught our attention was EBSCO Publishing, one of the leaders in the commercial database, library information, and electronic publishing sectors. EBSCO has embraced discovery technology, making “search without search”, faceted navigation, and other user-centric features available to EBSCO customers. Chances are your university, junior college, middle school, and primary school libraries use EBSCO products and services. Thousands of organizations world wide rely on EBSCO for high-value, third party content, including rich media. You can get the details of the EBSCO content and information services offerings at http://www.ebscohost.com/.
I wanted to know how a company anchored in online technology moved “beyond search” so effectively. I spoke last week with Sam Brooks, senior vice president of EBSCO Publishing. He told me:
As library users have grown accustomed to the simplicity and one-stop shopping of web search engines, EDS allows users to initiate a comprehensive search of a library’s entire collection via a single search box. The true value of EDS is that while providing a simple, familiar search experience to end users, the sophistication of the service combined with the depth of available metadata allows EDS to return extensive results as if the user had performed more advanced searches across a number of premium resources.
EBSCO’s presentation is easily customized. This particular user interface matches the rich options available from such companies as i2 Ltd. and Palantir, two leaders in the “beyond search” approach to information.
The new discovery interface makes it easy to pull together a broad range of content to answer a user’s query. The interface then goes farther. Exploring a topic or following a research thread is facilitated with the hot links displayed to the user. The technology for the user interface is intuitive. Mr. Brooks told me:
By using our EBSCOhost infrastructure as the foundation for EBSCO Discovery Service (EDS), the entire library collection becomes available through a fast, familiar, full-featured experience that requires no additional training. Additionally, unprecedented levels of interface customization allow libraries to use EDS as the basis for creating their own “discovery” service. Currently, users can access EDS via the mobile version of the EBSCOhost interface. Further, there will soon also be a dedicated iPhone/iPad app for use with EDS as well.
For the full text of the exclusive interview, navigate to the Search Wizards Speak feature at this link.
Stephen E Arnold, January 18, 2011
Freebie
Oracle and Drive BI Targets SAP
January 17, 2011
Search is in the Oracle enterprise products and services. Search is just not where the action is at either Oracle or SAP. As more agile competitors like Exalead erase the boundaries between traditional big iron business intelligence and petascale unstructured information, the solution is go after established competitors. My view is that this approach will be good for those who are working in open source business intelligence and in the next generation business intelligence solutions from outfits like Digital Reasoning.
A good example of dinosaurs snorting appears in “Introduction of Oracle Financial Analytics for SAP Extends Intelligence to SAP Financial Accounting.” Oracle’s initiative is a product and service package laser sited on SAP’s ageing system. Now “ageing” is relative. Oracle’s technology is no spring chicken either. Here’s what Oracle says about the business intelligence offering for the Bob Crachits of industry:
“Organizations that rely on SAP Financial Accounting can now turn to Oracle Financial Analytics for SAP to further improve business visibility and align decision-making,” said Paul Rodwick, vice president of Product Management, Oracle Business Intelligence. “For years, SAP customers have used Oracle Business Intelligence Enterprise Edition to deliver complete, relevant insight across their organizations. Having prebuilt analytic applications to support the Oracle E-Business Suite, Oracle’s PeopleSoft Enterprise, Oracle’s Siebel CRM and Oracle’s JD Edwards EnterpriseOne, it was natural to extend that support to SAP with Oracle Financial Analytics for SAP.”
Will this work? Sure. Oracle has won its recent legal tussle with SAP and Oracle gets interest on the judgment. SAP is juicy target. The problem is that new players in business intelligence are getting more attention. Even SAS is upping its game. IBM continues to move in erratic ways but may get its act together in business intelligence.
My view is that fixation on SAP is good and bad. SAP is an outfit with some challenges. The bad is that Oracle’s concentration may leave it vulnerable for some lateral pressure.
Stephen E Arnold, January 17, 2011
Freebie unlike some of the Oracle and SAP licensee support.
Finding Needles in an OmniFind XML Haystacks
January 13, 2011
Yes, you can use IBM DB2 for XML searches. “CI Can . . . Search XML in OmniFind V1R2” gives examples from the the XML search in IBM DB2 OmniFind V1R2. To sum up: “XML search allows a search to be scoped to a specific element or attribute, rather than the entire document. In addition, the search syntax allows comparing an element or attribute value to a numeric, ISO date or ISO dateTime value during the search.” It also supports XML namespaces on element and tag names. No need to break out the metal detector, the needles will pop right out of those haystacks.
Alice Wasielewski, January 11, 2011
MarkLogic and Change
January 10, 2011
Short honk: I read “You Say Goodbye, I Say Hello.” The write up by Dave Kellogg reported that he will be leaving MarkLogic. MarkLogic Corp. has gained considerable traction in publishing and a couple of other business sectors with the MarkLogic Server product. You can get more information about that product at this link. MarkLogic server is a database built for unstructured information. Depending on the licensee’s use of the MarkLogic server, the resulting implementation can “look like” search, business intelligence, a custom-publishing system, or other information application.
In his blog post, Mr. Kellogg said:
I am proud of what we accomplished during my six years at the MarkLogic: acquiring over 200 enterprise customers, growing annual revenues at a 75% CAGR, raising $27.5M in venture capital, and growing the company from 40 to over 230 employees. I am particularly happy to say that I will be leaving the company in a position of strength, having exceeded the 2010 revenue plan targets and with nearly $20M cash in the bank.
What’s next for MarkLogic? We have been impressed with the MarkLogic technology for years. We will keep you posted.
Stephen E Arnold, January 10, 2011
Oracle Documentation for SES11g
January 7, 2011
On a phone call yesterday (January 5, 2011), we learned that Oracle has a public documentation page at this location. The point made during the conversation was that this Oracle documentation page does not include an explicit link to either Oracle Text or Oracle’s enterprise search systems, Oracle SES10g and SES11g.
Frankly, we did not believe this statement. We took a look.
We found that the person telling us about this omission was partially correct. If you download the documentation for the Oracle Database, there are references to Oracle Text. We did not spot a direct link on this Oracle page to the company’s enterprise search system.
You cannot locate the documentation by running the query “SES11g” from this link.
So what do you do if you want SES11g documentation?
Well, you have to do some scouting around. If you click, this link, you will get the PDF of “Oracle Universal Content Management.” The document was dated May 2010, and the information in that file will get you rolling.
We had in our bookmarks a link to a Web page on the Oracle site called “Oracle Secure Enterprise Search”. You can get what appears to be reasonably complete installation information at this link. If you are working with SES11g, you may already have this page bookmarked. If you want to know more about SES11g, this Installation and Upgrade Guide will be useful at some point.
You may find the mini-access page called Tahiti helpful as well: http://tahiti.oracle.com/
What’s this exercise suggest about Oracle’s commitment to search and retrieval? We were surprised to say the least. Adding and explicit link to the Oracle documentation page seems easy from our vantage point in Harrod’s Creek.
Stephen E Arnold, January 7, 2011
Freebie
Lexalytics and DataSift
December 31, 2010
If sentiment analysis is the key ingredient in the social content technology cocktail, then Lexalytics aims to be the brand of choice for businesses and individual consumers everywhere. MediaSift Ltd., the British company behind the Datasift social media filtering engine, is eager to see a partnership with the Lexalytics text analysis software take root.
We learned in DataSift Taps Lexalytics to Help “Tune Your Data”, that one focus of the alliance is the ever increasing accumulation of data generated from tweeting. “Lexalytics provides the ability to automatically extract companies, people or product names, without having a list of them ahead of time; the ability to calculate tweet, entity, and “linked-content” sentiment; output lists of positive/negative entities; and more.”
The Founder and CEO of Favorit Ltd., owner of Tweetmeme, a service designed to total all links and ascertain which are the most popular, is Nick Halstead. “An important part of the metrics we provide through Datasift is the sentiment, or tonality of the data. We needed an engine that could integrate quickly into our environment and start immediately providing accurate sentiment analysis across all our data services.” says Halstead. “Lexalytics Salience gives us a great combination of flexible integration, high performance and accurate sentiment analysis.”
Another goal of the union is to give users the tools to observe and respond in real-time. This is accomplished through the interpretation of massive amounts of data from a variety of online sources. The Lexalytics software possesses the capability of converting all English text and is compatible with multiple systems. Looks like another player in social content technology is being added to the shaker.
Sarah Rogers, December 31, 2010
Freebie
Clarabridge: Quick Update
December 30, 2010
Clarabridge, Inc., has announced new deals and partnerships with J.D. Power & Associates, Teradata and Verint.
Clarabridge describes itself as “the leading provider of sentiment and text analytics software for customer experience management.”
Clarabridge’s agreement with J.D. Power & Associates is part of J.D.’s efforts to expand the reach of its social-media research. J.D. Power also signed a separate agreement with NetBase. J.D. Power explained that it chose Clarabridge “to provide best-in-class analysis of verbatim texts aggregated from CRM systems and from consumers in response to surveys.”
Clarabridge’s partnership with data warehousing leader Teradata also includes a reseller agreement. The companies believe that their complementary voices will allow end users to “integrate structured and unstructured data in a single warehouse for a 360 degree view of their quantitative and qualitative customer analysis.”
The OEM/Partnership with Verint Systems, Inc., “a global leader in Actionable Intelligence solutions and value-added services,” revolves around Verint’s new Impact 360 Text Analytics solution, which is part of its Customer Interaction Portfolio from Verint Witness Actionable Solutions. Impact 360 Text Analytics will enable Clarabridge to gain “an aggregated and unified view of customer service, experiences and opinions across both direct and indirect customer communications channels.”
Pete Fernbaugh, December 30, 2010
Freebie
Big Data, CAP, and NoSQL
December 19, 2010
We came across an interesting series of Web write ups about big data. You may know about the CAP theorum. The idea is that is “impossible for a distributed computer system to provide simultaneously” guarantees of “consistency (all nodes see the same data at the same time), availability (node failures do not prevent survivors from continuing to operate), [and] partition Tolerance (the system continues to operate despite arbitrary message loss). For more, read the Wikipedia entry here.
Nati Shalom’s Blog series Part I, II, and III on the CAP theorem postulates that if you are worried about CAP, then maybe you just need to re-define your needs. Shalom’s general thesis is as follows:
One of the core principals behind the CAP theorem is that you must choose two out of the three CAP properties. In many of the transactional systems giving away consistency is either impossible or yields a huge complexity in the design of those systems. In this series of posts, I’ve tried to suggest a different set of tradeoffs in which we could achieve scalability without compromising on consistency. I also argued that rather than choosing only two out of the three CAP properties we could choose various degrees of all three.
Some useful info to tuck away for future reference and consult before talking to a vendor who is pitching scale and the cloud when you need search or content processing.
Alice Wasielewski, December 19, 2010
Freebie
Why Big Name Enterprise Search Is So Costly
December 17, 2010
“How Much Time Out of Your Day Does IBM Waste?” is about IBM’s WebSphere Application Server and related components. The author does a good job of explaining how undocumented dependencies and bugs suck up his work time. Of course, a company that relies on IBM technology has made a business decision that probably had little to do with the challenges the firm’s technical professionals must face on an on going basis.
Here’s a passage from the write up that caught my attention:
The sad thing is that RAD is nothing but Eclipse, weighted down with IBM plugins and I love Eclipse. The latest release, Helios, is one of the nicest IDEs that you can use and it is totally free. It does everything RAD can do and it leaves a lighter foot print…
If this is an accurate statement, it shines a bright light on IBM’s use of open source technology. I am not sure I enjoy what the light shows me. Like many other 66 year old geese, I prefer the stage illusion of a well-oiled machine and its bullet proof engineering. Reality is often different from the marketing collateral I suppose.
The other passage I downloaded to my IBM file was this one:
No one ever calculates the lost productivity when the consider IBM products and really no one looks at the amount of money spent either. There are plenty of open source solutions that are faster, easier to configure and support is a Google away. My preference is to use Tomcat. Since every sane developer pretty much uses Spring anyway, Tomcat is the perfect choice and it is easy to support and maintain. JBoss is another great choice if you must have more J2EE container features, but again, by using Spring, they are mostly unnecessary.
Lost productivity. That means money. And when chief financial officers look to reduce costs, will the beancounter’s eyeballs focus on the expenses (both direct and indirect) that some large vendors’ software imposes? I know the answer is, “It depends.”
And enterprise search?
OmniFind 9.x is based on open source technology. I did this mental calculation: What’s the cost of direct and indirect engineering associated with a full IBM-centric search system? I ran through the costs of the hardware, field replaceable units, engineering support, and maintenance for WebSphere, OmniFind, and training for the bits and pieces? How much?
A lot. What got me thinking was that IBM is using open source to generate revenue for its high margin businesses like consulting, engineering support, and maintenance.
The point of the Jeviathon article was that he wanted to use other, lower cost tools, but the IBM commitment locks in certain technical challenges and, of course, the revenue for IBM from services.
After reading Jeviathon’s article, I formed a different impression of IBM’s commitment to open source. Thinking about Oracle’s stance on open source, I concluded that open source may be a stalking horse. If big name search vendors follow in IBM’s footsteps, the deployments have built in costs that may be difficult to control.
Big time search solutions are expensive because they are designed to generate a revenue stream for the vendor. No problem with that, of course. I like the idea of open source software providing the base and then the vendor wrapping the solution in Velcro so the hook dig in and keep the money flowing from the client to IBM. Would IBM take such actions to generate revenue? I don’t know, and it is an interesting hypothesis to consider.
Stephen E Arnold, December 17, 2010
Freebie
CopperEye: Speedy Stuff
December 10, 2010
I came across CopperEye several years ago. I was looking for a solution that would cope with large volumes of data, mainframe and client server hardware, and specific performance requirements. CopperEye met the specs. In London last week, I engaged in a conversation and learned that CopperEye was not widely known in the more traditional search and retrieval field. The purpose of this write up is to provide some basic information about the company. In a nutshell, the firm offers a system that can discover, parse and index data in a relational database or flat file output. The method can handle “big data”. (A video demo is available on YouTube.)
In 2007, In-Q-Tel, the investment arm of the Central Intelligence Agency, signed a deal for a strategic investment in CopperEye. In that 2007 announcement, an In-Q-Tel spokesperson said:
We selected CopperEye because it offers superior technology in the area of the retention and retrieval of structured, historical data,” said Troy M. Pearsall, Executive Vice President of Technology Transfer at In-Q-Tel. “Given the volume of information gathered by organizations within the public and private sectors, it made perfect sense to invest in an innovative data access technology that will potentially meet the critical needs of the U.S. Intelligence Community. We look forward to working with CopperEye in the coming months and years.”
Based on the information in my Overflight system, CopperEye is privately held. Now about 10 years old, the company provides enterprise class archiving solutions, including compliance archiving. The firm’s search product is called CopperEye Search. The Greenwich product uses standard SQL to retrieve records from log files. the Secure Data Retrieval Server is an an appliance that complies with with data retention regulations. the CopperEye Indexing function is optimized for high speed.
The current version Retrieval Server includes features improved compression The compression introduces no latency while yielding more efficient storage and reduced disc accesses. The system has been engineered for high availability. When deployed as a distributed system, queries operate as though the data set were a single environment. One interesting feature is that the system can be configured to process queries as parallel, failover or round robin methods.
The CEO of the firm is Carmen Carey. The founders are Paul McCafferty (COO) and Duncan Pauly. You can get more information about the company at www.coppereye.com.
Stephen E Arnold, December 10, 2010
Freebie

