Nstein in the News
October 11, 2010
I had a couple of comments about my not mentioning Nstein, now a unit of OpenText. Nstein has been an interesting company or unit of a bigger enterprise. Last year, one of Nstein’s executives set up a meeting with me and then did not show up. I pinged the fellow and learned that his plans had changed. Since then, my plans for covering Nstein changed as well. Seemed only fair.
To assuage the aggrieved reader, I took a quick look at the content sucked into my Overflight system about Nstein. One of the more interesting items appeared in a publication for which I write a for-fee column. I don’t cover search in that publication, but Archana Venkatraman wrote “Semantic Content Analytics Can Resolve Digital Information Problems.” I was surprised because a picture of me and links to my recent write ups about SAP appeared in the border for the Web version of Mr. Venkatraman’s article. I was flattered, but I was confused about the premise of the article; to wit, analytics resolving digital information problems. I think of analytics as causing problems, particularly with regard to the methods used to generate output. Data type and source, privacy, and latency – these topics cross the goose’s mind when he thinks about content analytics.
With regard to Nstein, the passage that caught my attention was information which is attributed, I assume, to an OpenText Nstein executive, Lubor Ptacek, vice president, product marketing:
Semantic Navigation first collects content through a crawling process. Then the content is automatically analyzed and tagged with relevant and insightful entities, topics, summaries and sentiments – the key to providing an engaging online experience. Next, content is served to users through intuitive navigation widgets that encourage audiences to discover the depth of available information or share it on social networks, such as Facebook and Twitter. From there, it supports placement of product and service offerings or advertising to convert page views into sales.Ptacek gives the example of a medical information professional is searching for the name of a disease, content analytics technology can provide him additional information such as the side effects of the illness the drugs used in the past and so on. “And this logic can be applied to other industries as well.” The solution comes after Open Text acquired Nstein Technologies, a content analytics company, six months ago. It acquired Nstein at a time when analysts were suggesting that such e-discovery solutions could provide sophisticated search and content navigation options that info pros are seeking.
I am hearing similar explanations of functionality from a number of companies. These include “sentiment specialists” like Attensity and Lexalytics and from certain mashup vendors such as Digital Reasoning and Kapow Technologies. I have heard the leaders in enterprise search like Autonomy and Exalead reference similar functions. I could toss in IBM, Google, and Microsoft, but I think you get the idea. Quite a few search vendors are morphing into solutions.
If you want more information about OpenText / Nstein, navigate to www.opentext.com. I would also suggest a look at the other vendors making similar assertions. I may have to start covering this new segment of search. Perhaps it warrants a separate Web log?
Stephen E Arnold, October 11, 2010
Freebie
Google Oracle and a Cup of Java
October 10, 2010
I don’t understand how the legal methods work in the US. Oracle files suit. I read in Digital Daily that Google has responded with its own request. “Google Asks Court to Toss Oracle’s Android Lawsuit” informed me that
the search sovereign filed an answer to Oracle’s suit, denying all seven of its patent-infringement charges, and asking that the company’s copyright-infringement claim be dismissed because Google (GOOG) feels it is “legally deficient.”
My take on this particular matter is that Google is a big, fat, easy, rich target. The fact that a company that was a Google- Search-Appliance tolerant outfit has lost patience with Android. The vehicle for the annoyance is Java, a programming language that I had assumed was available for anyone to use. I remember of old “write once, run anywhere” pitch from my long ago tie up with Kendara. That team used Java to create a personalized experience and the technology ended up in the hot hands of the AT&T @Home folks.
Now one needs a lot of time and a lawyer sitting near by to figure out what the next move will be. Oracle and Google: A pretty interesting spat. It was until Microsoft aimed its legal eagles at Motorola in order to take a whack at the GOOG.
My question: After 12 years, isn’t it a bit late to try and corral Googzilla. I remember those briefings I was doing in 2004 and 2005. Telcos and the world’s biggest software company just snorted. Now it is not snorting it is hyperventilating.
The collateral implications of the Oracle Google matter may spill over into the Wild West of open source software. Whenever smart attorneys take over from technologists, the unpredictable becomes the new normal.
Maybe it is a bit too late. I just read Android is the most popular operating system in US among recent smartphone buyers. What happened to Apple? Research in Motion? The Kin?
Did Yogi Berra really say, “It gets late early out there.”
Stephen E Arnold, October 10, 2010
Freebie sort of like Android and its sweet names
A Google Triple
October 9, 2010
It has been a long day. More than 300 people coding and chatting. The goslings and I flapped around the Lucene Revolution and survived the debate among some industry big dogs.
After the crowd thinned, I flipped through my newsreader and spotted three items in quick succession. Are these true? Who knows? I did find the three items taken as a bap, bap, bap quite suggestive.
First, the Google according to one of the super confident Web traffic outfits accounts for 72 percent of searches in September 2010. The story ran in Search Engine Watch. Is that a monopoly? I sure don’t know, but that’s a hefty chunk of the market.
Second, with that many users, I assumed that happiness was a warm puppy or at least a warm Googzilla. Not according to the Better Business Bureau. “Google Gets a C-Minus for Customer Service” reports that Google, despite its A+ in math has received a dull normal in helping those customers. As a C minus goose, dull normal is not too bad or is it two bad? For the Hahvad bound, gloom falls or is it fails?
Third, “Former FTC Staffer Files a Complaint against Google” contains some allegations. The story asserts:
the search engine and advertising outfit shares data with third parties. Soghoian doesn’t mince words, asking the FTC to “compel Google to take proactive steps to protect the privacy of individual users’ search terms”. His complaint also includes the aforementioned allegations of personal information being shared with third parties.
One wonders why the US government agency fiddled while Rome smoked?
Now, assume these statements from three sources are semi-accurate. The customer thing seems to be like one of those social type things, not a math type thing. Bap, bap, bap.
Stephen E Arnold, Octobenwsr 9, 2010
Freebie or is it furby? It’s a C minus thing.
Autonomy Spins Another Rich Media Deal
October 8, 2010
Searching radio content based on its meaning alone is indeed unique. That’s how semantics is rapidly transforming the traditional search. The Wallstreet Online.de press release “Radioplayer Selects Autonomy,” informs about the important task attributed to Autonomy’s core infrastructure software, the Intelligent Data Operating Layer (IDOL), to power the Radioplayer’s search box. Equipped with “unrivalled meaning-based search capabilities, unmatched scalability, and easy maintenance,” as the press release states, “Radioplayer listeners will be able to search every station, identifying news programs, sports highlights, musical genres or even individual songs, and potentially also find a specific place within the program that mentions the topic they are searching for.”
Not only does Radioplayer encourage radio listening, but puts the audience in control and make radio listening easy and accessible. The IDOL search platform will make available rich content across the radio industry through meaningful, relevant search results, and retrieve audio content based on the meaning of key concepts in the live stream. That’s really quick and effortless.
Harleena Singh, Octobr 8, 2010
Freebie
Facebook Shuns Google, Not too Social
October 7, 2010
The questions surrounding Google and the release of a new social “Facebook Killer” network continues to keep the world buzzing. Google’s Vice President Marissa Mayer in Venture Beat’s “Google’s Mayer Criticizes Content “Locked” Inside Facebook” expresses her feelings about the social leader Facebook. In her question and answer segment the VP expresses concerns about Facebook. Mayer states “her concern about social networks, particularly Facebook, was the fact that so much of their content is hidden from Google and other search engines.” Google wants individuals conducting searches to have access to relevant non sensitive information on Facebook. Google does currently pull information from some social sites, but Facebook gives them access to very little information. With the popularity of Facebook it is clear, if granted access, the available content would add to the overall quality of Google’s search results. With the Facebook and Microsoft partnership still taking shape perhaps that means Google is the odd man out. Facebook is not behaving in what one might describe a “social” manner.
April Holmes, October 7, 2010
Freebie
PoolCorp and Exalead Complete a High Scoring Deployment
October 6, 2010
PoolCorp. has immersed itself in Exalead’s search enabled applications framework, CloudView. According to information from Exalead, in 2009, PoolCorp evaluated its existing e-commerce platform and decided that a complete ground-up rewrite was needed to improve the ability for their customers to find products easier and faster. PoolCorp has always provided value-added tools for its dealers to grow their businesses and saw this as an opportunity to combine the best features of all of those tools into a new solution that would address all of the current customer adoption obstacles. With 35 percent of all
customer feedback surrounding search and search related functions, it was clear that in order for the new e-commerce site to provide an industry-leading customer purchase experience, an enterprise search solution was required.
Before deciding upon CloudView, PoolCorp reviewed a number of enterprise search solutions including Microsoft FAST, Endeca, Autonomy and open source Solr. Exalead told Beyond Search, “PoolCorp chose Exalead because it was cost effective, scalable and much easier to install than competing products.”
Dustin Hughes, the senior software archtect of PoolCorp said:
Great software like Exalead is like a ball of clay. You can easily push and mold it into how you need to use it. We had an extremely tight timeline for installing this software – due in part to Exalead’s fantastic customer support we got our beta up and working within weeks and rolled out to 500+ customers within 2 months. The entire system became generally available to 30,000+ U.S. customers within four months of the start of development and initial customer feedback has been very positive.
Beyond Search learned that since the POOL360 beta test in July 2010, PoolCorp found:
- The Exalead-based system offered remarkable response times, often within 1/50th of a second even when the application was pulling information directly from PoolCorp’s existing ERP system.
- Exalead’s ability to compress data from its original SQL format resulted in a 10:1 compression reduction, significantly reducing the amount of hardware necessary to deploy the POOL360 solution.
- PoolCorp saved more than $30,000 in hardware costs and licensing fees over alternative SQL-based solutions.
- The Exalead CloudView technology would be an ideal system for an internal enterprise search system.
Founded in 2000 by Search engine pioneers, Exalead is the leading search-based application platform provider to business and government. Exalead’s worldwide client base includes leading companies such as PricewaterhouseCooper, ViaMichelin, GEFCO, WorldBank and Sanofi Pasteur, and more than 100 million unique users a month use Exalead’s technology for search. Today, Exalead is reshaping the digital content landscape with its platform, Exalead CloudView, which uses advanced semantic technologies to bring structure, meaning and accessibility to previously unused or under-used data in the new hybrid enterprise and Web information cloud. CloudView collects data from virtually any source, in any format, and transforms it into structured, pervasive, contextualized building blocks of business information that can be directly searched and queried, or used as the foundation for a new breed of lean, innovative information access applications.
For more information about Exalead visit the firm’s Web site at www.exalead.com. Beyond Search uses Exalead’s technology for its blog indexing demonstration here. Our experience has positive with zero set up hassles and exceptional stablity and performance.
Stephen E Arnold, October 6, 2010
A Google Goal? Capture Indian SME Market
October 6, 2010
Matching its size and reputation, Google now has a suiting target. The Financial Express article “Google has 35m Indian SMEs on its radar,” reveals that India presently has only 200,000 SMEs having online presence, which is even less than 1 percent of its entire SME sector. As per the article, Google wants to convert the entire Indian SME sector into a potential customer base, and is on a “large-scale mission to educate smaller businessmen about the viability of the Internet for finding a market for their products.”
Even though the article reports about Google’s doubling of Indian SME customer base in a year, aggressive campaign for online advertising and plans for doubling its call center support, we reckon it will still be tough for Google to change the mindsets of most Indian businesspersons. Having said that, we note that the Internet usage is on a rise in India, which increases the chances of these business persons being lured to Google’s plan. I am not writing from the goose pond in Harrod’s Creek. I am writing from India. Different perspective perhaps?
Harleena Singh, October 6. 2010
A Linux Warning: Information or Disinformation
October 5, 2010
A reader sent me a link to a story on TechNewsWorld. “Penguins Old, Penguins New, Penguins Battered and Penguins Blue” provides some cautionary words about open source in general and Linux in particular. I am not able to say whether the information in the article is 100 percent spot on, but I did want to capture its main points. The arguments may be germane as open source software continues to chug forward. Later this week I will be at the Lucene Revolution Conference, and I want to make sure I know the pros and cons of the commercial versus open source landscape.
The key point in the TechNewsWorld write up is that Windows 7 is a better option for the use case described in the story. Here’s the passage that caught my attention:
The project’s afflictions included implementation delays, immature software and “disgruntled employees whose displeasure allegedly culminated in the creation of a home page dedicated to venting their gripes and who were so busy grappling with Linux that they no longer managed to do their jobs,” explains a special report in The H.
After indicating that there may be a role for Linux in this particular client situation, the author says:
Three “not-so-easy lessons” can be taken away from the Solothurn story, Hudson suggested:
Problem #1: “There will always be a significant minority that will resist any change.”
Lesson #1: “Plan for resistance, and be ready to modify plans accordingly. Giving up a little early on can mean not losing everything later. No battle plan survives the first engagement intact.”
Problem #2: “Trying to change from one computer monoculture to another ignores practicalities.”
Lesson #2: “Be practical. Save ideology for church on Sunday or discussing politics at the family reunion.”
Problem #3: “Nothing was ready on time, and a lot didn’t work as promised.”
Lesson #3: “Don’t over-promise, don’t over-sell. You’re not the 800-pound monkey — you can’t sell vaporware and then fling poo at your customers and hope some of it sticks.”
Yep, without resources, knowledge, and commitment, change is tough. I suppose that’s why 66 percent of Windows users are still running XP.
I am not convinced that this use case is necessarily representative.
Stephen E Arnold, October 5, 2010
Freebie
Yahoo and Prediction
October 5, 2010
Yahoo’s public relations machine is working hard to deal with the flood of news about executive turnover and the questions about Yahoo’s management leadership. I wanted to snag this “What Can Search Predict?” item before it becomes unfindable. Google Instant and Bing are wonder services but pinning down specific documents in the brave new world is getting more difficult in my opinion.
The point of the write up is that user behavior at a point in time provides information about what’s hot and what’s not. I understand this. Analyzing usage data is not a new thing, nor is the math used to clump clicks and plot them, massage them, and extrapolate from them. Most college grads had a chance to try their hand at this type of math in classes from psychology to biology and statistics. (I can hear the groans now.)
Yahoo says:
In many cases, we found that these traditional predictions performed on par with those generated from search. Although search data are indeed predictive of future outcomes, alternative information sources often perform equally well.
The idea is that big data are good but specific, narrow sets of data from specific corpuses may deliver better indicators of user future actions.
Makes sense to me. Big data are big. More precisely constrained data are narrower. When looking for a specific indicator, why not consider the constrained data? Makes sense to me, but I would prefer a method that uses big data, when available, and more constrained data. Two sets of outputs can be examined.
Yahoo adds:
The potential for search-based predictions seems greatest for applications like financial analysis where even a minimal performance edge can be valuable, or for situations in which it is cumbersome or expensive to collect and parse data from traditional sources. Ultimately, search can be useful in predicting real-world events, not because it is better than other traditional data, but because it is fast, convenient, and offers insight into a wide range of topics.
Several questions waddled across my mind:
- What is the current Yahoo use case for its insight? I know that each time I return to my Yahoo Mail, the system does not remember me, nor does it present options to me based on my behavior or a larger group’s in my view. I have to click, click, click to see a list of email. Maybe Yahoo can provide some concrete examples?
- In the midst of the shift to Bing search, where does this predictive stuff fit. I was looking for a “mens black watch” on Yahoo Shopping. Try the query. I am not sure what can be done to improve the results, but search results mixed ranges with specific prices on specific models. Huh? With user data – either big or constrained – predictive methods should reduce confusion, not create a “huh” moment for me.
- Is this a “level” problem? Here’s what I am thinking. The problem in search that Yahoo is addressing seems to be down in the weeds. There are larger findability problems with Yahoo’s system. For example, in the shopping example a user must click on a “more” link in order to access the shopping search feature. Most users don’t know to what that “More” refers. Is this a contributing factor to user frustration which in turn may explain some of the loss of polish on the purple Yahoo Y?
Worth reading and then finding a use case (which I may be missing) before recycling information already in the channel in my opinion.
Stephen E Arnold, October 5, 2010
Freebie
Floss Plone Information
October 4, 2010
I have been listening to podcasts when at the gym. New to the podcast world, I have been downloading programs to try and find out which ones have consistent, solid content. Yesterday I listened to Floss Weekly Number 137: Plone, produced by an outfit called Twit. You can get the show and information about Twit from the company’s Web site at http://twit.tv. I was surprised with the information revealed on this particular podcast, hosted by Randal Schwartz (aka merlyn), a Perl expert.
The guest on the program to which I listened was Alexander Limi, former Googler, employee at Mozilla, and user experience specialist for Plone. If you are not familiar with Plone, it is an open source content framework. You can use it to create content for industrial strength applications like the FBI and Discover Web sites. For more information about Plone, navigate to http://plone.org/.
I have no solid information about the accuracy of this particular podcast. I do want to highlight two points made in the podcast because I don’t want them to slip away.
The first point concerns Microsoft SharePoint. On the podcast I heard that Microsoft is not really selling or licensing SharePoint. Instead the model is shifting to providing the software and relying on services to generate revenue. I will have to poke around to find out if this is an early warning of a shift in the SharePoint business model or if there are only certain situations in which Microsoft is providing access to SharePoint in this way. The reason this is important is that SharePoint is, in my opinion, the fertile soil of an ecosystem that supports quite a few third-party vendors. These range from Microsoft Certified Partners who produce software that snaps in or overlays SharePoint. Example range from European vendors like Fabasoft to US firms like BA-Insight. In addition, there are many engineers who take some Microsoft classes and then support themselves making SharePoint work as the licensee requires. The notion of a “free” SharePoint or even a low cost SharePoint can explain why so many English majors, unemployed journalists, and third string business school MBAs are vociferously marketing their SharePoint expertise. This is a big ecosystem and it is going to get even bigger. I documented a study that suggested some SharePoint installations were challenges. The pricing implications are significant and the outlook for companies which can actually make SharePoint work are significant as well. I think most of the SharePoint snap in vendors could still be walking on a knife edge. The reason is that big accounts will be sucked up by Microsoft itself. Why let that revenue go to those who cultivated the cornfield? Just like big agriculture, the small farmer gets an opportunity to find a new future.

