Inteltrax: Top Stories, September 19 to September 23
September 26, 2011
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the idea of data analytics and business intelligence coming to the rescue in one way or another.
Our first story came from the article, “BI Rescues Legal World” http://inteltrax.com/?p=2392 we took a look inside how the legal billing world was saving firms money by using business intelligence.
Another rescue tale was found in the aptly titled, “Cloud Computing to Rescue Struggling Ledgers,” http://inteltrax.com/?p=2407 used Amazon as an example of how melding cloud computing and BI is putting many companies into the black.
Also, we in “Wine Gets the Big Data Treatment,” http://inteltrax.com/?p=2412 we explored how the wine industry, taking quite a hit during tough economic times, is staying afloat with big data analytic techniques.
No matter if an organization is dishing up legal briefs or chardonnay, there seems to be a need for book balancing by way of big data. We’ll keep an eye on this development as economic belt tightening continues around the world.
Follow the Inteltrax news stream by visiting www.inteltrax.com
Patrick Roland, Editor, Inteltrax. September 25, 2011
Software Giants Race to Solve the Big Data Puzzle
September 23, 2011
Other software companies have also put their hats in the ring to come up with a potential solution to this issue. The Computing.Co.UK article, Essential Guide to A Big Data: Part One states:
Microsoft, Oracle, SAP and Endeca are looking to sell enhanced database, analytics and business intelligence tools based on the big data concept, though the very definition of the term tends to be manipulated to play to individual product strengths in each case, meaning big data remains a moving target in many respects.
Sponsored by Pandia.com
Decide: Another Predictive Play
September 21, 2011
Purchasing big ticket items, specifically technology, is always a challenge. The price, trends, and efficiency of products varies greatly depending on the timing of your purchase.
New startup Decide is focused on helping consumers decide when to purchase items based on intelligent predictions monitored by price trends, rumors, news, and technical specifications. Technology Review’s article, “Algorithms Tell Consumers When to Buy Tech Products” discusses the process. We learn:
Etzioni (chief technology officer and cofounder of Decide) says the long-term impact of price prediction could be huge. It’s not just a question of when to buy a flashy new toy, he says. As companies become better at predicting prices and features for all types of devices, buying at the right time could help consumers own better-quality products across the board.
At what point do I end my search and let vendors predict what I need and when I need it? The company seems to have good intentions at saving me money, but I enjoy the independence I have while shopping and decision-making when to purchase items. I like to wait when new products come on the market to research reviews and see the price drop. I also like to think I am intelligent enough to complete this process on my own without waiting too long for the product to become outdated.
We have noticed a flurry of publicity about Dr. Etzioni, and we are forming the hypothesis that he may be in Decide marketing mode.
Andrea Hayden, September 21, 2011
sponsored by Pandia.com
Ric Upton Leads Digital Reasoning
September 20, 2011
Digital Reasoning empowers decision makers with timely, actionable intelligence to creating software to automatically make sense of complex data.
Our flagship product, Synthesys®, solves the problem of achieving actionable intelligence out of massive amounts of unstructured and structured text . . . A typical customer might be trying to completely understand how to locate an individual within massive amounts of reports . . . Sifting through all this data to accurately develop this profile even among misspellings, aliases, code names, etc. is typically something that can only be done by reading. Our ability to automate understanding is critical to customers with concerns about time, accuracy, completeness, or even the ability to leverage the massive amount of data they have generated.
Inteltrax: Top Stories, September 12 to September 16
September 19, 2011
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, in the world of big data and business intelligence.
Our flagship story this week was the feature, “Solving Big Data’s Problems Stirs Controversy ,” http://inteltrax.com/?p=2522 that gave a deeper look at how quickly our online data is piling up and whether all the talk of harnessing it’s real power is just that: talk.
Another big data tale, “Big Data Skeptics Still Lingering” http://inteltrax.com/?p=2350 illuminated how infant the analytics industry really is and had a little fun at the expense of ourselves and other industry insiders.
Finally, we took another look at the growing world of online data with, “Data Analytics Needs More Specialization, not Less,” http://inteltrax.com/?p=2357 and discovered niches might just be the solution to all the analytic nightmares out there.
The theme for this week seems to have been the mounting concern over data buildup. We can’t stop, we know that. And, thankfully, it looks like we’ll be able to do some fascinating stuff with it—though not everyone agrees. You can bet, as innovations and setbacks happen along this road, we’ll be watching it closely.
Follow the Inteltrax news stream by visiting
www.inteltrax.com
Patrick Roland, Editor, Inteltrax September 19, 2011
Linguamatics Scores Big with Text Mining
September 6, 2011
Wouldn’t it be great if there was a way to sift through all the chatter on Twitter and other social media sites to get to the real meat and potatoes? What if companies could find the proverbial needle in the Twitter-haystack? All this is being done by Cambridge-based Linguamatics as reported in the article, Tweet Smell of Success, on Business Weekly.
The small company (only 50 employees after expanding) caught the world’s attention due to their text-mining skills. Last year, using their search expertise, they were able to very accurately predict the outcome of an election based on the Tweets which occurred during a live, televised debate.
There core technology was developed by the four original founding members. Three remain at the company. They have expanded, rapidly, in their ten years of business, and rely solely on income. They believe their success is due to their unique search approach.
David Milward, CTO and co-founder said: ‘We knew that language processing could get people relevant information much faster than traditional search methods. However, previous systems needed reprogramming for different questions: we wanted to give users the flexibility to extract any information they wanted.’
Linguamatics is just one of many emerging search management companies, each with its own niche. With business and technology constantly shifting to newer and faster methods of getting information, it is no surprise that businesses demand better search methods. More and more information is popping up within the internet, intranets, file-sharing and other data storage entities. Traditional brute force search looks less and less useful to the professionals in some of these hot new market sectors.
Catherine Lamsfuss, September 6, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Oracle Data Mining Update
September 5, 2011
The new Oracle Data Mining Update is generating buzz, including a piece by James Taylor entitled, “First Look – Oracle Data Mining Update.” Oracle Data Mining (ODM) is an in-database data mining and predictive analytics engine, which allows for the building of predictive models. The features added in the latest version are highlighted.
The fundamental architecture has not changed, of course. ODM remains a “database-out” solution surfaced through SQL and PL-SQL APIs and executing in the database. It has the 12 algorithms and 50+ statistical functions I discussed before and model building and scoring are both done in-database. Oracle Text functions are integrated to allow text mining algorithms to take advantage of them. Additionally, because ODM mines star schema data it can handle an unlimited number of input attributes, transactional data and unstructured data such as CLOBs, tables or views.
The ability of ODM to build and executive analytic models completely in-database is a real plus in the market. The software would be a good candidate for anyone interested in using predictive analytics to take advantage of their operational data.
Emily Rae Aldridge, September 5, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Recorded Future and Predicting Investments
August 22, 2011
Recorded Future’s analytic trading signal now has the ability to allow customers to access data from the API in time to trade before the equity markets close. Their current blog post serves as a self-reporting report card.
In March 2011, Recorded Future announced their success in building a whole-market factor model that uses media analytic data to predict excess returns.
Since then, the firm has allowed others to benefit from this service. The company asserts the following in their August 9, 2011 post:
Taking the same strategy we presented earlier, and using the live data as it was available to our customers at 3:30, we have rolled our backtest forward, and looked at the performance of this strategy over the last few tumultuous months. Between May 13, and August 5, this strategy returned 10.4%, while the market lost 9.9% of its value.
Hakia has pushed into similar territory. No one knows how many investment firms are using similar technology to maximize their returns.
Megan Feil, August 22, 2011
Sponsored by Pandia.com
IBM May Need a More Robust Classification Solution
August 18, 2011
According to talk around the water cooler, some IBM content and search units are poking around for a classification “solution”. We think the rumor is mostly big company confusion since IBM already has software available to assess and address an organization’s content classification needs through the use of several components. According to the IBM website:
Most unstructured content is either trapped in silos across the organization or entirely unmanaged “content in the wild.” A majority of that unstructured content can be deemed unnecessary – over-retained, irrelevant, or duplicate – and should be either decommissioned or deleted.
As we understand it, one licenses the Classification Module and/or Content Analytics software to prevent the previously stated problem and to provide content classification.
Sounds great like the ads for IBM mainframes and the promotional information about
But a disturbing question to the ArnoldIT goslings who wear blue IBM logos: What if this stuff costs too much and does not deliver on the fly classification for real time processing of tweets and Google Plus public content?
Maybe an IBM box of parts with an expensive IBM engineering team is not exactly what some outfits require? Perhaps IBM should look around and maybe snap up one of the hot players in the space. IBM has been announcing partnerships with a number of interesting companies. We track Digital Reasoning and and think its technology looks very promising? IBM is in a good position to have an impact in the data analysis space, but it needs tools that go beyond its in house code and Cognos and SPSS methods in our opinion.
Jasmine Ashton, August 19, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Exclusive Interview with Ana Athayde, Spotter SA
August 16, 2011
I have been monitoring Spotter SA, a European software development firm specializing in business intelligence for several years. A lengthy interview with the founder, Ana Athayde appears in the Search Wizards Speak section of the ArnoldIT.com Web site.
The company has offices throughout Europe, the Middle East, and in the United States. The firm offers solutions in market sentiment, reputation management, risk assessment, crisis management, and competitive intelligence.
In the wide ranging interview, Ms. Athayde mentioned that she had been recognized as an exceptional manager, but she was quick to give credit to her staff and her chief technical officer, who was involved in the forward looking Datops SA content analytics service, now absorbed into the LexisNexis organization.
I asked her what pulled her into the vortex of content processing and analytics. She told me:
My background is business and marketing management in the sports field. In my first professional experience, I had to face major challenges in communication and marketing working for the International Olympic Committee. The amount of information published on those subjects was so huge that the first challenge was to solve the infoglut: not only to search for relevant information and build a list, but to understand opinions and assess reputation at an international level….I decided to fund a company to deliver a solution that could make use of information in textual form, what most people call unstructured data. But I knew that the information had to be presented in a way that a decision maker could actually use. Data dumps and row after row of numbers usually mean no one can tell what’s important without spending minutes, maybe hours deciphering the outputs.
I asked her about the firm’s technical plumbing. She replied:
The architecture of our own crawling system is based on proprietary methods to define and tune search scenarios. The “plumbing” is a fully scalable architecture which distributes tasks to schedulers. The content is processed, and we syndicate results. We use what we call “a source monitoring approach” which makes use of standard Web scraping methods. However, we have developed our own methods to adjust the scraping technology to each source in order to search all available documents. We extract metadata and relevant content from each page or content object. Only documents which have been assessed as fresh are processed and provided to users. This assessment is done by a proprietary algorithm based on rules involving such factors as the publication date. This means that each document collected by Spotter’s tracking and monitoring system is stamped with a publication date. This date is extracted by the Web scraping technology, from the document content. The type of behavior of the source; that is, the source has a known update cycle. We analyze the text content of the document. And we use the date and time stamp on the document itself.
Anyone who has tried to use the dates provided in some commercial systems realizes that without accurate time context, much information is essentially useless without additional research and analysis.
To read the complete interview with Ms. Athayde, point your browser to the full text of our discussion. More information about Spotter SA is available at the firm’s Web site www.spotter.com.
Stephen E Arnold, August 16, 2011
Freebie but you may support our efforts by buying a copy of The New Landscape of Enterprise Search

