Inteltrax: Top Stories, September 19 to September 23

September 26, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the idea of data analytics and business intelligence coming to the rescue in one way or another.

Our first story came from the article, “BI Rescues Legal World” http://inteltrax.com/?p=2392 we took a look inside how the legal billing world was saving firms money by using business intelligence.

Another rescue tale was found in the aptly titled, “Cloud Computing to Rescue Struggling Ledgers,” http://inteltrax.com/?p=2407 used Amazon as an example of how melding cloud computing and BI is putting many companies into the black.

Also, we in “Wine Gets the Big Data Treatment,” http://inteltrax.com/?p=2412 we explored how the wine industry, taking quite a hit during tough economic times, is staying afloat with big data analytic techniques.

No matter if an organization is dishing up legal briefs or chardonnay, there seems to be a need for book balancing by way of big data. We’ll keep an eye on this development as economic belt tightening continues around the world.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax. September 25, 2011

Software Giants Race to Solve the Big Data Puzzle

September 23, 2011

Many companies are in need of a new way to store large amounts of data because their existing hardware and software is unable to handle it efficiently. McKinsey Global Institute (MGI) estimates that organizations across nearly all sectors in the US economy have at least an average of 200TB stored somewhere within their IT infrastructure, with many storing more than 1 petabyte.These exceptionally large and diverse repositories of information needing organization are known as ‘big data’ and companies have begun to seek out data warehouse specialists such as Greenplum to help maintain it.

Other software companies have also put their hats in the ring to come up with a potential solution to this issue. The Computing.Co.UK article,  Essential Guide to A Big Data: Part One states:

Microsoft, Oracle, SAP and Endeca are looking to sell enhanced database, analytics and business intelligence tools based on the big data concept, though the very definition of the term tends to be manipulated to play to individual product strengths in each case, meaning big data remains a moving target in many respects.
The article goes on to explain the need for big data hardware and software and how it can increase the success of businesses that choose to utilize it. I found this information very useful and helpful.Jasmine Ashton, September 23, 2011

Sponsored by Pandia.com

Decide: Another Predictive Play

September 21, 2011

Purchasing big ticket items, specifically technology, is always a challenge. The price, trends, and efficiency of products varies greatly depending on the timing of your purchase.

New startup Decide is focused on helping consumers decide when to purchase items based on intelligent predictions monitored by price trends, rumors, news, and technical specifications. Technology Review’s article, “Algorithms Tell Consumers When to Buy Tech Products” discusses the process. We learn:

Etzioni (chief technology officer and cofounder of Decide) says the long-term impact of price prediction could be huge. It’s not just a question of when to buy a flashy new toy, he says. As companies become better at predicting prices and features for all types of devices, buying at the right time could help consumers own better-quality products across the board.

At what point do I end my search and let vendors predict what I need and when I need it? The company seems to have good intentions at saving me money, but I enjoy the independence I have while shopping and decision-making when to purchase items. I like to wait when new products come on the market to research reviews and see the price drop. I also like to think I am intelligent enough to complete this process on my own without waiting too long for the product to become outdated.

We have noticed a flurry of publicity about Dr. Etzioni, and we are forming the hypothesis that he may be in Decide marketing mode.

Andrea Hayden, September 21, 2011

sponsored by Pandia.com

Ric Upton Leads Digital Reasoning

September 20, 2011

Digital Reasoning is proving to be one of the leaders in entity based analytics.  The firm’s  web site declares this mission:
Digital Reasoning empowers decision makers with timely, actionable intelligence to creating software to automatically make sense of complex data.
Dr. Ric Upton leads Digital Reasoning’s DC area office and team.  In an exclusive recent interview he expounds upon the services offered by his company. He maintains his company is set apart from the competition because the “Digital Reasoning approach is a complete solution, offering intelligent mechanisms leveraging advanced NLP and related technologies to identify and extract entities . . . unstructured (and structured) data, sophisticated methods for dealing with co-referencing and context . . . and highly effective mechanisms for identifying and understanding the relationships between entities over space and time . . . the Digital Reasoning solution is the first solution to deal with the entire problem.”
Dr. Upton also explains how Synthesys helps a client make an informed decision.
Our flagship product, Synthesys®, solves the problem of achieving actionable intelligence out of massive amounts of unstructured and structured text . . . A typical customer might be trying to completely understand how to locate an individual within massive amounts of reports . . . Sifting through all this data to accurately develop this profile even among misspellings, aliases, code names, etc. is typically something that can only be done by reading. Our ability to automate understanding is critical to customers with concerns about time, accuracy, completeness, or even the ability to leverage the massive amount of data they have generated.
Digital Reasoning, under the leadership of Ric Upton, is worth keeping an eye on.  Stay tuned.
Emily Rae Aldridge, September 20, 2011

Inteltrax: Top Stories, September 12 to September 16

September 19, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, in the world of big data and business intelligence.

Our flagship story this week was the feature, “Solving Big Data’s Problems Stirs Controversy ,” http://inteltrax.com/?p=2522 that gave a deeper look at how quickly our online data is piling up and whether all the talk of harnessing it’s real power is just that: talk.

Another big data tale, “Big Data Skeptics Still Lingering” http://inteltrax.com/?p=2350 illuminated how infant the analytics industry really is and had a little fun at the expense of ourselves and other industry insiders.

Finally, we took another look at the growing world of online data with, “Data Analytics Needs More Specialization, not Less,” http://inteltrax.com/?p=2357 and discovered niches might just be the solution to all the analytic nightmares out there.

The theme for this week seems to have been the mounting concern over data buildup. We can’t stop, we know that. And, thankfully, it looks like we’ll be able to do some fascinating stuff with it—though not everyone agrees. You can bet, as innovations and setbacks happen along this road, we’ll be watching it closely.

Follow the Inteltrax news stream by visiting
www.inteltrax.com

Patrick Roland, Editor, Inteltrax September 19, 2011

Linguamatics Scores Big with Text Mining

September 6, 2011

Wouldn’t it be great if there was a way to sift through all the chatter on Twitter and other social media sites to get to the real meat and potatoes? What if companies could find the proverbial needle in the Twitter-haystack? All this is being done by Cambridge-based Linguamatics as reported in the article, Tweet Smell of Success, on Business Weekly.

The small company (only 50 employees after expanding) caught the world’s attention due to their text-mining skills. Last year, using their search expertise, they were able to very accurately predict the outcome of an election based on the Tweets which occurred during a live, televised debate.

There core technology was developed by the four original founding members. Three remain at the company. They have expanded, rapidly, in their ten years of business, and rely solely on income. They believe their success is due to their unique search approach.

David Milward, CTO and co-founder said: ‘We knew that language processing could get people relevant information much faster than traditional search methods. However, previous systems needed reprogramming for different questions: we wanted to give users the flexibility to extract any information they wanted.’

Linguamatics is just one of many emerging search management companies, each with its own niche. With business and technology constantly shifting to newer and faster methods of getting information, it is no surprise that businesses demand better search methods. More and more information is popping up within the internet, intranets, file-sharing and other data storage entities. Traditional brute force search looks less and less useful to the professionals in some of these hot new market sectors.

Catherine Lamsfuss, September 6, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Oracle Data Mining Update

September 5, 2011

The new Oracle Data Mining Update is generating buzz, including a piece by James Taylor entitled, “First Look – Oracle Data Mining Update.” Oracle Data Mining (ODM) is an in-database data mining and predictive analytics engine, which allows for the building of predictive models. The features added in the latest version are highlighted.

The fundamental architecture has not changed, of course. ODM remains a “database-out” solution surfaced through SQL and PL-SQL APIs and executing in the database. It has the 12 algorithms and 50+ statistical functions I discussed before and model building and scoring are both done in-database. Oracle Text functions are integrated to allow text mining algorithms to take advantage of them. Additionally, because ODM mines star schema data it can handle an unlimited number of input attributes, transactional data and unstructured data such as CLOBs, tables or views.

The ability of ODM to build and executive analytic models completely in-database is a real plus in the market. The software would be a good candidate for anyone interested in using predictive analytics to take advantage of their operational data.

Emily Rae Aldridge, September  5, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Recorded Future and Predicting Investments

August 22, 2011

Recorded Future’s analytic trading signal now has the ability to allow customers to access data from the API in time to trade before the equity markets close. Their current blog post serves as a self-reporting report card.

In March 2011, Recorded Future announced their success in building a whole-market factor model that uses media analytic data to predict excess returns.

Since then, the firm has allowed others to benefit from this service. The company asserts the following in their August 9, 2011 post:

Taking the same strategy we presented earlier, and using the live data as it was available to our customers at 3:30, we have rolled our backtest forward, and looked at the performance of this strategy over the last few tumultuous months. Between May 13, and August 5, this strategy returned 10.4%, while the market lost 9.9% of its value.

Hakia has pushed into similar territory. No one knows how many investment firms are using similar technology to maximize their returns.

Megan Feil, August 22, 2011

Sponsored by Pandia.com

IBM May Need a More Robust Classification Solution

August 18, 2011

According to talk around the water cooler, some IBM content and search units are poking around for a classification “solution”. We think the rumor is mostly big company confusion since IBM already has software available to assess and address an organization’s content classification needs through the use of several components. According to the IBM website:

Most unstructured content is either trapped in silos across the organization or entirely unmanaged “content in the wild.” A majority of that unstructured content can be deemed unnecessary – over-retained, irrelevant, or duplicate – and should be either decommissioned or deleted.

As we understand it, one licenses the  Classification Module and/or Content Analytics software to prevent the previously stated problem and to provide content classification.

Sounds great like the ads for IBM mainframes and the promotional information about

But a disturbing question to the ArnoldIT goslings who wear blue IBM logos: What if this stuff costs too much and does not deliver on the fly classification for real time processing of tweets and Google Plus public content?

Maybe an IBM box of parts with an expensive IBM engineering team is not exactly what some outfits require? Perhaps IBM should look around and maybe snap up one of the hot players in the space. IBM has been announcing partnerships with a number of interesting companies. We track  Digital Reasoning and and think its technology looks very promising? IBM is in a good position to have an impact in the data analysis space, but it needs tools that go beyond its in house code and Cognos and SPSS methods in our opinion.

Jasmine Ashton, August 19, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Exclusive Interview with Ana Athayde, Spotter SA

August 16, 2011

I have been monitoring Spotter SA, a European software development firm specializing in business intelligence for several years. A lengthy interview with the founder, Ana Athayde appears in the Search Wizards Speak section of the ArnoldIT.com Web site.

The company has offices throughout Europe, the Middle East, and in the United States. The firm offers solutions in market sentiment, reputation management, risk assessment, crisis management, and competitive intelligence.

In the wide ranging interview, Ms. Athayde mentioned that she had been recognized as an exceptional manager, but she was quick to give credit to her staff and her chief technical officer, who was involved in the forward looking Datops SA content analytics service, now absorbed into the LexisNexis organization.

I asked her what pulled her into the vortex of content processing and analytics. She told me:

My background is business and marketing management in the sports field. In my first professional experience, I had to face major challenges in communication and marketing working for the International Olympic Committee. The amount of information published on those subjects was so huge that the first challenge was to solve the infoglut: not only to search for relevant information and build a list, but to understand opinions and assess reputation at an international level….I decided to fund a company to deliver a solution that could make use of information in textual form, what most people call unstructured data. But I knew that the information had to be presented in a way that a decision maker could actually use. Data dumps and row after row of numbers usually mean no one can tell what’s important without spending minutes, maybe hours deciphering the outputs.

I asked her about the firm’s technical plumbing. She replied:

The architecture of our own crawling system is based on proprietary methods to define and tune search scenarios. The “plumbing” is a fully scalable architecture which distributes tasks to schedulers. The content is processed, and we syndicate results. We use what we call “a source monitoring approach” which makes use of standard Web scraping methods. However, we have developed our own methods to adjust the scraping technology to each source in order to search all available documents. We extract metadata and relevant content from each page or content object.  Only documents which have been assessed as fresh are processed and provided to users. This assessment is done by a proprietary algorithm based on rules involving such factors as the publication date. This means that each document collected by Spotter’s tracking and monitoring system is stamped with a publication date. This date is extracted by the Web scraping technology, from the document content. The type of behavior of the source; that is, the source has a known update cycle. We analyze the text content of the document. And we use the date and time stamp on the document itself.

Anyone who has tried to use the dates provided in some commercial systems realizes that without accurate time context, much information is essentially useless without additional research and analysis.

To read the complete interview with Ms. Athayde, point your browser to the full text of our discussion. More information about Spotter SA is available at the firm’s Web site www.spotter.com.

Stephen E Arnold, August 16, 2011

Freebie but you may support our efforts by buying a copy of The New Landscape of Enterprise Search

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta