Inteltrax: Top Stories, December 19 to December 23, 2011

December 26, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, watching things moving and traveling with the aid of big data analytics.

Our story, “Unstructured Social Data is a Gold Mine for Travel Sites” shed some light on how many travel sites like Travelocity are utilizing big data to aide customers.

Similarly, our story “Airports and Analytics Grow Closer Together” showed how the complex world of airports are getting less cumbersome by sorting their unstructured data.

Our third story deals more with our voices traveling, “Telecom Attracting Big Data Heavyweights,” shows how phone companies are embracing this technology to improve customer experience.

Clearly, it’s a growing time for travel and analytics. We’re keeping a close eye on the developments and you can be assured that we’ll keep you informed as things change.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax.

December 26, 2011

Big Data Analytics and Sense Making with Synthesys

December 19, 2011

Tim Estes is the CEO and co-founder of Digital Reasoning. Digital Reasoning develops and markets solutions that provide Automated Understanding for Big Data.

There’s a great deal of talk about “big data” today. If you walk into an AT&T store near you, you may see the statistics of users sending over 3 Billion text messages a day or over 250 million tweets. Compare that to closer to 100 million or less tweets a day a year or two ago, and it’s daunting how rapidly the volume of digital information is increasing. A mobile phone without expandable storage frustrates users who want to keep a contacts list, rich media, and apps in their pocket. In organizations, the appetite for storage is significant. EMC, Hewlett Packard, and IBM are experiencing strong demand for their storage systems. Cloud vendors such as Amazon and Rackspace are also experiencing strong demand from companies offering compelling services to end users on their infrastructure. At a recent Amazon conference in Washington, Werner Vogels revealed that the AWS Cloud has hundreds of thousands of companies/customers running on it as some level. Finally, companies like Digital Reasoning are working the next generation of Cloud – automated understanding – that goes from a focus on infrastructure to sense-making of data that sits in hosted or private clouds.

While most of the attention has been on infrastructure like virtualization / hypervisors, Hadoop, and NoSQL data storage systems, we think those are really the enablers of the killer app for Cloud- which is making sense of data to solve information overload. Without next generation analytics and supporting technology, it is essentially impossible to:

  • Analyze a flow of data from multiple sensors deployed in a factory
  • Process mobile traffic at a telephone company
  • Make sense of unstructured and structured information flowing through an email system
  • Identify key entities and their importance in a stream of financial news and transaction data.

These are the real world problems that have engaged me for many years. I founded Digital Reasoning to automatically make sense of data because I believed that someday all software would learn and that would unleash the next great revolution in the Information Age. The demand for this revolution is inevitable because while data has increased exponentially, human attention has been essentially static in comparison. Technology to create better return on attention would go from “nice to have” to utterly essential. And now, that moment is here.

Digging a little deeper, Digital Reasoning has created a way to take human communication and use algorithms to make sense of it without having to depend on a human design, an ontology, or some other structure. Our system looks at patterns and the way a word is used in its context and bootstraps the understanding much like a human child does – creating associations and building into more complex relationships.

In 2009, we migrated onto Hadoop and began taking on the problem of managing very large scale unstructured data and move the industry beyond counting things that are well structured and toward being able to figure out exactly what the data means that you are measuring.

Digital Reasoning asks the question: “How do you take loose, noisy information that is disconnected and unstructured and then make sense of it so that you can then apply analytics to it in a way that is valuable to business?”

We identify actors, actions, patterns, and facts and then put it into the context of space and time in an efficient and scalable way. In the government scenario, that can mean to finding and stopping bad guys. In the legal environment they want to answer the questions of “who”, “what”, “where”, and “when”.

Digital Reasoning initially set our focus on the complex task of making sense out of massive volumes of unstructured text within the US Government Intelligence Community after the events of 9/11. But we also believe that our Synthesys software can be utilized in the commercial sector to create great value from the mountains of unstructured data that sit in the Enterprise and streaming in from the Web.

Companies with large-scale data will see value in investing in our technology because they cannot hire 100,000 people to go through and read all of the available material. This matters if you are a bank and trying to make financial trades. This matters for companies doing electronic discovery. This matters for health sectors that need help organizing medical records and guarding against fraud.

We are an emerging firm, growing rapidly and looking to have the best and the brightest join our quest to empower users and customers to make sense of their data through revolutionary software. With the recent investment from In-Q-Tel and partners of Silver Lake, I believe that Digital Reasoning has a great future ahead. We are on the bleeding edge of what is going on with Hadoop and Big Data in the engineering area and how to make sense of data through some of the most advanced learning algorithms in the world. Most of all we care that people are empowered with technology so that they can recover value and time in the race to overcome information overload.

To learn more about Digital Reasoning, navigate to our Web site and download our white paper.

Tim Estes, December 19, 2011

Sponsored by Pandia.com

Inteltrax: Top Stories, December 12 to December 16

December 19, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the issue of change in the analytic world—for the better, for the worse and everything in between.

One example of change came from our story, “Data Mining Changing Scientific Thought” shows how the way scientists think is being streamlined by analytics.

On the other hand, “ManTech has Uphill Climb with Intelligence Analytics,” shows that not all change looks promising, like one company’s new focus on intelligence.

And some change, well, we’re just not sure how it’ll pan out, like with the story “Predicting the Ponies is Just Unstructured Data” which exposes how the gambling industry could be changed by analytic tools. For the better or worse is up for debate.

Change, in any aspect of life, is inevitable. However, the world of big data analytics seems more susceptible than most. And we couldn’t be happier, as we watch the unexpected turns these changes bring to the industry every day.

Follow the Inteltrax news stream by visiting http://www.inteltrax.com/

Patrick Roland, Editor, Inteltrax.

Predictions on Big Data Miss the Real Big Trend

December 18, 2011

Athena the goddess of wisdom does not spend much time in Harrod’s Creek, Kentucky. I don’t think she’s ever visited. However, I know that she is not hanging out at some of the “real journalists’” haunts. I zipped through “Big Data in 2012: Five Predictions”. These are lists which are often assembled over a lunch time chat or a meeting with quite a few editorial issues on the agenda. At year’s end, the prediction lunch was a popular activity when I worked in New York City, which is different in mental zip from rural Kentucky.

The write up churns through some ideas that are evident when one skims blog posts or looks at the conference programs for “big data.” For example—are you sitting down?—the write up asserts: “Increased understanding of and demand for visualization.” There you go. I don’t know about you, but when I sit in on “intelligence” briefings in the government or business environment, I have been enjoying the sticky tarts of visualization for years. Nah, decades. Now visualization is a trend? Helpful, right?

Let me identify one trend which is, in my opinion, an actual big deal. Navigate to “The Maximal Information Coefficient.” You will see a link and a good summary of a statistical method which allows a person to process “big data” in order to determine if there are gems within. More important, the potential gems pop out of a list of correlations. Why is this important? Without MIC methods, the only way to “know” what may be useful within big data was to run the process. If you remember guys like Kolmogorov, the “we have to do it because it is already as small as it can be” issue is an annoying time consumer. To access the original paper, you will need to go to the AAAS and pay money.

The abstract for “Detecting Novel Associates in Large Data Sets by David N. Reshef1,2,3,*,†, Yakir A. Reshef, Hilary K. Finucane, Sharon R. Grossman, Gilean McVean, Peter Turnbaugh, Eric S. Lander, Michael Mitzenmacher, Pardis C. Sabet, Science, December 16, 2011 is:

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R^2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Stating a very interesting although admittedly complex numerical recipe in a simple way is difficult, I think this paragraph from “The Maximal Information Coefficient”  does a very good job:

The authors [Reshef et al] go on showing that that the MIC (which is based on “gridding” the correlation space at different resolutions, finding the grid partitioning with the largest mutual information at each resolution, normalizing the mutual information values, and choosing the maximum value among all considered resolutions as the MIC) fulfills this requirement, and works well when applied to several real world datasets. There is a MINE Website with more information and code on this algorithm, and a blog entry by Michael Mitzenmacher which might also link to more information on the paper in the future.

Another take on the MIC innovation appears in “Maximal Information Coefficient Teases Out Multiple Vast Data Sets”. Worth reading as well.

Forbes will definitely catch up with this trend in a few years. For now, methods such as MIC point the way to making “big data” a more practical part of decision making. Yep, a trend. Why? There’s a lot of talk about “big data” but most organizations lack the expertise and the computational know how to perform meaningful analyses. Similar methods are available from Digital Reasoning and the Google love child Recorded Future. Palantir is more into the make pictures world of analytics. For me, MIC and related methods are not just a trend; they are the harbinger of processes which make big data useful, not a public relations, marketing, or PowerPoint chunk of baloney. Honk.

Stephen E Arnold, December 18, 2011

Sponsored by Pandia.com, a company located where high school graduates actually can do math.

FirstRain Gets Some Azure Chip Love

December 18, 2011

According to the October 25 news release, FirstRain Recognized as “Innovative Business Analytics Company under $100M to Watch in 2011” by Leading Market Research Firm, the analyst firm IDC has included FirstRain, an analytics software company, in its 2011 list of “Innovative Business Analytics Companies Under $100M to Watch.”

FirstRain is an analytics software company that uses its Business Monitoring Engine to provide professionals with access to the business Web. The company’s semantic-categorization technology instantly cuts through the clutter of consumer Web content, delivering only highly-relevant intelligence.

The company was highlighted in the “Cloud-based Analytics” category for their innovative use of semantic analysis to extract and deliver high-value information from the Web.

IDC observed:

The value in using FirstRain is the breadth of its coverage, combined with its depth of selection and filtering so that it delivers the information that users need to see without cluttering their desktops or their minds with too much that is extraneous. It was easy to integrate into existing information delivery channels and because of the high relevance of the information that it delivered.

The fact that IDC even has a list of top business analytics companies shows how important search optimization software is becoming in the business world. Who knew that business intelligence would be the new black?

Jasmine Ashton, December 18, 2011

Sponsored by Pandia.com

Digimind 9 Now Available

December 17, 2011

In the current economic climate, businesses are under more pressure than ever to consolidate their resources and invest in products that will maximize cost and efficiency. When choosing information management solutions, it is especially important that companies keep their specific needs in mind.

According to a recent PRWeb news release “Digimind launches Digimind 9 – Next generation Competitive Intelligence for Smarter Decision Making,” competitive intelligence software provider Digimind, has released Digimind 9, updated software which is designed to accompany organizations throughout their intelligence workflows.

The article states:

Digimind 9 comes in response to a growing demand from companies willing to complement their CI apparatus with such features included as “advanced semantic analysis”, “social media monitoring”, and “intelligence profile management”. Indeed, beyond the conventional intelligence workflows, more intelligence requirements surface nowadays to leverage on social networks, unstructured data, and related analysis.

As data is being created faster than ever, Digimind 9 meets the ever growing need for companies to improve their capacity and to react and anticipate rapid changes.

Jasmine Ashton, December 17, 2011

Sponsored by Pandia.com

IBM Redbooks Reveals Content Analytics

December 16, 2011

IBM Redbooks has put out some juicy reading for the azure chip consultants wanting to get smart quickly with IBM Content Analytics Version 2.2: Discovering Actionable Insight from Your Content. The sixteen chapters of this book take the reader from an overview of IBM content analytics, through understanding the details, to troubleshooting tips. The above link provides an abstract of the book, as well as links to download it as a PDF, view in HTML/Java, or order a hardcopy.

We learned from the write up:

The target audience of this book is decision makers, business users, and IT architects and specialists who want to understand and use their enterprise content to improve and enhance their business operations. It is also intended as a technical guide for use with the online information center to configure and perform content analysis with Content Analytics.

The product description notes a couple of specifics. For example, creating custom annotators with the LanguageWare Resource Workbench is covered. So is using the IBM Content Assessment to weed out superfluous data.

The content is, of course, slanted toward working with IBM solutions. However, there is also some more general information included. This is a good place to go to get a better handle on content management.

Cynthia Murrell, December 16, 2011

Sponsored by Pandia.com

Hewlett Packard Lusts after Big Data

December 16, 2011

As Web users continue creating structured and unstructured data at higher volumes than ever before we are starting to need technology to analyze it.

According to the Dec 1, Front Line article “HP Predicts 50 Zettabytes of Data will be Created Annually by 2020,” Hewlett Packard (HP) predicts that by 2020, fifty zettabytes (fifty billion terrabytes) of data will be created every year. This will present a major challenge for businesses.

Prith Banerjee, head of HP Labs, said at the firm’s Discover event:

By 2020 there could be as many as 10 billion people on the planet and some four billion of these will be online interacting on social networks. While now there are 2.5 million tweets per day this will rise to tens of millions.There’s also going to be a huge increase of sensors on the network measuring everything from temperature to heart monitoring. We expect there to be one trillion sensors by 2020.

HP Labs is currently working to address this issue by investigating technology that tracks a variety of complex events which must be correlated so that patterns can be detected. It could contextually analyze what customers say on twitter a mere ten seconds after the tweet is sent.

What will Autonomy’s role in this big data love fest be? Stay tuned.

Jasmine Ashton, December 16, 2011

Karmasphere and MapR Team Up on Hadoop Help

December 15, 2011

Karmasphere and MapR Technologies are working together to make Hadoop’s Big Data Analytics platform more accessible, announces Karmasphere in “Combination Offers Self-Service Big Data Analytics with Minimal IT Support.” Hadoop, of course is free as open source software. You can, however, purchase help in managing it.

Karmasphere Analytics is now available on MapR’s Hadoop distribution system. The write up notes:

‘Karmasphere’s graphical Big Data Analytics workspace is the perfect complement to MapR’s easy to use, dependable and fast platform,’ said Jack Norris, vice president of marketing, MapR. ‘With the availability of Karmasphere products on our distribution, data analysts can derive insights from their structured and unstructured data in Hadoop without developing MapReduce programs.’

Karmasphere helps its customers use Hadoop to extract patterns, relationships, and drivers from big data. The company boasts that its Analytics Engine is intuitive and simplifies data analysis.

MapR Technologies helps business users who don’t also happen to be IT pros efficiently manage their Hadoop implementation. It prides itself on making Hadoop more reliable and easier to use.

Cynthia Murrell, December 15, 2011

Sponsored by Pandia.com

HP Announces Autonomy’s Software Will be Integrated Across All Products

December 8, 2011

When Hewlett-Packard (HP) purchased Autonomy this past August for $10.3 billion, nearly 24 times the small Cambridge-based software company’s earnings, it vowed it’s reasoning was to integrate Autonomy’s software with HP’s hardware products.

Three months later, ZD Net UK reported on November 29, Autonomy’s advanced data search, analysis and augmented reality technology will be integrated across HP’s products, in the article “Autonomy plots HP- spanning tech”.

Autonomy’s chief executive, Mike Lynch said:

“There is a lot of work going on between the different business units at HP [to integrate Autonomy technology]. Servers and storage is obviously key [but with the] Personal Systems Group stuff is going to come that was only available for very large companies. There’s also some really stunning technology for printing being done by both HP research and development people and Autonomy’s. More detail will be given very shortly.”

Autonomy’s technology helps make sense of information generated by social media, phonecalls, video feeds and other types of unstructured data. I am very interested to see what this technology brings to HP’s products over the next year.

Jasmine Ashton, December 08, 2011

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta