Big Data Hailed as Triumphant

January 30, 2013

We’ve tripped over more big-data cheerleading, and we are ready to say, “enough already.” The timpark.io blog trumpets, “Data Trumps Everything.” Oh, really?

Mr. Park uses the example of the modern supermarket to illustrate his assertion: the use of big-data analysis has eclipsed human experience and intuition. While information technology was adopted to assist the seasoned manager with time-consuming calculations, the write-up asserts that big data has now taken over. Using grocery-receipt data, software now analyzes a myriad of factors, builds sophisticated models, and directs in-store humans in order to maximize profits. Park notes:

“That Halloween expansion of candy?   That wasn’t a guess – the supermarket knows down to a matter of hours of when to roll that out.   This is an obvious example, but a data scientist at one major retailer confided to me that they have over 550 such rotations that happen in a year to capture ebbs and flows in certain products.  Some of these are obvious, like Halloween candy or Valentine’s Day cards, that any human manager could have predicted — perhaps not with the accuracy of the data driven approach — but close enough.   But the vast majority of these are changes that frankly they don’t completely understand, like that having a sale on cereal on Tuesdays results in 17% more profit in breakfast products during 2 week periods where less than 4 sunny days are forecast.”

Park is correct that this is now our grocery-store reality. He is even correct to extrapolate that many other types of business are following suit. However, going on to say that data trumps “everything” is, shall we say, a bit simplistic. Even at large retail chains, humans take ultimate responsibility for decisions, including whether or not to follow the suggestions of that pricey software they chose to buy.

Now, if Watson ever takes over as CEO at IBM, that will be a different matter.

Cynthia Murrell, January 30, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Stunning Visuals Show How Datasets Connect

January 25, 2013

Data analysis can be tricky business, especially when you have been staring at a computer screen and all the information blurs together. What if there was a way to make the data more visually stimulating, not to mention could take out the guesswork in correlations? Gigaom may have found the answer, “Has Ayasdi Turned Machine Learning Into A Magic Bullet?” Ayasdi is a startup company that has created software for visually mapping hidden connections in massive datasets. The company just opened its doors with $10.25 million in funding, but what is really impressive is their software offering:

“At its core, Ayasdi’s product, a cloud-based service called the Insight Discovery Platform, is a mix of distributed computing, machine learning and user experience technologies. It processes data, discovers the correlations between data points, and then displays the results in a stunning visualization that’s essentially a map of the dataset and the connections between every point within it. In fact, Ayasdi is based on research into the field of topological data analysis, which Co-founder and President Gunnar Carlsson describes a quest to present data as intuitively as possible based solely on the similarity of (or distance between, in a topological sense) the data points.”

The way the software works is similar to social networking. Social networking software maps connections between users and their content, but the algorithms do not understand what the connections mean. Ayasdi makes it easier for its users to attach meaning to the correlations. The article also points out that Ayasdi’s software is hardly a new concept, but for some working in BI it takes out a lot of the discovery work. The software may be really smart, but humans are still needed to interpret the data.

Whitney Grace, January 25, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

SAP Pulls Ahead with HANA In Memory Database

January 24, 2013

Is SAP HANA the future of databases? ReadWrite seems to think so. Their Anton Gonsalves declares, “SAP’s HANA Deployment Leapfrogs Oracle, IBM, and Microsoft.” He reports that HANA now possesses a distinct edge—it is the only in-memory database that can perform both business analysis and transactions with a single database. That is an ability that the competition cannot (yet) provide. The article goes on to tell us:

“For SAP customers, HANA-powered applications can speed up the sales process dramatically. For example, today when salespeople for a large manufacturer takes a large order from a customer, they may not be to say on the spot exactly when the order will be fulfilled. That information often comes hours later after the numbers are run separately through forecasting applications.

“With HANA running SAP’s enterprise resource planning applications – called Business Suite – salespeople will be able to take the order and get forecasting information in seconds.”

That is indeed a big advantage. SAP’s Hasso Plattner promises that the company will eventually make HANA available in all its products, both on-premise and in the cloud. Will the big names be able to catch up before SAP captures the market?

According to Gonsalves, Oracle and IBM are expected to do so in time, and Microsoft says it should have the capability by the next iteration of MS SQL. Gartner analyst Donald Feinberg expects it will take those companies between two and five years to implement such technology. That gives SAP plenty of time to run with it.

A longstanding leader in enterprise software, SAP serves over 183,000 customers. Founded 1972 by five former IBM workers, the company is headquartered in Walldorf Germany and maintains operations in over 50 countries.

Cynthia Murrell, January 24, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Social Norms on Social Media

January 22, 2013

It is no secret that social networking is growing rapidly and is here to stay. But now we can sit back and watch as the world tries to figure out the boundaries on conversational topics on social media. Social norms that are widely accepted, such as which topics you can or cannot bring up in certain situations, get foggy when it comes to discussions online. We learn in “Religion, Politics, Sports… What People Around the World Do and Do Not Talk About on Social Media” on Quartz that citizens of different countries are already appearing to set vastly different new rules.

We learn:

“Different countries are writing those rules differently. A recent survey, conducted by the Pew Research Center, shows how common it is in each of 20 countries to talk about politics, religion, sports, and music or movies on social media. There are some surprising differences.

Europeans generally talk politics online less than people in the Americas. Middle Easterners are the most voluble of all, which is no surprise given the recent Arab Spring.”

It also appears that religious discussions vary greatly all over the world and pop culture is popular everywhere. Now that we know what is popular, perhaps this will be a starting place for new social norms to be set regarding social networking. We wonder what Miss Manners would have to say about this.

Andrea Hayden, January 22, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Complex Facebook Analytics Tool Available from Wolfram Alpha

January 22, 2013

Wolfram Alpha is famous for its knowledgeable tools and widgets that involve highly complex algorithms and computations. However, many may be surprised to hear about the Facebook analytics tool which is available from the systematic knowledge engine. The article “Use Wolfram Alpha to Dig Up Cool Statistics About Your Facebook Account [Weekly Facebook Tips]” on MakeUseOf tells readers how to get detailed facebook information about their account.

The article shares:

“With the Wolfram Alpha Facebook analytics tool, you can find out a huge amount of information about your Facebook account. It’s quite fun to see which of your posts or photos are the most popular, who your top commenters are, who is sharing your posts the most and more interesting tidbits. Plus, it’s easy to use this tool and completely free. Why not have a go?”

I decided to have a go with the Facebook tool, and was overwhelmed with the amount of detailed information I was provided. Wolfram Alpha told me everything from the moon phase at the time of my birth to statistical data about the top contributors on my page. Of course, all of this information is readily available to anyone with access to my page. This tool is fun, but may encourage others to consider resetting the privacy settings on their accounts.

Andrea Hayden, January 22, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

The Question Drives the Search

January 22, 2013

Over at Chiliad, an article called “Search Vs. Correlation Vs. Causality-What Do Your Goals Require?” discusses how different types of questions change search results. Business intelligence and search are different aspects of the same end result and together they can generate more useful results. Correlations provide analytics, thus turning up unexpected and often useful relationships. The value is not in observations, but rather connections between data, which then influences decision making. The “why” factor is also a big part, because it explains how the data will be used and what the end result will be.

It involves more legwork than anything else:

“Iterative Discovery—understanding “why”—requires a different approach. Not only does digging in deliver more information, it suggests new inquiry and allows you to dig deeper. It helps you understand—across all your sources—what matters most. Although Chiliad named this approach Iterative Discovery, we didn’t invent it. Great researchers and analysts did. We simply observed them—and created a tool tuned to figuring out…’What does it mean?’”

If the why question cannot be answered than search, business intelligence, and everything else is useless. Users conduct these actions to find an answer and if an answer is not provided the action are worthless.

Whitney Grace, January 22, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Accuracy Proves Quality Analytics

January 21, 2013

Accuracy is key for analytics, because it validates information performed by a computer while the human user was away doing other business. The only way to measure accuracy is to compare human analysis to computer analysis. The Attensity Blog focuses on “How Accuracy In Analytics Matters For Businesses.” The article explains accuracy is measured to how well a computer can mimic a human brain:

“Computers only do what we tell them to do.  They have (almost) infinite computational power, and can apply any set of rules to any computational variables.  This means that if we tell computers that a specific word or combination of words means something positive, then the computer cannot make it mean something negative.  In other words, we are not really rating the computer’s ability to determine a sentiment we are rating whether humans did a good job, or not, in biasing the computer to pick that sentiment.  This means we can accurately predict an outcome selected by the computer before the first variable is computed against the first rule.”

In other words, accuracy is human bias and for better analytics it should be reduced. To reduce bias, analytics’’ core elements must be examined: what is analyzed and what it is compared to. The article outlines the steps taken to help reduce bias and how it can improve a company’s standing, finance, etc. It look like that accuracy means adding the extra ingredient of love that grandma puts in her cookies, i.e. you have to care about it.

Whitney Grace, January 21, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Indiana University and a Big Data Set

January 20, 2013

Short honk: If you are looking for a big data set to show off your Big Data system, Indiana University can help. “Click Dataset” says:

To foster the study of the structure and dynamics of Web traffic networks, we make available a large dataset (‘Click Dataset’) of HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows one to examine large volumes of traffic data while minimizing biases associated with other data sources.

There are some caveats, but for the firms with sci-fi type Big Data analytics’ systems, the issues should be irrelevant. “Truthy” in advertising? For companies with real world systems, the caveats are important.

Stephen E Arnold, January 20, 2013

Social Search: Don Quixote Is Alive and Well

January 18, 2013

Here I float in Harrod’s Creek, Kentucky, an addled goose. I am interested in other geese in rural Kentucky. I log into Facebook, using a faux human alias (easier than one would imagine) and run a natural language query (human language, of course). I peck with my beak on my iPad using an app, “Geese hook up 40027.” What do I get? Nothing, Zip, zilch, nada.

Intrigued I query, “modern American drama.” What do I get? Nothing, Zip, zilch, nada.

I give up. Social search just does not work under my quite “normal” conditions.

First, I am a goose spoofing the world as a human. Not too many folks like this on Facebook, so my interests and my social graph is useless.

Second, the key words in my natural language query do not match the Facebook patterns, crafted by former Googlers and 20 somethings to deliver hook up heaven and links to the semi infamous Actor’s Theater or the Kentucky Center.

social outcast

Social search is not search. Social search is group centric. Social search is an outstanding system for monitoring and surveillance. For information retrieval, social search is a subset of information retrieval. How do semantic methods improve the validity of the information retrieved? I am not exactly sure. Perhaps the vendors will explain and provide documented examples?

Third, without context, my natural language queries shoot through the holes in the Swiss Cheese of the Facebook database.

After I read “The Future of Social Search,” I assumed that information was available at the peck of my beak. How misguided was I? Well, one more “next big thing” in search demonstrated that baloney production is surging in a ailing economy. Optimism is good. Crazy predictions about search are not so good. Look at the sad state of enterprise search, Web search, and email search. Nothing works exactly as I hope. The dust up between Hewlett Packard and Autonomy suggests that “meaning based computing” is a point of contention.

If social search does not work for an addled goose, for whom does it work? According to the wild and crazy write up:

Are social networks (or information networks) the new search engine? Or, as Steve Jobs would argue, is the mobile app the new search engine? Or, is the question-and-answer formula of Quora the real search 2.0? The answer is most likely all of the above, because search is being redefined by all of these factors. Because search is changing, so too is the still maturing notion of social search, and we should certainly think about it as something much grander than socially-enhanced search results.

Yep, Search 2.0.

But the bit of plastic floating in my pond is semantic search. Here’s what the Search 2.0 social crowd asserts:

Let’s embrace the notion that social search should be effortless on the part of the user and exist within a familiar experience — mobile, social or search. What this foretells is a future in which semantic analysis, machine learning, natural language processing and artificial intelligence will digest our every web action and organically spit out a social search experience. This social search future is already unfolding before our very eyes. Foursquare now taps its massive check in database to churn out recommendations personalized by relationships and activities. My6sense prioritizes tweets, RSS feeds and Facebook updates, and it’s working to personalize the web through semantic analysis. Even Flipboard offers a fresh form of social search and helps the user find content through their social relationships. Of course, there’s the obvious implementations of Facebook Instant Personalization: Rotten Tomatoes, Clicker and Yelp offer Facebook-personalized experiences, essentially using your social graph to return better “search” results.

Semantics. Better search results. How does that work on Facebook images and Twitter messages?

My view is that when one looks for information, there are some old fashioned yardsticks; for example, precision, recall, editorial policy, corpus provenance, etc.

When a clueless person asks about pop culture, I am not sure that traditional reference sources will provide an answer. But as information access is trivialized, the need for knowledge about the accuracy and comprehensiveness of content, the metrics of precision and recall, and the editorial policy or degree of manipulation baked into the system decreases.

image

See Advantech.com for details of a surveillance system.

Search has not become better. Search has become subject to self referential mechanisms. That’s why my goose queries disappoint. If I were looking for pizza or Lady Gaga information, I would have hit pay dirt with a social search system. When I look for information based on an idiosyncratic social fingerprint or when I look for hard information to answer difficult questions related to client work, social search is not going to deliver the input which keeps this goose happy.

What is interesting is that so many are embracing a surveillance based system as the next big thing in search. I am glad I am old. I am delighted my old fashioned approach to obtaining information is working just fine without the special advantages a social graph delivers.

Will today’s social search users understand the old fashioned methods of obtaining information? In my opinion, nope. Does it matter? Not to me. I hope some of these social searchers do more than run a Facebook query to study for their electrical engineering certification or to pass board certification for brain surgery.

Stephen E Arnold, January 18, 2013

QlikTech and Attivio Partner for Big Data Analytics

January 17, 2013

More and more collaboration continues to emerge in the Big Data community, particularly among open source-based software companies. Attivio and QlikTech recently formed one such partnership. Daily Finance covers the story in, “QlikTech and Attivio Partner to Deliver QlikView Direct Discovery for Big Data Analytics.”

The article begins:

QlikTech, a leader in user-driven Business Intelligence (BI), and Attivio today announced a partnership to give customers the ability to combine QlikView in-memory data with Attivio’s Active Intelligence Engine (AIE®) via QlikView Direct Discovery. QlikTech and Attivio have collaborated to test and validate Direct Discovery with Attivio’s AIE to leverage AIE’s capabilities of unifying the variety of Big Data information to give business users a full understanding of not just what has happened, but also why.”

Providing meaning to Big Data is a continuing challenge. But, many excellent Big Data solutions are emerging. LucidWorks Big Data is another to consider. LucidWorks boasts industry respect and long-standing investment in the open source community. Their platform is built on Apache Lucene, and is considered a leader in cost effective enterprise open source search.

Emily Rae Aldridge, January 17, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta