Japanese Government Uses Social Network Data to Reduce Suicide
August 13, 2012
Technology Review recently reported on behavior analysis through social networks in the article “Spotting Suicidal Tendencies on Social Networks.”
According to the article, a history of abnormally high suicide rates among Japanese men (ages 20 to 44) and women (ages 15 to 34) have caused the Japanese government to invest heavily in suicide research and prevention in hopes of cutting the rate by 20 percent by 2017.
One of the tactics that is being discussed is by identifying people who have regular thoughts of suicide, also known as suicide ideation, through their social networks. At the University of Tokyo, Naoki Masuda and a few others have taken to researching the popular Japanese social network Mixi which has over 25 million members.
After identifying user communities that may be more prone to suicide ideation, and comparing them with a control group, Masuda found that the differences were quite subtle. There were no differences in friend numbers, age, or gender between the two groups.
Some differences included:
“People prone to suicide ideation are likely to be members of more community groups than the control group. That may be the result of spending longer online and of a desire to want to interact. But a key indicator seems to be that these people are much less likely to be members of friendship triangles. In other words, they have fewer friends who also friends of each other. This low density of friendship triangles appears to be a crucial.”
This is an interesting application of algorithms. Utilizing social networks to discover the links between online and offline behavior is still a burgeoning field and there still remain gaps in our understanding.
Jasmine Ashton, August 13, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Rapid-I and Radoop Partnership: Working Toward User-Friendly Big Data
August 11, 2012
A new partnership announced between a business intelligence firm and a big data problem solver is one to put on your radar.
Rapid-I, a leading provider of open source solutions, and Radoop, an interface for big data analytics, have joined forces, according to a blog post on Rapid-I, “Partnership Between Rapid-I and Radoop.” Both parties are excited about the collaboration and state the development of the RapidMiner extension offered by Radoop is just the first step in a long-term business partnership. CEO of Rapid-I, Dr. Ingo Mierswa, states about the move:
“The extension is assuming a pioneering role and is already being used with great satisfaction. The best of two worlds is brought together by Radoop: The analysis of large data volumes enabled by Hadoop and an intuitive user interface brought in by RapidMiner. The real-time insights offered by the solution make a tight interlocking with operational business processes possible and therefore give enterprises a real benefit, e.g. for the early detection of customer churn or sales optimizations through combining CRM data with social media analyses.”
With a whirlwind of nonsense and heavy words surrounding the big data trend, it is refreshing for us to see two companies working toward user-friendly big data analytics. We look forward to the possibilities that will be sure to arise out of this collaborative effort.
Andrea Hayden, August 11, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Ami Revamps Web Site
August 11, 2012
Ami, the enterprise intelligence software company, has refreshed its Web site. If you navigate to the UK version of the company’s site here, you can learn about the firm’s current positioning. (Be warned, though: there is an auto-run video explaining the firm’s approach.) There is also an “old” Web site which is still online at this link.
The new site is more slick and sleek than the old. The color combinations are less jarring, and there is much less clutter on the page. The auto-run video, though, could be a problem. A software company should be aware that, yes, even now, not everyone’s system can handle such an imposition.
The company’s mission as described on the new site reads:
“Our mission is to enable our customers to develop the most far reaching insight and intelligence about the markets and sectors in which they operate through the optimised acquisition, analysis and presentation of information from both internal and external sources. . . .
“AMI Enterprise Intelligence specialises in the development of information and content processing software designed to capture, organise and analyse information from both internal and external software using horizon scanning techniques that are widely considered as best practice in competitor analysis.”
Ami was formed in 1999 by some individuals from the areas of aviation and electronics. These professionals applied the rigorous standards from those fields to the development of their software; those standards, they say, impart an unrivaled level of reliability to their products.

Cynthia Murrell, August 11, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
IntelTrax: Top Stories July 27 to August 2
August 6, 2012
Data analytics solutions and other Business Intelligence tools were the primary focus of many of this week’s IntelTrax stories.
Big Data is a continued source of controversy within the analytics community, particularly regarding its existence and whether or not it is something old or new. “Big Data is Analytics for Dummies” argues that big data is simply the rebranding of an old concept.
The referenced article explains the reasoning behind the rebranding argument:
“Cloud computing, for instance, offers much the same thing “ASPs” offered ten years before, with the difference that this time round it is going to work. Similarly, analytics has been available for many years, as a high-cost service using high value supercomputers, and operated by white-coated high priests who have come into the field from linguistics, philosophy and computer science. If you have a big data set, and the money to have it explored, analytics has been there to reveal the secret trends within you information, which might give your business an edge.”
Another notable post from last week is “Data Miners and the Need for Certificates Debunked.” According to the article, due to the fact that every field has been infiltrated by data mining, the need for experts and certifications in the field has come about as a result.
When discussing whether or not certifications have value, the article states:
“The “data mining” definition has been created by marketing industries just to summarize in a buzz word techniques of applied statistics and applied mathematics to the data stored in your hard disk. I don’t want say that tools are useless, but it should be clear that tools are only a mean to solve a problem, not the solution. In the real world the problems are never standard and really seldom you can take an algorithm as is to solve them! …maybe I’m unlucky but I never solved a real problem through a standard method.”
A story that explains the importance of data analytics technology within the insurance industry is “Insurance Doubles Down on Analytics.” According to the article, insurance companies looking to detect fraud are strongly impacted by data and statistics which is one of the reasons why they are embracing the big data revolution.
The story cited:
“The report, which covers the spectrum of tools from business intelligence tools to advanced analytics tools, finds that the average insurer invests 9 percent of the IT budget on data and analytics. This amounts to almost $10 billion per year, and while the insurance industry has long used analytics for traditional risk-centric analysis, there is a shift in the ‘how, where, and when’ the industry leverages data and analytics, according to the report.”
As you can see, text analytics and big data analysis are becoming increasingly important for companies looking to manage their content in a way that makes the most out of a multitude of different types and structures of data. Digital Reasoning is an analytics company with experience providing affordable solutions for both the government and private sector.
Jasmine Ashton, August 6, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Epic Analysis
August 2, 2012
A unique application of text analytics may hint at a future for English majors, at last! Phys.org News informs us, “Physicists Study the Classics for Hidden Truths.” Scholars at Coventry University analyzed the Illiad, Beowulf, and the Irish epic Táin Bó Cuailnge. They found that, in all three mythological works, character interactions mirror those found in today’s social networks.
The write up describes the study’s methodology:
“The researchers created a database for each of the three stories and mapped out the characters’ interactions. There were 74 characters identified in Beowulf, 404 in the Táin and 716 in the Iliad.
“Each character was assigned a number, or degree, based on how popular they were, or how many links they had to other characters. The researchers then measured how these degrees were distributed throughout the whole network.
“The types of relationships that existed between the characters were also analysed using two specific criteria: friendliness and hostility.
“Friendly links were made if characters were related, spoke to each other, spoke about one another or it is otherwise clear that they know each other amicably. Hostile links were made if two characters met in a conflict, or when a character clearly displayed animosity against somebody they know.”
These interaction maps paralleled those found in real-life networks. On the other hand, the same analysis of four fictional tales, Les Misérables, Richard III, The Fellowship of the Ring, and Harry Potter, turned up clear differences from real-life interactions. (See the article for more details on these differences.)
Interesting—the classical epics are more true-to-life than fiction. This is not to say that everything in them can be taken as facts, of course; no one insists Beowulf slew a real dragon, for example. However, the study does suggest that as the craft of story writing was refined, it moved away from realistic portrayal of societies and the ways folks related to each other. Why would that be? Ask an English major.
Cynthia Murrell, August 2, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Disagreement on Value of Big Data
August 1, 2012
Is the Big Data phenomenon good or bad for society? The Pew Research Center’s Internet & American Life Project and Elon University’s Imagining the Internet Center recently performed a study that gathered some pretty strong opinions on both sides of the issue, we learn in MediaPost’s “Pew: Value of ‘Big Data’ Debated.”
The survey asked over a thousand technology pros about Big Data, and more than half of them agreed that, by 2020, it will be a “huge positive for society in nearly all respects.” Researchers noted that:
“Big Data proponents predict continuing development of real-time data analysis and enhanced pattern recognition that could bring revolutionary change to personal life, business, and government.”
Probably so. However, a sizable minority (thirty-nine percent) disagree with the rosy outlook, asserting that, by 2020, Big Data will prove to be “a big negative.” I suspect that a field of (informed) non-technical respondents might have turned up an even larger proportion of naysayers. Writer Mark Walsh tells us these survey takers:
“. . . noted that the people controlling the collection and management of large data sets are typically governments or corporations with their own agendas. Dissenters also pointed to a shortage of human curators with the tools to sort through the glut of data, increasing the possibility that data can be manipulated or misread.”
Also true. Hmm.
The write up is an interesting read, and the opinions that accompany the survey results even more so (if you have the time to go through them). My take—like any powerful invention, Big Data collection and analysis can be employed for weal or woe, depending on who’s using it. Where would our society be if we rejected every technology that could be used nefariously? I’m afraid that individual, corporate, and governmental integrity are still the keys. Yes, even now.
Cynthia Murrell, August 1, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Making Data Easy with Training Wheels? The Nielsen Dust Up
July 31, 2012
In the Honk newsletter, I have been plugging away at some of the flights of fancy that surround big data, next generation analytics, and all things predictive. I am nervous about “training wheels” on complex mathematical processes. Like the fill-in-the-blanks functions in Excel, a person without a foundation in math can fiddle around until the software spits out a number which “looks good.” In one of my jobs, my boss was a master at “the flow.” The idea was that numbers can be shaped to support a particular point. I recall his comment to me in 1974, “Most of our clients are not smart enough to work through the math. We have to generate outputs which flow.” The idea is one that troubled me. I moved on to greener and less slippery pastures and I kept that notion of “flow” squarely in mind. Numbers should not cause the person looking at a chart or a table to say, “Wow, that number looks weird.” Hence, flow allows the reasoning process to be guided.
I just read a story which I hope is not accurate. I want to document my coming across the item, however. I think it will be an interesting touchstone and search and content processing companies race to be come players in big data and analytics. The story appeared in the Hollywood Reporter, a publication about which I know little. The headline caught my attention because it resonates with advertising and advertising automatically evokes the Google logo for me. “Nielsen Sued for Billions over Allegedly Manipulated TV Ratings” carries a hard hitting subtitle too: “In a huge new lawsuit, the business of TV ratings is fingered for rampant corruption by India’s largest TV news network.” I know even less about India than the Hollywood Reporter.

Fancy math underlies the products and services of many analytics firms which offer products and services to licensees that make interacting with data a matter of pointing and clicking. A happy quack for the equation to http://goo.gl/lBlXV
Here’s the passage I noted:
In a 194-page lawsuit filed in New York court late last week, NDTV accuses Nielsen of violating the Foreign Corrupt Practices Act by manipulating viewership data in favor of channels that are willing to provide bribes to its officials. According to NDTV, rampant manipulation of viewership data has been going on for eight years, and when presented with evidence earlier this year, top executives at Nielsen pledged to make changes. But the Indian news giant says these promises have been false ones.
Like most litigation, the story will unfold slowly and perhaps not at all. The i2 Group Palantir litigation is a relatively recent example. Based on my experience with the boss who wanted numbers to flow, I can see how the possibility of tweaking could be useful to some companies. However, with the dismal state of math skills, how can I now of the problem was a result of human intent, human error, or a training wheels type system driven over rocky terrain. I can’t and I bet that most people thinking about this situation cannot either.
What is interesting to me, however, are these notions:
- How many other fancy math systems are open to similar allegations from their licensees?
- Will this type of legal action cause some of the vendors pitching fancy math and predictive systems to modify their marketing materials to include more caveats and real world anchors instead of bold assertions?
- How will the legal system deal with fancy math litigation? I don’t know many attorneys. The handful with which I have some experience have been quick to point out that math, engineering, and science were not their strengths. Logic and reasoning were their strong suits.
With many search and content processing companies embracing fancy math, sentiment analysis, smart indexing and other math-based functions, will a search vendor find itself in the hot seat? I hope not but the market wants to buy fancy math. Understanding the fancy math may drive demand for individuals who can figure out if the systems and methods do what the licensee believes they do.
Oh, I like the word “billions.” Big money adds to the drama of analytics risk management in my opinion.
Stephen E Arnold, July 31, 2012
Inteltrax: Top Stories, July 23 to July 27
July 30, 2012
The growing availability of cloud based and open source analytics and the resulting beefed-up security demands surrounding such easily accessible software are the topics pervasive in this week’s posts on Inteltrax. As usual the nature and existence of ‘big data’ was discussed and the post, “Big Data is Analytics for Dummies”, accurately gets to the heart of the matter using text from the Tech Week Europe article as evidence.
“The author is probably spot-on with his analysis of the ‘big data’ hype invading industries around the world. Being right doesn’t change the fact that thanks to open source and cloud technologies more companies than ever before now have access to analytics. If the sage analysts need to dumb down their definitions then so be it. Thankfully, there are analytics providers committed to the industries and companies previously not invited into the analytics club.”
All that talk of openness among comrades came to a point with the announcement that Datameer was offering their analytics free of charge to academics mired in the muck of unstructured data. As the post, “Datameer Offers Free License to Academic Researchers”, quotes Market Watch,
“Academic researchers are particularly challenged by the massive amounts of data needed for their research. Collecting and analyzing data requires enormous computational effort and has typically been slow and tedious, often requiring a computer science background. Datameer offers an end-user focused tool that enables researchers themselves to integrate large quantities of data, do complex analysis in a familiar spreadsheet-like interface, and then visualize their results to easily understand, communicate and share their findings.”
Open and free are great especially in the world of costly analytics but both come with a price – heightened security risks. Inteltrax author Catherine Lamsfuss tackled security concerns with the post, “Security Top Concern for Cloud Based Software”. Live Mint compared cloud breaches to a door lock – it’s not a question of if the door will be broken down, but when. The article summarizes the state of security surrounding today’s cloud.
“These security issues should be at the forefront of companies’ decision making process when it comes to choosing a cloud based analytics provider. All cloud based software is protected to some degree but if protecting sensitive information is important than a thorough investigation into a provider’s security background is due.”
Whether one’s company is struggling with finding affordable cloud based analytics, applying open-source to existing systems or trying to strengthen security Digital Reasoning is a solid analytics provider more than capable of helping. With an extensive relationship with the intelligence community they understand the need for security but also are realistic about budgets, especially those of small and midsized businesses.
Follow the Inteltrax news stream by visiting www.inteltrax.com
Patrick Roland, Editor, Inteltrax.
July 30, 2012
So Lets Talk About MBAs Understanding Analytic Methods
July 27, 2012
Apparently biologists and scientists are having communication issues, but fortunately they have a medium to provide some council. Repercussions can be limited, as according to PNAS’s article “Heavy use of Equations Impedes Communication Among Biologists,” scientists just need to work on relevance and presentation in order to improve communication and move forward.
The method to the MBA’s madness still seems a little mad, as:
“Most research in biology is empirical, yet empirical studies rely fundamentally on theoretical work for generating testable predictions and interpreting observations. Despite this interdependence, many empirical studies build largely on other empirical studies with little direct reference to relevant theory, suggesting a failure of communication that may hinder scientific progress.”
Their actual point was too many mathematical equations are having a negative effect on the inputting of citations per page. It seems the more equations equal less citations. The irony is that the language of science is often one of mathematics.
Communication is the key to progress, and obviously scientists are not locked out since they are still progressing. The MBA’s think they need to enhance the presentation of the mathematical models utilized. This makes sense, but science is based on research and theory, so eventually the professionals will draw their on conclusion.
So, let us talk about MBA’s understanding of analytic methods. Is their conclusion feasible, or will it only lead to more miscommunication?
Jennifer Shockley, July 27, 2012
Latent Semantic Technology Tops Business Strategy List
July 26, 2012
The current 2012 year is going by quickly, but there is still time to implement business strategies that could gain your company a bigger presence on the Internet. Venture Beat reported on the “Top 10 Most Important SEO and Social Marketing Tactics of 2012.” Generally these top ten lists yield information we already know: distribute content via social channels, list your social media connection buttons prominently on the page, enable sharing content, join Pinterest, etc. Some of the ideas are new: author guest blog posts, keep your own blog content interesting and new, but the number one suggestion that caught our attention was:
“Get an onsite SEO audit: an onsite SEO audit is the foundation of your SEO campaign. Getting one will help you answer questions like: Are your title and meta tags optimized? How’s your keyword density? Have you correlated certain pages with certain keywords? Is that evident in the copy? Have you done your LSI (latent semantic indexing) research and incorporated it into the copy? An onsite SEO audit is relatively cheap, and it’s a one-time payment that you shouldn’t need to address more than once a year.”
An SEO audit done by a professional company will work wonders, heck, if you do your research you can do provide the service for yourself. One important aspect of the audit is latent semantic indexing, a powerful component of text and document analysis.
Whitney Grace, July 26, 2012
Sponsored by Polyspot

