Big Data Shows Its Return on Investment

January 13, 2016

Big data was the word that buzzed through the IT community and made companies revaluate their data analytics and consider new ways to use structured and unstructured information to their benefit.  Business2Community shares how big data has affected companies in sixteen case studies: “16 Case Studies Of Companies Proving ROI Of Big Data.” One of the problems companies faced when implementing a big data plan was whether or not they would see a return on their investment.  Some companies saw an immediate return, but others are still scratching their heads.  Enough time has passed to see how various corporations in different industries have leaned.

Companies remain committed to implementing big data plans into their frameworks, most of what they want to derive from big data is how to use it effectively:

  • “91% of marketing leaders believe successful brands use customer data to drive business decisions (source: BRITE/NYAMA)
  • 87% agree capturing and sharing the right data is important to effectively measuring ROI in their own company (BRITE/NYAMA)
  • 86% of people are willing to pay more for a great customer experience with a brand (souce: Lunch Pail)”

General Electric uses big data to test their products’ efficiently and the crunch the analytics to increase productivity.  The Weather Channel analyzes its users behavior patterns along with climate data in individual areas to become an advertising warehouse.  The big retailer Wal-Mart had added machine learning, synonym mining, and text analysis to increase search result relevancy.  Semantic search has also increased online shopping by ten percent.

The article highlights many other big brand companies and how big data has become a boon for businesses looking to increase their customer relations, increase sales, and improve their services.

 

Whitney Grace, January 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

How Big Data Is Missing the Mark

January 5, 2016

At this point in the Big Data sensation, many businesses are swimming in data without the means to leverage it effectively. TechWeek Europe cites a recent survey from storage provider Pure Storage in its write-up, “Big Data ‘Fails Businesses’ Due to Access, Skills Shortage.” Interestingly, most of the problems seem to have more to do with human procedures and short-sightedness than any technical shortcomings. Writer Tom Jowitt lists the three top obstacles as a lack of skilled workers, limited access to information, and bureaucracy. He tells us:

“So what exactly is going wrong with Big Data to be causing such problems? Well over half (56 percent) of respondents said bureaucratic red tape was the most serious obstacle for business productivity. ‘Bureaucratic red tape around access to information is preventing companies from using their data to find those unique pieces of insight that lead to great ideas,’ said [Pure Storage’s James] Petter. ‘Data ownership is no longer just the remit of the CIO, the democratisation of insight across businesses enables them to disrupt the competition.’ But regulations are also causing worry, with one in ten of the companies citing data protection concerns as holding up their dissemination of information and data throughout their business. The upcoming EU General Data Protection Regulation will soon affect every single company that stores data.”

The survey reports that missed opportunities have cost businesses billions of pounds per year, and almost three-quarters of respondents say their organizations collect data that is just collecting dust. Both cost and time are reasons that information remains unprocessed. On the other hand, Jowitt points to another survey by CA Technologies; most of its respondents expect the situation to improve, and for their data collections to bring profits down the road. Let us hope they are correct.

 

Cynthia Murrell, January 5, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Data Managers as Data Librarians

December 31, 2015

The tools of a librarian may be the key to better data governance, according to an article at InFocus titled, “What Librarians Can Teach Us About Managing Big Data.” Writer Joseph Dossantos begins by outlining the plight data managers often find themselves in: executives can talk a big game about big data, but want to foist all the responsibility onto their overworked and outdated IT departments. The article asserts, though, that today’s emphasis on data analysis will force a shift in perspective and approach—data organization will come to resemble the Dewey Decimal System. Dossantos writes:

“Traditional Data Warehouses do not work unless there a common vocabulary and understanding of a problem, but consider how things work in academia.  Every day, tenured professors  and students pore over raw material looking for new insights into the past and new ways to explain culture, politics, and philosophy.  Their sources of choice:  archived photographs, primary documents found in a city hall, monastery or excavation site, scrolls from a long-abandoned cave, or voice recordings from the Oval office – in short, anything in any kind of format.  And who can help them find what they are looking for?  A skilled librarian who knows how to effectively search for not only books, but primary source material across the world, who can understand, create, and navigate a catalog to accelerate a researcher’s efforts.”

The article goes on to discuss the influence of the “Wikipedia mindset;” data accuracy and whether it matters; and devising structures to address different researchers’ needs. See the article for details on each of these (especially on meeting different needs.) The write-up concludes with a call for data-governance professionals to think of themselves as “data librarians.” Is this approach the key to more effective data search and analysis?

Cynthia Murrell, December 31, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

New Years Resolutions in Personal Data Security

December 22, 2015

The article on ITProPortal titled What Did We Learn in Records Management in 2016 and What Lies Ahead for 2016? delves into the unlearnt lessons in data security. The article begins with a look back over major data breaches, including Ashley Madison, JP Morgan et al, and Vtech and gathers from them the trend of personal information being targeted by hackers. The article reports,

“A Crown Records Management Survey earlier in 2015 revealed two-thirds of people interviewed – all of them IT decision makers at UK companies with more than 200 employees – admitted losing important data… human error is continuing to put that information at risk as businesses fail to protect it properly…but there is legislation on the horizon that could prompt change – and a greater public awareness of data protection issues could also drive the agenda.”

The article also makes a few predictions about the upcoming developments in our approach to data protection. Among them includes the passage of the European Union General Data Protection Regulation (EU GDPR) and the resulting affect on businesses. In terms of apps, the article suggests that more people might start asking questions about the information required to use certain apps (especially when the data they request is completely irrelevant to the functions of the app.) Generally optimistic, these developments will only occur of people and businesses and governments take data breaches and privacy more seriously.

 

Chelsea Kerwin, December 22, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Topology Is Finally on Top

December 21, 2015

Topology’s  time has finally come, according to “The Unreasonable Usefulness of Imagining You Live in a Rubbery World,” shared by 3 Quarks Daily. The engaging article reminds us that the field of topology  emphasizes connections over geometric factors like distance and direction. Think of a subway map as compared to a street map; or, as writer Jonathan Kujawa describes:

“Topologists ask a question which at first sounds ridiculous: ‘What can you say about the shape of an object if you have no concern for lengths, angles, areas, or volumes?’ They imagine a world where everything is made of silly putty. You can bend, stretch, and distort objects as much as you like. What is forbidden is cutting and gluing. Otherwise pretty much anything goes.”

Since the beginning, this perspective has been dismissed by many as purely academic. However, today’s era of networks and big data has boosted the field’s usefulness. The article observes:

“A remarkable new application of topology has emerged in the last few years. Gunnar Carlsson is a mathematician at Stanford who uses topology to extract meaningful information from large data sets. He and others invented a new field of mathematics called Topological data analysis. They use the tools of topology to wrangle huge data sets. In addition to the networks mentioned above, Big Data has given us Brobdinagian sized data sets in which, for example, we would like to be able to identify clusters. We might be able to visually identify clusters if the data points depend on only one or two variables so that they can be drawn in two or three dimensions.”

Kujawa goes on to note that one century-old tool of topology, homology, is being used to analyze real-world data, like the ways diabetes patients have responded to a specific medication. See the well-illustrated article for further discussion.

Cynthia Murrell, December 21, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Old School Mainframes Still Key to Big Data

December 17, 2015

According to ZDNet, “The Ultimate Answer to the Handling of Big Data: The Mainframe.” Believe it or not, a recent survey of 187 IT pros from Syncsort found the mainframe to be the important to their big data strategy. IBM has even created a Hadoop-capable mainframe. Reporter Ken Hess lists some of the survey’s findings:

*More than two-thirds of respondents (69 percent) ranked the use of the mainframe for performing large-scale transaction processing as very important

*More than two-thirds (67.4 percent) of respondents also pointed to integration with other standalone computing platforms such as Linux, UNIX, or Windows as a key strength of mainframe

*While the majority (79 percent) analyze real-time transactional data from the mainframe with a tool that resides directly on the mainframe, respondents are also turning to platforms such as Splunk (11.8 percent), Hadoop (8.6 percent), and Spark (1.6 percent) to supplement their real-time data analysis […]

*82.9 percent and 83.4 percent of respondents cited security and availability as key strengths of the mainframe, respectively

*In a weighted calculation, respondents ranked security and compliance as their top areas to improve over the next 12 months, followed by CPU usage and related costs and meeting Service Level Agreements (SLAs)

*A separate weighted calculation showed that respondents felt their CIOs would rank all of the same areas in their top three to improve

Hess goes on to note that most of us probably utilize mainframes without thinking about it; whenever we pull cash out of an ATM, for example. The mainframe’s security and scalability remain unequaled, he writes, by any other platform or platform cluster yet devised. He links to a couple of resources besides the Syncsort survey that support this position: a white paper from IBM’s Big Data & Analytics Hub and a report from research firm Forrester.

 

Cynthia Murrell, December 17, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Big Data Gets Emotional

December 15, 2015

Christmas is the biggest shopping time of the year and retailers spending months studying consumer data.  They want to understand consumer buying habits, popular trends in clothing, toys, and other products, physical versus online retail, and especially what competition will be doing sale wise to entice more customers to buy more.  Smart Data Collective recently wrote about the science of shopping in “Using Big Data To Track And Measure Emotion.”

Customer experience professionals study three things related to customer spending habits: ease, effectiveness, and emotion.  Emotion is the biggest player and is the biggest factor to spur customer loyalty.  If data specialists could figure out the perfect way to measure emotion, shopping and science would change as we know it.

“While it is impossible to ask customers how do they feel at every stage of their journey, there is a largely untapped source of data that can provide a hefty chunk of that information. Every day, enterprise servers store thousands of minutes of phone calls, during which customers are voicing their opinions, wishes and complaints about the brand, product or service, and sharing their feelings in their purest form.”

The article describes some methods emotional data is fathered: phone recordings, surveys, and with vocal layer speech layers being the biggest.  Analytic platforms that measure vocal speech layers that measure relationships between words and phrases to understand the sentiment.  The emotions are ranged on a five-point scale, ranging from positive to negative to discover patterns that trigger reactions.

Customer experience input is a data analyst’s dream as well as nightmare based on all of the data constantly coming.

Whitney Grace, December 15, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Computers Pose Barriers to Scientific Reproducibility

December 9, 2015

These days, it is hard to imagine performing scientific research without the help of computers. Phys.org details the problem that poses in its thorough article, “How Computers Broke Science—And What We Can Do to Fix It.” Many of us learned in school that reliable scientific conclusions rest on a foundation of reproducibility. That is, if an experiment’s results can be reproduced by other scientists following the same steps, the results can be trusted. However, now many of those steps are hidden within researchers’ hard drives, making the test of reproducibility difficult or impossible to apply. Writer, Ben Marwick points out:

“Stanford statisticians Jonathan Buckheit and David Donoho [PDF] described this issue as early as 1995, when the personal computer was still a fairly new idea.

‘An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.’

“They make a radical claim. It means all those private files on our personal computers, and the private analysis tasks we do as we work toward preparing for publication should be made public along with the journal article.

This would be a huge change in the way scientists work. We’d need to prepare from the start for everything we do on the computer to eventually be made available for others to see. For many researchers, that’s an overwhelming thought. Victoria Stodden has found the biggest objection to sharing files is the time it takes to prepare them by writing documentation and cleaning them up. The second biggest concern is the risk of not receiving credit for the files if someone else uses them.”

So, do we give up on the test of reproducibility, or do we find a way to address those concerns? Well, this is the scientific community we’re talking about. There are already many researchers in several fields devising solutions. Poetically, those solutions tend to be software-based. For example, some are turning to executable scripts instead of the harder-to-record series of mouse clicks. There are also suggestions for standardized file formats and organizational structures. See the article for more details on these efforts.

A final caveat: Marwick notes that computers are not the only problem with reproducibility today. He also cites “poor experimental design, inappropriate statistical methods, a highly competitive research environment and the high value placed on novelty and publication in high-profile journals” as contributing factors. Now we know at least one issue is being addressed.

Cynthia Murrell, December 9, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Interview with Informatica CEO

November 26, 2015

Blogger and Datameer CEO Stefan Groschupf interviews Anil Chakravarthy, acting CEO of Informatica, in a series of posts on his blog, Big Data & Brews. The two executives discuss security in the cloud, data infrastructure, schemas, and the future of data. There are four installments as of this writing, but it was an exchange in the second iteration, “Big Data  Brews: Part II on Data Security with Informatica,” that  captured our attention. Here’s Chakravarthy’s summary of the challenge now facing his company:

Stefan: From your perspective, where’s the biggest growth opportunity for your company?

Anil: We look at it as the intersection of what’s happening with the cloud and big data. Not only the movement of data between our premise and cloud and within cloud to cloud but also just the sheer growth of data in the cloud. This is a big opportunity. And if you look at the big data world, I think a lot of what happens in the big data world from our perspective, the value, especially for enterprise customers, the value of big data comes from when they can derive insights by combining data that they have from their own systems, etc., with either third-party data, customer-generated data, machine data that they can put together. So, that intersection is good for, and we are a data infrastructure provider, so those are the two big areas where we see opportunity.

It looks like Informatica is poised to make the most of the changes prompted by cloud technology. To check out the interview from the beginning, navigate to the first installment, “Big Data & Brews: Informatica Talks Security.”

Informatica offers a range of data-management and integration tools. Though the company has offices around the world, they maintain their headquarters in Redwood City, California. They are also hiring as of this writing.

Cynthia Murrell, November 26, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

No Mole, Just Data

November 23, 2015

It all comes down to putting together the pieces, we learn from Salon’s article, “How to Explain the KGB’s Aazing Success Identifying CIA Agents in the Field?” For years, the CIA was convinced there was a Soviet mole in their midst; how else to explain the uncanny knack of the 20th Century’s KGB to identify CIA agents? Now we know it was due to the brilliance of one data-savvy KGB agent, Yuri Totrov, who analyzed U.S. government’s personnel data to separate the spies from the rest of our workers overseas. The technique was very effective, and all without the benefit of today’s analytics engines.

Totrov began by searching the KGB’s own data, and that of allies like Cuba, for patterns in known CIA agent postings. He also gleaned a lot if info from  publicly available U.S. literature and from local police. Totrov was able to derive 26 “unchanging indicators” that would pinpoint a CIA agent, as well as many other markers less universal but useful. Things like CIA agents driving the same car and renting the same apartment as their immediate predecessors. Apparently, logistics agents back at Langley did not foresee that such consistency, though cost-effective, could be used against us.

Reporter Jonathan Haslam elaborates:

“Thus one productive line of inquiry quickly yielded evidence: the differences in the way agency officers undercover as diplomats were treated from genuine foreign service officers (FSOs). The pay scale at entry was much higher for a CIA officer; after three to four years abroad a genuine FSO could return home, whereas an agency employee could not; real FSOs had to be recruited between the ages of 21 and 31, whereas this did not apply to an agency officer; only real FSOs had to attend the Institute of Foreign Service for three months before entering the service; naturalized Americans could not become FSOs for at least nine years but they could become agency employees; when agency officers returned home, they did not normally appear in State Department listings; should they appear they were classified as research and planning, research and intelligence, consular or chancery for security affairs; unlike FSOs, agency officers could change their place of work for no apparent reason; their published biographies contained obvious gaps; agency officers could be relocated within the country to which they were posted, FSOs were not; agency officers usually had more than one working foreign language; their cover was usually as a ‘political’ or ‘consular’ official (often vice-consul); internal embassy reorganizations usually left agency personnel untouched, whether their rank, their office space or their telephones; their offices were located in restricted zones within the embassy; they would appear on the streets during the working day using public telephone boxes; they would arrange meetings for the evening, out of town, usually around 7.30 p.m. or 8.00 p.m.; and whereas FSOs had to observe strict rules about attending dinner, agency officers could come and go as they pleased.”

In the era of Big Data, it seems like common sense to expect such deviations to be noticed and correlated, but it was not always so obvious. Nevertheless, Totrov’s methods did cause embarrassment for the agency when they were revealed. Surely, the CIA has changed their logistic ways dramatically since then to avoid such discernable patterns. Right?

Cynthia Murrell, November 23, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta