Data: Such a Flexible and Handy Marketing Tool

July 5, 2019

Love Big Data? Like New Age research? Enjoy studies funded by commercial enterprises? If you are nodding in agreement, head on over to ““Evidence-Based Medicine Has Been Hijacked: A Confession from John Ioannidis.”

Here’s a statement to ponder:

Since clinical research that can generate useful clinical evidence has fallen off the radar screen of many/most public funders, it is largely left up to the industry to support it.  The sales and marketing departments in most companies are more powerful than their R&D departments. Hence, the design, conduct, reporting, and dissemination of this clinical evidence becomes an advertisement tool. As for “basic” research, as I explain in the paper, the current system favors PIs who make a primary focus of their career how to absorb more money. Success in obtaining (more) funding in a fiercely competitive world is what counts the most. Given that much “basic” research is justifiably unpredictable in terms of its yield, we are encouraging aggressive gamblers. Unfortunately, it is not gambling for getting major, high-risk discoveries (which would have been nice), it is gambling for simply getting more money.

Does this observation apply to the world of Big Data, online advertising, and the spreadsheet fever plaguing MBAs? Yep.

  1. People believe numbers and most do not ask, “Where did this number come from? What was the sample? How did you verify these data?”
  2. Outputs can be shaped. Check out your college class notes for Statistics 101; that is, I am assuming you kept your college notes. See anything about best practices? Validity tests?
  3. What about those thresholds? Many Bayesian methods are based upon guesses. Toss in some Monte Carlo? How representative of the outputs? What are the deltas between the current outputs and other available data?

Our next Factualities will appear in this blog on Wednesday. There are some special numbers in that round up.

A friend of mine who owns a successful online business said, “Nobody cares.”

Nobody cares?

Stephen E Arnold, July 5, 2019

Knowledge Graphs: Getting Hot

July 4, 2019

Artificial intelligence, semantics, and machine learning may lose their pride of place in the techno-jargon whiz bang marketing world. I read “A Common Sense View of Knowledge Graphs,” and noted this graph:

image

This is a good, old fashioned, Gene Garfield (remember him, gentle reader) citation analysis. The idea is that one can “see” how frequently an author or, in this case, a concept has been cited in the “literature.” Now publishers are dropping like flies and are publishing bunk. Nevertheless, one can see that using the phrase knowledge graph is getting popular within the sample of “literature” parsed for this graph. (No, I don’t recommend trying to perform citation analysis in Bing, Facebook, or Google. The reasons will just depress me and you, gentle reader.)

The section of the write I found useful and worthy of my “research” file is the collection of references to documents defining “knowledge graph.” This is useful, helpful research.

The write up also includes a diagram which may be one of the first representations of a graph centric triple. I thought this was something cooked up by Drs. Bray, Guha, and others in the tsunami of semantic excitement.

One final point: The list of endnotes is also useful. In short, good write up. The downside is that if the article gets wider distribution, a feeding frenzy among money desperate consultants, advisers, and analysts will be ignited like a Fourth of July fountain of flame.

Stephen E Arnold, July 4, 2019

15 Reasons You Need Business Intelligence Software

May 21, 2019

I read StrategyDriven’s “The Importance of Business Intelligence Software and Why It’s Integral for Business Success.” I found the laundry list interesting, but I asked myself, “If BI software is so important, why is it necessary to provide 15 reasons?”

I went through the list of items a couple of times.Some of the reasons struck me as a bit of a stretch. I had a teacher at the University of Illinois who loved the phrase “a bit of a stretch, right” when a graduate student proposed a wild and crazy hypothesis or drew a nutsy conclusion from data.

Let’s look at four of these reasons and see if there’s merit to my skepticism about delivering fish to a busy manager when the person wanted a fish sandwich.

Reason 1: Better business decisions. Really? If a BI system outputs data to a clueless person or uses flawed, incomplete, or stale data to present an output to a bright person, are better business decisions an outcome? In my experience, nope.

Reason 6. Accurate decision making. What the human does with the outputs is likely to result in a decision. That’s true. But accurate? Too many variables exist to create a one to one correlation with the assertion and what happens in a decider’s head or among a group of deciders who get together to figure out what to do. Example: Google has data. Google decided to pay a person accused of improper behavior millions of dollars. Accurate decision making? I suppose it depends on one’s point of view.

Reason 11. Reduced cost. I am confident when I say, “Most companies do not calculate or have the ability to assemble the information needed to produce fully loaded costs.” Consequently, the cost of a BI system is not the license fee. There are the associated directs and indirects. And when a decision from the BI system is wrong, there are some other costs as well. How are Facebook’s eDiscovery systems generating a payback today? Facebook has data, but the costs of its eDiscovery systems are not known, nor does anyone care as the legal hassles continue to flood the company’s executive suite.

Reason 13. High quality data. Whoa, hold your horses. The data cost is an issue in virtually every company with which I have experience. No one wants to invest to make certain that the information is complete, accurate, up to date, and maintained (indexed accurately and put in a consistent format). This is a pretty crazy assertion about BI when there is no guarantee that the data fed into the system is representative, comprehensive, accurate, and fresh.

Business intelligence is a tool. Use of a BI system does not generate guaranteed outcomes.

Stephen E Arnold, May 21, 2019

Into R? A List for You

May 12, 2019

Computerworld, which runs some pretty unusual stories, published “Great R Packages for Data Import, Wrangling and Visualization.” “Great” is an interesting word. In the lingo of Computerworld, a real journalist did some searching, talked to some people, and created a list. As it turns out, the effort is useful. Looking at the Computerworld table is quite a bit easier than trying to dig information out of assorted online sources. Plus, people are not too keen on the phone and email thing now.

The listing includes a mixture of different tools, software, and utilities. There are more than 80 listings. I wasn’t sure what to make of XML’s inclusion in the list, but, the source is Computerworld, and I assume that the “real” journalist knows much more than I.

Two observations:

  • Earthworm lists without classification or alphabetization are less useful to me than listings which are sorted by tags and alphabetized within categories. Excel does perform this helpful trick.
  • Some items in the earthworm list have links and others do not. Consistency, I suppose, is the hobgoblin of some types of intellectual work
  • An indication of which item is free or for fee would be useful too.

Despite these shortcomings, you may want to download the list and tuck it into your “Things I love about R” folder.

Stephen E Arnold, May 12, 2019

Cognos: Now Transforming Business After Only 50 Years

May 3, 2019

It is 1969, and Cognos officially opened for business. That was a half century ago. Over the years, Cognos in its 50 years of “transformation” has absorbed a number of other technologies. Anyone remember Databeacon, the mid market analytics outfit. Cognos strikes me as an umbrella brand. According to CIO’s article “5 Ways IBM Cognos Analytics Is Transforming Business,” IBM’s Cognos Analytics has integrated the artificial intelligence capabilities of IBM Watson Analytics.

Okay, 50 years, much thrashing, and IBM is not on a part with the zippier outfits like DataRobot’s Eureqa. The idea of transforming is interesting, but I am not sure I buy into what looks to me like an example IBM marketing and PR. Sorry, CIO. I am just as suspicious as my neighbors here in Harrod’s Creek.

Here are the transforming things:

  1. Maximizing charitable donations (No, I am not kidding.)
  2. Optimizing retail operations with purchasing analytics. (What about Amazon’s data for merchants?)
  3. Leveraging data to maximize fan engagement. (No, I am not making this up.)
  4. Predicting audience viewing preferences.
  5. Deploying data science to keep salmon healthy. (Watson may not be a winner in the cancer thing, but it appears to work on fish.)

After 50 years, the write up points to these examples or use cases as transformational. Amazing.

Eureka may not capture what Cognos with Watson can deliver. The experience, however, could cause DataRobot’s phone to ring.

PS. What’s even more amazing, one of the DarkCyber team had to register to read what is marketing collateral. Interesting.

Stephen E Arnold, May 3, 2019

Cognitive Engine: What Powers the USAF Platform?

May 1, 2019

Last week I met with a university professor who does cutting edge data and text mining and also shepherds PhD candidates. In the course of our 90 minute conversation, I noticed some reference books which had SPSS on the cover. The procedures implemented at this particular university worked well.

After the meeting, I was thinking about the newer approaches which are becoming publicly available. The USAF has started talking about its “cognitive engine.” I thought I heard at a conference that some technology developed developed by Nutonian, now part of a data and text mining roll up, had influenced the project.

The Nutonian system is predictive with a twist. The person using the system can rely on the smart software to perform the numerous intermediary steps required when using more traditional systems.

The article “The US Air Force Will Showcase Its Many Technological Advances in the USAF Lab Day.” The original is in Chinese but Freetranslate.com can help out if don’t read Chinese or have a close by contact who does.

The USAF wants to deploy a cognitive platform into which vendors can “plug in” their systems. The Chinese write up reported:

AFRL’s Autonomy Capability Team 3 (ACT3) is developing artificial intelligence on a large scale through the development and application of the Air Force Cognitive Engine (ACE), an artificial intelligence software platform. Put into application. The software platform architecture reduces the barriers to entry for artificial intelligence applications and provides end-user applications with the ability to cover a range of artificial intelligence problem types. In the application, the software platform connects educated end users, developers, and algorithms implemented in software, task data, and computing hardware to the process of creating an artificial intelligence solution.

The article also provides some interesting details which were not included in some of the English language reports about this session; for example:

  • Smart rockets
  • An agile pod
  • Pathogen identification.

A couple of observations:

First, obviously the Chinese writer had access to information about the Lab Day demonstrations.

Second, the cognitive platform does not mention foundation vendors, which I understand.

Third, it would be delightful to visit a university and see documentation and information about the next-generation predictive analytics systems available.

Stephen E Arnold, May 1, 2019

Here’s what the Chinese writer reported about the

Analytics Leaders: No Google, No Voyager Labs

April 25, 2019

I read “Top 50 Organizations for Data Analytics to Be Honored.” Interesting idea: Identify outfits which are really, really good with analytics: Data mining, text mining, math, and numerical recipes which are yummy, yummy.

The names on the list were a bit of a surprise to me; for instance:

  • A football team, Philadelphia Eagles. The Eagles?
  • A not for profit with an interesting history, United Way
  • A company unable to build a 5G technology based product and unable to deliver certain silicon to some customers, Intel.

What’s up?

I expected that the Google would get a mention and a footnote to Recorded Future, partially funded by Google and In-Q-Tel, the investment arm of the Central Intelligence Agency. Where’s Voyager Labs, the developer of Voyager Analytics?

After reading the article, I am not sure how the list was developed, and I am not confident that the organizations cited for excellence in analytics would make my list of analytics leaders.

But the important thing is the PR.

Stephen E Arnold, April 25, 2019

The Surf Is Up for the Word Dark

April 4, 2019

Just a short note. I read this puffy wuffy write up about a new market research report. Its title?

The Research Report “Dark Analytics Market: Global Industry Analysis 2013 – 2017 and Opportunity Assessment; 2018 – 2028 ” provides information on pricing, market analysis, shares, forecast, and company profiles for key industry participants

What caught my attention is not the authors’ attempt to generate some dough via open source data collection and a touch of Excel fever.

Here’s what caught my attention:

Dark analytics is the analysis of dark data present in the enterprises. Dark data is generally is referred as raw data or information buried in text, tables, figures that organizations acquire in various business operations and store it but, is unused to derive insights and for decision making in business. Organizations nowadays are realizing that there is a huge risk associated with losing competitive edge in business and regulatory issues that comes with not analyzing and processing this data. Hence, dark analytics is a practice followed in enterprises that advances in analyzing computer network operations and pattern recognition.

Yes, buried data treasure. Now the cost of locating, accessing, validating, and normalizing these time encrusted nuggets?

Answer: A lot. A whole lot. That’s part of the reason old data are not particularly popular in some organizations. The idea of using a consulting firm or software from SAP is not particularly thrilling to my DarkCyber team. (Our use of “dark” is different too.)

Stephen E Arnold, April 4, 2019

A Statistics Rebellion? One Can Only Hope

March 21, 2019

Yesterday I mentioned to a reporter than most smart software is “right” somewhere between 50 to 80 percent of the time. The reporter asked, “Does that mean results are incorrect half to one third of the time?”

My answer, “Probably worse.”

The reporter changed the subject. My hunch is that the hyperbole about the accuracy of smart software suggests that the systems are better than a human. Some may be better at some specific tasks.

In many cases, the number crunching chops down what a human must examine. In an age of data, chopping down what one has to examine is a very important task. For applications like online advertising, 70 percent accuracy is close enough to keep the advertiser semi happy and spending money to reach a target. For other applications like where will a bad actor commit a crime, the game is “close enough for horseshoes.”

Why talk about numbers? My observations, with which you are invited to disagree, are a prelude to my recommending that you read “Scientists Rise Up Against Statistical Significance.” Here a passage I underlined:

In 2016, the American Statistical Association released a statement in The American Statistician warning against the misuse of statistical significance and P values. The issue also included many commentaries on the subject. This month, a special issue in the same journal attempts to push these reforms further. It presents more than 40 papers on ‘Statistical inference in the 21st century: a world beyond P < 0.05’. The editors introduce the collection with the caution “don’t say ‘statistically significant’”. Another article with dozens of signatories also calls on authors and journal editors to disavow those terms. We agree, and call for the entire concept of statistical significance to be abandoned.

What if one is using a system which bakes in statistical procedures and locks them away from users? What if those procedures are introducing errors?

Tough questions for vendors of smart software.

Stephen E Arnold, March 21, 2019

Who Is Assisting China in Its Technology Push?

March 20, 2019

I read “U.S. Firms Are Helping Build China’s Orwellian State.” The write up is interesting because it identifies companies which allegedly provide technology to the Middle Kingdom. The article also uses an interesting phrase; that is, “tech partnerships.” Please, read the original article for the names of the US companies allegedly cooperating with China.

I want to tell a story.

Several years ago, my team was asked to prepare a report for a major US university. Our task was to try and answer what I thought was a simple question when I accepted the engagement, “Why isn’t this university’s computer science program ranked in the top ten in the US?”

The answer, my team and I learned, had zero to do with faculty, courses, or the intelligence of students. The primary reason was that the university’s graduates were returning to their “home countries.” These included China, Russia, and India, among others. In one advanced course, there was no US born, US educated student.

We documented that for over a seven year period, when the undergraduate, the graduate students, and post doctoral students completed their work, they had little incentive to start up companies in proximity to the university, donate to the school’s fund raising, and provide the rah rah that happy graduates often do. To see the rah rah in action, may I suggest you visit a “get together” of graduates near Stanford or an eatery in Boston or on NCAA elimination week end in Las Vegas.

How could my client fix this problem? We were not able to offer a quick fix or even an easy fix. The university had institutionalized revenue from non US student and was, when we did the research, dependent on non US students. These students were very, very capable and they came to the US to learn, form friendships, and sharpen their business and technical “soft” skills. These, I assume, were skills put to use to reach out to firms where a “soft” contact could be easily initiated and brought to fruition.

threads fixed

Follow the threads and the money.

China has been a country eager to learn in and from the US. The identification of some US firms which work with China should not be a surprise.

However, I would suggest that Foreign Policy or another investigative entity consider a slightly different approach to the topic of China’s technical capabilities. Let me offer one example. Consider this question:

What Israeli companies provide technology to China and other countries which may have some antipathy to the US?

This line of inquiry might lead to some interesting items of information; for example, a major US company which meets on a regular basis with a counterpart with what I would characterize as “close links” to the Chinese government. One colloquial way to describe the situation is like a conduit. Digging in  this field of inquiry, one can learn how the Israeli company “flows” US intelligence-related technology from the US and elsewhere through an intermediary so that certain surveillance systems in China can benefit directly from what looks like technology developed in Israel.

Net net: If one wants to understand how US technology moves from the US, the subject must be examined in terms of academic programs, admissions, policies, and connections as well as from the point of view of US company investments in technologies which received funding from Chinese sources routed through entities based in Israel. Looking at a couple of firms does not do the topic justice and indeed suggests a small scale operation.

Uighur monitoring is one thread to follow. But just one.

Stephen E Arnold, March 20, 2019

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta