I read “5 Critical Steps for Identifying the Value in Your Unstructured Information.” The points in the write up are fine. In fact, anyone who has worked with unstructured data in the form of emails, tweets, Facebook posts, intercepts, etc. knows that a lot of work is required.
My problem with the write up in Datanami is that smart software keeps its nose tucked under the covers. I thought that smart software was able to perform collection (er, that’s a step not included in the list of five steps but let’s move on).
Smart software is supposed to discover important information. That’s fine but what is the process for configuring the smart software, checking to make sure that the system outputs useful or semi useful data, and presents it in a form which does not trigger another wave of manual effort? There are some systems which perform discovery; however, like today’s driverless autos, a human has to have his or her hands on the wheel. Otherwise a dead pedestrian or a dead driver can be an outcome. I recall a Tesla nuked a white truck because its LIDAR thought the truck was a cloud. Yeah, right.
The reality is that generalizations about what’s is required to make sense of unstructured data are only marginally useful. Anyone licensing a smart system from outfits like IBM, Palantir, BAE Systems, Textron, etc. must be prepared for the surprises which luck in the software.
For instance:
- Much of the work is manual. How does data get into Palantir Gotham?
- Setting up the system is iterative work. Have you ever heard about tuning?
- Creating and enforcing procedures for keeping data clean and happy is work. Automatic feeds and real time flows are super, but what happens when high value data is filtered and put in an exception folder?
- Analysis is work that needs a trained, attentive, subject matter expert. Who makes sense of the puzzle pieces and assembles them?
The real world requires that magic be confined to children’s books. Using the tools available today do not eliminate the need for manual work.
Smart software is a knee brace. The human has to carry the load. Omitting this reality creates false expectations and puts lives at risk or decision making in a higher risk setting. Smart software can do some functions well. Not all functions are smart.
Stephen E Arnold, November 25, 2018