Watson Weakly: Another Game. This Time I Spy. Huh?
March 28, 2016
I survived the Go games. In case you have been on an extended vacation, Google’s smart software beat a human at the game of Go. I assume that this smart software did not drive the car which ran into a bus, but that’s another issue.
I then noted “IBM Watson Could Soon Use Artificial Intelligence to Beat You at a Game of I Spy.” I love the use of the word “could.” I prefer supposition to reality. Contrast the satisfaction of “I could go to the gym” with “I am eating potato chips.” Which does IBM prefer? If you answered, “Generate substantial revenue”, you are incorrect.
The write up in question reports that IBM has “updated” Watson. I noted this statement about the updated Watson:
IBM has created a ‘Visual Recognition Demo’ to showcase Watson’s latest trick, which allows users to feed Watson an image before it tells you what it believes it sees. For example, supplying Watson with the image of a tiger throws up the result 77 per cent tiger, 26 per cent wild cat and 63 per cent cat.
In my experience, determining if an animal is a real live and possibly hungry tiger, that error could be darned interesting. On my last trip to Africa, I learned that a hapless trekker discovered that confusing “cat” with “tiger” can have interesting consequences.,
Sigh. IBM appears to be making news out of some image processing capabilities which I have seen in action before. How long “before”? Think more years than IBM has been reporting declining revenues. Watson, what can one do about that? Hello, Watson. Are you there?
Stephen E Arnold, March 28, 2016
Reputable News Site Now on the Dark Web
March 28, 2016
Does the presence of a major news site lend an air of legitimacy to the Dark Web? Wired announces, “ProPublica Launches the Dark Web’s First Major News Site.” Reporter Andy Greenberg tells us that ProPublica recently introduced a version of their site running on the Tor network. To understand why anyone would need such a high level of privacy just to read the news, imagine living under a censorship-happy government; ProPublica was inspired to launch the site while working on a report about Chinese online censorship.
Why not just navigate to ProPublica’s site through Tor? Greenberg explains the danger of malicious exit nodes:
“Of course, any privacy-conscious user can achieve a very similar level of anonymity by simply visiting ProPublica’s regular site through their Tor Browser. But as Tigas points out, that approach does leave the reader open to the risk of a malicious ‘exit node,’ the computer in Tor’s network of volunteer proxies that makes the final connection to the destination site. If the anonymous user connects to a part of ProPublica that isn’t SSL-encrypted—most of the site runs SSL, but not yet every page—then the malicious relay could read what the user is viewing. Or even on SSL-encrypted pages, the exit node could simply see that the user was visiting ProPublica. When a Tor user visits ProPublica’s Tor hidden service, by contrast—and the hidden service can only be accessed when the visitor runs Tor—the traffic stays under the cloak of Tor’s anonymity all the way to ProPublica’s server.”
The article does acknowledge that Deep Dot Web has been serving up news on the Dark Web for some time now. However, some believe this move from a reputable publisher is a game changer. ProPublica developer Mike Tigas stated:
“Personally I hope other people see that there are uses for hidden services that aren’t just hosting illegal sites. Having good examples of sites like ProPublica and Securedrop using hidden services shows that these things aren’t just for criminals.”
Will law-abiding, but privacy-loving, citizens soon flood the shadowy landscape of the Dark Web.
Cynthia Murrell, March 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Retraining the Librarian for the Future
March 28, 2016
The Internet is often described as the world’s biggest library containing all the world’s knowledge that someone dumped on the floor. The Internet is the world’s biggest information database as well as the world’s biggest data mess. In the olden days, librarians used to be the gateway to knowledge management but they need to vamp up their skills beyond the Dewey Decimal System and database searching. Librarians need to do more and Christian Lauersen’s personal blog explains how in, “Data Scientist Training For Librarians-Re-Skilling Libraries For The Future.”
DST4L is a boot camp for librarians and other information professionals to learn new skills to maintain relevancy. Last year DST4L was held as:
“DST4L has been held three times in The States and was to be set for the first time in Europe at Library of Technical University of Denmark just outside of Copenhagen. 40 participants from all across Europe were ready to get there hands dirty over three days marathon of relevant tools within data archiving, handling, sharing and analyzing. See the full program here and check the #DST4L hashtag at Twitter.”
Over the course of three days, the participants learned about OpenRefine, a spreadsheet-like application that cane be used for data cleanup and transformation. They also learned about the benefits of GitHub and how to program using Python. These skills are well beyond the classed they teach in library graduate programs, but it is a good sign that the profession is evolving even if the academia aspects lag behind.
Whitney Grace, March 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Wanna Be a Googler? Consider The Questions If Not the Grammar
March 27, 2016
I see a lot of a saucisson each day. I was amused by “Interview Questions Google Pulled Down Because They Were Impossibly Difficult to Answer.” Like the fabled Google Labs Aptitude Test, Google tries, like the addled goose, to winnow the goose feathers from the giblets. The addled goose focuses on the world of search and content processing. Google tries to perform the same trick with humans. You remember, humans, the reason Google’s self driving cars have accidents.
I found this question interesting:
How many times a day does a clock’s hands overlap?
What does this sentence suggest about those creating exam questions? Hint: Do not ask a Googler.
Stephen E Arnold, March27, 2016
SEO Consultants Face Google Reality
March 27, 2016
The entire search engine optimization boomlet annoyed me. The idea that individuals could trick algorithms to displaying where it should not appear goes against my old fashioned notions of relevance, precision, and recall.
I blame lots of people for this destruction of on point search and retrieval. Consultants, vendors, the wonderful Google—yep, sorry. I want a search system to deliver information directly germane to my query. I don’t want search systems to think for me.
I read an amusing write up called “Google ch-ch-ch-changes. How They’re Affecting Publishers and SEOs.” The focus is not on the users’ needs for relevant information. Nah, the focus is on publishers and the members of the class SEO.
The write up bemoans the fact that Google no longer has a wizard to explain how to fool Google’s algorithms. That’s a positive in my opinion. Next the write up points out that Google wants to use even smarter algorithms to determine what is and is not relevant. Does Google’s notion of relevance match mine. Nah, I don’t care about advertising, but my hunch is that Google cares a great deal about money with relevance a consideration. But the goal is money.
The part of the article I liked was the section labeled “SEO Is Dead.” Good. The result was a surprise. The article points out that Facebook is a better place to get information. I highlighted in social scarlet this statement:
More and more, I go to Facebook for answers because I can no longer find them in Google. Google uses AI to throw me a kitchen sink when it is not sure, and that kitchen sink rarely has much in it that’s useful for me.
How does one find on point information from a Web search engine dependent on advertising? The write up dodges the question and suggests:
If you are using informational search, SEO hasn’t gotten harder — it has just become much more irrelevant. Whereas Google used to be very good at returning exact query results, AI goes with the “broad net” approach. If Google does not have a specific “thing” it can return, it will often return a set of more general results, leaving words out of the query set. Often, the word it leaves out is the most relevant modifier.
Sound like baloney?
Stephen E Arnold, March 27, 2016
Search as a Framework
March 26, 2016
A number of search and content processing vendors suggest their information access system can function as a framework. The idea is that search is more than a utility function.
If the information in the article “Abusing Elasticsearch as a Framework” is spot on, a non search vendor may have taken an important step to making an assertion into a reality.
The article states:
Crate is a distributed SQL database that leverages Elasticsearch and Lucene. In it’s infant days it parsed SQL statements and translated them into Elasticsearch queries. It was basically a layer on top of Elasticsearch.
The idea is that the framework uses discovery, master election, replication, etc along with the Lucene search and indexing operations.
Crate, the framework, is a distributed SQL database “that leverages Elasticsearch and Lucene.”
Stephen E Arnold, March 26, 2016
Advertising and Search Confidence: Google As Government
March 26, 2016
I read “US State Department Emails: Google Wanted in 2012 to Help Syria’s Rebels Overthrow Assad.” The story might be a load of horse feathers. I stopped and read the article and noted this passage:
Messages between former secretary of state Hillary Clinton’s team and one of the company’s executives detailed the plan for Google to get involved in the region. “Please keep close hold, but my team is planning to launch a tool … that will publicly track and map the defections in Syria and which parts of the government they are coming from,” Jared Cohen, the head of what was then the company’s “Google Ideas” division, wrote in a July 2012 email to several top Clinton officials.
Perhaps this is Palantir envy? Clever folks are confident of their abilities. And here is a See Also reference.
Stephen E Arnold, March 26, 2016
Ixquick and StartPage Become One
March 25, 2016
Ixquick was created by a person in Manhattan. Then the system shifted from the USA to Europe. I lost track. I read “Ixquick Merges with StartPage Search Engine.” Web search is a hideously expensive activity to fund. Costs can be suppressed if one just passes the user’s query to Bing, Google, or some other Web indexing search system. The approach delivers what is called a value-added opportunity. Vivisimo used the approach before it morphed into a unit of IBM and emerged not as a search federation system but a Big Data system. Most search traffic flows to the Alphabet Google advertising system. Those who use federated search systems often don’t know the difference and, based on my observations, don’t care.
According to the write up:
The main difference between StartPage and the current version of Ixquick is that the former is powered exclusively by Google search results while the latter aggregates data from multiple search engines to rank them based on factors such as prominence and quantity. Both search engines are privacy orientated, and the merging won’t change the fact. IP addresses are not recorded for instance, and data is not shared with third-parties.
Like DuckDuckGo.com, Ixquick.com and StartPage.com “protect the user’s privacy. My thought is that I am not confident Tor sessions are able to protect a user’s privacy. A general interest search engine which delivers on this assertion is interesting indeed.
If you want to use the Ixquick function that presents only Google results, navigate to www.ixquick.eu. There are other privacy oriented systems; for example, Gibiru and Unbubble.
Sorry, I won’t/can’t go into the privacy angle. You may want to poke around how secure a VPN session, Tails, and Tor are. The exploration may yield some useful information. Make sure your computing device does not have malware installed, please. Otherwise, the “privacy” issue is off the table.
Stephen E Arnold, March 25, 2016
Some News, Maybe None That Is Not Sort of True?
March 25, 2016
I read “Proposed Truthfulness Law Spooks Russian News Aggregators.” I came away a little puzzled. My perception is that the “news,” regardless of country, is a weird amalgam of infotainment, bias, and theater (political, social, and William Wycherley fare). Whenever the notion of “real,” “accurate,” “objective,” and “true” enter from stage right or left, I wonder what these folks’ definition of the glittering generalities are.
According to the write up, “Russia has tight media controls that include a requirement to make sure all print, broadcast and online news is true.”
A new bill (not yet a law, gentle reader) “would effectively say that news aggregators are the same as mass media operations.” News aggregators like Yandex and the Alphabet Google thing:
would become liable if they spread false information and state agencies complain about it.
The write up, a “real” journalism outfit observes:
Although the law would create a handy way of further restricting information flows, when the bill came out, the Russian communications ministry indicated it was not keen on the idea. That said, the Kremlin has already been making life hard for big online players, particularly by mandating that they store users’ personal data on servers in Russia.
May I suggest a quick romp through Jacques Ellul’s Propaganda: The Formation of Men’s Attitudes?
Stephen E Arnold, March 25, 2016
Not So Weak. Right, Watson?
March 25, 2016
I read an article which provided to be difficult to find. None of my normal newsreaders snagged the write up called “The Pentagon’s Procurement System Is So Broken They Are Calling on Watson.” Maybe it is the singular Pentagon hooked with the plural pronoun “they”? Hey, dude, colloquial writing is chill.
Perhaps my automated systems’ missing the boat was the omission of the three impressive letters “IBM”? If you follow the activities of US government procurement, you may want to note the article. If you are tracking the tension between IBM i2 and Palantir Technologies, the article adds another flagstone to the pavement that IBM is building to support it augmented intelligence activities in the Department of Defense and other US government agencies.
Let me highlight a couple of comments in the write up and leave you to explore the article at whatever level you choose. I noted these “reports”:
The Air Force is currently working with two vendors, both of which have chosen Watson, IBM’s cognitive learning computer, to develop programs that would harness artificial intelligence to help businesses and government acquisitions officials work through the mind-numbing system.
The write up identifies one of the vendors working on IBM Watson for the US Air Force. The company is Applied Research.
I circled this quote: “The Pentagon’s procurement system is the “perfect application for Watson.”
The goslings and I love “perfect” applications.
How does Watson learn about procurement? The approach is essentially the method used in the mid 1990s by Autonomy IDOL. Here’s a passage I highlighted:
But first Watson must be trained. The first step is to feed it all the relevant documents. Then its digital intellect will be molded by humans, asking question after question, about 5,000 in all, to help understand context and the particular nuance that comes with federal procurement law.
How does this IBM deal fit into the Palantir versus IBM interaction? That’s a good question. What is clear is that the US Air Force has embraced a solution which includes systems and methods first deployed two decades ago.
What’s that about the pace of technology?
Stephen E Arnold, March 25, 2016

