Wolfram Mathematica
March 19, 2020
DarkCyber noted “In Less Than a Year, So Much New: Launching Version 12.1 of Wolfram Language & Mathematica” contains highly suggestive information. Yes, this is a mathy program. The innovations are significant for analysts and some government professionals. To cite one example:
I’ve been recording hundreds of hours of video in connection with a new project I’m working on. So I decided to try our new capabilities on it. It’s spectacular! I could take a 4-hour video, and immediately extract a bunch of sample frames from it, and then—yes, in a few hours of CPU time—“summarize the whole video”, using SpeechRecognize to do speech-to-text on everything that was said and then generating a word cloud…
DarkCyber reacts positively to other additions and enhancements to the Mathematica “system.” Version 12.1 will make it easier to develop specific functions for policeware and intelware use cases.
Remarkable because the “system” can geo-everything. That’s important in many situations.
Stephen E Arnold, March 19, 2020
Israel and Mobile Phone Data: Some Hypotheticals
March 19, 2020
DarkCyber spotted a story in the New York Times: “Israel Looks to Repurpose a Trove of Cell Phone Data.” The story appeared in the dead tree edition on March 17, 2020, and you can access the online version of the write up at this link.
The write up reports:
Prime Minister Benjamin Netanyahu of Israel authorized the country’s internal security agency to tap into a vast , previously undisclosed trove of cell phone data to retract the movements of people who have contracted the corona virus and identify others who should be quarantined because their paths crossed.
Okay, cell phone data. Track people. Paths crossed. So what?
Apparently not much.
The Gray Lady does the handwaving about privacy and the fragility of democracy in Israel. There’s a quote about the need for oversight when certain specialized data are retained and then made available for analysis. Standard journalism stuff.
DarkCyber’s team talked about the write up and what the real journalists left out of the story. Remember. DarkCyber operates from a hollow in rural Kentucky and knows zero about Israel’s data collection realities. Nevertheless, my team was able to identify some interesting use cases.
Let’s look at a couple and conclude with a handful of observations.
First, the idea of retaining cell phone data is not exactly a new one. What if these data can be extracted using an identifier for a person of interest? What if a time-series query could extract the geolocation data for each movement of the person of interest captured by a cell tower? What if this path could be displayed on a map? Here’s a dummy example of what the plot for a single person of interest might look like. Please, note these graphics are examples selected from open sources. Examples are not related to a single investigation or vendor. These are for illustrative purposes only.
Source: Standard mobile phone tracking within a geofence. Map with blue lines showing a person’s path. SPIE at https://bit.ly/2TXPBby
Useful indeed.
Second, what if the intersection of two or more individuals can be plotted. Here’s a simulation of such a path intersection:
Source: Map showing the location of a person’s mobile phone over a period of time. Tyler Bell at https://bit.ly/2IVqf7y
Would these data provide a way to identify an individual with a mobile phone who was in “contact” with a person of interest? Would the authorities be able to perform additional analyses to determine who is in either party’s social network?
Third, could these relationship data be minded so that connections can be further explored?

Source: Diagram of people who have crossed paths visualized via Analyst Notebook functions. Globalconservation.org
Can these data be arrayed on a timeline? Can the routes be converted into an animation that shows a particular person of interest’s movements at a specific window of time?
Source: Vertical dots diagram from Recorded Future showing events on a timeline. https://bit.ly/39Xhbex
These hypothetical displays of data derived from cross correlations, geotagging, and timeline generation based on date stamps seem feasible. If earnest individuals in rural Kentucky can see the value of these “secret” data disclosed in the New York Times’ article, why didn’t the journalist and the others who presumably read the story?
What’s interesting is that systems, methods, and tools clearly disclosed in open source information is overlooked, ignored, or just not understood.
Now the big question: Do other countries have these “secret” troves of data?
DarkCyber does not know; however, it seems possible. Log files are a useful function of data processes. Data exhaust may have value.
Stephen E Arnold, March 19, 2020
First Counting Bees, Now Predicting Parrots
March 5, 2020
DarkCyber found amusing the write up “Parrots Can Make Predictions Based on Probabilities” interesting. With the corona virus data widely available, will these poly-nomial avians lend their expertise to global health administrators?
The write up asserts:
They [scientists] discovered the kea, a species of large parrot found in New Zealand, can make inferences and predict events based previous knowledge or experience. They [yep, this is a reference to the parrots] even performed better than chimps in some experiments.
The write up states:
The team said it is the first time this complex cognitive ability has been demonstrated in an animal outside of the great apes, which could help shed light on the “evolutionary history of statistical inference”.
Now is the time to apply parrot intelligence to tough computing problems like the Corona virus research. Polly, do you want a protein predictive output?
Stephen E Arnold, March 5, 2020
Amazon: Buying More Innovation
February 26, 2020
DarkCyber noted the article “Amazon Acquires Turkish Startup Datarow.” The word “startup” is rather loosely applied. Datarow was founded in 2016. Not a spring chicken in DarkCyber’s view is a four year old outfit.
What’s interesting about this acquisition is that it provides the sometimes unartful Amazon with an outfit that specializes in making easier-to-use data tools. The firm appears to have been built around AWS Redshift.
The company’s quite wonky Web site says:
We’re proud to have created an innovative tool that facilitates data exploration and visualization for data analysts in Amazon Redshift, providing users with an easy to use interface to create tables, load data, author queries, perform visual analysis, and collaborate with others to share SQL code, analysis, and results. Together with AWS, we look forward to taking our tool to the next level for customers.
The company provides what it calls “data governance,” a term which DarkCyber means “get your act together” with regard to information. This is easier said than done, but it is a hot button among companies struggling to reduce costs, comply with assorted rules and regulations, and figure out what’s actually happening in their lines of business. Profit and loss statements are not up to the job of dealing with diverse content, audio, video, real time data, and tweets. Well, neither is Amazon, but that’s not germane.
Will Amazon AWS Redshift (love the naming, don’t you?) become easier to use? Perhaps Datarow will become responsible for the AWS Web site?
Stephen E Arnold, February 26, 2020
Facial Recognition: Those Error Rates? An Issue, Of Course
February 21, 2020
DarkCyber read “Machines Are Struggling to Recognize People in China.” The write up asserts:
The country’s ubiquitous facial recognition technology has been stymied by face masks.
One of the unexpected consequences of the Covid 19 virus is that citizens with face masks cannot be recognized.
“Unexpected” when adversarial fashion has been getting some traction among those who wish to move anonymously.
The write up adds:
Recently, Chinese authorities in some provinces have made medical face masks mandatory in public and the use and popularity of these is going up across the country. However, interestingly, as millions of masks are now worn by Chinese people, there has been an unintended consequence. Not only have the country’s near ubiquitous facial-recognition surveillance cameras been stymied, life is reported to have become difficult for ordinary citizens who use their faces for everyday things such as accessing their homes and bank accounts.
Now an “admission” by a US company:
Companies such as Apple have confirmed that the facial recognition software on their phones need a view of the person’s full face, including the nose, lips and jaw line, for them to work accurately. That said, a race for the next generation of facial-recognition technology is on, with algorithms that can go beyond masks. Time will tell whether they work. I bet they will.
To sum up: Masks defeat facial recognition. The future is a method of identification that can work with what is not covered plus any other data available to the system; for example, pattern of walking and geo-location.
For now, though, the remedy for the use of masks is lousy facial recognition and more effort to find innovations.
The author of the write up is a — wait for it — venture capital professional. And what country leads the world in facial recognition? China, according to the VC professional.
The future is better person recognition of which the face is one factor.
Stephen E Arnold, February 21, 2020
Map Economics: Useful Content and One Major Omission
February 13, 2020
DarkCyber spotted a paper called “The Economics of Maps.” The authors have presented some extremely useful and interesting information about depicting the real world.
One of the most useful aspects of the article is the list of companies providing different types of mapping services and data. The list of firms in this business includes such providers, vendors, and technology companies as:
Airbus
Farmers Edge
Mapbox
Pitney Bowes
There are some significant omissions; for example, the category for geo-analytics for law enforcement and intelligence applications; for example, the low profile Geogence and investigative tools like those available from Verint.
Worth reading and tucking into one’s intelligence folder in our opinion.
Stephen E Arnold, February 13, 2020
Easy Facial Recognition
February 11, 2020
DarkCyber spotted a Twitter thread. You can view it here (verified on February 8, 2020). The main point is that using open source software, an individual was able to obtain (scrape; that is copying) images from publicly accessible services. Then the images were “processed.” The idea was identify a person from an image. Net net: People can object to facial recognition, but once a technology migrates from “little known” to public-available, there may be difficulty putting the tech cat bag in the black bag.
Stephen E Arnold, February 11, 2020
Math Resources
January 27, 2020
One of the DarkCyber team spotted a list of math resources available. Some cost money; others are free. Math Vault lists courses, platforms, tools, and question – answering sites. Some are relatively mainstream like Wolfram Alpha; others, less well publicized like ProofWiki. You can find the listing at this link.
Kenny Toth, January 26, 2020
Google and Data: Doing Stuff Without Data?
January 25, 2020
The Verge has been one of the foot soldiers carrying a pointy stick toward the Google. A few days ago, Google mobilized its desktop search results. The idea was to make search results look the same; that is, virtually impossible to determine where a link came from, who paid for it, and how it was linked to a finger tap or an honest-to-goodness thumb typed word or phrase.
The Verge noted the difference because its experts looked at a page of results on a tiny display device and then on a bigger device and noted the similarity or differences. “Google’s Ads Just Look Like Search Results Now” stated on January 23, 2020:
In what appears to be something of a purposeful dark pattern, the only thing differentiating ads and search results is a small black-and-white “Ad” icon next to the former.
Yikes, a dark pattern. Tricking users. Changing to match mobile.
A day later, The Verge reported that “Google is backtracking on its controversial desktop search results redesign.” The write up stated:
The company says it will experiment with favicon placement.
But the point is not the Verge’s useful coverage of the Google shift. For DarkCyber, the new interface illustrates that the baloney about Google using data to determine its actions, the importance of A B testing, and the overall brilliance of Googlers illustrates that the GOOG does what it wants.
If Google’s “data” cannot inform the company that an interface change will irritate outfits like the Verge, users, and denizens of the Twitter thing — maybe the company’s data dependence is a shibboleth?
If Google cannot interpret A B data in a way to avoid backlash and crawfishing, maybe Google’s data skills are not what the PR machine says?
DarkCyber thought experimenting and analysis came first at the Google. It seems that these steps come after guessing. Ah, the Google.
Stephen E Arnold, January 25, 2020
Abandoned Books: Yep, Analytics to the Rescue
January 6, 2020
DarkCyber noted “The Most ‘Abandoned’ Books on GoodReads.” The idea is that by using available data, a list of books people could not finish reading can be generated. Disclosure: I will try free or $1.99 books on my Kindle and bail out if the content does not make me quiver with excitement.
The research, which is presented in academic finery, reports that the the author of Harry Potter’s adventurers churned out a book few people could finish. The title? The Casual Vacancy by J.K. Rowling. I was unaware of the book, but I will wager that the author is happy enough with the advance and any royalty checks which clear the bank. Success is not completion; success is money I assume.
I want to direct your attention, gentle reader, to the explanation of the methodology used to award this singular honor to J.K. Rowling, who is probably pleased as punch with the bank interaction referenced in the preceding paragraph.
Several points merit brief, very brief comment:
- Bayesian. A go to method. Works reasonably well. Guessing has its benefits.
 - Data sets. Not exactly comprehensive. Amazon? What about the Kindle customer data, including time to abandonment, page of abandonment, etc.? Library of Congress? Any data to share? Top 20 library systems in the US? Got some numbers; for example, number of copies in circulation?
 - Communication. The write up is a good example why some big time thinkers ignore the inputs of certain analysts.
 
To sum up, perhaps The Casual Vacancy may make a great gift when offered by Hamilton Books? A coffee table book perhaps?
Stephen E Arnold, January 6, 2020
	

