Australian Software Developer Revealed the Panama Papers
May 23, 2016
The Panama Papers have released an entire slew of scandals that sent out ripples we will be dealing with for years to come. It also strikes another notch in the power of software and that nothing is private anymore. But how were the Panama Papers leaked? Reuters reports that a “Small Australian Software Firm Helps Join The Dots On The Panama Papers”.
Nuix Pty Ltd. is a Sydney-based software development company that donated its document analysis program to the International Consortium of Investigative Journalists (ICIJ) to delve through the data from Mossack Fonseca, the Panamanian law firm that leaked the documents. Reporters have searched through the data for some time and discovered within the 2.6 terabytes the names of politicians and public figures with questionable offshore financial accounts.
“By using the software, the Washington-based ICIJ was able to make millions of scanned documents, some decades old, text-searchable and help its network of journalists cross reference Mossack Fonseca’s clients across these documents. The massive leak has prompted global investigations into suspected illegal activities by the world’s wealthy and powerful. Mossack Fonseca, the firm at the center of the leaks, denies any wrongdoing. The use of advanced document and data analysis technology shows the growing importance of technology’s role in helping journalists make better sense of increasingly bigger news discoveries.”
Nuix Pty is a ten-year-old company and their products have been used to conduct data analysis in child pornography rings, people trafficking, and high-end tax evasion. Another selling feature for the company is their dedication to their clients’ privacy. They did not allow themselves to have access to the information within the Panama Papers. That is an interesting fact, considering how some tech companies need to have total access to their clients’ information.
Nuix sounds like the Swiss bank of software companies, guaranteeing high-quality services and products that guarantee results, plus undeniable privacy.
Whitney Grace, May 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
A Snapshot of American Innovation Today
May 23, 2016
Who exactly are today’s innovators? The Information Technology & Innovation Foundation (ITIF) performed a survey to find out, and shares a summary of their results in, “The Demographics of Innovation in the United States.” The write-up sets the context before getting into the findings:
“Behind every technological innovation is an individual or a team of individuals responsible for the hard scientific or engineering work. And behind each of them is an education and a set of experiences that impart the requisite knowledge, expertise, and opportunity. These scientists and engineers drive technological progress by creating innovative new products and services that raise incomes and improve quality of life for everyone….
“This study surveys people who are responsible for some of the most important innovations in America. These include people who have won national awards for their inventions, people who have filed for international, triadic patents for their innovative ideas in three technology areas (information technology, life sciences, and materials sciences), and innovators who have filed triadic patents for large advanced-technology companies. In total, 6,418 innovators were contacted for this report, and 923 provided viable responses. This diverse, yet focused sampling approach enables a broad, yet nuanced examination of individuals driving innovation in the United States.”
See the summary for results, including a helpful graphic. Here are some highlights: Unsurprisingly to anyone who has been paying attention, women and U.S.-born minorities are woefully underrepresented. Many of those surveyed are immigrants. The majority of survey-takers have at least one advanced degree (many from MIT), and nearly all majored in STEM subject as undergrads. Large companies contribute more than small businesses do while innovations are clustered in California, the Northeast, and close to sources of public research funding. And take heart, anyone over 30, for despite the popular image of 20-somethings reinventing the world, the median age of those surveyed is 47.
The piece concludes with some recommendations: We should encourage both women and minorities to study STEM subjects from elementary school on, especially in disadvantaged neighborhoods. We should also lend more support to talented immigrants who wish to stay in the U.S. after they attend college here. The researchers conclude that, with targeted action from the government on education, funding, technology transfer, and immigration policy, our nation can tap into a much wider pool of innovation.
Cynthia Murrell, May 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Big Data and Value
May 19, 2016
I read “The Real Lesson for Data Science That is Demonstrated by Palantir’s Struggles · Simply Statistics.” I love write ups that plunk the word statistics near simple.
Here’s the passage I highlighted in money green:
… What is the value of data analysis?, and secondarily, how do you communicate that value?
I want to step away from the Palantir Technologies’ example and consider a broader spectrum of outfits tossing around the jargon “big data,” “analytics,” and synonyms for smart software. One doesn’t communicate value. One finds a person who needs a solution and crafts the message to close the deal.
When a company and its perceived technology catches the attention of allegedly informed buyers, a bandwagon effort kicks in. Talks inside an organization leads to mentions in internal meetings. The vendor whose products and services are the subject of these comments begins to hint at bigger and better things at conferences. Then a real journalist may catch a scent of “something happening” and writes an article. Technical talks at niche conferences generate wonky articles usually without dates or footnotes which make sense to someone without access to commercial databases. If a social media breeze whips up the smoldering interest, then a fire breaks out.
A start up should be so clever, lucky, or tactically gifted to pull off this type of wildfire. But when it happens, big money chases the outfit. Once money flows, the company and its products and services become real.
The problem with companies processing a range of data is that there are some friction inducing processes that are tough to coat with Teflon. These include:
- Taking different types of data, normalizing it, indexing it in a meaningful manner, and creating metadata which is accurate and timely
- Converting numerical recipes, many with built in threshold settings and chains of calculations, into marching band order able to produce recognizable outputs.
- Figuring out how to provide an infrastructure that can sort of keep pace with the flows of new data and the updates/corrections to the already processed data.
- Generating outputs that people in a hurry or in a hot zone can use to positive effect; for example, in a war zone, not get killed when the visualization is not spot on.
The write up focuses on a single company and its alleged problems. That’s okay, but it understates the problem. Most content processing companies run out of revenue steam. The reason is that the licensees or customers want the systems to work better, faster, and more cheaply than predecessor or incumbent systems.
The vast majority of search and content processing systems are flawed, expensive to set up and maintain, and really difficult to use in a way that produces high reliability outputs over time. I would suggest that the problem bedevils a number of companies.
Some of those struggling with these issues are big names. Others are much smaller firms. What’s interesting to me is that the trajectory content processing companies follow is a well worn path. One can read about Autonomy, Convera, Endeca, Fast Search & Transfer, Verity, and dozens of other outfits and discern what’s going to happen. Here’s a summary for those who don’t want to work through the case studies on my Xenky intel site:
Stage 1: Early struggles and wild and crazy efforts to get big name clients
Stage 2: Making promises that are difficult to implement but which are essential to capture customers looking actively for a silver bullet
Stage 3: Frantic building and deployment accompanied with heroic exertions to keep the customers happy
Stage 4: Closing as many deals as possible either for additional financing or for licensing/consulting deals
Stage 5: The early customers start grousing and the momentum slows
Stage 6: Sell off the company or shut down like Delphes, Entopia, Siderean Software and dozens of others.
The problem is not technology, math, or Big Data. The force which undermines these types of outfits is the difficulty of making sense out of words and numbers. In my experience, the task is a very difficult one for humans and for software. Humans want to golf, cruise Facebook, emulate Amazon Echo, or like water find the path of least resistance.
Making sense out of information when someone is lobbing mortars at one is a problem which technology can only solve in a haphazard manner. Hope springs eternal and managers are known to buy or license a solution in the hopes that my view of the content processing world is dead wrong.
So far I am on the beam. Content processing requires time, humans, and a range of flawed tools which must be used by a person with old fashioned human thought processes and procedures.
Value is in the eye of the beholder, not in zeros and ones.
Stephen E Arnold, May 19, 2016
Signs of Life from Funnelback
May 19, 2016
Funnelback has been silent as of late, according to our research, but the search company has emerged from the tomb with eyes wide open and a heartbeat. The Funnelback blog has shared some new updates with us. The first bit of news is if you are “Searchless In Seattle? (AKA We’ve Just Opened A New Office!)” explains that Funnelback opened a new office in Seattle, Washington. The search company already has offices in Poland, United Kingdom, and New Zealand, but now they want to establish a branch in the United States. Given their successful track record with the finance, higher education, and government sectors in the other countries they stand a chance to offer more competition in the US. Seattle also has a reputable technology center and Funnelback will not have to deal with the Silicon Valley group.
The second piece of Funnelback news deals with “Driving Channel Shift With Site Search.” Channel shift is the process of creating the most efficient and cost effective way to deliver information access and usage to users. It can be difficult to implement a channel shift, but increasing the effectiveness of a Web site’s search can have a huge impact.
Being able to quickly and effectively locate information on a Web site saves time for not only more important facts, but it also can drive sales, further reputation, etc.
“You can go further still, using your search solution to provide targeted experiences; outputting results on maps, searching by postcode, allowing for short-listing and comparison baskets and even dynamically serving content related to what you know of a visitor, up-weighting content that is most relevant to them based on their browsing history or registered account.
Couple any of the features above with some intelligent search analytics, that highlight the content your users are finding and importantly what they aren’t finding (allowing you to make the relevant connections through promoted results, metadata tweaking or synonyms), and your online experience is starting to become a lot more appealing to users than that queue on hold at your call centre.”
I have written about it many times, but a decent Web site search function can make or break a site. Not only does it demonstrate that the Web site is not professional, it does not inspire confidence in a business. It is a very big rookie mistake to make.
Whitney Grace, May 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Travel to South Africa Virtually with Googles Mzansi Experience
May 18, 2016
The article on Elle titled Google SA Launches the Mzansi Experience On Maps illustrates the new Google Street View collection for South Africa. For people without the ability to travel, or scared of malaria or Oscar Pistorius, this collection offers an in-depth platform to view some of South Africa’s natural wonders and parks. The article explains,
“Using images collected by the Street View Tripod and Trekker, Google has created 360-degree imagery of some of South Africa’s most beautiful locations, and created virtual tours that enable visitors to see the sights for themselves on their phones, tablets or computers. Visitors will be able to, for the first time, visit a family of elephants in the Kruger National Park, take a virtual walk on Table Mountain, admire Cape Point, or take a walk along Durban’s Golden Mile.”
For South Africa, this initiative might spark increased tourism once people realize just how much the country has to offer. So many of the images of Africa that we are exposed to in the US are reductive and patronizing, like those ceaseless commercials depicting all of Africa as a small, poverty-stricken village. Google’s new collection helps to promote a more diverse and appealing look at one African country: South Africa. Whether you want to go in person or virtually, this is worth checking out!
Chelsea Kerwin, May 18, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
The Trials, Tribulations, and Party Anecdotes Of “Edge Case” Names
May 16, 2016
The article titled These Unlucky People Have Names That Break Computers on BBC Future delves into the strange world of “edge cases” or people with unexpected or problematic names that reveal glitches in the most commonplace systems that those of us named “Smith” or “Jones” take for granted. Consider Jennifer Null, the Virginia woman who can’t book a plane ticket or complete her taxes without extensive phone calls and headaches. The article says,
“But to any programmer, it’s painfully easy to see why “Null” could cause problems for a database. This is because the word “null” is often inserted into database fields to indicate that there is no data there. Now and again, system administrators have to try and fix the problem for people who are actually named “Null” – but the issue is rare and sometimes surprisingly difficult to solve.”
It may be tricky to find people with names like Null. Because of the nature of the controls related to names, issues generally arise for people like Null on systems where it actually does matter, like government forms. This is not an issue unique to the US, either. One Patrick McKenzie, an American programmer living in Japan, has run into regular difficulties because of the length of his last name. But that is nothing compared to Janice Keihanaikukauakahihulihe’ekahaunaele, a Hawaiian woman who championed for more flexibility in name length restrictions for state ID cards.
Chelsea Kerwin, May 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Anonymous Hacks Turkish Cops
May 16, 2016
Anonymous has struck again, this time hacking the Turkish General Directorate of Security (EGM) in its crusade against corruption. The International Business Times reports, “Anonymous: Hacker Unleashes 17.8 GB Trove of Data from a Turkish National Police Server.” It is believed that the hacker responsible is ROR[RG], who was also deemed responsible for last year’s Adult Friend Finder breach. The MySQL-friendly files are now available for download at TheCthulhu website, which seems to be making a habit of posting hacked police data.
Why has Anonymous targeted Turkey? Reporter Jason Murdock writes:
“Anonymous has an established history with carrying out cyberattacks against Turkey. In 2015 the group, which is made up of a loose collection of hackers and hacktivists from across the globe, officially ‘declared war’ on the country. In a video statement, the collective accused Turkish President Recep Tayyip Erdo?an’s government of supporting the Islamic State (Isis), also known as Daesh.
“’Turkey is supporting Daesh by buying oil from them, and hospitalising their fighters,’ said a masked spokesperson at the time. ‘We won’t accept that Erdogan, the leader of Turkey, will help Isis any longer. If you don’t stop supporting Isis, we will continue attacking your internet […] stop this insanity now Turkey. Your fate is in your own hands.’”
We wonder how Turkey will respond to this breach, and what nuggets of troublesome information will be revealed. We are also curious to see what Anonymous does next; stay tuned.
Cynthia Murrell, May 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Parts Unknown of Dark Web Revealed in Study
May 13, 2016
While the parts unknown of the internet is said to be populated by terrorists’ outreach and propaganda, research shows a different picture. Quartz reports on this in the article, The dark web is too slow and annoying for terrorists to even bother with, experts say. The research mentioned comes from Thomas Rid and Daniel Moore of the Department of War Studies at King’s College London. They found 140 extremist Tor hidden services; inaccessible or inactive services topped the list with 2,482 followed by 1,021 non-illicit services. As far as illicit services, those related to drugs far outnumbered extremism with 423. The write-up offers a few explanations for the lack of terrorists publishing on the Dark Web,
“So why aren’t jihadis taking advantage of running dark web sites? Rid and Moore don’t know for sure, but they guess that it’s for the same reason so few other people publish information on the dark web: It’s just too fiddly. “Hidden services are sometimes slow, and not as stable as you might hope. So ease of use is not as great as it could be. There are better alternatives,” Rid told Quartz. As a communications platform, a site on the dark web doesn’t do what jihadis need it to do very well. It won’t reach many new people compared to “curious Googling,” as the authors point out, limiting its utility as a propaganda tool. It’s not very good for internal communications either, because it’s slow and requires installing additional software to work on a mobile phone.”
This article provides fascinating research and interesting conclusions. However, we must add unreliable and insecure to the descriptors for why the Dark Web may not be suitable for such uses.
Megan Feil, May 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Smart Software a Derby Winner. Watson Does Not Show
May 12, 2016
I read “AI Predicts All Four Top Places in the Kentucky Derby: Machine Uses Swarm Intelligence to Turn $20 bet into $11,000.” Let’s assume that the magic revealed in the write up is spot on. I will not ask the question, “How much would IBM have won if it had bet a couple of hundred million on the Kentucky Derby using the revealed technology?” Believe me, I want to ask that question, but I will exercise restraint.
According to the write up:
An artificial intelligence program developed by Unanimous A.I. successfully predicted the Superfecta at the 142nd Kentucky Derby last Saturday, turning a $20 bet into nearly $11,000. Using ‘Swarm Intelligence,’ the AI was able to correctly choose the winning horse, Nyquist – along with the second, third, and fourth finishers.
The article includes a nifty, real anigif to illustrate how “swarm intelligence” made big money at the track.
The idea originated at the real news outfit TechRepublic.
The trick:
Many minds are better than one.
Should one ask Watson if it can perform the same big payday magic? Nah.
Stephen E Arnold, May 12, 2016
Amusing Mistake Illustrates Machine Translation Limits
May 12, 2016
Machine translation is not quite perfect yet, but we’ve been assured that it will be someday. That’s the upshot of Business Insider’s piece, “This Microsoft Exec’s Hilarious Presentation Fail Shows Why Computer Translation is so Difficult.” Writer Matt Weinberger relates an anecdote shared by Microsoft research head Peter Lee. The misstep occurred during a 2015 presentation, for which Lee set up Skype Translator to translate his words over the speakers into Mandarin as he went. Weinberger writes:
“Part of Lee’s speech involved a personal story of growing up in a ‘snowy town’ in upper Michigan. He noticed that most of the crowd was enraptured — except for a few native Chinese speakers in the crowd who couldn’t stop giggling. After the presentation, Lee says he asked one of those Chinese speakers the reason for the laughter. It turns out that ‘snowy town’ translates into ‘Snow White’s Town.’ Which seems innocent enough, except that it turns out that ‘Snow White’s town’ is actually Chinese slang for ‘a town where a prostitute lives,’ Lee says. Whoops.
“Lee says it wasn’t caught in the profanity filters because there weren’t actually any bad words in the phrase. But it’s the kind of regional flavor where a direct translation of the words can’t bring across the meaning.”
Whoops indeed. The article notes that another problem with Skype Translator is its penchant for completely disregarding non-word utterances, like “um” and “ahh,” that often carry necessary meaning. We’re reminded, though, that these and other problems are expected to be ironed out within the next few years, according to Microsoft Research chief scientist Xuedong Huang. I wonder how many more amusing anecdotes will arise in the meantime.
Cynthia Murrell, May 12, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

