DeepGram: Audio Search in Lectures and Podcasts
March 23, 2016
I read “DeepGram Lets You Search through Lectures and Podcasts for Your Favorite Quotes.” I don’t think the system is available at this time. The article states:
Search engines make it easy to look through text files for specific words, but finding phrases and keywords in audio and video recordings could be a hassle. Fortunately, California-based startup DeepGram is working on a tool that will make this process simpler.
The hint is the “is working.” Not surprisingly, the system is infused with artificial intelligence. The process is to covert speech to text and then index the result.
Exalead had an interesting system seven or eight years ago. I am not sure what happened to that demonstration. My recollection is that the challenge is to have sufficient processing power to handle the volume of audio and video content available for indexing.
When an outfit like Google is not able to pull off a comprehensive search system for its audio and video content, my hunch is that the task for a robust volume of content might be a challenge.
But if there is sufficient money, engineering talent, and processing power, perhaps I will no longer have to watch serial videos and listen to lousy audio to figure out what some folks are trying to communicate in their presentations.
Stephen E Arnold, March 23, 2016
Amazon Web Services: Crushing the Competition?
March 23, 2016
I read “Attack! Run. WTF? A Decade of Enterprise Class Fear and Uncertainty with AWS.” I am not sure if Amazon’s Web Services’ business is being praised or criticized. Nevertheless, the write up has some interesting factoids. I highlighted these statements:
IBM’s Cloud Services
- IBM, … was so flabbergasted [when Amazon won a US government contract] that the Blue Shirts of Armonk decided on the old-school route to victory and filed a legal complaint asking the government to re-evaluate IBM’s deal against that of Amazon, which Big Blue later withdrew.
- Famed for re-inventing itself around software in the 1990s under Lou Gerstner, the majority of IBM’s focus for the 2000s was devoted to unloading the PC and the server businesses on China. The firm is now trapped in a maelstrom of transition, restructuring and layoffs. Like Microsoft, IBM seems to have believed AWS couldn’t happen to it, that what the world needed was the same server software and services. It was nearly seven years after AWS that IBM realized something was afoot – probably when it lost both the CIA deal and got slapped about its attempts to make the CIA love it – that Big Blue said it would spend $2bn buying computing player SoftLayer and in 2014 throw $1.2bn into a massive data centre expansion to host your data and compute.
Microsoft Cloud Services
- Azure succumbed to classic innovator’s dilemma: how to sell a new platform as a package and at a price to maximize revenue without cannibalizing the company’s actual main money-makers – PC and server software. After delayed starts under Ray Ozzie and Bob Muglia, the technology roadmap only really clicked under new CEO Satya Nadella and executive software nerd Scott Guthrie. One brought the CEO-level commitment, the other made Azure work for developers.
- Gartner today regards Azure as number two, behind AWS, and yet… According to Gartner’s incumbent Cloud Queen Lydia Leong, Azure lacks the polish of AWS.
Oracle Cloud Services
- Oracle, which bought Sun, preferred to play a Game of Thrones that was corporate M&A to hold onto its position in IT. Sadly, it chose wrong; Oracle spent $8.5bn on Sun but ultimately discontinued the company’s fledgling utility computing service. Hardware and Java was what Oracle wanted.
- Today, Oracle’s resultant hardware business makes just half the revenue of AWS and is is shrinking – falling 13 per cent to $1.1bn – versus AWS’s 69 per cent growth last quarter to $2.4bn. That past complacency of Oracle’s CEO on cloud has put Oracle firmly in a pack of also rans behind AWS on platform cloud, with Oracle now throwing PR at a problem to convince Wall St it is credible as a provider of IT as a service.
And what about Amazon? The write up points out:
- AWS is still attacking – growing at a phenomenal rate, 71 per cent in its recent quarter to $2.4bn and 69 per cent for the year to $7.88bn. The appetite among enterprises for AWS’s style of technology and model of delivery clearly hasn’t yet been satiated.
- …the truth is AWS now has its fences across so much of the cloud, removing them isn’t an option. The big question then for AWS at the age of 10 is this: when will the old men of IT regain their wind? How big will be their counter-attack and will it be concerted? Will it pose a tangible threat and how would AWS respond?
I noted that Apple has shifted some of its cloud business to the Google from AWS. I assume the Board of Directors’ excitement is now behind the kids from Cupertino. What’s clear is that IBM and Oracle seem to face an uphill slog if I understand the write up. Read the original and decide for yourself. I love the WTF. Some stakeholders may be asking this question too.
Stephen E Arnold, March 23, 2016
Weekly Watson: IBM Watson Has a Sister
March 23, 2016
I read “In Africa, Watson’s Sister Lucy Is Growing Up with the Help of IBM’s Research Team.” I did not know that. According to the write up:
Lucy, named after the fossil ancestor Australopithecus afrarensis, is more of a system than a sci-fi super machine. “Lucy is many things, but it’s not just one talking computer in a room,” said Dr. Kamal Bhattacharya, Director of IBM Research–Africa. “We are using Watson related technology and big data analytics to develop solutions to African problems.”
I have been to different countries in Africa a handful of times. I have seen some of problems first hand. I learned from the description of Lucy, brother of IBM Watson that:
On the execution side, IBM Research Africa has launched problem solving groups around issues such as education, infrastructure, health care, and economic inclusion. Partners include African universities, telcos, hospitals, tech startups, and the Kenyan ICT Authority.
Research is good. Research which helps people is good. My concern is that IBM remains mired in years of revenue challenges. Marketing, not generating benefits for its stakeholders, seems to be a core IBM Watson competency. Also, the company is improving its ability to terminate unneeded employees. Lucy, what’s the fix for declining IBM revenues?
I await word from Watson’s sister?
Stephen E Arnold, March 23, 2016
The Dark Web Cuts the Violence
March 23, 2016
Drug dealing is a shady business that takes place in a nefarious underground and runs discreetly under our noses. Along with drug dealing comes a variety of violence involving guns, criminal offenses, and often death. Countless people have lost their lives related to drug dealing, and that does not even include the people who overdosed. Would you believe that the drug dealing violence is being curbed by the Dark Web? TechDirt reveals, “How The Dark Net Is Making Drug Purchases Safer By Eliminating Associated Violence And Improving Quality.”
The Dark Web is the Internet’s underbelly, where stolen information and sex trafficking victims are sold, terrorists mingle, and, of course, drugs are peddled. Who would have thought that the Dark Web would actually provide a beneficial service to society by sending drug dealers online and taking them off the streets? With the drug dealers goes the associated violence. There also appears to be a system of checks and balances, where drug users can leave feedback a la eBay. It pushes the drug quality up as well, but is that a good or bad thing?
“The new report comes from the European Monitoring Centre for Drugs and Drug Addiction, which is funded by the European Union, and, as usual, is accompanied by an official comment from the relevant EU commissioner. Unfortunately, Dimitris Avramopoulos, the European Commissioner for Migration, Home Affairs and Citizenship, trots out the usual unthinking reaction to drug sales that has made the long-running and totally futile “war on drugs” one of the most destructive and counterproductive policies ever devised:
‘We should stop the abuse of the Internet by those wanting to turn it into a drug market. Technology is offering fresh opportunities for law enforcement to tackle online drug markets and reduce threats to public health. Let us seize these opportunities to attack the problem head-on and reduce drug supply online.’”
The war on drugs is a futile fight, but illegal substances do not benefit anyone. While it is a boon to society for the crime to be taken off the streets, take into consideration that the Dark Web is also a breeding ground for crimes arguably worse than drug dealing.
Whitney Grace, March 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Stanford Offers Course Overviewing Roots of the Google Algorithm
March 23, 2016
The course syllabus for Stanford’s Computer Science class titled CS 349: Data Mining, Search, and the World Wide Web on Stanford.edu provides an overview of some of the technologies and advances that led to Google search. The syllabus states,
“There has been a close collaboration between the Data Mining Group (MIDAS) and the Digital Libraries Group at Stanford in the area of Web research. It has culminated in the WebBase project whose aims are to maintain a local copy of the World Wide Web (or at least a substantial portion thereof) and to use it as a research tool for information retrieval, data mining, and other applications. This has led to the development of the PageRank algorithm, the Google search engine…”
The syllabus alone offers some extremely useful insights that could help students and laypeople understand the roots of Google search. Key inclusions are the Digital Equipment Corporation (DEC) and PageRank, the algorithm named for Larry Page that enabled Google to become Google. The algorithm ranks web pages based on how many other websites link to them. John Kleinburg also played a key role by realizing that websites with lots of links (like a search engine) should also be seen as more important. The larger context of the course is data mining and information retrieval.
Chelsea Kerwin, March 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Interview with Stephen E Arnold, Reveals Insights about Content Processing
March 22, 2016
Nikola Danaylov of the Singularity Weblog interviewed technology and financial analyst Stephen E. Arnold on the latest episode of his podcast, Singularity 1 on 1. The interview, Stephen E. Arnold on Search Engines and Intelligence Gathering, offers thought-provoking ideas on important topics related to sectors — such as intelligence, enterprise search, and financial — which use indexing and content processing methods Arnold has worked with for over 50 years.
Arnold attributes the origins of his interest in technology to a programming challenge he sought and accepted from a computer science professor, outside of the realm of his college major of English. His focus on creating actionable software and his affinity for problem-solving of any nature led him to leave PhD work for a job with Halliburton Nuclear. His career includes employment at Booz, Allen & Hamilton, the Courier Journal & Louisville Times, and Ziff Communications, before starting ArnoldIT.com strategic information services in 1991. He co-founded and sold a search system to Lycos, Inc., worked with numerous organizations including several intelligence and enforcement organizations such as US Senate Police and General Services Administration, and authored seven books and monographs on search related topics.
With a continued emphasis on search technologies, Arnold began his blog, Beyond Search, in 2008 aiming to provide an independent source of “information about what I think are problems or misstatements related to online search and content processing.” Speaking to the relevance of the blog to his current interest in the intelligence sector of search, he asserts:
“Finding information is the core of the intelligence process. It’s absolutely essential to understand answering questions on point and so someone can do the job and that’s been the theme of Beyond Search.”
As Danaylov notes, the concept of search encompasses several areas where information discovery is key for one audience or another, whether counter-terrorism, commercial, or other purposes. Arnold agrees,
“It’s exactly the same as what the professor wanted to do in 1962. He had a collection of Latin sermons. The only way to find anything was to look at sermons on microfilm. Whether it is cell phone intercepts, geospatial data, processing YouTube videos uploaded from a specific IP address– exactly the same problem and process. The difficulty that exists is that today we need to process data in a range of file types and at much higher speeds than ever anticipated, but the processes remain the same.”
Arnold explains the iterative nature of his work:
“The proof of the value of the legacy is I don’t really do anything new, I just keep following these themes. The Dark Web Notebook is very logical. This is a new content domain. And if you’re an intelligence or information professional, you want to know, how do you make headway in that space.”
Describing his most recent book, Dark Web Notebook, Arnold calls it “a cookbook for an investigator to access information on the Dark Web.” This monograph includes profiles of little-known firms which perform high-value Dark Web indexing and follows a book he authored in 2015 called CYBEROSINT: Next Generation Information Access.
Yellowfin: Emulating i2 and Palantir?
March 22, 2016
I read “New BI Platform Focuses on Collaboration, Analytics.” What struck me about this explanation of a new version of YellowFin is that the company is adding the type of features long considered standard in law enforcement and intelligence. The idea is that visualizations and collaboration are components of a commercial business intelligence solution.
I noted this paragraph:
Other BI vendors have tried to push data preparation and analysis responsibilities onto business users “because it’s easier to adapt what they have to fulfill that goal.” But Yellowfin “isn’t a BI tool attempting to make the business user a techie. It is about presenting data to users in an attractive visual representation, backed-up with some of the most sophisticated collaboration tools embedded into a BI platform on the market.”
The reason for analyst involvement in the loading of data is a way to eliminate the issue of content ownership, indexing, and knowledge of what is in the system’s repository. I am not confident that any system which allows the user to whack away at whatever data have been processed by the system is ready for prime time. Sure, Google can win at Go, but the self driving auto ran into a bus.
The write up, which strikes me as New Age public relations, seems to want me to remember what’s new with YellowFin with this mnemonic example: Curated. Baffled? Here’s what curated means:
- Consistent: Governed, centralized and managed
- Usable: by any business to consume analytics
- Relevant: connected to all the data users need to do their jobs well
- Accurate: data quality is paramount
- Timely: Provide real time data and agile content development
- Engaging: Offer a social or collaborative component
- Deployed: widely across the organization.
Business intelligence is the new “enterprise search.” I am not sure the use of notions like curated and adding useful functions delivers the impact that some marketers promise. Remember that self driving car. Pesky humans.
Stephen E Arnold, March 23, 2016
More Amazing Factoids: US Government Web Sites Best Amazon and Google in User Satisfaction
March 22, 2016
I read “Government Websites Best Amazon, Google in User Satisfaction.” From the write up generated by “real” journalist at a “real” media outfit, I learned:
By one measure, a well-established gauge of user satisfaction, the government actually beats out many of the top business sites on the Web, including perennial consumer favorites Amazon, Expedia and Google.
Where doth the datum originate? Well, the hardly annoying pop up survey outfit ForeSee. According to the write up:
ForeSee evaluates websites on a 100-point customer-satisfaction scale, looking at a variety of factors like search, functionality and ease of navigation. The firm also focuses on outcomes, such as the likelihood that users would return to the site or recommend it to others.
Now for the data:
… 36 percent of the 101 websites ForeSee evaluated in the fourth quarter of 2015 notched scores of 80 or above, what the firm deems as the threshold where websites are “meeting or exceeding the standards of excellence for highly satisfied visitors.” That mark was up from 30 percent in the first quarter of the year. Leading the pack were four websites maintained by the Social Security Administration. Two SSA sites scored 90 on ForeSee’s satisfaction index, and two others scored 89. For comparison, Amazon netted an 86 on the same index. Vanguard.com came in at 80, followed by Google (78), Pinterest (78), Expedia (77) and NYTimes.com (76).
I have added some bold face to make it easier to see the slam dunk the US government Web sites are putting in the face of Team Traffic.
Wow, up from 30% in a matter of months. The Social Security Administration must be doing something right. A couple of questions:
- Does the SSA site support remembering certain passwords for users or do some must have functions lose the state of certain users?
- Has foot traffic at Social Security offices declined because the SSA Web sites are satisfying such a large percentage of users?
- Are the SSA Web sites integrated, or are disparate systems, including mainframes, still generating content for internal reports and public Web queries?
Well, the write up focuses on the lousy job some consumer centric sites are doing with user satisfaction. Are we comparing apples and oranges, or is this just a convenient way to reward some good government clients and remind the most used Web sites that some folks don’t like the modern Web?
No answers, but I am sure some of the university-inspired wizards at ForeSee will have logical, but glib, answers.
By the way, what’s the traffic at the four best Web sites doing in the same time period? My information suggests that traffic to US government Web sites is not booming because the US government Web sites have not made the transitions required to deal with the growing base of users with mobile devices.
Stephen E Arnold, March 22, 2016
Hot Data Startups to Notice
March 22, 2016
An outfit called UBM, which looks a lot like the old IDC I knew and loved, published “9 Hot Big Data and Analyt5ics Startups to Watch.” The article is a series of separate pages. Apparently the lust for clicks is greater than the MBAs’ interest in making information easy to access. Progress in online publishing is zipping right along the information highway it seems.
What are the companies the article and UBM as describing as “hot.” I interpret the word to mean “having a high degree of heat or a high temperature” or “(of food) containing or consisting of pungent spices or peppers that produce a burning sensation when tasted.” I have a hunch the use of the word in this write up is intended to suggest big revenue producers which you must license in order to get or keep a job. Just a guess, mind you.
The companies are:
AtScale, founded in 2013
Algorithmia, founded in 2013
Bedrock Data, founded in 2012
BlueTalon, founded in 2013
Cazena, founded in 2014
Confluent, founded in 2014
H2O.ai, founded in 2011
RJMetrics, founded in 2008
Wavefront, founded in 2013
The list is US centric. I assume none of the Big Data and analytics outfits in other countries are “hot.” I think the reason is that the research process looked at Boston, Seattle, and the Sillycon Valley pool and thought, “Close enough for horseshoes.” Just a guess, mind you.
If you are looking for the next big thing founded within the last two to eight years, the list is just what you need to make your company or organization great again. Sorry, some catchphrases are tough to purge from my addled goose brain. Enjoy the listicle. On high latency systems, the slides don’t render. Again. Do MBAs worry about this stuff? A final comment: I like the name “BlueTalon.”
Stephen E Arnold, March 22, 2016
Allegedly Secretive Palantir Technologies Getting Chatty?
March 22, 2016
Many of the articles I read about Palantir Technologies describe the company as secretive. I am not sure that is 100 percent accurate. The company has videos on YouTube for goodness sake.
I noted “How Palantir Uses Big Data to Find Missing Kids.” This article came hard on the heels of “Is Morgan Stanley Wrong about Big Palantir Valuation Markdown?”
The missing kids story emphasizes Palantir’s social “good” work. I noted this passage:
Lucky for Palantir, big data challenges are just as common in the nonprofit world as in the for-profit sector. Recently, the company, which started out partnering with the U.S. intelligence and defense communities in antiterrorism efforts, has turned its attention to one of the biggest current problems: The Syrian civil war and subsequent refugee crisis, via a collaboration with The Carter Center. “We’re a company that focuses on the world’s hardest problems,” says Karin Knox, head of Palantir’s philanthropy engineering team. “Right now we probably have a hand in all of them.”
Lucky.
Stephen E Arnold, March 22, 2016

