AI Technology Poised to Spread Far and Wide
April 3, 2015
Artificial intelligence is having a moment; the second half of last year saw about half a billion dollars invested in the AI industry. Wired asks and answers, “The AI Resurgence: Why Now?” Writer Babak Hodjat observes that advances in hardware and cloud services have allowed more contenders to afford to enter the arena. Open source tools like Hadoop also help. Then there’s public perception; with the proliferation of Siri and her ilk, people are more comfortable with the whole concept of AI (Steve Wozniak aside, apparently). It seems to help that these natural-language personal assistants have a sense of humor. Hodjat continues:
“But there’s more substance to this resurgence than the impression of intelligence that Siri’s jocularity gives its users. The recent advances in Machine Learning are truly groundbreaking. Artificial Neural Networks (deep learning computer systems that mimic the human brain) are now scaled to several tens of hidden layer nodes, increasing their abstraction power. They can be trained on tens of thousands of cores, speeding up the process of developing generalizing learning models. Other mainstream classification approaches, such as Random Forest classification, have been scaled to run on very large numbers of compute nodes, enabling the tackling of ever more ambitious problems on larger and larger data-sets (e.g., Wise.io).”
The investment boom has produced a surge of start-ups offering AI solutions to companies in a wide range of industries. Organizations in fields as diverse as medicine and oil production seem eager to incorporate these tools; it remains to be seen whether the tech is a good investment for every type of enterprise. For his part, Hodjat has high hopes for its use in fraud detection, medical diagnostics, and online commerce. And for ever-improving personal assistants, of course.
Cynthia Murrell, April 3, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Hidden Value Oxymoron: Another Me Too Webinar?
April 2, 2015
Look what I received in my email on April Fool’s Day.
I know zero about direct marketing. It did cross my mind that when sending out content marketing spam, one should make sure the message does not appear as a spoof. “Hidden value.” Okay.
I also wondered why IDC and BA Insight would want me to attend a webinar when I have been an outspoken critic of webinars and mid tier consulting companies recycle my content without bothering to issue a contract, pay for rights, or make sure I am okay with the pricing and the method of selling. I don’t want my content on Amazon, a company focused on offering one button ordering of laundry detergent, thank you.
Mid tier consultancies and their experts are another kettle of fish from an unregistered trawler operating near Samut Sakhon, Thailand.
This buzzword filled marketing spam is an invitation to yet another advertising webinar. The company footing the bill is BA Insight, which is one of the SharePoint centric search vendors working to generate sufficient revenue to keep its stakeholders calm and carrying on. The best way to achieve sales, it appears, is to pay IDC’s “search expert” to explain the value of hidden information. Oh, yes. The value of hidden information. There is gold in them thar hills.
If you are not familiar with IDC and information, may I point you to this item about IDC’s Dave Schubmehl. You may also find this article mildly amusing: Meme of the Moment.
Keep in mind when you listen to this infomercial that IDC and Mr. Schubmehl sold my content on Amazon without my permission. I buy from Amazon. I don’t sell via Amazon. My legal eagle managed to get the $3,500 eight page document out of the Amazon store. I make my information somewhat more affordable. CyberOSINT is only $99 with the offer code LEA99. That’s a good price for an original chunk of work. The Amazon $3,500 eight page item is, even with my name on it, a pretty crazy play for cash. Maybe an adventuresome five year old might fall for the $3,500 price tag. I would not.
How much of the information in this BA Insight infomercial will be recycled? How much of the information will be of “value”? Well, sign up and drink deep of the Pierian spring.
Remember: If products are not advertised, products may not sell. If products do not sell, there is no money to pay back investors keeping outfits in business. Without business, the mid tier consultants will get fired. Money is what’s important.
Value? Hmm. Good question when experts who use other individuals’ information are the “talent” on a Web infused late night infomercial. Why not hire Guthy Renker and get the job done in a manner that can be measured. Talk about value is not value. Remember. Eight pages of stuff with my name on it was only $3,500.
Such a deal. Ah, the power of presumptive management and challenged search vendors. Why not invite me. I just love this content marketing, webinar, value, best practice fluff.
Note: I almost wrote, “Don’t fail to miss it.” I did not.
Stephen E Arnold, April 2, 2015
Google: Three Elephants Preparing to Fight Among Themselves
April 2, 2015
I love the Google. I found two unrelated articles interesting for one simple reason: Google is getting ready for its own version of Wrestlemania.
The first write up is “China Blasts Google Security Move as ‘Unacceptable’.” Most outfits doing business in China seek to avoid getting into an awkward position with the Chinese authorities. Anyone remember the mobile death vans? Well, check up on your allegedly accurate current history. According to the write up, Google is not recognizing certain Chinese certificates:
The Google posting was updated Wednesday to note that CNNIC’s certificates “will no longer be recognized in Google products” adding that the Chinese organization was “welcome… to reapply once suitable technical and procedural controls are in place”. An anti-censorship group, GreatFire.org – which has accused Beijing of attacking its services—said the original revelation was evidence that CNNIC had been “complicit” in so-called man-in-the-middle operations. Such attacks involve an unauthorized intermediary inserting themselves between computer users and their online destinations, usually undetected, allowing them to harvest data including passwords.
For me, the point is that Google is lighting up the radar of the CNNIC just as the fly bys by Mr. Putin’s armed forces catch the attention of Russia’s neighbors.
The second write up is even more fascinating, if it is accurate. The article is “EU Lays Groundwork for Antitrust Charges Against Google.” You will either have to buy a newspaper and kill a small bush or tree or pay. Or you will have to pony up money for online access. If the link works, wow, you are lucky.
The passage I noted was:
The European Commission, the European Union’s top antitrust authority, has been asking companies that filed complaints against Google for permission to publish some information they previously submitted confidentially, according to several people familiar with the requests. Shopping, local and travel companies are among those that have been contacted, one of those people said.
Assume the “one of those people said” is delivering on the money information. The idea that there will be legal documents available for analysis is darned interesting. I have reviewed one or two court documents in my work career and some of them are chock full of useful information. Too bad that some documents, like those in the i2 vs Palantir matter disappear after the proceedings, but that’s life in the aeries of legal eagles.
The net net of this is that Google is not just jousting with the Xooglers at Facebook and the world’s smartest man at Amazon. The Google appears to be entering a two front war. One hopes those online advertising revenues continue to pump cash into the Google’s coffers. Two front wars can be costly for human, companies, and the victims of the proverb which asserts:
When elephants fight, only the grass gets trampled.
I am delighted I live in Harrod’s Creek. The mine drainage run off makes grass a scare item. So who will be the grass when Google tussles with China and the EC? I am interested in how a company battling nation states will move forward.
Stephen E Arnold, April 2, 2015
Intranet Connections and Super Search Version 13
April 2, 2015
I read “Maximize Productivity with Super Search from Intranet Connections (Version 13.0 Release).”
For decades I have been gathering information about enterprise search and content processing. The name of the company was not familiar to me. The assertions in the news article were, however.
Puzzled, I went through my archive of search vendor information and did not find content about Intranet Connections. I noted the date on this article and wondered if the company were an April Fool’s spoof. I know I am getting on in years, but when I wake up and plop in front of my primitive, coal-fired computer, my memory works reasonably well. I know my name and the day of the week.
The write touts an enterprise search system that includes:
- New Search Engine
- Preview Display Cards
- Intuitive Search Filters
- Advanced Search Options
- Controlled Search Security.
I know from the years of experience I have logged examining, testing, and creating search and content processing systems like the one we sold to the long ago Lycos, that “new” is a slippery concept. For some folks, learning about Google’s site operator is a new thing. For others it is a reminder of the many useful search functions that Google no longer exposes to the ad consumers looking for objective information via Google.com.
I am not sure how often a search system innovates across 13 versions. Intranet Connections seems to have been founded in 1999, which makes the company 16 years young. Most of the long lived search engines don’t change too much from the original core; for example, Autonomy IDOL. On the other hand, other companies just discard a search system and graft in open source Lucene and slap on the “new” label. Others take inspiration from Fast Search and call it new.
The company states on its LinkedIn page here:
Intranet Connections is a business intranet software solution that enables organizations to connect, collaborate and create more efficiently yielding significant time-cost savings and stronger employee engagement. We combine key business tools to automate workflows and processes, while delivering improved communication and collaboration among employees to engage and promote culture within the digital workplace.
How new is Version 13 of a search system. I learned from the write up:
“Enterprise intranet search functionality is proving to be more critical than ever as today’s mature intranets face thousands of data entries, pages, forms, and uploaded corporate documents and policies. Employees’ expectations are higher than ever to deliver on an intranet search utility that is fast, focused, intelligent and super simple. We wanted to introduce intranet search that is not just functional but an entire experience.” Douglas also reports that Super Search was a result of close collaboration with Intranet Connections’ customers who were active in feedback for the design and feature set capabilities, ensuring the product release would enhance their needs for enterprise intranet search.
And adds that it can deliver software capable of “triggering emotions on the Intranet using Intranet design such as theme, videos, and photos.” Furthermore, the system seems to be able to marry an “Intranet and an enterprise social network.”
These are significant assertions.
Okay, that sounds great but we are in Version 13, not Version 1.1. The new version was announced in August 2014. In “Intranet Search Designed for Maximum Productivity” I learned:
The first thing customers will notice is the completely redesigned user interface. It is really geared towards making search simple and fast as possible for the average user. We also introduced “one-click filtering”. If a user knows they are looking for a document, a form, or a person, they have the option to filter their results with a single click. This automatically removes search results that aren’t in the specified category. More advanced search options are available for power users, but are hidden by default. These users can choose to filter their results by specific sites, application, modified date, author, or tags. We also introduced the content of feature cards. Most of the time, users can determine if a search result is what they are looking for by the title, category, or short description. However, if there are multiple documents that are similar, a little bit more information may be necessary. Instead of requiring the user to click into the item to view more details, and navigate away from search, we introduced the concept of a feature card. Additional summary information for a search result can be displayed within the search screen, preventing the need to jump back and forth from search and content.
Intrigued by my Overflight systems lack of information about Super Search, I visited the company’s Web site. I learned that the system begins at $15,000, which strikes me as a bargain. Low cost search systems often face significant financial demands as the company struggles to keep pace with the needs of customers, support demands, and the inevitable tweaks that are needed to deal with the wild and crazy nature of behind-the-firewall content.
At www.IntranetConnections.com I learned that the company makes “Intranet software made for you.” I assume that means me. I do have a 2.5 million test corpus which has been known to take days of indexing. One German company promised speedy performance, and I had to leave the system on for five days before I could run a test query. The initial crawl failed because this particularly German, Lucene based system choked on Microsoft’s file locks. Yep, every Microsoft system has these types of files. I wondered, “Yo, why not provide some tools to deal with this like a !readme.txt file.”
Back to Intranet Connections.
The company delivers what I think of as a one-stop, 7-11 solution. The Web site highlights a people directory, forms, document management, Intranet Web sites, but not search. After I scrolled through information about the corporate Intranet, the finance Intranet, and the healthcare Intranet. But no direct link to Super Search.
I used my tools to examine the site and located a blog post about Super Search in an article labeled “Super Search Launch [sic] Scavenger Hunt.” There was a phrase about Super Search, “And much more”. But there was no link. I did locate a link to a story with the title “Super Search (V13.0) dated January 27, 2015. That page did provide links to a feature guide, a support page, a webinar recording (Does anyone have webinar fatigue as I do?) and a recursive link to the blog. There is also a link to the installation guide. The guide is 300 words long and helpful provides me with a user name and password. The guide also makes clear that I need to be deep into the Microsoft world. Mac and Linux users do not seem to be encouraged. Unlike the German outfit, Intranet Connections provides a link to information necessary to get the search engine working.
It appears that the company offers an alternative to Microsoft SharePoint. The firm, based in Vancouver, has 1,600 customers. Some have a high profile like NASA and the Mayo Clinics. My hunch is that the company has assembled / developed a suite of software. Search is included.
Other observations:
- I struggled with the “new” concept. I mean after 13 years, how “new” is “new.”
- I had to do some poking around to get access to the fact sheet and basic information
- The pricing of $15,000 seems to apply to the full collection of software available from the company
- I have yet to figure out how I managed to know zero about a company with a search system named “super.”
I need to improve my enterprise search information collection. Some help from vendors with more comprehensive and easy-to-find information would be helpful.
Stephen E Arnold, April 2, 2015
Pooling the Pangaea Ad Pool
April 2, 2015
In order to capitalize more on Internet ads, some of the biggest news published have pooled their resources to form the Pangaea Alliance, says Media Post in the article, “Premium Publishers Including Guardian, Reuters, FT Launch Programmatic Alliance.” The Pangaea Alliance includes CNN International, the Financial Times, The Guardian, Reuters, and The Economist. Combined all these publishers have an audience over 110 million users. The Pangaea will make ad inventory available to advertisers using programmatic buying.
All participating members will pool their audiences and share their data with each. This is very big news, considering most companies keep their customer list a secret.
“ ‘We know that trust is the biggest driver of brand advocacy, so we have come together to scale the benefits of advertising within trusted media environments,’ stated Tim Gentry, global revenue director at Guardian News and media and Pangaea Alliance project lead.”
Rubicon Project will power the Pangaea Alliance. The alliance feeds into the demand for premium programmatic advertising venues on a massive scale. The biggest problem it faces will be the customers. They might have a large combined clientele, but will they actually want to pay for these outfits’ information?
Whitney Grace, April 2, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
EBay Develops Open Source Pulsar for Real Time Data Analysis
April 2, 2015
A new large-scale, real-time analytics platform has been launched in response to one huge company’s huge data needs. VentureBeat reports, “EBay Launches Pulsar, an Open-Source Tool for Quickly Taming Big Data.” EBay has made the code available under an open-source license. It seems traditional batch processing systems, like that found in the widely used open-source Hadoop, just won’t cut it for eBay. That puts them in good company; Google, Microsoft, Twitter, and LinkedIn have each also created their own stream-processing systems.
Shortly before the launch, eBay released a whitepaper on the project, “Pulsar—Real-time Analytics at Scale.” It describes the what and why behind Pulsar’s design; check it out for the technical details. The whitepaper summarizes itself:
“In this paper we have described the data and processing model for a class of problems related to user behavior analytics in real time. We describe some of the design considerations for Pulsar. Pulsar has been in production in the eBay cloud for over a year. We process hundreds of thousands of events/sec with a steady state loss of less than 0.01%. Our pipeline end to end latency is less than a hundred milliseconds measured at the 95th percentile. We have successfully operated the pipeline over this time at 99.99% availability. Several teams within eBay have successfully built solutions leveraging our platform, solving problems like in-session personalization, advertising, internet marketing, billing, business monitoring and many more.”
For updated information on Pulsar, monitor their official website at gopulsar.io.
Cynthia Murrell, April 2, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Mistakes to Avoid When Migrating to Office 365
April 2, 2015
Sadly, many migrations are considered failures by the organization and users, even if all the content survives. Why is this the case? Well, user experience usually suffers greatly. Redmond Magazine offers more insight and advice in their article, “5 Mistakes To Avoid When Migrating from SharePoint to Office 365.”
The article starts with a mention of the upcoming SharePoint 2016 release, and the every evolving Office 365 before stating:
“The question for many organizations isn’t whether to stay with SharePoint — rather, IT managers are grappling with how to advance its use in the most strategic and cost-effective way possible. As organizations consider a myriad of options from Microsoft, it becomes essential to have not only a long-term strategic technology vision — but also a SharePoint migration and upgrade roadmap that’s big on efficiency and low on cost.“
It is easy to be shortsighted. And while planning is hard and cumbersome, having a long-term plan is one of the only ways to avoid some of the mistakes mentioned in the article. Stephen E. Arnold is another resource to consider when planning. His Web site, ArnoldIT.com, is a top destination for the latest news in search, including SharePoint. His SharePoint feed provides a one-stop-shop for all the latest tips and tricks to assist your organization with their SharePoint planning.
Emily Rae Aldridge, April 2, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
CyberOSINT Update: No Fooling
April 1, 2015
Two quick items about cyber OSINT. These are not April Fool jokes and the information is available without nag screens, registration forms, or blinking ads.
First, we have posted a five minute video that explains what cyber OSINT means. I was interviewed by award winning tech journalist Ric Manning. You can view the video at this link.
Second, we have started a new interview series. Like the original Search Wizards Speak series of interviews, the Cyber Wizards Speak interviews provide more first-person information about cyber OSINT from those working in the field. The interviews are intended for those interested in law enforcement, intelligence, and security. The first interview in the series presents the viewpoints of Luca Scagliarini, one of the original developers of the Expert System Cogito system. You can find the interview at www.xenky.com/expert-system.
Watch for upcoming announcements about more cyber OSINT videos and interviews with the principals of BrightPlanet and Recorded Future.
Copies of my new study CyberOSINT: Next Generation Access are available at www.xenky.com/cyberosint.
Stephen E Arnold, April 1, 2015
The Clever Folks at Yale Remind Us We Are Not Clever
April 1, 2015
Years ago I gave a lecture at Yale University. Very interesting experience. Everyone in the audience knew what was in my monographs about Google. Incredible. I thought I had gathered original information. Well, did I learn how dumb I was. Invigorating.
I read with an eye on the April Fool’s notation on my calendar “Yale Study: You’re Not as Clever As Your Googling Suggests.” I must admit that after I learned I was hopelessly stupid after my lecture, I knew this.
According to the write up:
Yale psychology professor Frank Keil argues that having the internet’s vast resources at your fingertips causes people to confuse their internal knowledge base (what they personally know) with their external knowledge base (knowing where to find the information they need). In short, it acts as a sort of cognitive opiate, convincing people they know more than they do even when the search results come up empty.
Yes. Proof. Not must the attitude of my audience nor their somewhat snarky questions at the meet and mingle, now there is proof.
Isn’t it wonderful to have confirmation that you, like me, are stupider than we knew.
Stephen E Arnold, April 1, 2015
GitHub: More Than Code
April 1, 2015
Short honk: Google killed off its open source software thing. GitHub seems to be the go to repository. However, GitHub is more than code. Navigate to “Le Code Civil francais, sour Git.” Is it important that a code repository is growing its content pool? Nah, just a blip. There is that denial of service attack. But that is probably unrelated to GitHub’s activities.
Stephen E Arnold, March 31, 2015