Semantic Search Explained
June 19, 2010
I get asked about semantic search one a day, often more frequently. I usually say, “Semantic search means software can figure out what something is about.” If that does not do the trick, I trot out the more detailed explanation Martin White and I put in our 2009 study “Successful Enterprise Search Management.”
I neglected to write about “10 Things that Make Search a Semantic Search.” The informton in that write up by the founder of Hakia, Dr. Riza C. Berkan is useful. If you have not reviewed the write up, you will want to put this reading on your To Do list.
I don’t want to reproduce the full list. Navigate to the original article and work through. I do want to highlight three points with which I agree.
First, a semantic search can handle synonyms. Languages are like roads in Kentucky, full of potholes. Disambiguation and figuring out synonyms are two important tasks. Their presence signals a semantic component in the content processing system.
Second, a search systm that can present a snippet or a highlight of the key sentence of paragraph is quite useful. I find that some snippeting technology is designed to meet the needs of folks selling ads. The snippeting function I want works with the honesty and zeal of a prisoner who is due to be released from prison in two days.
Finally, a user can enter a query without having to formulate a query with Boolean operators or special instructions such as CC=. Systems have to be smart but not biased or tilted for the benefit of advertisers. Objectivity is important in delivering this type of query support. Alas, I think this is a difficult goal to achieve. Humans are humans and often prefer to click the ad for a vacation rental than running a query and perusing results, then making an informed decision.
A happy quack to Hakia for the post.
Stephen E Arnold, June 19, 2010
Freebie
Google Austria Book Scanning Deal
June 17, 2010
Google keeps on plugging away with its book scanning project. I have been one of the people who think that Google has flowed into a vacuum. The sniping and legal flaps have not taken my eye off the ball, however. Google wants to scan, ingest the content, and make money from its effort. I think a big part of the book scanning effort is directed at Google’s knowledge base initiative. The more content processed by the Google, the better able its numerical recipes are at making decisions. The making money part is important but not the whole story.
Google, according to India’s Economic Times, has deal with Austria. The story “Google to Scan 400,000 Austrian Library Books” said:
Austria’s national library said on Tuesday it has struck a 30-million-euro deal with US Internet giant Google to digitize 400,000 copyright-free books, a vast collection spanning 400 years of European history. Johanna Rachinger, the head of the ONB library, hailed what she called an “important step,” arguing at a news conference that “there are few projects on such a scale elsewhere in Europe.” The Austrian library project concerns one of the world’s five biggest collections of 16th- to 19th-century literature, totaling some 120 million pages, the ONB said in a statement.
Important points. This is a 30 million euro deal. The content is non exclusive. The library solves a preservation problem along with some access and money issues.
Look ahead 10 years. When you want a book from this collection, will you use Google or some other service? Google is aiming for the long haul and a much bigger play. What about the “regular” scanning activity? Just keeps on clicking along in my opinion.
Stephen E Arnold, June 17, 2010
Freebie
McKinsey to Squeeze the Azure Chip Consultants in Social Media
June 14, 2010
The azure chip crowd is going to have to up their game. I read “Nielsen Partners with McKinsey to Create Social Media Consultancy” and chuckled. The blue chip firms don’t twitch and jump. Their business is predicated on 80 percent plus repeat business from Fortune 500 firms. The new work comes from the churn and drama of business activities. The azure chip crowd usually lacks the luxury of the blue chip firms’ momentum.
Social media intelligence is one of those odd little markets which overlap traditional competitive analysis, the whizzy new “voice of the customer” baloney, and the Google-like “big data” approach to decision making.
According to the write up:
Research firm Nielsen has partnered its social monitoring service BuzzMetrics with management consultancy McKinsey to form NM Incite, a social media consultancy…In January, Nielsen announced it would extend its partnership with Facebook to measure the impact of online branding ads on Facebook.
What happened to comScore and other firms with a “core competency” in social media metrics? McKinsey has chosen its partner for the first dance in the social media waltz. What will the azure chip crowd do? Probably launch a Twitter campaign and cook up more white papers. The big money jobs will now be more exciting for the azure chip folks. Just my opinion. Ah, don’t know what makes the blue chip consulting firms different? There’s a useful data point, gentle reader. Some tips are here.
Stephen E Arnold, June 14, 2010
Freebie
Morgan Stanley Wants You to Churn Your Investments
June 13, 2010
Short honk: The excitement is back. Forget the fire fights among Apple, Google, Microsoft, and others. Forget the lousy economic outlook. Forget the oil spill. Remember the good old pre crash days. To document this moment in time, navigate to “Mary Meeker’s Amazing Internet Presentation.” You can view the great news here. Churn those holdings of your now. Yes, right now. Those data are hot, objective, and darn near as solid as anything Wall Street has to offer its partners. Amazing for sure.
Stephen E Arnold, June 13, 2010
Freebie which is a word that Morgan Stanley does not use with high frequency.
Quote to Note: Management Excellence at AOL and Time Warner
June 13, 2010
This quote to note appeared in the Daily Telegraph’s “Yahoo! Shakes on a New Type of Partnership.” If an accurate statement, it helps me understand the AOL and Time Warner way:
Here’s the passage that made me honk:
A former AOL executive said the best line to me this week – which summed up the crazy technology acquisition culture perfectly: “Every business we [AOL] ever bought we destroyed – until we bought Time Warner and they destroyed us.”
The write up provides some insight into Yahoo, but Yahoo’s track record in acquisitions is notable for the number of business school analyses each has triggered in my opinion.
Stephen E Arnold, June 13, 2010
Freebie
Quote to Note: Data Pig
June 10, 2010
I don’t use an iPhone. Yes, I pay AT&T for one of my broadband landlines. Yes, I have an AT&T landline. I am not sure if I sympathize with people who make a conscious choice to purchase services which can impose punitive variable pricing. Maybe most people don’t remember the pre-Judge Green days when a person rented a Western Electric telephone device and never owned it? I was at the Piscataway IBM facility when the order was enforced with one part of the building becoming Bellcore and other part remaining Bell Labs. The object of the company was to make money, pay for the fancy stuff like PICS, and build phones you could toss from the second floor of the Western Electric building confident that the clunky thing would work after the 26 foot fall to the concrete below.
Money.
When a telephone carrier with the “old” AT&T DNA offers a deal, I chuckle. I used to put on my Young Pioneers hat, but Tess ate it. Sigh. Memories of a monopoly don’t face quickly.
Point your browser at “AT&T Learns Exactly The Wrong Thing About Data Usage.” Agree or disagree with the write up. What I noted was:
AT&T says that 65% of its users use less 200 megabytes per month; a whopping 98% use less than 2 gigabytes. (NYT) AT&T looked at these numbers and concluded it was time for tiered pricing; time to soak these “data pigs”.
Now that’s a quote to note: “data pigs.” You can take the old AT&T out of the phone business but you can’t alter than DNA easily. Ah, “data pigs”.
Stephen E Arnold, June 10, 2010
Freebie, unlike a long distance call in 1950 when a ringy dingy to Brazil was a major event. Remember differential pricing by class of customer? Ah, remember.
Another Upstart Nation State Bans Google
June 10, 2010
I may have to fire up my old copy of XyWrite III+, create a template, and assign standing text to an Alt key. I read “Turkey Bans Use of Google, Services.” If I weren’t so busy with my World Cup paperwork, I would create a chart with such categories as “banned”, “sued”, “threatened”, and probably a couple of other categories.
The most recent nation state to get frisky with Google is Turkey. Long viewed by the US as a cheerleader, Turkey seems to be willing to make pals with certain countries which are annoyed with the United States.
Here’s the passage I noted:
In an official statement, Turkey’s Telecommunications Presidency said it has banned access to many of Google IP addresses without assigning clear reasons. The statement did not confirm if the ban is temporary or permanent….The banned IP addresses include translate.google.com, books.google.com, Google-analytics.com, tools.google.com and docs.google.com.
I thought companies had an obligation to shareholders to maximize returns. Getting in hot water in countries where there are potentially lucrative markets strikes me as losing an opportunity to make money. After the World Cup, I will work through the countries in which Google faces push back. Fascinating that a single company can become the focal point for frequent hassles with nation states.
Maybe this is a trend, not an outlier? A good question in my opinion: “Who is at fault? The country, a politician, a government, a company?” I can hear my seventh grade teacher now: Discuss in less than 250 words. What’s next? Educational institutions?
Stephen E Arnold, June 10, 2010
Freebie
The UX Crowd Does Harvey the Rabbit
June 9, 2010
I like the blinking dot interface. The 20 somethings poke with fingers. Sigh. The user experience chatter goes unheard by me. I find the cartoons, the mini motion pictures, and cluttered “assisted navigational aids” annoying. The future of interfaces is certainly less cluttered. Point your browser at “‘Imaginary’ Interface Could Replace Real Thing.” And, for a bonus, you get the “real thing”, a phrase much loved by some azure chip poobahs. The point of the write up is that the interface is – well – imaginary. For me, the key passage was:
Researchers are experimenting with a new interface system for mobile devices that could replace the screen and even the keyboard with gestures supported by our visual memory.
I have a gesture in mind.
Stephen E Arnold, June 9, 2010
Freebie.
Evidence of an Open Source Boomlet?
June 8, 2010
I read “What Is Data Science?” with interest. This is a long O’Reilly Radar essay by Mike Loukides. The write up has a message that is going to be of interest to those looking for the next big and some giant companies with data and not much leverage from that asset. The key point in the write up is that there is money to be made by converting data into products. Note that this is not the tired old data-information-knowledge mantra. The days of the quasi-intellectual approach to making money is not sufficiently pragmatic for these economic times. The key is to take data and make a product. When I read the essay, I thought about various online vendors who are doing this now. Candidates for poster children include Google, Facebook, and Yahoo along with lots of other folks. Statistics Canada once signed a deal with a vendor to crunch the StatsCan stuff into more saleable products.
But for me the most interesting item in the write up was a chart that showed the number of job listings for a couple of open source products; specifically, Hadoop and Cassandra.
You can see the lines trending upwards.
My take: there is some tangible data that indicates open source software in the data management sector is gaining traction. I am not sure what this means for other open source software. But I found this factoid interesting.
Stephen E Arnold, June 8, 2010
Freebie
Mobile Search and Apple
June 8, 2010
Short honk: Fascination chart. Navigate to “iPad Web Usage Passes iPod.” The chart is tough to read by 65 year old geese. The message conveyed is that Android’s owners lag iPod and iPad Web surfing usage. Google may be selling 65,000 Android devices via its partners each day, but Apple’s two million iPads are sucking Web content. The startling factoid is that the iPad accounted for more Web content than Apple’s iPod. If the data are correct, Google needs to whip its pony.
Stephen E Arnold, June 8, 2010
Freebie