Taxonomies: 24 Caret or Fool’s Gold
July 25, 2008
I have been bedeviled by taxonomies in the last two weeks. Vendors want to demo their systems. Clients want to find out how to make their taxonomies improve search. Even an entrepreneur showed up, gave me money, and outlined his taxonomy scheme for world domination.
Fancy tools are not needed to create a useful taxonomy.
Yikes!
The purpose of this feature is to provide some basic places to seek taxonomy lists, services, and functions. The list is not complete, and I will add to it over time.
- Dow Jones Factiva. You can get librarians to give you a hand and license software too. Click here for traditional media’s taxonomy resource.
- Interse. A Microsoft SharePoint-centric system. Click here.
- SV Technologies, now part of Sydney Plus. Legal taxonomy. Click here
- Taxonomy Warehouse. This is the place to start. Click here to start your quest.
- WAND. Software, services, and term lists based on business units. Click here.
- Wordmap. The grandpappy of many whippersnappers’ word lists. Click here
In my April 2008 Beyond Search study, I provide in depth analyses of Access Innovations‘ system and the SchemaLogic taxonomy management systems. You can get information about the for fee profiles here.
This is not a complete list. If you wish to add companies, please use the comment form for this Web log.
Stephen Arnold, July 25, 2008
Google Israel: Beavering Away
July 24, 2008
Google Israel remains my pick for the smartest Google operation outside of Mountain View. My opinion may rile the whiz kids working near Seattle and annoy the heck out of the wizards in Beijing, but I’m entitled to my view.
Noa Parag’s “Google as a Start Up?” is an important essay, and I suggest you click this link to Globes Online and read his English language article here. It’s one part interview with Googler Meir Brand who is pretty good at math and one part business analysis. Don’t delay. Globes Online doesn’t claim to be an online archive, but it does claim bragging rights to its coverage of Israel’s business affairs.
So, what did I learn?
- Google Israel is operated like a start up. The company has 100 employees in two R&D centers, one in Haifa and one in Tel Aviv.
- Google Israel developed Google Trends and the overlay technology that puts text content on video clips available on YouTube.com
- Mobile advertising is a significant opportunity.
Noa Parag’s write up underscores to points about Google. First, the company delegates and relies on email, Google’s internal online system to keep Google Israel “down the hall”. Google, despite its size, is allowing Google Israel to run with the start up ball.
Stephen Arnold, July 24, 2008
A David Outperforming Two Goliaths: Factiva, Lexis, Silobreaker
July 24, 2008
A thoughtful reader sent me a screen shot of a Compete.com report. This is the metrics company that says, “Track your rivals. Then eat their lunch.” As you may know, I don’t get too excited by third party analytics. The data have to show me a big jump, or most of the market shares information is a statistical fuzz ball. When I saw this chart, I took notice.
The time period is a 12 month span, ending on June 30, 2008. The companies on the chart are Dow Jones’s “other” online service, Dow Jones Factiva. You can read more about this outfit here. This online service is so adept that it’s Google ad today (July 24, 2008) returns a 404 error or “File Not Found”. I clicked on the ad eight or nine times to see if was traditional media latency or just carelessness. Answer: carelessness.
The second company’s data charted by Compare is Lexis Nexis, one of the two monopolies in legal information. I love the Lexis tag line: “Lead with Confidence. Work with Confidence. Grow with Confidence.” Unfortunately this Compare.com chart shows Lexis following, not something to inspire confidence or trigger growth. Lexis Nexis sells online information to lawyers, but not surprisingly, lawyers have been finding out that their clients expect the legal eagles to use publicly accessible services, not the high priced services. Accordingly, Lexis Nexis has been working overtime to make Lexis spin more money. Nexis, has been paddling upstream for years, and the brand has less visibility than the hair product (Nexxis) in my opinion. Lexis tried to get the hair product company to change its name. Didn’t work. Tough to confuse a sagging online service with shampoo and conditioners in my opinion.
Now, the third company is co-founded by the former McKinsey manager and intelligence officer, Mats Bjore, and the CEO Kristofer Mansson. His company, Silobreaker, is the one with the soaring line of the chart. When a third party generates an upward curve that rises steeply, I take notice. The absolute numbers are less important than the third party’s sampling process notes a significant change. You can read my interview with Mr. Bjore here.
What’s this chart tell me?
First, Silobreaker is gaining attention at the expense of Factiva and LexisNexis. You can see that in the up and down red and green lines.
Second, Silobreaker’s upward ascent tells me that the company is getting new customers, not just sucking oxygen from the bigger guys’ base.
Third, whatever goosed Silobreaker to rapid growth took place early in 2008, and the momentum appears to be holding up. There will be a tail off in the summer when information junkies head for the beach or a trout stream.
But the useful piece of data is that the combined “people” score for Silobreaker.com is only slightly less than the combined “people” score of Factiva.com and Nexis.com.
Silobreaker may be a David. The two Goliaths owned by traditional media companies and a track record of throwing money and people at a “problem” are not out of the game. But if I were the product manager for either of these two companies, I would be considering one of these actions:
[a] Killing Silobreaker.com with a price war or carpet bomb marketing campaign
[b] Polishing my résumé because I am getting humiliated by a company in Sweden, which has a GDP smaller than my employer’s annual revenue
[c] Buying Silobreaker.com and taking credit for the company’s rapid growth, nifty technology, and developers
[d] Deleting my Silobreaker.com bookmark and pretending that the company does not exist.
Since I worked for the world’s smartest publisher, William Ziff, I would go for [c]. Why pretend that a giant traditional publishing company can make a product people want, that’s sexy, and has lift. Buy it, issue a news release, and collect that bonus.
Will Factiva and Lexis wake up? I will keep you posted.
Stephen Arnold, July 24, 2008
Microsoft: What Now for Search?
July 24, 2008
Googzilla twitches its tail and Microsoft goes into convulsions. When I was in the management consulting game, my boss, Dr. William Sommers, talked about “hyper-actions”. The idea was that a single event or a minor event would trigger excessive reactions.
Brain scan of a person undergoing excessive “excitement” and “over reaction”.
When I read the flows-like-water prose of Kara Swisher’s “Microsoft’s Latest Web Stumble: Kevin Johnson Out” and then her brief introduction to Mr. Steve Ballmer’s “Full Memo to the Troops about New Reorg”, I thought about Dr. Sommers’s “hyper-action” neologism. In my opinion, we are watching the twitch in Mountain View triggering via management string theory the convulsions in Redmond.
First, let me identify for you the points that jumped from screen to neurons in Ms. Swisher’s write ups.
- Ms. Swisher reports that Mr. Kevin Johnson was the architect behind the Yahoo buy out. I thought that the idea was cooked in Mr. Chris Liddell’s lamb-roasting pit. Obviously my sources were off base. Mr. Johnson moves to Juniper and Mr. Liddell continues to get a Microsoft paycheck. Mr. Liddell’s remarks at the March 2008 Morgan Stanley Technology Conference left me with the impression that he was being “systematic” in his analysis. Here’s one take on his remarks.
- Ms. Swisher’s run down of Microsoft’s actions so far in 2008 is excellent, and she reminded me that Microsoft bought aQuantive, a fact which had slipped off my radar. What has happened to aQuantive for which Microsoft paid $6 billion, more than what Microsoft paid for Fast Search & Transfer and Powerset combined. He mentioning aQuantive reminded me of those wealthy car collectors on the Speed Channel’s exotic automobile auctions. What do you do with a $1.2 million Corvette? You put it in a garage. You don’t run down to the Speedway in Harrods Creek, Kentucky, to buy a pack of chewing tobacco.
- Ms. Swisher turns a great phrase; specifically, “Microsoft has succeeded in burnishing its image as a Web also-ran and still has an uncertain path to change that.” I quite like the notion that a large company takes one action and succeeds in producing an opposite reaction. I think the Google folks would peg that as one of the Laws of Google Dynamics applied to Microsoft. For every action, there is a greater, opposite reaction that persists through time. (Ms. Swisher’s statement that Yahoo looks stable brought a smile to my face as well.)
Next, let me comment on the Mr. Steve Ballmer reorg memo, which will be a classic in business schools for years to come. The opening line will probably read, “Mr. Steve Ballmer, firmly in control of Microsoft, sat at his desk and looked across the Microsoft campus. He knew a bold strategic action was needed to deal with the increasing threat of Google, etc. etc.”
After the razzle dazzle about goals, the memo gets down to business:
We will out-innovate Google in key areas—we’re already seeing this in our maps and news search. Third, we are going to reinvent the search category through user experience and business model innovation. We’ll introduce new approaches that move beyond a white page with 10 blue links to provide customers with a customized view of their world. This is a long-term battle for our company—and it’s one we’ll continue to fight with persistence and tenacity.
MicroStrategy: TSA Swims through Data with PIMS
July 24, 2008
Government information technology makes me perspire. When a government news item renders in my news reader, I don’t pause. I want to make an exception. MicroStrategy is a very intriguing company. The fact that the firm has ramped its services to a law enforcement agency is interesting. MicroStrategy has been working with TSA since 2004. The deal signed in 2006 has saved TSA more than $100 million. The sentence that caught my attention was:
The TSA is a metrics-based organization… We [the TSA] use metrics every day to drive our decision making and quantify security effectiveness, operational efficiency and workforce management.
An example of this metrics focus is that since 2004, the TSA uses PIMS to run one million reports per year. TSA has about 12,000 users of the system. Each user prints about two reports a week. TSA is right in line with the Office of Management & Budget’s guidelines for managers to make decisions based on hard data, not hunches.
MicroStrategy, as you may know, popped in and out of the news in the 2000-2002 period. One of the items I recalled reading is here. Some former MicroStrategy professionals founded Clarabridge, a company focused on the overlap between business intelligence and content processing. You can find information about that company here.
I want to pay closer attention to MicroStrategy. Companies that can help Federal agencies save $100 million are the taxpayers’ best friends. I am interested in the MicroStrategy – Clarabridge alignment as well. Off to the library in the morning to find what I can find.
Stephen Arnold, July 24, 2008
Googzilla Swallows Telegraph Media Group
July 24, 2008
Traditional media has been my favorite whipping boy for a long time. The Telegraph Media Group may force me to rethink my critical view of companies who write stuff, print of dead trees, and employ folks at near starvation wages to get the messy artifacts to a declining readership. Silicon.com reported here that the publisher of The Daily Telegraph, Sunday Telegraph and Weekly Telegraph, as well as the telegraph.co.uk Web site will standardize on Google Apps–word processing, mail, collaboration. The whole shooting match.
My reading of the announcement suggests that TMG did the math and calculated that it could save a bundle. More significantly, TMG lets Google worry about software, presumably so the newspapers can worry about selling adverts. The most interesting statement in the Silicon.com write up is this remark attributed to one of TMG’s managers:
We see the levels of innovation happening in the consumer space…you can actually take advantage of within the enterprise space.
Microsoft, among other traditional software companies, are going to learn first hand how fissionable material goes critical. A few things happen, then a few more things, and then the game changes. Is Google Apps ready to go critical?
My view: yes.
Stephen Arnold, July 24, 2008
Google’s Schmidt on Google as an Application Platform
July 24, 2008
TechCrunch’s Eric Schonfeld’s “Liveblogging Eric Schmidt Google Interview at Brainstorm” snagged my attention. You can read the full text of his write up here. The document is also available on the TechCrunch.com Web site.
The points of interest to me were:
- Mr. Schmidt: Because of the way technology works, all the technology companies are aggregating information about people. It is a political debate.
- Mr. Schmidt: The most interesting next-generation social apps will be mobile.
- Mr. Schmidt: The easiest [way] for us to enter the enterprise is to address high pain levels like e-mail, messaging, calendaring.We have something like a million companies using these services, mostly small. My view is that it will be a many-year process…
What did I learn?
First, Google is explicitly describing itself as an application platform.
Second, usage data is the common denominator among technology companies. Aggregation is underway and the stern wake is political process. The data collection continues, of course. Politics has to play catch up to reality.
Third, mobile is a big deal. That’s for sure. Google has been beavering away on mobile technology for at least nine years.
Finally, Google is dead serious about the enterprise. Over a span of years, Google’s winning the hearts and minds of today’s students is an important part of Google’s being pulled into organizations.
The summer of transparency is yielding some useful insights into Googzilla.
Stephen Arnold, July 24, 2008
Knol: Encyclopedia or Brain Food for the Googleplex
July 23, 2008
Jason Kincaid does a fine job summarizing the Knol launch. He characterizes Google’s new service as “the monetizable Wikipedia” for TechCrunch. There’s a lot of repetition floating around, and you will do well to read this take on the new Google service here.
I want to identify a key point in Mr. Kincaid’s write up and then offer a different view of the service. Be aware: I will be expressing an opinion based on the research I conducted for my studies about Google. My angle of attack, therefore, is different from those who are viewing the service as a Wikipedia clone or killer.
The key point in Mr. Kincaid’s essay for me was:
The big news here is that by assigning ownership and allowing authors to include AdSense ads on their articles, Google is effectively offering a monetary incentive to create good content. In theory, the best articles will get the most attention, and in turn the most revenue. Unfortunately, this plan may backfire on Google. We’re going to start seeing a flurry of articles on the most popular content…
Dead on target.
The obvious question is, “Does Google know that many people will write about hot topics?” This will create duplicate information which will be tough for a 7th grader to figure out.
My take on this concentration on hot topics is that Know is not an encyclopedia. Knol is a mechanism to add information to the Google knowledge bases. Why does Google give away free voice directory assistance? The reason is that Google is building its knowledge base of morphemes.
Knol is another nozzle on the Google vacuum cleaner of data. What better way to approach disambiguation than to have a flow of content from known contributors, data from user behavior, and pools of information on popular subjects. The system also shakes the bugs out of the JotSpot plumbing that lurks somewhere in the bowels of Knol.
Knol is important. Knol could become an encyclopedia. But Knol pays dividends to Googzilla in many ways. In my opinion, the data Knol produces are more important than some of the other features highlighted by the many people writing about this service.
Stephen Arnold, July 23, 2008
Google Does Yahoo-Style Math
July 23, 2008
Not long ago, a Yahoo guru opined that building a Web search system cost about $300 million. I made a feeble attempt to point out that if that were indeed true, Yahoo would have accomplished the task and not collected search engines the way my mother adds to her collection of knickknacks. Similarly, Microsoft would not have bought Powerset, Fast Search & Transfer, and football-field sized data centers. I think the Yahoo math in that essay which you can read here was 1 + 1 = 3 bazillion. A bazillion is a technical term favored by the mathematical challenged. You can read about Yahoo math here.
Now, Fortune, sponsor of the Brainstorm Tech conference that featured the Beavis and Butt-head analysis I noted yesterday, offers another number. This time the number is $100 million and the person doing the calculation is Google’s employee #12, Marissa Mayer, high wizardette of the Googleplex.
You can read Fortune’s own take on this calculation here. I cannot do justice to the Fortune writer’s discourse. I can remark that Google News was a project that showed off some flashy Google technology circa 2001. Google News’s developer was, based on what I learned in my research, was Krishna Bharat. In 2006, Google News became official. In the last seven years, Google News has expanded, making it easy for me to see what’s shakin’ in France or a score of other countries.
Google’s technology spiders a subset of important news sources. The system “discovers” the important stories. The front page or splash page is automatically generated with follow on stories appearing in a cluster of related links. There have been some hiccups. These have ranged from major media outlets reminding Google that a newspaper accepting a feed from the mothership should not appear at the top of the news stack for a story. Google fixed this with some “human intervention”, which is now a key distinction of Google’s intelligent software; that is, a human makes sure the numerical recipe doesn’t add too much sugar and not enough salt to the output. The service also provided me with a good example to tell traditional publishers that unless some rethinking of their news operations took place, digital news would erode the traditional news base. I started yapping about this in 2001, but then and now, traditional publishers prefer to talk with my partners not me. I guess I’m too blunt for the white shoes and panama hat set on sultry summer days. Gee, the truth is the truth and the earnings of Time Warner (Fortune’s owner) supports my 2001 insights I suppose.
Now to the magic number.
Google News is a demo; it is not a revenue producer. I provide some information about the technology Google uses for this service in The Google Legacy and Google Version 2.0. If you want to know more, click here. The technology in Google News is darned impressive and generally unappreciated by users and competitors alike. By the way, do you wonder what Mr. Bharat has been doing since 2003, the most recent technical paper on his Google official biography page? (He’s been busy, but he is remiss in updating his research activities. For a peek, check out US20080097833.)
How do I know this? Do you see any ads on this page? As my Wall Street pals tell me, “Google’s revenue comes from ads.” So, no adds means no revenue on public facing Google pages. There’s not even a link to Google Enterprise that I could find.
“Why,” I ask, “are there zero ads on Google News?”
If my research is correct (which it may not be), the reason may be tie back to those traditional publishers. Not even Google can figure out how to divvy up a $0.83 click among the contentious, cantankerous publishers whose headlines are presented on the Google News page. A misstep can trigger another $1 billion lawsuit. Not even Google wants more of these old media new media face offs. I read here that Google even has a deal with the Associated Press, a forward looking outfit if there ever was one. Too many lawyers undermine one’s ability to do math, I have heard.
After seven or eight years, Google News’s presents an ad free face to me. Your mileage may differ, of course. And, according to the Fortune write up, Google wants to handle consumer health records in the same way; that is, traffic generation; to wit:
Mayer said that’s the way Google thinks about monetizing digital consumer health records. The company is one of many working to make it convenient for people to store and access their medical records online, a move that proponents say will improve health care by empowering consumers. But Mayer said that after some internal discussions, Google brass decided not to put ads on health record pages.
I think the strategy is for Google News and the other Google services to pull traffic to Google’s information amusement park. Overall traffic is a net benefit as long as Google can manage the costs associated with scaling to handle the hoards of visitors, buyers, and gawkers.
What strikes me as weird as why Google feels compelled to use Yahoo math–that is, making up a number–to justify its 10-year old business strategy. The top line revenue and net profit reveal the underlying math and the wisdom of Google’s approach. What’s a $100 million to Google. I can’t even get my Bowmar calculator to calculate the percentage because the zeros overflow the display.
Yahoo math from Google I don’t need. Agree? Disagree?
Stephen Arnold, July 23, 2008
Privacy Flash Point
July 23, 2008
When I speak with professional groups, I dance around the issue of “smart software”. The idea is that scripts do more than handle situations as a zero or one, white or black, on or off. The computers are binary, but programmers have numerous methods for helping a script deal with ambiguity.
One of the ways is to know what a single user or a group of users who share characteristics actually do. Looking at what a person does nine times out of ten times makes it easy to tell a script, “When this person takes this action, you take that action.”
The key to making this type of “smart software” work is data. The more data one has about an individual or a group of like-acting individuals, then the easier it is to cook up simple rules. The script runs the actions. When a decision is needed, the script looks at the usage data and makes a decision.
Endeca can integrated saved queries into a work flow. When the sales person reaches a particular point in a selling script, the Endeca system runs the query and displays the information based on a combination of rules and looking at some data about what sells, what product returns the largest commission, or some other factor.
Again, the key is rules and data.
The rules are tedious to set up and test. But once in place, the real nourishment for smart software is data. But most users are themselves unaware of what actions they take when using a computer. If I remind a user that email can be analyzed for syntactical fingerprints, friends, and insight into the preferences of the user, people are shocked. This amazes me.
Closed doors–that is, privacy–are tough to live behind in an online world.
I was thinking about this issue and privacy because the current issue of KMWorld, a tabloid published by Information Today, arrived via snail mail this afternoon. My monthly column was no more. In the July August 2008 issue, my column had become a feature story, “Cloud Computing and the Issue of Privacy”, pages 14, 15, and 22. The highlight of the story is a graphic from one of Google’s patent documents showing an exemplary data model for usage information about an individual or a group of users. The idea is that when a person can be assigned to a cluster based on some discovered similarity, probability methods make it trivial to “predict” what most members of the group will next do or prefer. This is not magic, but it is complicated and requires a honking big computer to work when there are lots of people and many groups.
To prepare for the one or two emails I get when my for-fee articles appear, I thought it might be a good idea to see what’s online. I know a little about Google but I don’t know much beyond my little area of expertise that I hone against the whetstone of Kentucky culture.