What Is Vint Cerf Saying

February 16, 2009

Lidija Davis’s “Vint Cerf: Despite Its Age, the Internet Is Still Filled with Problems” does a good job of providing an overview of Vint Cerf’s view of the Internet. You can read the article here. Mr. Davis provides a snapshot of the issues that must be addressed if she captured the Google evangelist’s thoughts accurately:

According to Cerf, and many others, inter-cloud communication issues such as formats and protocols, as well as inter or intra-cloud security need to be addressed urgently.

I found the comments about bit rot interesting and highly suggestive. She quite rightly points out that her summary presents only a small segment of the talk.

When I read her pretty good write up, I had one thought: “Google wants to become the Internet.” If the company pulls off this grand slam play, then the issues identified by Evangelist Cerf can be addressed in a more forthright manner. My reading of the Guha patent documents, filed in February 2007, reveals some of the steps Google’s programmable search engine includes to tackle the problems. Mr. Cerf identified and Ms. Davis reported. I find the GoogleNet an interesting idea to ponder. With some content pulled from Google caches and the Google CDN (content delivery network), Google may be the appropriate intermediary and enforcer in this increasingly unstable “space”.

Stephen Arnold, February 16, 2009

Another Google Glitch

February 16, 2009

More technical woes befuddle the wizards at Google. According to SERoundTable’s article “Google AdSense and AdWords Reportings Takes a Weekend Break” [sic] here, these systems analytic reports did not work. I wonder if Googzilla took a rest on Valentine’s Day?  The story provides a link to Google’s “good news” explanation of the problem in AdWords help. SERoundTable.com provides links to the various “discussions” and “conversations” about this issue. This addled goose sees these as “complaints” and “snarls”, but that’s the goose’s refusal to use the lingo of the entitlement generation.

Call it what you will. The GOOG has been showing technical missteps with what the goose sees as increasing frequency. The Google plumbing reached state of the art in the 1998 to 2004 period. Now the question is can the plumbing and the layers of software piled on top of Chubby and the rest of the gang handle the challenges of Facebook.com and Twitter.com? Google knows what to do to counter these real time search challengers. The question is, “Will its software system and services allow Googzilla to deal with these threats in an increasingly important search sector?” I am on the fence because of these technical walkabouts in mission critical systems like AdSense and AdWords. Who would have thought that the GOOG couldn’t keep its money machine up and running on Cupid’s day? Is there a lack of technical love in Mountain View due to other interests?

Stephen Arnold, February 16, 2009

Mysteries of Online 6: Revenue Sharing

February 16, 2009

This is a short article. I was finishing the revisions to my monetization chapter in Google: The Digital Gutenberg and ran across notes I made in 1996, the year in which I wrote several articles about online for Online Magazine. One of the articles won the best paper award, so if you are familiar with commercial databases, you can track down this loosely coupled series in the LITA reference file or other Dialog databases.

Terms Used in this Write Up

database A file of electronic information in a format specified by the online vendor; for example Dialog Format A or EBCIDIC
database producer An organization that creates a machine-readable file designed to run on a commercial online service
online revenue Cash paid to a database producer generated when a user connected to an online database and displayed online or output the results of a search to a file or a hard copy
online vendor A commercial enterprise that operated a time sharing service, search system, and customer support service on a fee basis; that is, annual subscription, online connect charge, online type or print charge
publisher An organization engaged in creating content by collecting submissions or paying authors to create original articles, reports, tables, and news
revenue Money paid by an organization or a user to access an online vendor’s system and then connect and access the content in a specific database; for example, Dialog File 15 ABI/INFORM

My “mysteries” series has evoked some comments, mostly uninformed. The number of people who started working in search when IBM STAIRS was the core tool are dwindling in number. The people who cut their teeth in the granite choked world of commercial online comprise an even smaller group. Commercial online began with US government funding in the early 1960s, so Ruby loving script kiddies are blissfully ignorant of how online files were built and then indexed. No matter. The lessons form foundation stones in today’s online world.

Indexing and Abstracting: A Backwater

Aggregators collect content from many different sources. In the early days of online, this meant peer reviewed articles. Then the net gathered magazines and non-peer reviewed publications like trade association magazines. Indexing and abstracting in the mid 1960s was a backwater because few publishers knew much about online. Permission to index and abstract was often not required and when a publisher wanted to know why an outfit was indexing and abstracting a publication, the answer was easy. “We are creating a library reference book.” Most publishers cooperated, often providing some of the indexing and abstracting outfits with multiple copies of their publications.

Some of the indexing and abstracting was very difficult; for example, legal, engineering, and medical information posed special problems. The vocabulary used in the documents was specialized, and word lists with Use For and See Also references were essential to indexing and abstracting. The abstract might define a term or an acronym when it referenced certain concepts. When abstracts were included with a journal article, the outfit doing the indexing and abstracting would often ask the publisher if it was okay to include that abstract in the bibliographic record. For decades publishers cooperated.

The reason was that publishers and indexing and abstracting outfits were mutually reinforcing operations. The published collected money from subscribers, members, and in some cases advertisers. The abstracting and indexing shops earned money by creating print and electronic reference materials. In order to “read the full text”, the researcher had to have access to a hard copy of the source document or, in some cases, a microfilm instance of the document.

No money was exchanged in most cases. I think there was trust among publishers and indexing and abstracting outfits. Some of the people engaged in indexing and abstracting crated products so important to certain disciplines that courses were taught in universities worldwide to teach budding scientists and researchers how to “find” and “use” indexes, abstracts, and source documents. Examples include the Chemical Abstracts database, Beilstein, and ABI/INFORM, the database with which I was associated for many years.

Pay to Process Content

By 1982, some publishers were aware that abstracting and indexing outfits were becoming important revenue generators in their own right. Libraries were interested in online, first in catalogs for their patrons, and then in licensing certain content directly from the abstracting and indexing shops. The reason for this interest from libraries (medical, technical, university, public, etc.) was that the technology to ingest a digital file (originally on tape) was becoming available. Second, the cost of using commercial online services which would make hundreds of individual abstract and index databases available was variable. The library (academic or corporate) would obtain a password and a license. Each database incurred a charge, usually billed either by the minute or per query. Then there was online connect charges imposed by outfits like Tymnet or other services. And there were even charges for line returns on the original Lexis system. Libraries had limited budgets, so it made sense for some libraries to cut the variable costs by loading databases on a local system.

By 1985, full text became more attractive to users. The reason was that A&I (abstracting and indexing) services provided pointers. The user then had to go find and read the source document. The convenience of having the bibliographic information and the full text online was obvious to anyone who performed research in anything other than a casual, indifferent manner. The notion of disintermediation expanded first in the A&I field because with full text, why pay to crate a formal bibliographic record and manually assign index terms. The future was full text because systems could provide pointers to documents. Then the document of interest to the researcher could be saved to a file, displayed on screen, or printed for later reference.

The shift from the once innovative A&I business to the full text approach threw a wrench into the traditional reference business. Publishers were suspicious and then fearful that if the full text of their articles were in online systems, subscription revenues would fall. The publishers did not know how much risk these systems poses, but some publishers like Crain’s Chicago Business wanted an upfront payment to permit my organization to crate full text versions of certain articles in the Crain publications. The fees were often in the five figure range and had additional contractual obligations attached. Some of these original constraints may still be in operation.

image

Negotiating an online deal is similar to haggling to buy a sheep in an open market. The authors were often included among the sheep in the traditional marketplace for information. Source: http://upload.wikimedia.org/wikipedia/commons/thumb/0/0e/Haggling_for_sheep.jpg/800px-Haggling_for_sheep.jpg

Revenue Sharing

Online vendors like Dialog Information Services knew that change was in the air. Some vendors like Dialog and LexisNexis moved to disintermediate the A&I companies. Publishers jockeyed to secure premium deals for their full text material. One deal which still resonates at LexixNexis today was the New York Times’s arrangement with LexisNexis for the New York Times’s content. At its height, the rumor was that LexisNexis paid more than $1 million for the exclusive that put the New York Times’s content in the LexisNexis services. The New York Times decided that it could do better by starting its own online system. Because publishers saw only part of the online puzzle, the New York Times’s decision was a fateful one which has hobbled the company to the present day. The New York Times did not understand the cost of the infrastructure and the importance of habituated users who respond to the magnetism of an aggregate service. Pull out a chunk of content, even the New York Times’s content, and what you get is a very expensive service with insufficient traffic to pay the overall cost of the online operation. Publishers making this same mistake include Dow Jones, the Financial Times, and others. The publishers will bristle at my assertion that their online businesses are doomed to be second string players, but look at where the money is today. I rest my case.

To stay in business, online players cooked up the notion of revenue sharing. There were a number of variations of this business model. The deal was rarely 50 – 50 for the simple reason that as contention and distrust grew among the vendors, the database companies, and the publishers, knowledge of costs was very difficult to get. Without an understanding of costs in online, most organizations are doomed to paddling upstream in a creek that runs red ink. The LexisNexis service may never be able to work off the debt that hangs over the company from its money sucking operations that date from the day the New York Times broke off to go on its own. Dow Jones may never be able to pay off the costs of the original Dow Jones online service which ran on the mainframe BRS search system and then the expensive joint venture with Reuters that is now a unit in Dow Jones called Factiva. Ziff Communications made online pay with its private label CompuServer service and its savvy investments in high margin database and operations that did business as Information Access. Characteristic of Ziff’s acumen, the Ziff organization exited the online database business in the early 1990s and sold off the magazine properties, leaving the Ziff group with another fortune in the midst of the tragedy of Mr. Ziff’s health problems. Other publishers weren’t so prescient.

With knowledge in short supply, here were the principal models used for revenue sharing:

Tactic A: Pool and Payout Based on Percentage of Content from Individual Publishers

This was a simple way to compensate publishers. The aggregator would collect revenues. The aggregator would scrape off an amount to cover various costs. The remainder would then be divided among the content providers based on the amount of content each provider contributed. To keep the model simple (it wasn’t) think of a gross online revenue of $110. Take off $10 for overhead (the actual figure was variable and much larger). The remainder is $100. One publisher provided 60 percent of the content in the pay period. Another publisher provided 40 percent  of the content in the pay period. One publisher got a check for $60 and the other a check for $40. The pool approach guarantees that most publishers get some money. It also makes it difficult to explain to a publisher how a particular dollar amount was calculated. Publishers who turned an MBA loose on these deals would usually feel that their “valuable” content was getting short changed. It wasn’t. The fact is that disconnected articles are worth less in a large online file than a collection of articles in a branded traditional magazine. But most publishers and authors today don’t understand this simple fact of the value of an individual item within a very large collection.

I was fascinated when smart publishers would pull out of online services and then  try to create their own stand alone online services without understanding the economic forces of online. These forces operate today and few understand them after more than  40 years of use cases.

Read more

Truescoop: Social Search with a Twist

February 16, 2009

As social media keeps expanding, privacy and security is in the spotlight, especially on sites like MySpace and Facebook, where you can list home address, birthdays, phone numbers, e-mails, family connections, and post pictures and life details. That information is then available to anyone. Every time you access an application on Facebook, you have to click through a warning screen that tells you that app will be gathering your personal information. And now there’s Truescoop, http://www.truescoop.com, a Facebook tool at http://apps.facebook.com/truescoop/ specifically designed to target that personal information. TrueScoop’s database of millions of records and photos is meant to help people discover personal and criminal histories. So if you find a date on Facebook, you can check them out first, right? Whoa. We all know that there are issues with the Internet and personal privacy, but how much is going too far? Although Truescoop says its service is confidential, the info isn’t – TrueScoop also allows for users to share discoveries with others and comment on someone’s personal profile. Time to be more cautious. Consider what information you post on your blogs and sites carefully. You don’t want some other goose to steal your golden egg.

Jessica W. Bratcher, February 16, 2009

Google and Torrents: Flashing Yellow Lights

February 15, 2009

Ernesto, writing in Torrent Freak here, may be the first flashing yellow signal for Google’s custom search service. You can learn about the CSE here. The article that caught my attention as I was recycling some old information for part six of my mysteries of online opinion series was “uTorrent Adds Google Powered Torrent Search.” If you don’t know what a torrent is, ask your child or your local hacker. uTorrent is now using “a Google powered torrent search engine”. Ernesto said:

While the added search is not a particular good way to find torrents, its addition to the site is an interesting move by BitTorrent Inc. Not so long ago, uTorrent removed the search boxes to sites like Mininova and isoHunt from their client, as per requests from copyright holders. However, since BitTorrent Inc. closed its video store, there is now no need to please Hollywood and they are free to link to torrent sites again.

With more attention clumping to pirated software and digital content, Ernesto’s article might become the beacon that attracts legal eagles, regulators, and folks looking to get something for nothing. I will keep my eye open for other Google assertions. Until I get more information, I want to remind you that I am flagging an article by another person. I am not verifying Ernesto’s point. The story could be the flashing signal or a dead bulb. I find it interesting either way. Google’s index has many uses; for example, looking for the terms hack, password, confidential, etc.

Stephen Arnold, February 15, 2009

SEO: Costly, Noisy, and Uninteresting

February 15, 2009

I enjoy reading the comments from aggrieved search engine optimization wizards. I know SEO is a big business. I met a fellow in San Jose who boasted of charging “clueless clients” more than $5,000 a month for adding a page label and metatags to a Web page. Great work if you want to do it. I don’t. I gave talks for a couple of years at various SEO conferences. I grew tired of 20 somethings coming up to me and asking, “How do I get a high Google ranking?” The answer is simple, “Follow Google’s guidelines and write information that is substantive.” Google makes the rules of its information toll road pretty clear. Look here, for example. Google even employs some mostly socially acceptable engineers to baby sit the SEO mavens.

I am not alone in taking a dim view of SEO. I have spoken with several of the Beyond Search goslings about methods to fool Mother Google. These range from ripping off content from other sources in violation of copyright to loading up pages with crapola that Google’s indexing system interprets as “content.” Here’s an article that I keep in my ready file when I get asked about SEO. I love the title: “Make $200K+ a Year Running the SEO Scam.” I also point to an SEO “expert’s” own tips to help avoid the most egregious scam artists. You can read this checklist from AnyWired.com here. Finally, navigate here and look at the message in words and pictures. The message is pretty clear. Pay for rankings whether the content is information, disinformation, good, bad, or indifferent.

My suggestion is take a writing class and then audit a course in indexing at an accredited university offering a degree in library science. Oh, too much work. Too bad for me because I have to wade through false drops in public Web search engines. SEO is contributing to information problems, not solving them.

In Washington, DC, a few days ago, I heard this comment, “We have to get our agency to appear higher in the Google rankings. The House finance committee uses Google results to determine who is doing a good job.” Great. Now the Federal Web sites, which are often choked with data, will be doing SEO to reach elected officials. Wonderful.

SEO is like kudzu. I’m glad I confine my SEO activities to recommending that sites use clean code, follow Google’s rules, include content that is substantive, and update information frequently. I leave the rest to the trophy generation carpetbaggers.

Stephen Arnold, February 15, 2009

HiQube Update

February 15, 2009

Beyond Search miniprofiled HiQube here in May 2008 when it was launched as a new company. HiQube emerged from Hicare Research (http://www.hicare.it or http://www.hicare.com), which had been bought by Altair Engineering in January 2007 and renamed. Here comes the shuffleboard. In September 2008, Altair and Hicare (which Altair had supposedly absorbed), announced an agreement to continue their business intelligence work independently–both using the HiQube technology.

In my opinion, what the agreement boils down to is that Altair received full ownership of HiQube while the Hicare founders got their company back, and both now have the right to all the software assets of HiQube. Altair and HiQube are moving into engineering software, and Hicare is sticking to its business intelligence product, Lillith. It seems a bit foggy, as the same vague news release was posted on all three sites. Neither hicare.com nor hiqube.com have posted a news post since the one in September (although Hicare had a Lillith update in October). This url–http://www.altair.com/hicare –happily delivered up a 404 on February 14, 2009. But Altair seems to be chugging right along. If more substance becomes available, I will pass it along.

Stephen Arnold, February 15, 2009

Microsoft and Luck, Fate, Whatever

February 15, 2009

I read eWeek’s Microsoft Watch column by Joe Wilcox tagged “Microsoft’s 10 Unlucky Breaks.” You can read it here. The hook for the write up was Friday the 13th, an “unlucky” day. Mr. Wilcox romped through a collection of challenges, gaffes, and fumbles. You will want to read the top 10 unlucky breaks in their entirety, but I want to give you a flavor of the analysis.

For example, Mr. Wilcox points to the lack of “luck” about Microsoft’s share price. He also flagged “the Google economy”. And he identified the World Wide Web as another unlucky factor.

Let’s think about Microsoft and “luck”. I am not certain the word “luck” is the appropriate one to describe the challenges and issues identified by Mr. Wilcox. I recall reading a play in college with the interesting title Oedipus Rex. Oedipus “knew” that he was going to whack his dad and then marry his mom. As Oedipus reacted as a thinking person with some emotional triggers, he did indeed off pops and woe his mother.

The point was that good old Oedipus “knew” what would happen and then he made decisions that delivered his mama to the marriage bed. At the end of the play, poor old Oedipus figured out the consequences of his actions and embraced a life without YouTube.com. Teiresias’ comment sticks in my mind: “It’s a terrible thing to be wise when there’s nothing you can do.”

Microsoft seems be operating within an ecosystem in which failure is predestined. Let me recast these three points in Sophoclean terms:

  1. Microsoft’s share price. This is a reflection of investor perception about the value of a company. Microsoft’s share price has been in the value stock range for years. Keep in mind that Microsoft had and still has a monopoly of the enterprise desktop operating system. The company’s revenue is north of $65 billion a year. The company is profitable. Yet there’s that share price. That’s not a matter of luck. The share price is an indicator of perceived value. Actions and decisions create the perception. No luck involved.
  2. The Google economy is not a matter of luck for Microsoft. In 1999, Microsoft hired some AltaVista.com talent but decided to follow a different path. Google hired some AltaVista.com talent and went a different way. Microsoft decided and took actions that today have significant impact on the company’s competitiveness. Google was able to overcome its early mistakes and act opportunistically to implement another company’s business model. Microsoft hired one of the Overture wizards and made decisions that keep Microsoft far behind Google. No luck. Just actions based on what Microsoft executives believed to the appropriate. This is not luck; this is decision making that did not pan out.
  3. The World Wide Web. No luck there. Microsoft smothered Netscape with “love” and then watched or sort of watched as the Google moved into Web search, then Web applications, and then into the enterprise. Now the Microsoft managers have to get out of the NASCAR grandstand and behind the wheel of a race car. No luck involved. Microsoft watched and now has an older race car which seems to defy souping up to beat the Google’s ride.

I think the unfolding of events documented in Mr. Wilcox’s good list is little more than worry beads that remind me of decisions and actions that have created the present situation. The only predestination is that Microsoft has not learned from its past. Now, like Oedipus, it is in some sense, blind.

Stephen Arnold, February 15, 2009

Enterprise Software Pre-Buy Checklist

February 15, 2009

Enterprise software is a sector showing signs of stress. Some enterprise systems don’t work very well regardless of which vendor’s system is in a data center. Other vendors are gobbling up companies, disregarding the energy depletion that occurs when acquisitions occur. The talk is about synergy, not performance enhancing supplements required to make the deal work. Some buyers are following the “Fire, Ready, Aim” approach to decision making. Other idiosyncrasies exist. I found the Inside ERP “Midmarket ERP Solutions Checklist”, a two page write up interesting. You can download the paper here, but you will have to register to get the document. I don’t want to reproduce the full checklist. I do want to highlight three items and offer a comment about each. Most of the items in the checklist apply to enterprise search and content processing.

  1. Who is the owner of the project? Good question. In my experience, most organizations rotate “owners”, creating an ownerless situation that helps increase the likelihood of cost overruns.
  2. What is the specific business problem the system must solve? This basic question is usually answered in clumps of problems. The problem with clumps is that like a shotgun blast no single pellet will kill Bambie. A blast can wound Bambie, not get the job done in a clean, efficient, humane manner. Most enterprise search systems would the information problem and then create havoc as the procurement team tries to chase the wounded problem to ground.
  3. Will the enterprise system adapt to change? In my experience, enterprise software expects the licensee to change. The result is an SAP type experience which grinds down the customer. Once the system is installed, who has the energy to repeat the process?

As I read this checklist, I said to myself, “That cloud computing approach looks mighty appealing.” Snag the white paper and look at the other items. Soul searching time arrived as I worked through the list.

Stephen Arnold, February 15, 2009

Google’s Radio Ad Failure

February 15, 2009

If you are interested in Google’s failures, you will want to take a quick look at “BIA/Kelsey Commentary: Fratrik on Google’s Departure from Radio.” The is a “free” consultant write up, so keep that in mind where you read the article here. The write up provides a mini analysis of how Google fumbled the ball and withdrew like Jackie Smith, former Dallas Cowboys’ received, famous for dropping a pass that would have won the big one. Google is a digital Jackie Smith when it comes to radio advertising. The most interesting comment in the write up was:

“Radio operators were never comfortable getting in bed with Google,” he said. “Among other things, the Google model asked for information that broadcasters thought was confidential. It also required the purchase of equipment. I heard the pitch when it was first launched, and I couldn’t see how this would be successful.” Why didn’t Google’s entry into the radio advertising market work out?  “The initial read three years ago was somewhat positive – they were going to use their core strengths in Internet scalability and transactional efficiencies to attract buyers and sell inventory that local stations were unable to sell. But, even with their model and their reach to many more potential advertisers, they could not sell enough to make it a profitable business line.”

The notion of “comfort” is important. When Googzilla is not comfortable with its potential customers’ comfort with Googzilla, Googzilla says, “Adios.” Kelsey Group write up points out that some broadcasters are embracing digital ad technologies. That’s encouraging to some but not me.

Here’s why.

Traditional broadcasting companies are in the same boat as dead tree publishers. The demographics and the costs of their business model are like a current rushing down the Green River. If you go with the flow, you get carried along. If you try to paddle against the current, you fail, walk, or dock. Googzilla did not just exist; Googzilla wrote off an entire business sector as unable to “get it.” Trouble looms for traditional broadcasters I fear. The Sirius XM financial challenge is a harbinger. Kelsey Group’s article omitted this nuance which surprised me.

Stephen Arnold, February 15, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta