Forbes on Powerset

June 19, 2008

Forbes Magazine has an interesting article about Powerset, Chris Taylor’s “The Next Search Frontier: Just Ask Your Question“. I often have difficulty locating information on the Forbes’ Web site. Sometimes I grow frustrated with the pop up ads and page latency, so snag this article quickly.)

The key point in the article for me was this statement:

Powerset’s main asset is a partnership with PARC, the Palo Alto research center that incubated the computer mouse and the laser printer. In 2005, Pell discovered that PARC researchers had been working for 30 years on turning English into software code. Pell promptly licensed PARC’s research and hired the top scientists in the field, starting with Powerset co-founder Lorenzo Thione.

Xerox PARC (now simply PARC — it’s officially a subsidiary company of Xerox) has been an innovator for many years. But my experience has been that some of its better ideas are difficult to commercialize and convert into major revenue winners. Inxight Software, a PARC spin out, gained some market success and was acquired by Business Objects, which in turn was acquired by SAP. Powerset’s tie up with PARC will be another opportunity to convert ideas into revenue.

You can test drive Powerset here. Information about PARC is here.

I am accustomed to formulating queries with Boolean ANDs and NOTs. Typing questions is too much work for me. With the average query creeping up to 2.3 words on major public search engines, the idea that a well formed question will revolutionize search seems unlikely.

Natural language processing, like semantic and linguistics mechanisms, may be best suited for work behind the scenes, not in front of the user.

Stephen Arnold, June 19, 2008

Google’s Udi Manber on Search Quality

June 18, 2008

The Googlers were out in force, chipper and  explaining, to the 150 or so attendees of the Gilbane Group’s annual content management conference.

The key reason that drives Google forward, asserted Dr. Manber, is that users have rising expectations. Google, therefore, must use smart software, innovate, and scale. In 2007, Google tweaked its PageRank algorithm more than 450 times. Google works to keep bureaucracy at a minimum, empowering engineers to make necessary changes.

PageRank changes are not based on hunches. Extensive data analysis underlies tweaks.

The 21st century, asserted Dr. Manber, is about understanding people; that is, social interactions. Starting points for analysis are user intent. Queries are diverse like “hairstyles for ears that stick out” or “i’m going to win the lottery”.

Like other search systems, Google looks terms up in its index. Then Google uses other functions in order to determine intent; for example, time, place, context, and user information from “individualized Google,” if available.

You can see this in action. Run the queries “GM cars” then “GM food”. Google returns different results for each query even though the acronym GM appears in each query.

User expectations are now growing quickly. Google, therefore, must innovate and continue to scale.

Some development features were referenced, but these were not active in “regular” Google when I ran these sample queries. The presentation was well received and triggered a flurry of questions about site search and universal or federated search. Attendees applauded enthusiastically. The Googley magic was working today.

Stephen Arnold, June 18, 2008

The LinkedIn Bet: $1 Billion Social Valuation

June 18, 2008

The chatter about the Linked In valuation of $1 billion is choking my trusty RSS readers. The voice that reached me was Om Malik’s comments here. The essay is “Is LinkedIn worth $1 Billion.” Mr. Malik makes two points that warrant highlighting in the midst of the cacophony:

  • The notion that smart money has picked a winner may be suspect.
  • The per subscriber valuation is generous.

Mr. Malik nails this financial optimism as out of step with the company’s performance.

There are three other factors that Mr. Malik’s must-read essay surfaced in my mind:

  1. Social networks can be gamed. My experience with Linked In suggests that the controls on abuse are not as fine-grained as they should be
  2. The layers of fees are annoying to me, and I suspect that others will find that invitations often carry along obligations I don’t want
  3. In a deteriorating economy, referrals are indeed important. However, LinkedIn often wobbles into probes for intelligence in the form of questions from people whom I don’t know and marketing in the form of thinly disguised marketing pitches.

These three factors when combined with Mr. Malik’s analysis suggest an optimistic valuation. “Social” is hot. I am not convinced that today’s flag carriers will be tomorrow’s winners.

Stephen Arnold, June 17, 2008

Gilbane Chats Up a Silly Goose: The Arnold Interview

June 18, 2008

On Wednesday, June 18, 2008, I will be interviewed in front of an audience completely unaware of why a fellow from Harrod’s Creek, Kentucky, is sitting on a stage answering questions. No one is more baffled than I. Based on my knowledge of the big city, I anticipate confusion, torpor, and indifference to my comments.

In this essay, which will become available on June 18, 2008, the curious will have a reference document that summarizes my thoughts on issues about which I may be asked. There has been no dry run for this interview. The last one in which I participated–the Associated Press’s invitation-only gathering last year–left the audience with little appetite for food. Some found the beverage table a more welcome destination.

Anticipated Question 1: What’s “beyond search” mean?

In research conducted by me and others, about two-thirds of the users of an enterprise search system are dissatisfied with that system. “Beyond search” implies that we have to move to another approach because what is now available in organizations with which I and the other researchers have investigated is not well liked. Due to the cost of some systems, annoying two-thirds of the users is tantamount to getting a D or an F on a report card.

Anticipated Question 2: What’s “behind the firewall search” mean?

I wrote about the search elephant here. Many different functions involving information access are made available to an employee, contractor, or authorized user. The idea is that “behind the firewall search” is not public and made available by an organization to a select group of users. The “search elephant” refers to the many different ways in which search is understood and perceived within an organization.

Anticipated Question 3: Why are there so many search vendors and more coming each day?

There is a belief that existing systems are not tapping into what I have estimated to be a $2.5 billion market for information access in the enterprise. Entrepreneurs and people with money look at Google and think, “We should be able to make gains like that in the enterprise market.” I also think that the market itself is trying to figure out the search elephant. Buyers don’t know what is needed. When entrepreneurs, money, and confused customers with severe information access problems come together, we have the type of market place that exists today.

Anticipated Question 4: What about Microsoft and Fast Search & Transfer?

I understand that it is business as usual at Microsoft and Fast Search. For Microsoft, this means trying to get 10,000 motorboats to go in roughly the same direction. For Fast Search, the company continues to license its Enterprise Search Platform and service customers. There are many bits of grit in the working parts where Microsoft and Fast Search mesh. It is too soon to tell if these inhibitors are trivial or whether the machine will sputter, maybe stop. What I tell people is to ignore the Microsoft-Fast Search tie up, and get a solution for a SharePoint environment that works. There are good choices ranging from a lower cost solution like dtSearch to a competitively priced system from Coveo, Exalead, ISYS Search Software, or another Microsoft Certified vendor.

Anticipated Question 5: What’s the impact of the Google Search Appliance?

Many vendors will tell you that Google has delivered a second-class system. That’s not exactly true. With the OneBox API, Google has a very solid solution. The impact is that Google has about 10,000 enterprise customers. These are sales made, in many cases, under the noses of incumbent vendors. Google’s a player in the enterprise market and a serious one. I have uncovered one impactful bit of research at Google that could–note, I said, could–change the search landscape. I have tried to ask Google about this development, but the GOOG thinks I am do not merit their attention. Too bad for me, I guess.

Anticipated Question 6: What’s the impact of text processing, semantic search, and other new technologies on enterprise search?

These are hot terms that will open doors. Some vendors will make sales because of their ability to mesh trendy concepts with more traditional search.

Stephen Arnold, June 18, 2008

Mark Logic: Content Applications Fuel Company’s Growth

June 17, 2008

Mark Logic provides information access and delivery solutions that accelerate the creation of content applications. Customers across a range of industries rely on Mark Logic to repurpose content and deliver that information through channels. Some vendors describe this suite of functions as an enterprise publishing system.

The company has been growing at a furious pace. Dave Kellogg, former Business Objects’ executive, said:

Mark Logic… is a database management system built to natively manage XML documents and optimized for handling vast numbers of them (I mean hundreds of terabytes) with high performance. It’s a read/write system. It has a query language (XQuery). It has transactions and logging. You can use it, by itself–without the need to bolt it on to either a relational database or an application server–as the basis for content applications.

The company’s customers include Oxford University Press, O’Reilly Media, and the Congressional Quarterly. The company builds relationships with its customers. Mr. Kellogg says, “Our philosophy is to sell sell solutions to problems and avoid the stereotypical “drive-by” technology sale, where companies dump the software in the parking lot and leave.”

The full interview appears as part of the Search Wizards Speak series published by ArnoldIT.com. You can read the transcript of the interview with Mr. Kellogg here. The index to the full series of interviews is here.

Stephen Arnold, June 17, 2008

Google: From the Disruptor to the Disrupted

June 17, 2008

I am a fan of ReadWriteWeb, and I found the essay by Bernard Lunn quite interesting. Mr. Lunn has identified the “11 Search Trends that May Disrupt Google.” ReadWriteWeb.com makes it easy to locate its articles, so you can track this story down easily.

I found the list of factors that may be moving Google into a different role: from disruptor to a company that is itself disrupted. On the whole, I agree with the ReadWriteWeb analysis. Of particular importance is the notion of “start ups using a new outsourced infrastructure.” Powerset is an example of a company taking a different approach. I have heard that Powerset makes use of Amazon Web Services, and I think this is an important aspect of the company to monitor if my information is accurate.

The other point that I found on target is the impact tagging may have upon Google. Not long ago Vivisimo announced that its system made it possible for a user to add a tag–that is, index term–to an item in a result list. Tagging is becoming one of the everyday activities for those who write Web logs. The Semantic Web has been slow in coming, but I think the “social tagging” function may be providing some opportunities that search engines, including Google, have yet to exploit fully.

I would add one other point to the factors that are likely to influence Google–the challenge of size. Google is now 10 years old, and it is getting big enough to encounter the friction that plagues any large organization. Google, therefore, changes more slowly even though certain innovations make users gasp. A competitor can exploit Google’s own inertia but that competitor must take care to stay clear of Google’s momentum.

A happy quack for a useful and thought provoking write up, ReadWriteWeb!

Stephen Arnold, June 17, 2008

Microsoft’s Web Search Strategy Revealed: The Scoble Goldberg Interview

June 16, 2008

Online video does not match my mode of learning. Robert Scoble, a laurel leaf wwearerin the new world of video and text Web logs, conducted an interview with Brad Goldberg.

The interview is part of the Fast Company videos, and it is available here. The interview is remarkable, and I urge you to spend 31 minutes and listen to Brad Goldberg, General Manager of Microsoft Search Business Group.

The interview reveals useful information about the time line for Microsoft to capture market share fro9m Google and Microsoft’s ideas for differentiating itself from Google in Web search.

Surprisingly, there were no references that I could pick up to enterprise search, nor was there any indication that Mr. Goldberg was aware of the Fast Search & Transfer Web search technology which was quite good. As you may know, Fast Search withdrew from Web search in 2003, selling its AllTheWeb.com Web index to Overture. Yahoo gobbled Overture and used bits and pieces of the Fast Search technology recently. The “auto suggest” feature is still available from Yahoo’s AllTheWeb.com site. My tests suggest that today’s AllTheWeb.com uses the Yahoo Search index built by the Slurp crawler and the Fast Search technology for some of the bells and whistles on the site. The news search function is actually quite useful. If you are not familiar with it, you can try it here.

During the interview, Mr. Goldberg uses some sample queries to illustrate his claims about Live.com’s search performance, precision, and recall. I ran the “Paris” query on each of these systems, and I ran comparative queries on this Web log as well. After the interview, I took a look at the 2005 analysis of mainstream Web search systems here so I could gauge how much change has taken place in the last three years. Quick impression: Not much. You may want to perform similar as-you-listen tests. It is easy to see what search system responds most quickly, how the search results differ, and the features that each system makes available.

Three points in Mr. Goldberg’s remarks stuck in my mind. I want to mention each of these and then offer a few observations. Judging from the edgy comments to some my essays, I want you to know that you may not agree with me. That’s okay with me. Please, use the comments section to set me straight. Providing some facts to go along with your push back is helpful to me.

Key Points for Me

1. Parity or Microsoft’s Relevancy Is As Good as Google’s

Mr. Goldberg asserted that the major search services were at parity in terms of relevance and coverage. I found this notion somewhat difficult to comprehend. The data about Web search market share undermines any argument about parity which means, according to my understanding of the word “equality” or “equivalence”. I have had difficulty interpreting comments by whiz kids before, so I may be off base. My thought was that Google continues to gain market share at the expense of both Microsoft and Yahoo. The dis-parity is significant because Google, according to data mavens, accounts for 60 percent of more of user queries in the US. In Europe, the market share is higher. US search systems do not hold commanding leads in China, Korea, and other Eastern markets.

Should parity mean visual appearance, yes, Microsoft is looking more like Google. Here is the result of one of my test queries: “real estate baltimore maryland”.

googlesearch live search

On the surface these look alike. Closer inspection reveals that Google includes a canned form so I can narrow my result by location and property type. Google eliminates a step in looking for real estate in Baltimore. Microsoft’s result does not offer this feature, preferring to show “related searches”. I like the Google approach. I don’t make much use of machine-generated related queries. I have specialized tools to discern relationships in result sets.

Read more

Google Customers, Actually Superstar Customers

June 15, 2008

Google is a secretive outfit, despite the river of information about various doings at the Googplex in Mountain View, California. If you want to know who some of Google’s enterprise customers are, you can find six of them with profiles here. I assume the notion of a “superstar” is different from being a real live Googler, but the designation is interesting as is the first name familiarity. These superstars may warrant a contact at Google who takes their calls and answers their emails. Grab the names and profiles before the range information drifts away. The enterprise applications range from federated search to collaboration.

Stephen Arnold, June 15, 2008

Search Wizards Speak Interviews

June 15, 2008

One of the handful of people who read my musings in this Web log told me that it was hard to locate the interviews with influential people in the enterprise search market. You can find the index to the 18 interviews at http://www.arnoldit.com/search-wizards-speak/ or click on this link.

On Monday, we will post another interview. This one is particularly exciting because it takes a look at a company with sophisticated technology based on search, database and content processing. The firm provides a solution of which search and indexing are components. I will give you one clue: the firm is growing at a double-digit pace in the enterprise publishing system sector. The company competes with some of the Big Names in search as well as with firms that are off the radar of many information access vendors. We’ll post the new interview on Monday, June 16, 2008. With this interview, I am going to take a hiatus because it is becoming quite difficult to reach people over the summer period in the United States.

However, I will be posting brief profiles of companies in the search and content processing sector. These will be updated on an irregular basis, and I will post an index page to these profiles. I want to create several test profiles, gauge reader reaction, and then finalize a standard format for the information.

Here are the direct links to the interviews completed through June 12, 2008. These are listed in alphabetical order.

Company Interview Subject Date of Interview
Attivio Ali Riaz

26-May-08

Bitext

Antonio S. Valderrábanos

14-Apr-08

Blossom Software

Alan Feuer, Ph.D.

18-Feb-08

Brainware

James Zubok

31-Mar-08

Coveo Solutions Inc.

Laurent Simoneau

11-Mar-08

Deep Web Technologies

Abe Lederman

10-Jun-08

Endeca

Pete Bell

17-Mar-08

Exalead

François Bourdoncle

25-Feb-08

Intelligenx

Iqbal & Zubair Talib

12-May-08

ISYS Search Software

Ian Davies

5-Mar-08

Kroll

David Chaplin

28-Apr-08

Northern Light

David Seuss

2-Jun-08

PolySpot

Olivier Lefassy

19-May-08

Silobreaker

Mats Bjore

12-Jun-08

Sinequa

Jean Ferré

21-Apr-08

Thunderstone

John Turnbull

7-Apr-08

Vivísimo

Raul Valdes-Perez & Jerome Pesenti

24-Mar-08

ZyLAB

Johannes Scholtes

5-May-08

A special thanks to the executives who participated in the interviews. I extended invitations to Microsoft and Fast (sorry, we can’t talk to you), Google (no one responded including the top gun in enterprise search David Girouard), and Autonomy’s Andrew Kanter (no, can’t talk. Life is too manic). Too bad for me, I guess. For the folks who could talk, were thoughtful enough to respond to email, and in control of their time–thank you and a contented quack, quack.

Stephen Arnold, June 15, 2008

The Duh Factor: Email, Distractions, and Workers

June 15, 2008

When I awakened at 6 30 am, I took a look at what my crawlers snagged as I slept. Email stories. Hundreds of email stories. You can sample the floods on Techmeme.com, Megite.com, and other aggregation services. The catalyst for this blog-astrophy appears to be this essay by Matt Richtel of the New York Times. I’m not sure you need a link in this Web log because this story has gone viral. Email, distraction, and digital addiction are, in my view, part of the furniture of living. These behaviors will be with us for some time.

Now, let me summarize “Lost in E-Mail, Tech Firms Face Self-Made Beast.” Email is a problem. Workers, companies, and any one else in the message flow spends time fiddling around. Wasted time means an expensive, often futile, experience.

Some of the pundits commenting on this essay by Mr. Richtel has made the leap to distractions of which email is one in the modern work space. You can sample this line of thought in the Business Week article “May We Have Your Attention, Please?” by Maggie Jackson.

More, Not Fewer, Messages

I have a different view of the email problem, and I am not sure what to make of the furor over any digital messaging.

First, we have more types of electronic messaging that are exponentiating the cost and attention problems and their costs in money and time. SMS, for example, adds to the message traffic. When BearStearns used to be in business, my client sent me SMS messages, and these to him were must-answer communications. Email was too slow. SMS traffic I learned in one of my studies is larger than email traffic. I don’t have the motivation to dig out the 2007 data I have but I recall being flabbergasted at the number of SMS fired off and the revenue these generate for telcos.

Second, we now must deal with micro blogging. Twitter.com, the Silicon Valley in crowd communication medium, allows one to broadcast information of great import to anyone interested in receiving a friend’s postings. Here’s “tweet” I cadged from Popurls.com, a service which presents random tweets:

YoungnRich Apparel california. @holli I nap every Saturday there [sic] so rejuvenating

Very helpful this post, Holli.

Third, the notion of “soft interruptions” just adds to the flow of message traffic. Toss in instant messaging–a form of information that pops up–that runs across networks. I dislike instant anything, but that’s my 64-year-old biases coming to the fore.

Stepping Back

Distraction is a fact of life. When I leave my log cabin in rural Kentucky and venture to the big city, I find myself in meetings. One experience I had in Seattle in the last month is illustrative of the situations I encounter.

I am giving a talk about a company’s technology. I don’t work for the company whose technology I am describing. I don’t even care if it works or not. I’m describing what this company says its technology will do. There are five people in the room. Each has a laptop with a wireless connection, a smartphone, and a beverage. A person rushes into the room with a laptop, smartphone, and dog. The late comer is the “boss” and he says, “That diagram is wrong. That technology will never work.” He then sits down, attends to his laptop, and sends one message on his smartphone. He then interrupts and asserts, “That device can’t perform that function. It doesn’t have two radios. Quit telling us about that function. It won’t work.”

I’ m not sure what to do, so I say, “No problem. I go on to another slide.” A short time later the “boss” leaves. After the presentation, the other attendees wander off.

This situation is representative of what I find in my work.

  • Many employees who are bright and confident in their ability to multi task. I find that humans who multi task may be operating at less than 100 percent efficiency, not the 100 percent plus that these folks think their doing two tasks at a time delivers. This old human does not multi task. I do one thing at a time. In fact, I have to work to keep my mind from wandering. Many of the people I encounter actively embrace wandering thoughts.
  • Over the years, I have found that people in knowledge jobs go to great lengths to appear busy, engaged, and in demand. At Booz, Allen & Hamilton in the late 1970s, the fellow who trained me–Dr. William P. Sommers–appeared calm and unhurried. It was a false front. He was busy, but he managed his time effectively and separated himself from the lesser beings at Booz, Allen because he was under control. Today, the appearance of busy-ness is highly valued. Intrusive crap is embraced because it connotes success.
  • The devices are fun for many people. I find small gadgets, including mini-notebook computers, maddening. I can’t see the screen. The keyboard is too small for my large, increasingly clumsy fingers. The gizmos are fragile, and I drop small slippery gadgets with great frequency. Younger folks enjoy the complexity and some watch videos on screens the size of match books. My hunch is that these professionals are playing with toys. Instead of Lego blocks, professionals today keep their childhood habits alive with digital play things.
  • Most professionals I encounter don’t know what the heck they are doing. Their expertise often lacks a broader business context. Reinventing the wheel is a popular pass time in many of the high-tech environments in which I find myself. The interest in mobile search is somehow new. Nope, like metatagging, it is the same old stuff gussied up with a new name. If you don’t know what to do to make a direct and immediate contribution in your work, humans generate fake smoke. The blue flickers on digital gizmos are the equivalent of laser light shows for a touring rock band’s stage dressing. Distraction is a bit of fakery.

You probably disagree with my take on this email discussion. My reaction is like the Cheers’ character who says, “Duh.” As professionals more cut off from meaningful work, distractions become more attractive. I can gauge the focus of a meeting by counting the number of laptops and smartphones in the room. When there are more gizmos than people, I know the company is in a management whirlpool. In one meeting, I had an audiences of 62 people. There were 109 devices. This company, if I were to name it, is one the media, investors, and customers believes is in a death spiral.

So, as the economy falters, the pressure on employees goes up. If the employee doesn’t have a clue about managing time and setting priorities, the distractions flow. Forget how many emails pile up. When I return from a trip out side the US, I winnow emails ruthlessly. I don’t waste time on email. If a person wants me to do something, there are ways to get my attention. One young consultant at an Internet research firm wrote me to participate in a survey. I wrote back, “No. I will now delete email from you and your company automatically.” End of problem from my point of view.

Distractions, therefore, provide a way to measure the intellectual and managerial skill of a worker. The best employees and colleagues know how to manage distractions. I like to think about Alexander, sitting in a stinking tent, somewhere east of modern Afghanistan and his ability to manage distractions. I can imagine his hearing, “The troops don’t have water” and “We don’t know where the enemy is” and “We don’t have enough fodder for the pack animals”. A New York Times writer observing this situation could easily report that Alexander is overwhelmed by yammering requires from his lieutenants, too many parchment or wax messages, and intrusions such as a raiding party intent on killing him. Alexander dealt reasonably well with distractions.

Is it possible that these squawks, cheeps, howls, and tweets about “the email problem” reveal the flaws of the individuals, not the problems of the messaging environment. Agree? Disagree? Let me know if you are not too distracted.

Update. Times of London introduces the notion of a “pond-skater mind” here. The fix may be to use other tools.

Stephen Arnold, June 15, 2008

Update 1: June 23, 2008 You may find “The Myth of Multitasking” by Christine Rosen germane. Writing in The New Atlantis, she summarizes the challenges of mutli tasking. I particularly liked her use of the phrase “acquired inattention”. You can read the full essay here. Highly recommended.

Update 2; June 24, 2008 Ars Technica has a useful essay about multi tasking. J.M. Gitlin’s “The Boss Made Me Do It” is here.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta