MyRoar: NLP Financial Information Centric Service

March 6, 2009

A happy quack to the reader who alerted me to MyRoar.com. This is a vertical search service that relies on natural language processing. I did some sleuthing and learned that François Schiettecatte joined the company earlier this year. Mr.  Schiettecatte  has a distinguished track record in search, natural language processing, and content processing. French by birth, he went to university in the UK and has lived and worked in the US for many years. Here’s what the company says about MyRoar.com:

In today’s current political and economic environment people have never had more questions. MyRoar helps people sort through the hype to find just the answers they are looking for. Extraneous information is eliminated, while saving hours of time or abandonment of search. We provide a fun new interface that keeps users up to date on current news, which helps them formulate the best questions to ask. MyRoar is a Natural Language Processing Question Answering Search Engine. Using integrated technologies we are able to offer high precision allowing users to ask questions relating to finance and news. MyRoar integrates proprietary Question Answer matching techniques with the best English NLP tools that span the globe.

You can use the system here. The system performed quite well on my test queries; for example, “What are the current financials for Parker Hannifin?” returned two results with the data I wanted. I will try to get Mr. Schiettecatte  to participate in the Search Wizards Speak interview series. Give the system a whirl.

Stephen Arnold, March 6, 2009

Facebook in Flux

March 6, 2009

The Financial Times’ Chris Nuttall wrote “Facebook’s Identify Crisis” here. I agree in general that Facebook is reacting to Twitter and other rivals. I think Facebook is a walled garden; that is, a proprietary space. Most software and online service companies are walled gardens or want to be. The principal difference among these different organizations is how many gates there are to the walled garden and what one has to do get inside. The article missed an important point; namely, Facebook is a sector leader, but it has to deal with the likes of Google and Twitter. This is a tough competitive sandwich to choke down. Google is not a Facebook, but Google wants to be Facebooky enough to snag the ad revenue, the eyeballs, the clicks, and the data. Twitter, on the other hand, is the leader in real time search and it can morph in several directions serially or just do a number of things in parallel. Facebook, therefore, has an increasingly traditional competitor that wants to move in to the Facebook market space. And Facebook has to figure out how to deal with the real time, microblogging content catnip that sets Twitter apart. I don’t think one can fault Facebook for looking confused. Compare its efforts to innovate with those of the Financial Times. Facebook looks innovative and much more aware of where the information and financial action is than most traditional information companies in general and the newspapers like the Financial Times more particularly.

Stephen Arnold, March 6, 2009

Vyre: Software, Services, Search, and More

March 6, 2009

A happy quack to the reader who sent me a link to Vyre, whose catchphrase is “dissolving complexity.” The last time I looked at the company, I had pigeon holed it as a consulting and content management firm. The news release my reader sent me pointed out that the company has a mid market enterprise search solution that is now at version 4.x. I am getting old, or at least too sluggish to keep pace with content management companies that offer search solutions. My recollection is that Crown Point moved in this direction. I have a rather grim view of CMS because software cannot help organizations create high quality content or at least what I think is high quality content.

The Wikipedia description of Vyre matches up with the information in my archive:

VYRE, now based in the UK, is a software development company. The firm uses the catchphrase “Enterprise 2.0” to describe its enterprise  solutions for business.The firm’s core product is Unify. The Web based services allows users to build applications and content management. The company has technology that manages digital assets. The firm’s clients in 2006 included Diageo, Sony, Virgin, and Lowe and Partners. The company has reinvented itself several times since the late 1990s doing business as NCD (Northern Communication and Design), Salt, and then Vyre.

You can read Wikipedia summary here. You can read a 2006 Butler Group analysis here. My old link worked this evening (March 5, 2009), but click quickly.  In my files I had a link to a Vyre presentation but it was not about search. Dated 2008, you may find the information useful. The Vyre presentations are here. The link worked for me on March 5, 2009. The only name I have in my archive is Dragan Jotic. Other names of people linked to the company are here. Basic information about the company’s Web site is here. Traffic, if these data are correct, seem to be trending down. I don’t have current interface examples. The wiki for the CMS service is here. (Note: the company does not use its own CMS for the wiki. The wiki system is from MediaWiki. No problem for me, but I was curious about this decision because the company offers its own CMS system.  You can get a taste of the system here.

image

Administrative Vyre screen.

After a bit of poking around, it appears that Vyre has turned up the heat on its public relations activities. The Seybold Report here presented a news story / news release about the search system  here. I scanned the release and noted this passage as interesting for my work:

…version 4.4 introduces powerful new capabilities for performing facetted and federated searching across the enterprise. Facetted search provides immediate feedback on the breakdown of search results and allows users to quickly and accurately drill down within search results. Federated search enables users to eradicate content silos by allowing users to search multiple content repositories.

Vyre includes a taxonomy management function with its search system, if I read the Seybold article correctly. I gravitate to the taxonomy solution available from Access Innovations, a company run by my friend and colleagues Marje Hlava and Jay Ven Eman. Their system generates ANSI standard thesauri and word lists, which is the sort of stuff that revs my engine.

Endeca has been the pioneer in the enterprise sector for “guided navigation” which is a synonym in my mind for faceted search. Federated search gets into the functions that I associated with Bright Planet, Deep Web Technologies, and Vivisimo, among others. I know that shoving large volumes of data through systems that both facetize content and federated it are computationally intensive. Consequently, some organizations are not able to put the plumbing in place to make these computationally intensive systems hum like my grandmother’s sewing machine.

If you are in the market for a CMS and asset management company’s enterprise search solution, give the company’s product a test drive. You can buy a report from UK Data about this company here. I don’t have solid pricing data. My notes to myself record the phrase, “Sensible pricing.” I noted that the typical cost for the system begins at about $25,000. Check with the company for current license fees.

Stephen Arnold, March 6, 2009

Yidio: Video Search

March 6, 2009

A happy quack to the reader who alerted me to Yidio, a video search system that indexes 200 million videos. The search system is powered by Truveo. Here’s what Yidio said about itself:

Yidio is owned and operated by 2ten Media LLC, based in San Diego, California which also owns Sportsnipe.com, a sports news aggregator that combines sports news from thousands of sources around the world.  It is 2ten Media’s mission to provide an Internet experience to users that is not only simple and efficient, but employs the highest technology available while adding value to every user.  We are constantly expanding our Internet properties, so be on the look out for good thing’s in the near future from 2ten Media.

Here’s what Truveo said about itself:

Today, Truveo is one of the largest video search engines on the Web. Truveo is the search engine that powers many of the Web’s most popular video destinations. Truveo currently powers video search for AOL, Microsoft Corporation, CNET’s Search.com, Brightcove, Qwest, Kosmix, CSTV, Infospace, Excite, and hundreds of other applications worldwide. Across the network of websites it powers, Truveo reaches an audience of over 40 million users every month. The Truveo video search engine is widely recognized as being the most comprehensive and up-to-date video search service on the Web.

I am not a video consumer. If you can help me understand these two services, let me know. I am also trying to map these services to Blinkx, which bills itself as a big video search system. And YouTube.com? Check out Yidio.

Stephen Arnold, March 6, 2009

Google Twitter: Miscommunication

March 5, 2009

Henry Blodget’s “Google’s Schmidt: I Didn’t Diss Twitter” made me laugh. When I saw the blogosphere lightning strikes about an alleged remark by Google’s top wizard, I wondered if the reporters heard correctly. I don’t do hard news. I point to stories I find interesting. Mr. Blodget wrote on March 4, 2009, a story that allegedly set the record straight. You can read it here.

image

Which interstellar object is growing? Which is dying? Which is the winner? Which will become a charcoal briquette in a manner of speaking?

Please, navigate to Silicon Valley Insider because the good stuff is in capital letters with some words tinted red in anger. For me, the most interesting comment was:

In context if you read what I said, I was talking about the fact that communication systems are not going to be separate. They’re all going to become intermixed in various ways.

Several comments:

  1. The quote sounds like something I heard George Gilder say years ago. (For the record, the fellow who paid Mr. Gilder and me for advice sided with me about convergence. I prefer the term “blended”, and I still do.) Think a digital Jamba cooler.
  2. Google’s top Googler comes across as more politically sensitive. In Washington, DC, saying nothing whilst saying something that seems coherent is an art form. Mr. Schmidt is carrying a tinge of Potomac fever in my opinion.,
  3. The Twitter “thing” is clearly on Mr. Schmidt’s mind. My conclusion after reading the capital letters and red type is that Twitter has become a wisdom tooth ache. The pain is deep and it is getting worse.

No one is more interested in real time search than sentiment miners, intelligence professionals, and some judicially oriented researchers. The more the Twitter and real time search gains traction, the older and slower Google looks. In case you missed my post here, is this another sign of a generation gap between Google’s “old style” indexing and Twitter’s here and now flow? Note: Facebook.com is getting with the program too. eWeek has an interesting article here.

In my opinion we have a fuzzy line taking shape like those areas between galaxies that NASA distributes to show the wonders of the universe.

Metadata Perp Walk

March 5, 2009

I mentioned the problems of eDiscovery in a briefing I did last year for a content processing company. I have not published that information. Maybe some day. The point that drew a chuckle from the client was my mentioning the legal risk associated with metadata. I was reporting what I learned in one of  my expert witness projects. Short take: bad metadata could mean a perp walk. Mike Fernandes’ “Think You’re Compliant? Corrupt Metadata Could Land You in Jail” here tackled this subject in a more informed way than my anecdote. He does a good job of explaining why metadata are important. Then he hits the marrow of this info bone:

Data recovery cannot be treated as the ugly stepsister of enterprise backup, and the special needs that ECM systems place on backup must not be ignored. Regulatory authorities and industry experts are beginning to demand more ECM- and compliance-savvy recovery management strategies, thereby setting new industry-wide legal precedents. One misstep can lead to disaster; however, there are approaches and ECM solutions that help avoid noncompliance, downtime and other incidents.

If you are floating through life assuming that your metadata are shipshape, you will want to make a copy of Mr. Fernandes’ excellent write up. Oh, and why the perp walk? Bad metadata can annoy a judge. More to the point, bad metadata in the hands of the attorney from the other side can land you in jail. You might not have an Enron problem, but from the inside of a cell, the view is the same.

Stephen Arnold, March 5, 2009

Endeca: Push into Education and Training

March 5, 2009

Endeca, http://www.endeca.com, is expanding its information access software business by connecting education and training customers with more specialized solutions. You can read the press release here. Solutions from Endeca Education Services here, include customized training curriculum served on site or online. The goal is to get pre-packaged, flexible solutions that speed up business performance to their customers in these trying economic times. Part of the attraction of Endeca’s expanded offerings is the ability to pre-purchase training at a discounted rate. There are a lot of information access companies in this industry, and education services are particularly dependent on critical technology. It’s a really good move on Endeca’s part to expand in that venue to tap so much opportunity. On the other hand, the shadows of Apple and Google have begun to creep into the education market. Excitement ahead in a large business sector perhaps?

Jessica W. Bratcher, March 5, 2009

Yahoo: Inventing the Next Facebook

March 5, 2009

Reuters issued a story with the social network worm dangling in front of the Web surfers fishing for information. You can read “Yahoo CEO Interested in Social Networks and Search” here. The Thomson Reuters’ links can be slippery eels themselves, so the link may be dead when you read my comments. The story summarized Yahoo’s new chief executive’s comments at a bank’s high tech conference. Yep, banks. High tech. Credibility. Whatever. For me, the most remarkable comment attributed to Ms. Bartz, chief Yahoo, was:

“I do not believe we can invent the next Facebook,” Bartz said.

When I read this, I realized that Yahoo’s research and development effort might be redirected. Acquistions, partnerships, and deals may not require the innovations that have been released in the last year; for example, BOSS, the build your own search system. Maybe Yahoo will stick to its historical path of buying companies and trying to grow seedlings into giant redwoods. I liked Ms. Bartz’s pragmatism. I wonder what it means for Yahoo’s R&D initiatives and what companies will become the focal point for Yahoo. With “every business up for examination”, the stability of Yahoo may be a future goal, not a here and now reality.

Stephen Arnold, March 5, 2009

Facebook: Moving into Business Directory Territory

March 5, 2009

I saw “Facebook Creates New Profiles for Public Figures and Organizations” by Nicholas Kolakowski here. The write up struck me as important and indicative of a content processing opportunity. Facebook.com is one of the two online services that have managed to defy Googzilla. The other is the leader in real time search, Twitter.com. Facebook, according to the eWeek story:

Facebook announced on March 4 the launch of new profiles for public figures and organizations. These profile pages might belong to a large company or famous politician, but nonetheless will function like the “regular” user pages already present on the site. These new profiles will let their administrators post status updates, videos, and photos, as well as provide information via a real-time news feed to users.

When I read this passage, I thought business directory. The “old” Hoover’s pointed toward a new, more useful type of business directory. Dun & Bradstreet (quite a url the dnb.com moniker) dominates this sector. Hoover’s disappeared into the D&B combine and its usefulness has deteriorated. The void has not been filled on a large scale. The eWeek story sparked the thought in my addled goose brain that Facebook.com organization pages could revivify the business directory business.

Why? Three reasons:

  1. User generated content plus content generated by the system (in this case Facebook.com) may provide a nice mix of subjective and objective information, particularly for publicly traded companies.
  2. The possibility of linking Facebook.com users to a particular organization is a potentially useful tool for text mining and relationship analysis.
  3. The updating problems of the traditional business directory companies could be eliminated. Facebook.com’s could update pages in near real time. A Twitter like function for news about an organization would be a boon to researchers, analysts, job hunters, and law enforcement.

What will Facebook.com do? People are now worrying about what Google would do a decade after the company began its run? A 20 something might want to ask the question about Facebook.com-like outfits. And D&B? I am not sure the company is aware of Facebook-like companies and their potential in business directory information. My hunch is that a Facebook-like business directory could erode traditional business directory revenues. Maybe I’m off base, so set me straight.

Stephen Arnold, March 5, 2009

Libraries: A Tipping Point in Commercial Online

March 5, 2009

Libraries find themselves in a tough spot. The economic downturn has created a surge in walk in traffic. In Louisville, Kentucky, I watched as patrons waited to use various online systems available. I spoke with several people. Most were looking for employment information or government benefit resources. I pop into the downtown library a couple of times a month, and at 4 pm on a Thursday, the place was busy.

In Massachusetts, four libraries found themselves in the spotlight. According to the Wicked Local Brockton here, “Wareham, Norton Libraries Lose Certification; Brockton, Rockland Given Reprieve”. The libraries, according to Maria Papadopoulos’ article had cut their budgets too much. As a result, the libraries lost their state certification, which further increases budget pressure. Across the country the Seattle Post Intelligencer reported “Big Challenges Await City’s New Librarian.” Kathy Mulady wrote:

Actual library visits are up 20 percent, and virtual visits online are up even more. About 13 million people visited city library branches last year.

That’s the good news. The bad news is that Seattle, home of Amazon (king of ebooks) and Microsoft (the go-to company for software and online information) has a budget crunch. The new library director will have to deal with inevitable financial pressure at a time when demand for services is going up. Tough job.

What’s this mean for commercial online services?

image

View of a collision between light rail and a freight locomotive. Will this happen when library budgets collide with the commercial online vendors in 2010? Image source: http://www.calbar.ca.gov/calbar/images/CBJ/2005/Metrolink-Train-Wreck.jpg

My view is that the companies dependent on libraries for their revenue will be facing a very lean 2009. The well managed companies will survive, but those companies that are highly leveraged may find themselves facing significant revenue pressure. Most of the vendors dependent on libraries for revenue are low profile operations. These companies aggregate information and make that information available to individual libraries or to groups of libraries that join together to act as a buying club. Most library acquisitions occur on a cycle that is governed by the budget authority funding a library. In effect, library vendors will receive orders and payments in 2009.

The big crunch may occur in 2010. When that happens, the library vendors will be put under increasing pressure. I have identified three potential developments to watch.

First, I think some high profile library dependent information companies will be forced to merge, cut back on staff and product development, or shut their doors. Size of the library centric company may not protect these firms. The costs of creating and delivering electronic information of higher value than this goose-based Web log are often high and difficult to compress. The commercial database companies are dependent on publishers for content. Publishers are in a difficult spot themselves. As a result, the interlocks between commercial publishing, traditional database companies, and libraries are complex. Destabilize one link and the chain disintegrates. No warning. Pop. Disintegration.

image

Image source: http://harvardinjurylaw.com/broken-chain.jpg

Second, the libraries themselves are going to have to rethink what they do with their budgets. This type of information decision has been commonplace for many years. For example, libraries have to decide what books to buy. Libraries have to decide what percent of their budget gets spent on periodicals in print or online. Libraries have to decide whether to cut hours or cut acquisitions. Libraries, in short, make life and death information decisions each day. The forced choices mean that libraries have to decide between serving patrons with online access to Internet resources or online access to high value information sources like those purchased from Cambridge Scientific Abstracts (privately held), Ebsco (privately held), Reed Elsevier (tie up between two non US commercial entities, one Dutch, one British), Thomson Reuters (public company)  Wolters Kluwer (public, non US company) and some other companies that are not household names. Free services from Google, Microsoft, and Yahoo plus Web logs, Twitter, and metasearch systems like IxQuick.com would look pretty good to me when I had to decide between a $200,000 payment to a commercial database company and providing services to my patrons, students, and consortium partners.

Third, Google’s steady indexing of content in Google Books and in its government service and the general Google Web index offers an alternative to the high value, six figure deals that library centric information companies pursue. If I were working in a library, I would not hesitate to focus on Google-type resources. I would shift money from the commercial database line item to those expenses associated with keeping the library open and the public access terminals connected to the Internet available.

In short, the economic problems for companies in the search and content processing sector are here-and-now problems. The managers of these firms need to make sales in order to stay in business. The library centric information companies are sitting on railroad tracks used by the TGV, just waiting for the real budget collision to arrive. The traditional library information companies cannot get off the tracks even though they know the 2010 is going to arrive right on schedule.

I want to steer clear of these railroad tracks. Debris can do some collateral damage.

Stephen Arnold, March 5, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta