NoSQL to Die in Train Wreck
March 30, 2010
“I Can’t Wait for NoSQL to Die” adds fuel to the SQL suck and NoSQL is stupid fire fight. I want to make clear that the addled goose paddles in his pond filled with mine run off fluids and takes no strong position on either side in the battle. When the bullets fly, the goose submerges his head and hopes no stray slug makes him oven ready.
The idea is that SQL is pretty darned good and:
The idea is that object relational databases like MySQL and PostgreSQL have lapsed their useful lifetimes, and that document-based or schemaless databases are the wave of the future. Never mind of course that MySQL was the perfect solution to everything a few years ago when Ruby on Rails was flashing in the pan. Never mind that real businesses track all of their data in SQL databases that scale just fine. (For Silicon Valley readers, Wal-Mart is a real business, Twitter is not.)
The write up points out some notable flaws in the NoSQL solutions. Here’s an example and a good one in the goose’s opinion:
So you’ve magically changed your backend from MySQL to Cassandra. Stuff will just work now, right? Well, no. Did you know that Cassandra requires a restart when you change the column family definition? Yeah, the MySQL developers actually had to think out how ALTER TABLE works, but according to Cassandra, that’s a hard problem that has very little business value. Right.
Source: http://teddziuba.com/2010/03/i-cant-wait-for-nosql-to-die.html
Other SQL advocates to whom I have spoken have pointed out that even Google uses MySQL in its advertising system. Yes, even Google.
My view of this is that I want to start gathering these pro and con arguments. When emotions run high over a technical issue, I think there may be some interesting examples and possibly some financial information beneath the bluster.
Dr. Codd made a wonderful contribution to data management. My hunch is that with an efflorescence of non-Codd methods, perhaps some useful learnings will emerge. It takes years for an innovation to survive the tests imposed by the real world. When the shooting stops, SQL will remain a useful tool and we may have other useful tools to use to solve certain types of problems.
But the arguments and the verbal sharpshooting is a great deal of fun as long as no goose is killed. I don’t want to be dinner or paté just yet.
Stephen E Arnold, March 30, 2010
Nope. No one paid me to write about my interest in self preservation.
IBM and an Open Source Milestone
March 30, 2010
I think the buzz about open source is interesting. I had forgotten that IBM had grabbed the tailgate of the open source bandwagon years ago. “IBM Celebrates a Decade of Linux on Its System Z Mainframe” reminded me that that IBM saw an opportunity to use open source to address some of the objections to the z/OS operating system and its interesting idiosyncrasies. The article said:
Since opening the mainframe to run popular open source Linux applications 10 years ago, there are today 3,150 Linux applications enabled for System Z and 70 per cent of the top one hundred global mainframe customers run Linux. Rosamilia [IBM manager for mainframes] described the mainframe as middle-aged because “it’s come a long way and has a long way to go.” And it’s timeless. “We made compatible changes so that stuff that ran a long time ago still runs today and yet we’ve still invested in brand new architectures so that we can take advantage of lots of things,” said Rosamilia. He named two top drivers for Linux on System Z. The speed with which it allows new server provisioning, and the control IT administrators have to maintain over servers while lowering risks in the data centre environment. [Emphasis added]
I love IBM mainframes because there is not as much competition to keep these puppies alive and well. IBM loves them too, and I think these comments indicate that open source is more than chopping licensing fees. Open source makes it possible to use “middle-aged” and quite expensive in today’s code-and-run world. STAIRS III, anyone?
Stephen E Arnold, March 30, 2010
A freebie. An unsponsored write up.
Oracle Text Visualization
March 30, 2010
Short honk: A happy quack to the reader and obvious Oracle aficionado for this tip. You can download the Java library for Oracle Text visualization directly from Oracle and at no charge. The information on the Oracle Web site is sparse:
This Java library for Oracle Text visualization is incorporated in the Oracle Text sample code application to visualize clusters, categories, and themes.
Act now.
Stephen E Arnold, March 30, 2010
No one paid me to pass along this link.
Autonomy Takes a Poke in the Ribs
March 29, 2010
I saw a news item on Yahoo a few minutes ago (March 28, 2010, 2 pm Eastern), and I wondered it the item were accurate. The news story “Shares of Autonomy Appear Overvalued” may be little more than the stuff flowing from a disgruntled analyst struggling with a wrinkled shirt. Potentially negative financial news becoming available on a Sunday is not the same as a negative story fired out before trading begins on a Monday. At any rate, the story reports that the financial publication Barrons has done some tea leaf analysis and concluded that “the Britain-based software maker may now be overvalued and set for a pullback.” For me, the interesting comment was:
Given its modest returns on invested capital and “granting it some premium for improving returns, a multiple closer to its organic earnings growth could trim shares by roughly 20 percent,” the report said.
I will now try to chase down the Barrons report and make sure I keep an eye on Autonomy’s response. I thought the company was taking other search and content processing companies to school. Even OpenText, which has had a somewhat similar approach to generating growth and buzz, has not impressed me as much as Autonomy.
Stephen E Arnold, March 29, 2010
No pay for this one. I will report no dough to the SEC, an outfit on the alert always for financial silliness like working for no money.
Hakia Enterprise Search
March 27, 2010
A happy quack to the Hakia’s executive who confirmed the firm’s new enterprise search solution. Hakia added a page to its Web site identified as “Semantic Booster”. When I examined it, the page was about Hakia’s brand new enterprise search appliance. (Yikes, Beyond Search has a news scoop!)
The Hakia SemanticBooster appliance. Source: http://company.hakia.com/new/semanticbooster.html
According to the write up, Hakia offers “the lowest cost and the best performance.” The write up continues:
The main utility is to provide internal search function within organization’s document repository. Options include other vital functions like powering a consumer facing search, providing targeted Web search to the workers inside the corporation, external news monitoring and alerting, harvesting quality content from the Web to enrich organizations’ information repository, and categorization of documents for better management.
The options for the system include news monitoring (which seems to hook into Hakia’s invitation only service SenseNews), “automated content acquisition”, and semantic categorization.
Source: Hakia.com
The solution is available as a fixed price solution “delivered in a box”. The document limit on the box is pegged at 30 million. When you need more capacity, just add another appliance. Hakia provides a 20 page description of its enterprise search solution here.
I chased down a Hakia wizard,Dr. Riza Berkan, CEO and Founder of hakia, who told me:
Semantic technology in enterprise search is now becoming such a competitive advantage that the corporations using it are making it part of their trade secret and remaining silent about it. We help corporations in this transition with our complete semantic solution with unprecedented performance.
Prices begin in the $20,000 range but you will want to deal directly with the company.You can contact the company by emailing bdev@hakia.com.
Stephen E Arnold, March 27, 2010
No one paid me to write this. I would report non payment to the GSA, but Hakia’s appliance is not listed on the GSA schedule. Maybe I was not running the correct search because the GSA search system is pretty darned good.
Dr. Riza Berkan, CEO & Founder of hakia, you can use in your article: "Semantic technology in enterprise search is now becoming such a competitive advantage that the corporations using it are making it part of their trade secret and remaining silent about it. We help corporations in this transition with our complete semantic solution with unprecedented performance."
Mindbreeze Desktop Search
March 26, 2010
A reader in Europe alerted me to the Mindbreeze Desktop Search download on Netswelt.de. According to the Netzwelt write up:
The free Mindbreeze Desktop Search software makes it easy for you to to find files and folders. Mindbreeze With the Desktop Search software is it possible to search for individual files and folders and access them. The user-friendly interface of the software with your search and user-input screen also allows you to search beyond the traditional concepts in a range of file types, including Adobe PDF files. Similarly, the search includes all Microsoft Outlook emails, attachments, and calendar entries.
To download the program click here.
Mindbreeze, a unit of Fabasoft, offers an enterprise search system that can reduce the time and costs associated with some search systems. For more information about the company and its software, navigate to www.mindbreeze.com.
Stephen E Arnold, March 26, 2010
A freebie. No one paid me to write about this free software. I am not sure to whom in Washington, DC to report writing for free about a company based in a city for which a Mozart symphony has been associated. Does the Marine Corp. do Mozart music. Sousa? Yes. Mozart? No clue.
Exalead Tightens NewspaperArchive Tie Up
March 26, 2010
A happy quack to the reader who alerted me to a Marketwire story about Exalead’s deal with NewspaperArchive.com. Exalead is one of the most interesting search applications and content processing companies we monitor. The story I read was “NewspaperArchive.com Scales With Exalead”.
The story reported:
NewspaperArchive.com is the largest historical newspaper database online. It contains tens of millions of newspaper pages from 1753 to present. Every newspaper in the archive is fully searchable by keyword and date, making it easy for people to quickly explore historical content. NewspaperArchive.com had bumped up against limitations of having nearly 100 million records. After the switch to Exalead in December 2009, NewspaperArchive.com has been able to scale again, increasing the number of records by 20%; while at same time reducing the amount of hardware by 75%.
The performance angle is important based on our research. There are very few companies with the engineering and architecture to deal with the types of data flows found in many organizations today. One of the founders of Exalead worked on the AltaVista.com search system. I have identified a number of Exalead innovations that moved beyond the Digital Equipment approach to search. One of the most important is scaling and a design that permits enterprise applications to break free of their lock step methods of making data available to users. Exalead can give today’s iPod savvy user a way to access business information with the fluidity of downloading a tune from Apple’s system. In the enterprise, this type of functionality is a rare animal in my experience.
Exalead, founded in 2000,
…is the leading search-based application platform provider to business and government. Exalead’s worldwide client base includes leading companies such as PricewaterhouseCooper, ViaMichelin, GEFCO, American Greetings and Sanofi Pasteur, and more than 100 million unique users a month use Exalead’s technology for search. Today, Exalead is reshaping the digital content landscape with its platform, Exalead CloudView™, which uses advanced semantic technologies to bring structure, meaning and accessibility to previously unused or under-used data in the new hybrid enterprise and Web information cloud. Cloudview collects data from virtually any source, in any format, and transforms it into structured, pervasive, contextualized building blocks of business information that can be directly searched and queried, or used as the foundation for a new breed of lean, innovative information access applications. Exalead is an operating unit of Qualis, an international holding company, with offices in Paris, San Francisco, Glasgow, Milan and Darmstadt.
I want to let you know that the last time I was in Paris I got a preview of Exalead’s forthcoming search application technology. I am not at liberty to let le chat out of the bag, but I will be describing the system when Exalead makes a formal announcement.
You can get more information about Exalead at www.exalead.com. Additional information about NewspaperArchive is available at
Open Source Users
March 25, 2010
With Red Hat ringing the cash register, interest in open source continues to chug forward. If you are tracking Linux users for competitive intelligence or sales leads, you will want to snag a copy of “50 Places Linux is Running That You Might Not Expect”. Well, I did expect quite a few of these outfits, but there were some interesting open source adopters. One was IBM. I wonder why that outfit doesn’t use its own mainframes and zOS? Oh, I know. Cost. Darn good list. I did not know that Spain ran Linux. There’s the Bitext outfit which has some government clients, and some of its software supports the Microsoft Windows ethos.
Stephen E Arnold, March 25, 2010
Yep, freebie. I will report this to the Department of Defense. I thought that outfit was into Microsoft technology. Well, the DoD does not work for free as I do.
RDBMS Trounces NoSQL Technology
March 25, 2010
I enjoyed “Fighting The NoSQL Mindset.” The article takes the arguments advanced by the group of data management upstarts who want to use No SQL databases. These range from Cassandra to MongoDB and almost any article listed on the NoSQL Web site. The arguments are anchored in some real world testing and the approach reminded me of some of the Googley talks I have heard in the last couple of years. The article is a long one. For me, the key passage was:
They do this because looking up data that can’t be cached in memory is an expensive operation. Yet as has been shown, SSDs, which are getting faster and cheaper regularly, completely flip the I/O equation. SSDs change everything.
My recommendation is to read this NoSQL article and then try to answer these questions which the goslings and I worked up at our social networking event this evening:
- If the SQL database model were the bird in the hand, what was the reason for Google’s investment in its data management systems?
- With hardware prices declining, why would Oracle focus on providing high end and quite expensive database appliances to address certain SQL licensees’ performance problems? Clever engineers should be able to knock down performance problems with off the shelf hardware and a good grasp of database basics.
- With SQL solutions readily available, what’s with the proliferation of the NoSQL alternatives? The number of products and the interest in them suggest that there is some magnetic effect.
My view is that I am delighted to be an addled goose. I don’t have to sit in front of a CFO and explain why the data management systems are expensive and generally a drain on information users. Something is amiss. If not technology, then what? Maybe management? Maybe database expertise?
Stephen E Arnold, March 25, 2010
A freebie. I will report not getting paid to TSA, where Oracle databases are not quite as zippy in certain applications. But TSA does pay for work. Just not the goose.
SAS Teragram in Marketing Push
March 25, 2010
Two readers on two different continents sent me links to write ups about SAS Teragram. As you may know, SAS has been a licensee of the Inxight technology for various text processing operations. Business Objects bought Inxight, and then SAP bought Business Objects. I was told a year or so ago that there was no material change in the way in which SAS worked with Inxight. Not long after I heard that remark, SAS bought the little-known specialist content processing firm, Teragram. Teragram, founded by Yves Schabes and a fellow academic, landed some big clients for the firm’s automated text processing system. These clients included the New York Times and, I believe, America Online.
Teragram has integrated its software with Apache Lucene, and the company has rolled out what it calls a Sentiment Analysis Manager. The idea behind sentiment analysis is simple. Process text such as customer emails and flag the ones that are potential problems. These “problems” can then be given special attention.
The first news item I received from a reader was a pointer to a summary of an interview with Dr. Schabes on the Business Intelligence network. Like ZDNet and Fierce Media, these are pay-for-coverage services. The podcasts usually reach several hundred people and the information is recycled in print and as audio files. The article was “Teragram Delivers Text Analytics Solutions and Language Technologies.” You can find a summary in the write up, but the link to the audio file was not working when I checked it out (March 24, 2010, at 8 am Eastern). The most interesting comment in the write up in my opinion was:
Business intelligence has evolved from a field of computing on numbers to actually computing on text, and that is where natural language processing and linguistics comes in… Text is a reflection of language, and you need computational linguistics technologies to be able to turn language into a structure of information. That is really what the core mission of our company is to provide technologies that allow us to treat text at a more elaborate level than just characters, and to add structure on top of documents and language.
The second item appeared as “SAS Text Analytics, The Last Frontier in the Analysis of Documents” in Areapress. The passage in that write up I noted was this list of licensees:
Associated Press, eBay, Factiva, Forbes.com, Hewlett Packard, New York Times Company, Reed Business Information, Sony, Tribune Interactive, WashingtonPost.com, Wolters Kluwer, Yahoo! and the World Bank.
I am not sure how up to date the list is. I heard that the World Bank recently switched search systems. For more information about Teragram, navigate to the SAS Web site. Could this uptick in SAS Teragram marketing be another indication that making sales is getting more difficult in today’s financial climate?
Stephen E Arnold, March 25, 2010
A no fee write up. I will report this sad state of affairs to the IMF, which it appears is not clued in like the World Bank.

