Oracle, Publishing, and XSQL
July 14, 2009
I am a big fan of the MarkLogic technology. A reader told me that I should not be such a fan boy. That’s a fair point, but the reader has not worked on the same engagements I have. As a result, the reader has zero clue about how the MarkLogic technology can resolve some of the fundamental information management, access, and repurposing issues that some organizations face. I am all for making parental type suggestions. I give them to my dog. They don’t work because the dog does not share my context.
The same reader who wanted me to be less supportive of MarkLogic urged me to dig into Oracle’s capabilities in Oracle XSQL, which I know something about because XSQL has been around longer than MarkLogic has.
Now Oracle is a lot like IBM. The company is under pressure because its core business lights up the radar of its licensees’ chief financial officer every time an invoice arrives. Oracle is in the software, consulting, open source, and hardware business. Sure, Oracle may not want to make SPARC chips, but until those units of Sun Micro are dumped, Oracle is a hardware outfit. Like I said, “Like IBM.”
MarkLogic has been growing rapidly. The last time I talked with MarkLogic’s tech team, it was clear to me that the company was thriving. New hires, new clients, and new technologies—these added to the buzz about the company. Then MarkLogic nailed another round of financing to fuel its growth. Positive signs.
Oracle cannot sit on its hands and watch a company that is just up Highway 101 expand into a data management sector right under Oracle’s nose. Enter Oracle XSQL, which is Oracle’s answer to MarkLogic Server.
The first document I examined was “XSQL Pages Publishing Framework” from the Oracle 9i/XML Developer’s Kits Guide. I printed out my copy, but you can locate an online instance on the Oracle West download site. I am not sure if you will have to register. Parts of Oracle recognize me; other parts want me to set up a new account. Go figure. Also, Oracle has published a book about XSQL, and you can learn more about that from eBooksLab.com. You can also snag a Wiley book on the subject: Oracle XSQL: Combining SQL, Oracle Text, XSLT, and Java to Publish Dynamic Web Content (2003). A Google preview is available as well. (I find this possibly ironic because I think Wiley is a MarkLogic licensee but I might be wrong about that.)
Oracle has an Oracle BI Publisher Web log that provides information about the use of XSQL. The most recent post I located was a June 11, 2009, write up but the link pointed to “Crystal Fallout” dated May 22, 2009. Scroll to the bottom of this page because the results are listed in chronological order, with the most recent write up at the bottom of the stack. The first article, dated May 3, 2006, is interesting. “It’s Here: XML Publisher Enterprise Is Released” by Tim Dexter provides a run down of the features of this XSQL product. A download link is provided, but it points to a registration process. I terminated the process because I wasn’t that interested in having an Oracle rep call me.
I found “BI Publisher Enterprise 10.1.3.2. Comes Out of Hiding” interesting as well. The notion that an Oracle product cannot be found underscores another aspect of Oracle’s messaging. From surprising chronological order to hiding a key product, Oracle XSQL seems to be on the sidelines in my opinion.
An August 31, 2007 post “A Brief History of BIP” surprised me. The enterprise publishing project was not a main development effort. It evolved out of frustration with circa 2007 Oracle tools. Mr. Dexter wrote:
Three years later and the tool has come a long way … we still have a long way to go of course. But you’ll find it in EBS, PeopleSoft, JDE, BIEE as a standalone product, integrated with APEX and maybe even bundled with the database one day – its a fun ride, exhausting but fun.
This statement, if accurate, pegs one part of XSQL in 2004. (I apologize that the links point to the long list of postings, but Oracle’s system apparently cannot link to a single Web log post on a separate Web page. Annoying, I know. MarkLogic’s system provides such fine grain control with a mouse click, gentle reader.)
When we hit 2009, posts begin to taper off. A new release—10.1.3.3.3—was announced in May 2008. The interesting posts described the method of tapping into External Data Engines Part I, May 13, 2008) and Part 2, May 15, 2008).
The flow seems somewhat non intuitive to me, even after reading two detailed Web log posts.
An iPhone version of Publisher became available on July 17, 2008.
In August 2008, Version 10.1.3.4 was released. The principal features, as I understand them, were:
- Integration with Oracle Enterprise Performance Management Workspace
- Integration with Oracle “Smart Space”
- Support for multidimensional data sources, including Hyperion Essbase, SQL Server, and SAP Business Information Warehouse (!)
- Usability and operation enhancements which seem to eliminate the need to write scripts for routine functions
- Support for triggers
- Enhanced Web services support
- A Word template builder
- Support for BEA Web Logic, JBoss, and Mac OS X.
Another release came out in April 2009. This one was 10.1.3.4.1 and focused on enhancements. When I scanned the list of changes, most of these modifications looked like bug fixes to me. In April 2009, Tim Dexter explained a migration gotcha. I read this as a pretty big glitch in one Oracle service integrating with another Oracle service.
Stepping back I am left with the impression that XSQL and this product are not the mainstream interest of “big” Oracle. In fact, if I had to decide between using Oracle’s XSQL, I would not hesitate in selecting MarkLogic’s solution for these reasons:
- MarkLogic has one mission: facilitate content and information management. The company is not running an XQuery side show. The company runs an XQuery main event.
- The MarkLogic server generates pages that make it easy to produce crunchy content. The Oracle system produces big chunks of content that are difficult to access and print out. Manual copying and pasting is necessary to extract information from the referenced Web log.
- The search function in MarkLogic works. Search in Oracle is slow and returns unpredictable results. I encountered this problem when trying to figure out whether “search” means “Ultra Search” or “SES”.
So, I appreciate the feedback about my enthusiasm for MarkLogic. I think my judgment is sound. Go with an outfit that does something well, not something that is a sideline.
Stephen Arnold, July 14, 2009
Mysteries of Online Available as a Free Report
July 13, 2009
A student at a library school in Toronto sent me an email asking for permission to reuse two of the write ups in this Web log’s “Mysteries of Online” series. I wrote nine essays which are finable via the Blossom search box on any of this Web log’s pages. After that call, I decided to make life easy for students and any other person who wanted to review what I have learned in the last couple of decades about online information and deriving revenue from that type of information.
You can now click here and download a PDF that contains the nine essays. I have added a short disclaimer and a basic table of contents so you can locate the essay you wish to review. I did not prepare an index or insert the illustrations that I use in my formal lectures and presentations.
The only caveat attached to the document is that if you work for a commercial enterprise, write me at seaky2000 at yahoo dot com to let me know what you want to do. There is some legal boilerplate that must be inserted in you want to recycle my work.
Stephen Arnold, July 13, 2009
Arnold Study Explains Google as a New Information Medium
July 13, 2009
A new in-depth search industry monograph that examines how Google’s highly sophisticated technical resources could position it to become a formidable player in 21st Century media is now available.
“Google: The Digital Gutenberg” offers critical analysis and information to anyone connected with traditional media enterprises and to anyone charged with calculating threats and potential for media companies in book and magazine publishing, directory and guide compilation, television and video. It is authored by Stephen E. Arnold, http://www.arnoldit.com, an consultant in search, content processing and text analytics.
Arnold explains in the monograph that Google is a type of large-scale disruptor. “It’s the poster child of larger changes made possible by technology, infrastructure and user demands,” he said.
Google continues to push products and services into different business sectors, and media is an obvious target. These waves can be disruptive and often the cause of surprising reactions; “Google: The Digital Gutenberg” explains the current industry climate surrounding Google and will help the traditional media industry understand how the technological landscape is changing as led by Google’s charge.
Google is best known as a Web search vendor and an online advertising system. But Google as a publisher is a relatively new concept. The company already offers a number of revenue-generating opportunities such as the AdSense program to publishers and business at-large.
But Google is now moving to also offer video content organization, actual advertising sales and Google search within sites, among other functions. The monograph closes with a discussion of the Google App Engine, which offers users the ability to build and host web applications on Google’s infrastructure. Essentially, the partner who uses Google as a back office can negotiate revenue splits with Google, making it an effective shot in the arm for failing traditional media revenues.
The study reviews Google’s content automation methods, dataspace functions and the company’s increasing impact on education, scholarly publishing, and commercial online business as Google positions itself as a major player in the media industry.
Arnold ends by summarizing “Google: The Digital Gutenberg” by urging readers to develop products and services for the Google platform. For those who choose to ignore Google, they risk being left behind as Internet technology forges the new media of the future.
The monograph, available now, is published by Infonortics. Orders may be placed online at https://www.infonortics.com//https/goog-ord.html or by mail using the order form here. A table of contents is posted here.
Jessica Bratcher, July 13, 2009
Google’s Demise with a Network World Imprimatur
July 11, 2009
I sat on the May 2009 write up by Kaila Colbin. She wrote the first part of her “Google’s Dilemma and Why It Will Die, Part One” and Network World published the essay. You can read Part One here. Part Two became available on June 9, 2009. You can read that installment here. The “it” in the title refers to Google not the dilemma, but I pushed that ambiguity aside and jumped into the argument.
Part One trots out the problem innovator have; that is, in the intellectual shade of Clayton Christensen, Google won’t be able to keep pace with the many problems, see through its blind spots, and recapture the sizzle that made the steak great before it stayed on the fire too long.
Part Two recycles more of the good professor Christensen’s findings and races to this conclusion:
Finally, any investor in Google would have to concerned about the narrowness of its success. They’ve never gone through a leadership transition. They’ve never generated significant revenue from anything other than AdWords. As of now, they’re Gloria Gaynor; they’re Dexys Midnight Runners; they’re Harper Lee. Don’t get me wrong — I’d welcome the kind of success and financial returns associated with I Will Survive, Come On Eileen, or To Kill a Mockingbird — but it’s generally not desirable to predicate long-term business strategy on a single major success.
Let me offer several observations:
- What if Google is, as I argued in my three Google monograph’s here, a new type of company? Will the lessons of those who found themselves crushed by the logic of Mr. Christensen’s analysis apply? One quick question: who regulates the GOOG if it parks its data beyond the three mile limit on its data center barges?
- How will Google’s controlled chaos, eternal betas, and surround and seep tactics help protect the company from the type of ossification that characterizes some of its competitors? Will companies like Microsoft and Yahoo, to pick two competitors, themselves be able to change to keep pace with the multi-front tech probes Google delights in launching?
- What is Google doing that makes it vulnerable to the specific points Ms. Colbin plucked from the Christensen book? Are there any steps Google has taken to resolve those issues, say, for example, some of the ideas in The Innovator’s Prescription?
Network World’s imprimatur is not enough to make me grab on to this analysis.
Stephen Arnold, July 11, 2009
Lingospot: Technology for Publishers
July 10, 2009
I have been thinking about the problem facing traditional publishing companies. Some pundits suggest that publishers need to solve their problems with technology. My view is that publishers think about technology in a way that makes sense for their company’s business and work processes. I have come across two companies who have been providing publishers with quite sophisticated technology. The management of these two firms’ clients gets the credit. The technology vendors are enablers.
I provided some recent information about Mark Logic Corporation in my summary of my presentation to the Mark Logic user’s group a few weeks ago. You can refresh your memory about Mark Logic’s technology by scanning the company’s Web site.
The other company is Lingospot. Among its customers are Forbes Magazine and the National Newspaper Association. The company offers hosted software solutions. Publishing companies comprise some of Lingospot’s customers, but the firm has deals with marketing firms as well.
The company describes its technology in this way:
Lingospot’s patented topic recognition and ranking technology is the result of more than eight years of development and four years of machine learning. To date, our algorithm has identified over 30 million unique topics drawn from more than two billion pages that we have crawled and analyzed. During the last four years, we have collected over five billion data points on such topics, including the context in which readers have chosen to interact with each topic. What does all this mean for our clients? By partnering with Lingospot you have access to the leading topic recognition, extraction and ranking technology, as well as the accumulated machine learning of our platform. This translates into a more engaging experience for your readers and substantially higher metrics and revenue for you.
My understanding of this technology is that Lingospot can automatically generate topic pages from a client’s content and then handle syndication of the content. The Lingospot works with text, images, and video.
The company is based in Los Angeles. Founded by Nikos Iatropoulos, Mr. Iatropoulos was involved with several other marketing-centric companies. He worked for Credit Suisse and, like the addled goose, did a stint with Booz, Allen & Hamilton. His co founder is Gerald Chao, who is the company’s chief technical officer. Prior to founding Lingospot, Gerald served as a Senior. Programmer at WebSciences International and as a programmer at IBM. Gerald holds a MS in Computer Science and a PhD in statistical natural language processing, both from UCLA.
Publishers are embracing technology. My hunch is that the business models need adjustment.
Stephen Arnold, July 10, 2009
The Guardian Sees Opportunity in User Generated Content
July 9, 2009
The Guardian is taking a week off from its nibbling at Googzilla’s paws. I found “User-Generated Content Is Only the Beginning” here a gust of early autumn wind. Jeff Jarvis, WWGD author, is the author of the essay. He wrote:
Bild’s [a newspaper] partnering with readers is part of a trend that this paper’s editor, Alan Rusbridger, is calling the “mutualisation of news”. Last week, he talked with staff about the concept, using as examples the paper’s own collaboration with readers. The separation between reporter and reader, he said, blurs as they work together. That is the future – and natural state – of media; collaboration not just in content creation, but now in advertising as well.
The idea is that publishing companies can tap into the stem cell growth of user generated content. Sounds good. My thought is that those autumn winds, no matter how warm and gentle, come before the long winter. Where is my parka?
Stephen Arnold, June 9, 2009
Elsevier Policy Change for Sponsored Publications
July 8, 2009
Short honk: According to CBC News, “Publisher Elsevier to Revise Journal Sponsorship Rules” here. The CBC said”:
“Elsevier will review practices related to all article reprint, compilation or custom publications and set out guidelines on content, permission, use of imprint and repackaging to ensure that such publications are not confused with Elsevier’s core peer-reviewed journals and that the sponsorship of any publication is clearly disclosed,” the company said in a statement on Thursday. The company expects to complete its review and issue new guidelines by June 30. An internal review showed a series of publications produced in Australia between 2000 and 2005 used the words “Journal of” in their name but lacked proper disclosure of sponsors, were not, in fact, journals and should not have been titled as such, the company said.
A shift in advertorials and their attendant revenue is underway. No word about the impact of those researchers who were unaware of the difference between peer review and sponsored publications. A somewhat spicey commentary is here.
Stephen Arnold, June 8, 2009
Print and the Digital Gutenberg
July 7, 2009
The premise of my new study Google: The Digital Gutenberg is that a new medium has supplanted traditional media. I don’t mean print. I mean video, images, and constructs. The Google Wave is a bundle of Google components that eliminate the need for separate types of communication modes. Wave may or may not work, but whoever pulls it off will be rolling in money and opportunity. Hamilton Nolan, writing in Gawker, reprinted David Eggers email, which is germane to my thesis. You can find the Eggers’ email here. Mr. Eggers wrote:
Anyway. I would like to say to you good print-loving people that for every dire bit of news there is out there, there is also some good news, too. The main gist of my (rambling) speech at the Author?s Guild was that because I work with kids in San Francisco, I see every day that their enthusiasm for the printed word is no different from that of kids from any other era. Reports that no one reads anymore, especially young people, are greatly overstated and almost always factually lacking. I’ve written about youth readership elsewhere, but to reiterate: sales of young adult books are actually up. Total volume of all book sales is actually up. Kids get the same things out of books that they have before. Reading in elementary schools and middle schools is no different than any other time. We have work to do with keeping high schoolers reading, but then again, I meet every week with 15 high schoolers in San Francisco, and all we do is read (literary magazines, books, journals, websites, everything) in the process of putting together the Best American Nonrequired Reading. And I have to say these students, 14 to 18 years old, are far better read and more astute than I was at their age, and there are a million other kids around the country just like them.
I agree. I just think that a new medium will supplant ink-on-paper and to a certain extent other media that are expensive to distribute via traditional means. I have three or four publishers selling my books, and I know that a downturn is underway. My newest publication “Mysteries of Online” will be made available without charge on my Web site. One of my publishers expressed interest in the material, but I am shifting my approach. Call it a test.
Information is not at risk. Certain approaches and business models are at risk. Innovation and experimentation are needed.
Mr. Eggers wrote:
As long as newspapers offer less each day— less news, less great writing, less graphic innovation, fewer photos— then they’re giving readers few reasons to pay for the paper itself. With our prototype, we aim to make the physical object so beautiful and luxurious that it will seem a bargain at $1. The web obviously presents all kinds of advantages for breaking news, but the printed newspaper does and will always have a slew of advantages, too. It’s our admittedly unorthodox opinion that the two can coexist, and in fact should coexist. But they need to do different things. To survive, the newspaper, and the physical book, needs to set itself apart from the web. Physical forms of the written word need to offer a clear and different experience. And if they do, we believe, they will survive. Again, this is a time to roar back and assert and celebrate the beauty of the printed page. Give people something to fight for, and they will fight for it. Give something to pay for, and they’ll pay for it.
My view is that the medium of innovations like Wave may offer an opportunity. The paper “thing” is expensive and becoming an issue for some concerned with the environment. Information is thriving.
Stephen Arnold, June 6, 2009
Sci Tech Publishers: Doom Looms for the Tech Challenged
July 3, 2009
Quite interesting essay by Michael Nielsen: “Is Scientific Publishing about to Be Disrupted?” The answer is soon. I don’t agree. Sci tech publishing is in the midst of a crisis. If you want to know about Mr. Nielsen’s good news interpretation of the coming disruption, dive in.
Mr. Nielsen, in case you haven’t been keeping up with quantum computation, is a real life wizard. He is one of the pioneers of quantum computation. Together with Ike Chuang of MIT, he wrote the standard text on quantum computation. This is the most highly cited physics publication of the last 25 years, and one of the ten most highly cited physics books of all time (Source: Google Scholar, December 2007). He is the author of more than fifty scientific papers, including invited contributions to Nature and Scientific American. His research contributions include involvement in one of the first quantum teleportation experiments (related), named as one of Science Magazine’s Top Ten Breakthroughs of the Year for 1998, quantum gate teleportation, quantum process tomography, the fundamental majorization theorem for comparing entangled quantum states, and critical contributions to the formula for the quantum channel capacity.
He explains that publishers are victims of a local optimum; that is, publishers know where they should take their companies. Publishers just can’t bridge the gap. He provides a useful discussion of the knocks traditional media deliver to the digital door to online information.
But the guts of the write up are gathered in his discussion of non traditional publishing of scientific and technical information. The links are useful and the examples are compelling. Let me mention one; the others you can glean directly from his write up. He wrote:
Or consider startups like SciVee (YouTube for scientists), the Public Library of Science, the Journal of Visualized Experiments, vibrant community sites like OpenWetWare and the Alzheimer Research Forum, and dozens more. And then there are companies like WordPress, Friendfeed, and Wikimedia, that weren’t started with science in mind, but which are increasingly helping scientists communicate their research. This flourishing ecosystem is not too dissimilar from the sudden flourishing of online news services we saw over the period 2000 to 2005.
He concludes his essay with some examples of new opportunities. His recipe for success is that publishers must understand technology in the way Steve Jobs and Messrs Brin and Page do. That’s where he and I part company. A technologist like Mr. Nielsen assumes that a motivated manager can identify, recruit, and manage a world class technologist or somehow edge closer to this capability.
Won’t happen. Technologists like Mr. Nielsen come from a different dimension; sci tech publishers adopt a very different technology world. Nevertheless, the essay is interesting and worth reading.
Stephen Arnold, July 3, 2009
WSJ: Now Upgraded to Viagra Class Spammer
July 1, 2009
Short honk: Yep, 7 56 am the Wall Street Journal began spamming me to become a subscriber. Well, the newspaper achieved one objective. I have suspended my Wall Street Journal subscription. I did enjoy this type of information about the proud, oh, so proud New Jersey publication. I wrote. I called customer support. I posted two previous stories about this company’s spamming of existing customers. Now there is one less customer and my legal eagle is writing the consumer complaint entities in the great state of New Jersey and the Commonwealth of Kentucky. One person asked me not to describe newspapers as the “dead tree” crowd. Sorry. When a paying customer gets spammed, not only is the organization a fully fledged dead tree publisher, it has achieved the rank of Viagra class spammer. As an observer, I can be critical. As a customer, I can be miffed. The spammer—Rupert, are you listening?—has lost one real, live, paying customer. How many more will your silly marketing methods drive away. Oh, I know. The Wall Street Journal is too important, too big to fail. Gee, I hear an echo.
Stephen Arnold, July 1, 2009

