Bixtext Augments Enterprise Search with Natural Language Technology
May 7, 2012
Social media is swarming with sound bites about social media. We recently came across a bit of information about Bitext’s recent SIG brainstorming meeting, which prompted further investigation into their company. As their name implies, they are concerned with text bits. Or, as the name we know it as: unstructured content.
There event was a big success with attendance turning out to be double what they expected. Social media and business strategies were discussed, in particularly in relation to their primary concern of semantics.
Amongst several solutions, consulting services and research and development, NaturalFinder stood out as having value on par with other semantically enriched search technology:
“NaturalFinder is the essential complement for any Internet or intranet search engine as it allows users to query in natural language (Spanish, English, French…) without using Booleans or wildcards. Thanks to its linguistic technology, users can focus on typing their queries in his/hew own words as if he/she talked to another person. NaturalFinder will return all relevant documents and more documents than traditional search engines, which are based on keywords.”
It is clear here that technology is continuing to adapt to the larger trend of pervasive informal language. First, we saw unstructured content, as opposed to traditional structured content, utilized for business analytics. Now, we are creating tools that allow search engines to mimic human intelligence.
Megan Feil, May 7, 2012
Sponsored by Ikanow
No Big Deal: Beyond Search Passes 8,000 Articles
May 6, 2012
Beyond Search began in January 2008. I wanted to find a way to keep track of the most interesting news which I had been placing in my Overflight system. You can see some of the Overflight functionality at www.arnoldit.com/trax or www.arnoldit.com/taxonomy. A few days ago, Beyond Search passed the 8,000 post mark. You can search the archive of content using either the site search system, provided by Blossom.com, or the Google Custom Search Engine which indexes site content plus the links Beyond Search editors include in stories. Blossom is the search box at the top of the page. The CSE is labeled “Google.”
You can use the content to track a leading vendor; for example, enter the query “Autonomy” in the site specific search box and you see the events which we consider significant. You can also get my personal views on online products and services. Just run a query for “mysteries of online.” You can use the categories to limit a display to indexed content. No index is perfect, but you can look at a result set for a hot topic like “indexing” with a mouse click or two.
Now about the content.
First, I am not running a news operation. In fact, I don’t do news. Neither my editorial team nor I are real journalists. I am supposed to know about medieval religious sermons in Latin. The writers are mostly librarians or researchers who have been trained to produce the equivalent of a debate note card. I learned how to prepare 5×8 inch note cards when I returned to the US from Brazil and entered a wonderful American high school. Let’s see. That was in 1957 or 1958. In short, I have been doing one thing as my core research method for more than 50 years. Do you think I am going to change because a PR maven, an unemployed middle school teacher, an English major turned search expert or a Panda wants me to? In case you don’t know the answer, the answer is, “No.”
Second, we run sponsored content. We use Google AdSense. We run ads for companies who want to get a message in front of my two or three readers. I wish I knew what the business model for Beyond Search is, but the content continues to flow, seven days a week, year round. When I was in intensive care in January for more than a week, the content flowed. I know one of the editors smuggled my laptop into the hospital lock up where I was. We kept publishing. Those working on the blog just kept on going. My writing was given an extra cycle of editing because I was, quite literally close to being a gone goose. Keep in mind that the only difference between a note card content object and sponsored content is that the subject of the write up gets a chance to provide input to an editor. The ironic or cynical comments remain. If I get fascinated with a topic, I write about it or get one of the editors to produce content objects on the subject. So you will find certain topics get covered and then dropped, it is because I lose interest. You want news? Find a real journalist. Examples of what I follow and then drop range from European search systems to ways to federate the text and numeric data associated with building a fungible product like a personal computer.
Third, I am usually biased, often incorrect, and completely indifferent to the hottest trends that azure chip consultants pump out to sell consulting work. If you read the content in Beyond Search or any of the blogs which we produce, you have the obligation to think about what we present and make your own judgment about its usefulness, accuracy, or appropriateness for your particular situation.
Fourth, I use the content in Beyond Search for my columns in Enterprise Technology Management magazine, Online magazine, Information Today (a library oriented tabloid), KMWorld (an enterprise information tabloid), and Searcher magazine (a specialist publication for people who know how to use the old fashioned Dialog and Lexis systems). The content in my for fee articles is closer to the type of reports I prepare for my one or two clients. I am not a great writer. I try to look at popular or emerging technical trends and put them into the frame of my experience. If you want stories that reinforce received wisdom, you will find Beyond Search inappropriate for your needs. In my for fee columns, I knit together a number of items of information and interpret those items in a business context. The for fee columns, therefore, go beyond what is in the free blog.
My plan is to keep the information stream flowing and free. If you have a comment to make about the point of view or the information in a content object (my word for article or story), use the comments section of the blog. If you write me with spam, silly news releases, and baloney I did not specifically request—be advised: I may write about what I call “desperation marketing.” Don’t like the term? Well, I do, and it is accurate. The facile notion of “pivoting” a company is mostly marketing baloney. I don’t like baloney.
For more information about the editorial policies or how to contact us to get access to our two or three readers, navigate to the About page.
Stephen E Arnold, May 6, 2012
Sponsored by Stephen E Arnold
Buried Alive by Data
May 1, 2012
This recent blog post on the Search Technologies’ Web site makes some amusing and thought provoking comparisons between the reality TV show “Hoarding, Buried Alive!”, and the state of unstructured data within some organizations.
This phrase—I am absolutely overwhelmed by this, I just don’t know where to start” — is attributed to both a hoarder on a TV show. The speaker is contemplating how to tackle a sink piled with dirty dishes. The phrase also applies to an enterprise search program manager contemplating how to begin a project.
The article, Buried Alive by Data is worth a read for the amusement value alone. However, it also makes some important points. Discipline and due process are key part of the success recipe. For enterprise search, the award-winning search assessment methodology is cited as a proven approach to project discipline. The comparison made between the lawlessness of a hoarder’s kitchen and the average corporate file share may seem somehow familiar to many readers.
Iain Fletcher, May 1, 2012
Sponsored by Search Technologies
Open Source Search Profiles Available
April 25, 2012
OpenSearchNews.com, the new information service from ArnoldIT, has rolled out a new profile service. The first profile describes the Basho Riak Search system. Although proprietary, the Basho team has made the Riak search system open source. You can request a copy of the Basho Riak profile, which is available without charge, from the Open Source Search Profiles link.
Stephen E Arnold, publisher of OpenSearchNews said:
Consulting firms specializing in open source search have been slow on the trigger when it comes to vendors who offer an alternative to proprietary, “closed” search systems. My team has completed analyses of a dozen open source search vendors and will post a fresh profile every seven to 10 days. The profiles follow the same type of format which we used in such monographs as The Google Legacy, Beyond Search (published by the “old” Gilbane Group), Enterprise Search Report, Successful Enterprise Search Management, and The New Landscape of Enterprise Search. Instead of paying hundreds, maybe thousands of dollars, ArnoldIT is making the information available without charge to facilitate greater understanding and discussion of open source search options.
Profiles contain:
- Background of the company
- Principal features and functions of the systems
- The upside and downside of the system
- An ArnoldIT “net net” which puts the system in context.
The content of the profiles is intended for individuals, students, and teachers. Libraries are free to use the content without seeking permission. Any other use requires written permission from Stephen E Arnold.
A complete collection of the 12 profiles, an introduction to the open source search, and a summary of where open source search is gaining traction, contact us by writing seaky2000 at yahoo dot com. The information is available in the form of an online or on site briefing. There is a charge for the complete set of information and/or the briefing.
For up-to-date information about open source search solutions built on Lucene, Solr, and Xapian, among others, check out OpenSearchNews.com. You can, of course, wait for one of the azure chip consultants, unemployed Webmasters, or newly minted search experts to recycle ArnoldIT content. However, the profiles are current and will be available without charge. Enjoy.
Donald C Anderson, April 24, 2012
Sponsored by ArnoldIT, your source for strategic information services
Protected: Exclusive Interview: David B. Camarata, IKANOW
April 9, 2012
SharePoint: Internal and External Functionality
March 29, 2012
When you hear the term “dark” these days it usually refers to the surge in vampire romance fiction or Stars Wars variants. In the world of web design, however, dark means that web developers use dark web templates on their pages. It takes a good eye for color and contrast to make a dark web site work and Top SharePoint found “50 Beautiful Dark Web Sites Built on SharePoint.”
As the article’s author explains:
“Personally I am very fond of dark websites even though clean light web design is the main choice, especially in the corporate world. It is true that dark designs have a tendency to feel a bit heavy and harder to read if lot of text is presented but I feel they look more elegant and creative. Besides, using dark backgrounds you can make the content stand out and be the main focus.”
Perusing through the list the web sites that catch my eye are “The City of Calgary,” “Hard Rock Casino Tulsa,” “Club Paradiso,” and “NSU.” Pick your own favorites and discover new ideas for SharePoint web design. No matter what graphic colors you use for your SharePoint site, take into account that if you want visitors to find information you will need an excellent search enterprise.
If you want to use SharePoint for more than internal document sharing, you can turn to Search Technologies to assist you in leveraging SharePoint. In addition to dynamic content and rich media, a Search Technologies’ implementation can integrate your public facing Web site with your internal SharePoint system. You will be able to establish immediate and direct interactions with your prospects and customers without losing the SharePoint functionality you need to run your business in a cost effective manner. To learn more, visit www.searchtechnologies.com.
Iain Fletcher, March 29, 2012
Ikea Italy Selects Autonomy
March 28, 2012
Ikea Italy is giving its business to Autonomy: Market Watch informs us that “Autonomy Powers Intelligent Document Management and Process Automation at IKEA Italy.” It seems the deal cincher was the system’s single platform to be deployed across the enterprise. The write up reveals:
IKEA Italy needed a single, centralized repository where employees could quickly and efficiently access relevant business documents and improve the automation of its internal and customer-facing business processes. Furthermore, it needed to improve collaboration between different departments. IKEA Italy selected Autonomy WorkSite and TeleForm to fulfill these requirements. Autonomy’s solutions, built on the Intelligent Data Operating Layer (IDOL), provide IKEA Italy with a single platform for document management and business process automation across the company’s multiple repositories.
Other benefits for Ikea Italy include robust document management functionality; adequate scalability; and multi-language search capabilities. That last facet should prove very valuable; the company’s internal documents are in several languages.
HP bought Autonomy in 2011. The company, originally founded in 1996, is a leader in meaning-based information technology. They take great pride building tools that efficiently extract meaning from unwieldy tangles of unstructured data.
Originally founded in Switzerland, Ikea’s quality, customer-assembled furniture business now spans the globe. It arrived in Italy in 1989. Ikea arrived at Autonomy in 2012. What took so long?
Cynthia Murrell, March 28, 2012
Sponsored by Pandia.com
What Do Search Buy Outs Mean?
March 21, 2012
I worked through the 75 profiles I maintain on search and content processing vendors. Here’s a list of the Big Dogs in search in Year 2000 and what happened to these companies since this date.
Original Name | Buyer | Comment |
Autonomy | Hewlett Packard | “A baby tiger” |
Blossom | Available | Hosted search |
Brainware | Lexmark | Back office |
Convera | Out of business | Parts sold off |
dtSearch | Available | Low cost leader |
Endeca | Oracle | Unclear |
Exalead | Dassault Systèmes | Unclear |
Fast Search | Microsoft | An add in for SharePoint |
Innerprise | GoDaddy | Search |
InQuira | Oracle | Unclear |
Inxight Software | SAP property | Unclear |
Isys Software | Lexmark | Unclear |
Mindbreeze | Part of Fabasoft | Replacement for SharePoint search |
Mondosoft | SurfRay | On shelf |
Ontolica | SurfRay | Replacement for SharePoint search |
Panoptic | Squiz | Now Funnelback |
Recommind | Available | In and out of enterprise search |
Stratify | Autonomy | Formerly Purple Yogi |
Teratext | SAIC | Unclear |
Thunderstone | Available | Enterprise search |
TREX | SAP | Unclear |
TripleHop | Oracle | Unclear |
Vivisimo | Available | Customer support |
This is a selected list. These 22 companies provide a snapshot of what’s happened in enterprise search in the last 12 years. Some observations:
First, in the list of 22 entries, I have used the word “unclear” as a comment eight times. The reason is that I am not sure how the technology will be deployed or if the technology has been orphaned (TREX) or held in reserve (Mondosoft). How does one apply a “system” to a search system (Dassault Exalead)?
Second, of this set of 22 companies which I have written about in Enterprise Search Report (2004 to 2006), Beyond Search (Gilbane), and The New Landscape of Search (Pandia in Oslo), five have not been acquired to my knowledge. One wonders if and when these search vendors will be taken off the table.
Third, the list begs the questions, “What are the next wave of search and content processing companies to be purchased, merged, or integrated into a larger entity?” Great question and one which I will not answer in a free blog post.
My thoughts, before they slip away, are:
- With the interest in open source search, what will be the long term revenue and cost picture for proprietary search solutions?
- Will content analytics vendors become the “new search vendors”? IBM’s use of Lucene for its various search solutions provides a suggestion of this shift in its Content Analytics product.
- How will the companies which have acquired search technology make money from these purchases AND be able to invest in the research and development necessary to keep the systems in step with licensee requirements? Frankly, I don’t know. There is only so much money available to pump into the black hole of information retrieval for technology, which is some cases is almost 25 years young.
Net net: Okay, lots of company have acquired search and retrieval systems. Now what? Not my problem.
Stephen E Arnold, March 21, 2012
Sponsored by Pandia.com
Google and the Enterprise: The Point? Money
March 19, 2012
You must read “Google Enterprise chief Girouard Heads to Startup Upstart.com.” I wondered if a simple executive shuffle many months after a de facto demotion was news. Apparently the poobahs and “real” journalists find a Xoogler worthy of a headline. I have a different view about Google and the enterprise. I write about Google’s latest adventures in my Enterprise Technology Management column, published in the UK, each month.
Google pumped quite a bit of time, effort, money, and Google mouse pads into its enterprise initiative. In the salad days, Google could not learn enough about the companies dominating the enterprise search space. As I researched my Google monographs, I was picking up from interview subjects anecdotal information about the paucity of knowledge Googlers had about what enterprise procurement teams required.
In one memorable, yet still confidential interaction, Google allegedly informed a procurement manager that Google disagreed with a requirement. Now, if that were true, that is something one hears about a kindergarten teacher scolding a recalcitrant five year old. Well, that may have been a fantasy, but there were enough rumblings about a lack of customer support, a “fluid” approach to partners, and a belief that whatever Google professionals did was the “one true path.” I never confused Google and Buddha, but for some pundits, Google was going to revolutionize the enterprise. Search was just the pointy end of the spear. The problem, of course, is that organizations are not Googley. In fact, Googley-type actions make some top dogs uncomfortable.
What happened?
Based on my research, which I shifted to the back burner, I learned:
- Google was unable to put on an IBM type suit. The Googley stuff opened doors, but the old Wendy’s hamburger ad sums up what happened after the mouse pads and sparkle pins were distributed: “Where’s the beef?”
- The products and services were not industrial strength and ready for prime time. The notion of an endless beta and taxi meter pricing, no matter how “interesting”, communicated a lack of commitment.
- The enterprise market likes the idea of paying money to be able to talk to a person who in most cases semi-cares about a problem. AT&T makes tons of dough making clients pay four times an engineer’s salary to get a human on the phone any time. Google delegated support down to partners. Won’t work. A Fortune 100 company wants to call Google, not send an email.
- Pricing. If you are not sure what the ballpark cost for indexing 100 million documents using a search appliance, ensuring 24×7 uptime, and backing up—navigate to www.gsaadvantage.com and look up the price of a Google Search Appliance. Now figure out how much it will cost to process an additional one million documents. How’s that price grab you?
When Larry Page assumed control of the company, I wrote about the wizards who were reporting directly to him. The head of the enterprise unit was not one of those folks. My conclusion: game over.
Like AOL, the notion of having a Google person on staff is darned appealing to some, but as the AOL experience makes clear, a Xoogler is not a sure fire money maker.
Here’s the quote I jotted down from the GigaOM story:
Still, market share and revenue may never have been Google’s goal. By offering a lower-cost option to the Office/Exchange tandem, Google forced the market leader to respond, and that may have been the point all along.
Baloney. Google expected to have big outfits roll over and wag their tail. The US government did not roll over. Most big IBM, Microsoft, and Oracle customers did not roll over. More important, the new wave of enterprise service and solutions providers did not roll over. Why? A lack of focus and a dependence on online advertising, legal hassles, privacy chatter, and a failure to deliver competitive products and services made the enterprise initiative a tough sell. Betas may be great for market tests. For the enterprise, a beta may be a hindrance.
Stephen E Arnold, March 19, 2012
Sponsored by Pandia.com
Protected: AvePoint Takes SharePoint to Japan
March 16, 2012