Limitations of MSFT Exchange 2010

March 16, 2010

I am not sure how one of my goslings came across this spreadsheet tucked away on the Microsoft Exchange Web log. When I tried to access the file, the system did not recognize my “official” Microsoft MSDN user ID nor my Windows Live credentials. So you may have to register to access the blog. Once there, you need to look for the download section and visually inspect the file names for the one that points to the Exchange Performance Excel spreadsheet. Running a query in the blog’s search box produced zero hits for me. But with some persistence and patience I was able to get a copy of the spreadsheet. Latency was a problem when I was fiddling with this download. (Note: if the link is dead, write one of the goslings at benkent2020 at yahoo dot com, and maybe he will email you a copy of this document.)

Once you get the document “Scalability Limitations”, you will see some pretty interesting information. One quick example is that the spreadsheet includes three columns of specifics about scaling amidst the more marketing oriented data on the spreadsheet. These three juicy columns are:

  • Limitation
  • Issue
  • Mitigation.

Here’s the information for the row Database Size:

  • Limitation–Exchange 2007 – 200GB; Exchange 2010 – 2TB or 1 disk, whichever is less
  • Issue–The DB size guidance changed from 200GB (if you are in CCR) to 2TB or 1 disk, whichever is greater (if you have 2+ copies of the DB in question)
  • Mitigation—Blank. No information.

Okay.

I hope you are able to locate this document. For those of you eager to install Exchange 2010, SharePoint 2010, and Fast Search 2010, you will want to make sure you have these type of spreadsheets at your fingertips * before * you jump on the Microsoft Enterprise steam engine. The information in the spreadsheet makes clear why some types of email content processing may be expensive to implement.

Stephen E Arnold, March 16, 2010

This is the equivalent of the free newspaper Velocity in Louisville. Read it for nothing. I will report working for no dough to the Jefferson County agency that thinks I work in Louisville when I spend most of my time in the warm embrace of airlines.

Indexing Craziness

March 15, 2010

I read “Folksonomy and Taxonomy – do you have to choose?,” which takes the position that a SharePoint administrator can use a formal controlled term list or just let the users slap their own terms into an index field. The buzzword for allowing users to index documents is part of a larger 20 something invention—folksonomy. The key segment for me in the SharePoint centric Jopx blog was:

The way that SharePoint 2010 supports the notion of promoting free tags into a managed taxonomy demonstrates that a folksonomy can be used as a source to define a taxonomy as well.

Let me try and save you a lot of grief. Indexing must be normalized. The idea is to use certain terms to retrieve documents with reasonable reliability. Humans who are not trained indexers do a lousy job of applying terms. Even professional indexers working in production settings fall into some well known ruts. For example, unless care is exercised in management and making the term list available, humans will work from memory. The result is indexing that is wrong about 15 percent of the time. Machine indexing when properly tuned can hit that rate. The problem is the that the person looking for information assumes that indexing is 100 percent accurate. It is not.

The idea behind controlled term lists is that these are logically consistent. When changes are made such as the addition of a term such as “webinar” as a related term to “seminar”, a method exists to keep the terms consistent and a system is in place to update the index terms for the corpus.

When there is a mix of indexing methods, the likelihood of having a mess is pretty high. The way around this problem is to throw an array of “related” links in front of the user and invite the user to click around. This approach to discovery entertains the clueless but leads to the potential for rat holes and wasted time.

Most organizations don’t have the appetite to create a controlled term list and keep it current. The result is the approach that is something I encounter frequently. I see a mix of these methods:

  1. A controlled term list from someplace (old Oracle or Convera term list, a version of the ABI/INFORM or some other commercial database controlled vocabulary, or something from a specialty vendor)
  2. User assigned terms; that is, uncontrolled terms. (This approach works when you have big data like Google but it is not so good when there are little data, which is how I would characterize most SharePoint installations.)
  3. Indexes based on parsing the content.

A user may enter a term such as “Smith purchase order” and get a bunch of extra work. Users are not too good at searching, and this patchwork of indexing terms ensures that some users will have to do the Easter egg drill; that is, look for the specific information needed. When it is located, some users like me make a note card and keep in handy. No more Easter egg hunts for that item for me.

What about third party SharePoint metadata generators? These generate metadata but they don’t solve the problem of normalizing index terms.

SharePoint and its touting of metadata as the solution to search woes are interesting. In my opinion, the approach implemented within SharePoint will make it more difficult for some users to find data, not easier. And, in my opinion, the resulting index term list will be a mess. What happens when a search engine uses these flawed index terms, the search results force the user to look for information the old fashioned way.

Stephen E Arnold, March 15, 2010

A free write up. No one paid me to write this article. I will report non payment to the SharePoint fans at the Department of Defense. Metadata works first time every time at the DoD I assume.

BA-Insight: New Angle on Lead Generation

March 13, 2010

The Microsoft Fast search road show was in New York this week. I stayed in rural Kentucky watching the acid run off trickle into my goose pond. I took time out from this strenuous activity to read “BA-Insight Announces New Direct Access to Free Information and Resources for SharePoint Search and Fast.”

BA-Insight develops software, including Longitude which “helps people find an analyze relevant information across the entire enterprise independently of format or location.” The firm’s Web site has been revamped and features “an enhanced support portal and new free resource library specially designed for enterprise evaluating SharePoint or Fast Search or engaging in SharePoint or Fast Search deployments, including Fast ESP.”

I took a look at the site. The splash page is below, but you will see different graphics because the rectangular area features a slide show of information.

bainsight

Source: http://www.ba-insight.net/Pages/Home.aspx

You can download white papers, get inks to videos, and access the company’s Web logs. One of the documents is the Microsoft Enterprise Search 2010 Roadmap. When I clicked on that link, I saw another link and the icon labeled premium shown below.

bainsight premium

In order to access that document, I was given an option to fill in a form with my name, title, organization, phone, email, and interests. The angle seems to be that to get this document, one must go through a vendor like BA-Insight.

One of the goslings filled in the form and the road map is a single page that explains Microsoft’s five search technologies and lists the capabilities, repository indexing, and manageability features of each product. Interesting stuff.

Here’s one snippet of the roadmap, which is more of a table than a map in my opinion:

bainsight snippet

Interesting stuff. Particularly with regard to scaling, I wonder if organizations will have the appetite for this type of hardware footprint on site. Will enterprise Fast ESP work from the cloud? © Microsoft 2010.

Several questions:

  • Will more search vendors shift into education or missionary marketing mode to move their systems?
  • In today’s financial climate, will the portal approach supplant the more traditional features-benefit type of marketing that characterizes some search vendors’ Web sites?
  • Has the complexity of the product offering broken the back of the adage “KISS” for business oriented communications?

I will watch to see if other vendors embrace the educational portal approach to sales and lead generation. The addled goose just makes information available via a blog, assuming that content with an edge will generate inquiries. Perhaps once again I am wrong?

Stephen E Arnold, March 13, 2010

No one paid me to write this short article. Because of the references to Microsoft and its five search options, I will report non payment to the Department of Defense, an organization with an interest in Microsoft’s technology.

Hewlett Packard Trim 7

March 12, 2010

Hewlett Packard, a company that I continue to associate with low cost printers and high cost ink, lit up my radar with its acquisition of Lexington, Kentucky-based Exstream Software two years ago. Exstream (now Enterprise Document Automation), like IBM Ricoh Infoprints and Streamserve, generates outputs like invoices with warranty reminders and auto payment bills with coupons for oil change discounts. I learned that in February 2010, HP stepped up its footprint in document management. One of the source documents I examined is “HPTrim 7… How We Got Here?”. The gray  background and the dark blue highlights on text were a bit much for the addled goose’s eyes, however. For me, the most interesting segment in the history of Trim 7 was this passage:

Market consolidation meant that lots of little players were gobbled up, as the larger vendors strived to meet the ever challenging demands of the marketplace, picking up technology from these smaller companies and making them a part of their overall product line. Hewlett-Packard, one of the largest IT companies in the world, did the same, acquiring TOWER Software in 2008, but with one subtle difference. Rather than cannibalize the technology and abandon the product, they kept almost all of the staff from the TOWER acquisition and told them to build the next version of what is now known as HP TRIM. And – there were no other products that HP TRIM had to compete with internally unlike a lot of the other acquisitions: IBM/FileNet, OpenText/Hummingbird/Vignette, and utonomy/Zantaz/Interwoven/Meridio. HP wanted to concentrate on the product that was HP TRIM, and add the backing that only a company like HP can bring to a product. And so, HP TRIM 7 was born.

Digging through the text, HP bought an outfit called Tower and is rolling in other software to create the “new” document management business. You can locate the main page here. Three points jumped out:

First, I did not see any indication that HP’s dynamic document system integrates our “touches” the Trim 7 product. That’s strike me as an indication that HP is chasing revenues from silo sales, not integration.

Second, how does one find a document? I could not locate any information about the search and retrieval functions within Trim 7. I surmise that if I use Trim 7 for SharePoint, I in theory would be able to use the Microsoft Fast ESP system to search for content. That also seems to be quite a bit of work; that is, consulting revenue for HP or its partners. My query “search HP Trim” resulted in 10 hits but noting on point. One result was this page, which was heavy on marketing an light on locating information within the Trim 7 system. After a legal eagle drops a gift on a company named as a party in a legal matter, job one is answering the question, “What’s this about?” Trim 7 may not be able to answer that question.

Third, HP seems to be grabbing enterprise software companies that address really big information problems. With HP’s push into printers and ink, I saw a success that may have caught the firm’s hardware mavens by surprise. The trajectory in enterprise software is being driven from bit money acquisitions. I think that the surprise of printing consumables will be different from the surprise of acquisition-based growth. One was emergent; the latter is closer to MBA spreadsheet fever.

Big bets. Big win or big loss? I am leaning toward the loss option. Outlook: worth monitoring.

Stephen E Arnold, March 12, 2010

No one paid me to write this. Because HP derives significant revenue from ink, I think I have to report non payment to the US government’s printer, GPO.

Bitrix in the Enterprise Search Game

March 12, 2010

Short honk: A happy quack to the reader who sent me a link to “Bitrix Introduces the D.I.G.™ Engine: the Ultimate in Enterprise 2.0 and Web 2.0 Search Technology.” Bitrix was a company not familiar to me and there were no data in my Overflight service.

Bitrext, founded in 1998 and based in the Washington, DC are, asserts that it is a “technology trendsetter.” The company says:

Bitrix, Inc. specializes in the development of content management systems and intranet portal solutions for managing web projects and multifunctional information systems on the Internet. Deployed at more than 30,000 customers worldwide, Bitrix products are fast, reliable, easy to use and highly scalable…Bitrix takes pride in serving clients ranging from Fortune 500 companies to funded startups, including enterprises like Xerox, Toshiba, Epson, Samsung, Panasonic, Volkswagen, Hyundai, KIA, Gazprom, VTB, Zurich Insurance, DPD, PriceWaterHouseCoopers, Cosmopolitan, Vogue, PC Magazine, and many more.

The search system makes use of the firm’s D.I.G. Engine. D.I.G. is “an advanced search engine developed specifically for enterprise Intranets and Web sites that enables high-performance data search in texts, media content and documents with smart ranking, sorting and display. The engine is available in the company’s flagship products – Bitrix Intranet Portal and Bitrix Site Manager.”

The system “enumerates texts, media content and documents while looking for morphological stems and considering their density.” The search results are “filtered with respect to the user access rights before being displayed.” The company adds:

D.I.G. offers manual or immediate automatic data indexing, making content searchable right after its submission. Users may create complex search queries using query language, inclusion/exclusion masks and logic operators, as well as choose specific site sections for a highly targeted search. The technology supports AJAX-powered interactive pages, provides advanced taxonomy service with automatic tag cloud generation, allows making Google Sitemap, as well as a user-specific search form design. It covers English, German and Russian and enables fast and painless connecting of other languages with third-party stemming tables.

There are screenshots of the company’s products on the firm’s Media Gallery page, but I did not see a search results example.

The company offers a “virtual appliance”. The idea is that multiple instances of Bitrix products can run on the same computer each in a virtual space.

Prices for the system are located at http://www.bitrixsoft.com/buy/intranet.php with the range in the $1,500 to $20,000 spectrum.

My impression is that search is an embedded feature, which exemplifies the trend of content management vendors trying to improve the utility of their systems.

Stephen E Arnold, March 12, 2010

No one paid me to write this. With the firm’s location near several interesting Federal entities, I will send an email to one of those Dot Mil addresses and report my status of free writer.

Cisco Throw Down: Accelerating the Internet

March 11, 2010

I keep track of the network hardware folks, but I don’t write about them in Beyond Search. Most of my readers are interested in search, content processing, and electronic information. I am pretty confident about what my readers want because I only have two or three readers. One is my assistant and the other two people are actually Internet café terminals I hacked to get my RSS feed. So, I’m unpopular. No problem for the addled goose in rural Kentucky.

I read “Cisco Shows Off Super Router” and because the article gets close to what will be an interesting front in the traditional networking sector’s battle with Google. Yep, I know. Google is a search and advertising company. Save that for the search engine optimization crowd and the azure chip crowd.

The core of the story is the statement allegedly made by John Chambers, Cisco CEO:

“Video is the killer app,” Chambers said. “Video brings the Internet to life.”

The idea is that textual information is yesterday. He is right even though I hate to see the end of an era. I regret the loss of mainframes and the wonderful revenue stream those gizmos delivered to me, but time moves on.

What the article triggered in my thinking was that Hewlett Packard and Cisco had a love spat. Now Cisco is going to find itself going where the money is, and that means into traditional telco land. The problem is that the Google with its own home brew telecommunications capabilities, the stuff it has acquired, and the technology in which it invests is going to a much larger factor in Cisco’s future. I think that may be bad news for Cisco and for some telcos. The reason is that the Google is pushing toward efficient, automated, lower cost methods.

To learn about one of Google’s little adventures, check out my KMWorld column about a company with “wireless networks that simply work?. Who will win the next series of battle in this coming collision of Google and outfits allied with traditional telecommunications companies? I don’t know, but I wager that the “real” consultants, the poobahs, mavens, and self appointed experts will discover this skirmish soon enough.

Stephen E Arnold, March 11, 2010

I was paid for the KMWorld article, but I was not paid for this reference to the KMWorld article. So, this write falls into the category of shameless marketing and self promotion. I love it.

The Taxonomy Torpedo

March 9, 2010

Quite an interesting phone call today (March 8, 2010). Apparently the article “A Guide to Developing Taxonomies for Effective Data Management” caught this person’s attention. The write up boils down the taxonomy job to a couple of pages of tips and observations. Baloney in my opinion.

The caller wanted to have me and a gosling provide the green light to a taxonomy project. The method was to use a couple of subject matter experts from marketing and an information technology intern. The idea was to take a word list and use it to index content with the organization’s enterprise search system.

The called told me, “We let the staff add their own key words. There has been a lot of inconsistency. We will develop our controlled term list and that way we have date, time, and creator; the terms the users assign; and the words in our taxonomy. What do you think?”

What I think is that no one will be able to find some of the relevant data. I am surprised that so many vendors point out that their systems “discover” metadata and provide users with suggestions, lists of related content, and the ability to search by entities.

Doesn’t work.

Here’s why:

  1. Fancy interfaces (user experience in today’s lingo) requires consistent, appropriate, and known tags. Most organizations, fresh from doing taxonomy push ups for a day, have wildly inconsistent term lists. A user may know how to locate a document in an idiosyncratic way. If that method involves a controlled word, the user may not get the results she was expecting.
  2. Automatic processes work well when the information objects have enough substantive content to make key word indexing work. I have examined a number of organizations’ content and found inconsistencies in the way in which the organization referred to itself. The controlled terms were rarely used. When a query included a controlled term, the user was puzzled why the result set was not complete.
  3. Most organizations lack the expertise and resources to create a well-formed controlled term list. Ad hoc lists are useful sometimes to those who cooked them up. A comprehensive controlled term list is a great deal of work.

What’s this mean? The stampede to taxonomies will yield the same dissatisfaction that other, partially implemented search features. Talk is easy. Taxonomies and controlled term lists are tough to develop and even harder to keep current.

Stephen E Arnold, March 9, 2010

No one paid me to write this. I mention indexing, so I will report non payment to the Librarian of Congress, or maybe the librarian for the House library, or maybe the librarian for the Senate library. I wonder why there are three libraries for Congress.

Autonomy and Its Single Platform

March 8, 2010

I read “Autonomy Interwoven First to Deliver Integrated Web Content Management, Search, Optimization, and Rich Media on a Single Platform” and realized that the Autonomy team was beginning a position campaign. The Cambridge UK firm explained that it is “a global leader in infrastructure software for the enterprise.”

The key word in the write up, in my opinion, is “integrated.” Autonomy offers an integrated bundle of Web content management, search, search engine optimization, and rich media on the IDOL platform. With enterprises struggling to hold down costs, the bundle may allow some organizations to reduce certain license fees, engineering and professional services costs, and the problem solving that goes hand in hand with complicated multi-vendor sets ups.

One added twist to the bundle is its support for targeting. The idea is that the Autonomy system can get the appropriate content in front of a specific target. The company said:

The platform provides the full spectrum of implicit and explicit targeting capabilities across all channels, including web, contact center, and social media. Unlike other marketing technologies, Autonomy automatically tags and forms a conceptual understanding of all content, inside and outside a corporation. This underlying capability allows businesses to instantly deliver more relevant, targeted content to customers.

My opinion is that integrated systems are of considerable interest to many CFOs. Information technology costs are difficult to control. When the money is rolling in, that’s not a big deal. When money is not so plentiful, IT becomes a problem. CFOs need solutions that jump over the hurdles an existing software system presents. Autonomy’s new bundle will get a close look. Will it resolve the problems that keep CFOs up at night? It’s too soon to tell.

You can learn more about this integrated system at www.autonomy.com.

Stephen E Arnold, March 8, 2010

No one paid me to write this news item. Since the company is based in the UK, to whom to I report my failure to get cash on the barrel head. Ah, the Department of State, a globally aware organization if there ever was one.

JustSystems in Flux?

March 8, 2010

I received a call about JustSystems, the Japanese company that figured out how to enter complex characters using a three digit code from a mobile phone keypad. A deal with a large mobile device company was the firm’s go-to revenue stream. With changes in mobile technology, that revenue began to dwindle. JustSystems turned to software development and consulting, which are difficult businesses to scale. When I visited the company for my key fob I noted that the firm had more than 500 employees in several locations. The information I reviewed this morning suggested that JustSystems had about 900 employees at the end of 2009.

The firm was of interest to me. I received a Japanese dinner and a key fob after giving a briefing to the company’s owners four or five years ago. I also reported in my story “JustSystems ConceptBase” that the company rolled out a search appliance in some sort of tie up with IBM.

I dug through my files and noticed a data point that I wanted to surface. In April 2009, JustSystems became a subsidiary of the Keyence Corporation. (Asiajin reported this story in April 2009.) Keyence makes a wide range of electronic gizmos. JustSystems pushed into search and content processing, purchasing a US content processing company called Clairvoyance, founded by wizard Dr. David A Evans. The push did not work and the company turned to Keyence, which bought 43.96 percent of JustSystems, valued at a about US$50 million. Six months later in 2009, the founders–Kazunori Ukigawa and Hatsuko Ukigawa, a husband and wife team—resigned as chairman and vice chairperson and quit the Board of Directors. Mrs. Unkgawa was one of the most visible female Japanese company heads in a very male Japanese technology sector.

On Friday, I was able to speak to a customer support representative on the firm’s North American hot line. I was not able to get much information about the status of the products, particularly the search appliance. I asked about the office in Pittsburgh, where Clairvoyance was located. I learned that the Pittsburgh office had been closed.

JustSystems is hosting Webinars and publicizing that it is one of the 100 firms identified as a “company that matters” by the prestigious, widely read KMWorld Magazine. The company lists as its customers, Amazon, Thomson, Symantec, Cisco, WIPO, Jaguar, and other high profile firms.

The company’s flagship product is XMetal and the firm offers a “maturity model” for an enterprise “semantic ecosystem.

Several observations:

  1. JSERI–the original Claritech – Clairvoyance – JustSystems Evans Research–seems to have shut down. See drakesbaycompany.com/documents/JSERI_ExecSummary.pdf
  2. The USPTO published in November 2009 US, “Methods and Apparatus for Interactive Document Clustering.” The assignee is JustSystems Evans Research. The Wikipedia entry is here.
  3. There is no search function on the company’s English language Web site. A quick look at the Japanese site and I was not able to locate a search function. When I search Keyence’s Web site for the ConceptBase I got zero hits. Maybe the appliance is a goner?

To sum up, I don’t know if the ConceptBase appliance is currently for sale. I will keep poking around.

Stephen E Arnold, March 8, 2010

No one paid me to write this. I did get paid to go to Tokyo to get my JustSystems key fob. I suppose that counts for something.

Microsoft SharePoint: The CMS Killer

March 7, 2010

I read “Interesting Perspective on How SharePoint Is Capturing the ECM Market.” The write up references a post by Lee Dallas who writes the Big Men on Content blog. The idea is that SharePoint works seamlessly with Active Directory. As a result, access and identity are part of the woodwork, and no information technology staff have to futz around so employees can find and manipulate documents, presentations, or spreadsheets. Furthermore, SharePoint put a stake in the heart of enterprise content management systems by adding collaboration to the create it, find it, and use it approach of the traditional content management vendors. SharePoint won because it added these features and did a great job marketing.

I agree that Microsoft SharePoint seems to be everywhere. I also know that Microsoft has pumped Tiger Juice into its partners and resellers to push the SharePoint solution. The marketing message is reinforced with zeal and great prices. Keep in mind that SharePoint requires a dump truck full of other Microsoft software to deliver on the bullet points in the SharePoint sales presentation.

Now my view on this brilliant success is a bit different.

First, Microsoft SharePoint has been around a long time. It is a combination of products, features, functions. When I hear SharePoint, I see the nCompass logo, circa 2001. I also think “content server”. The current incarnation of SharePoint is a bunch of stuff that requires even more Microsoft stuff to work. A number of Microsoft partners have built software to snap into SharePoint to deliver some of the features that Microsoft talks about but cannot get to work. These range from search to content management itself. I wrote about a SharePoint expert who uses WordPress because SharePoint is too much of a headache. Age can bring wisdom, but I think SharePoint’s trajectory has been one that delivers  mind boggling complexity. SharePoint consultants love the product. Addled geese like me see it as one more crazy enterprise solution that today’s top managers just pay for reflexively.

Second, the world of content management has become mired in muddy road after muddy road. Some projects make travel by donkey delightful. CMS was created to help outfits without any expertise in producing information post Web pages. Then the Web morphed into an applications platform and the CMS vendors were like the buggy whip manufacturers who thought horse powered carriages were a fad. Big CMS projects almost never worked without application of generous layers of money and custom engineering. At the same time, information management became important due to the fine work of the SEC, Enron, Tyco, and other outfits. Now many organizations have to keep track of documents, not lose them like White House email. It turns out that managing electronic information is pretty difficult. The bubble gum approach of Web CMS won’t work for a nuclear power plan engineering change order. Some folks are discovering this fact that a Web page is different from tracking the versions of a diagram for a cooling pipe in an ageing pressurized water reactor. Imagine that!

Third, companies lack the dough to spend wildly for information technology. The financial challenges of many organizations have not been prevented by fancy systems. Some might argue that fancy systems accelerated the impact of certain financial problems. The reason there are the alleged 100 million SharePoint users is a result of really aggressive marketing and bundling. If SharePoint provides job security, go for it. I have heard this sentiment expressed by an information technology company in Europe on more than one occasion.

The net net of SharePoint is that Microsoft is going to make a great deal of money, but there will be a gradual loss of customers. The reason is partly due to demographics and partly due to what I call SharePoint fatigue. When users discover that the fancy metadata functions don’t work, some will poke around. Metadata must be normalized; otherwise, fancy functions don’t work very well. Fixing metadata is expensive. When a cloud service comes along with the function that normalizes metadata transparently, then SharePoint will be behind an eight ball.

SharePoint, like other Microsoft software, is reaching a point where moving forward becomes more difficult and more expensive. That’s the signal for outfits like Google to strike. The death of CMS has given SharePoint a good run. Now that SharePoint may be difficult to scale, stabilize, and extend, SharePoint becomes catnip for Googzilla. Just my opinion.

Stephen E Arnold, March 7, 2010

No one paid me to write this. Since I mention Microsoft, I think I have to report non payment to the many SharePoint fans at the Department of Defense.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta