Protected: DotNetNuke and the New SharePoint Additions
August 2, 2011
Oracle Updates SES11g
August 1, 2011
We wanted to mention the update to Secure Enterprise Search (SES) to our Oracle fans. Users will want to upgrade to 11g Release 1 (11.1.2.2), which can be downloaded at the link above.
First the token bullet list of “what’s new” straight from the Web site:
- All platforms available for download, including Windows 64-bit
- Oracle Access Manager integration for crawler and search application
- Autovue CAD file support
- Custom lexers and stop words lists, on per-data source granularity
It’s nice to see that the add-on is ready to cooperate with Oracle’s own Autovue; including drawings in an index is a must for several industries. Provided it proves functional, adding more flexibility with the stoplist should increase accuracy and weed out those pesky repetitive user-specific terms.
I scanned the release notes; no surprises here (a patch is a patch is a patch). There are several known issues but save a few exceptions the workarounds are adequately documented. Watch out for a possible compatibility issue with IPv6. Keep in mind that Oracle bought a natural language search engine with its InQuira purchase. NLP seems to be an interest of Oracle.
Sarah Rogers, August 1, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Protected: SharePoint Licensees May Want to Check Out Daytona
August 1, 2011
Lucene/Solr Get Bitten by Java Bugs
July 30, 2011
The open source search crowd loves Lucene/Solr. The Java goodness helps deliver cross platform goodness. The article “Apache and Oracle Warn of Serious Java 7 Compiler Bugs” suggests that scratchy bites in interesting places are now arriving. The subtitle delivers a pay load:
The newly released Java upgrade suffers hotspot-compiler problems that affect Lucene and Solr
Whoa, horsey.
The bottom line, per Apache Project: Don’t use Apache Lucene/Solr with Java 7 releases before Update 2. If you do, the group recommends that you at least disable loop optimizations using the
-XX:-UseLoopPredicate
JVM option to avoid the risk of index corruptions. The Apache Project strongly recommended that users avoid running any of the hotspot optimization switches in any version of Java without extensive testing. Additionally, Apache Project advised that users who upgrade to Java 7 may need to reindex, “because the Unicode version shipped with Java 7 changed and tokenization behaves differently (e.g. lowercasing).” More information is available in theJRE_VERSION_MIGRATION.txt
file that comes with the distribution package.
Our view is probably superficial. With Oracle slipping on an open source T shirt and hanging with the community, open source goodness is floating down Dolphin Way. However, missteps can create hassles for the Lucene/Solr folks. One wonders if these errors were a result of smart folks rushing a project out the door in order to hit the beach or if the Oracle touch is like one of those Bruce Lee fingers of death. One touch and you are a goner.
Excitement is here.
Stephen E Arnold, July 30, 2011
Freebie just like some open source search software but not all.
Connectbeam May Be Disconnected
July 29, 2011
Short honk: The Connectbeam.com Web site is down and listed as available on GoDaddy. The company was an enterprise social search vendor. Here’s what I have from one of my Overflight files:
Founded in 2006, Connectbeam provides modern Web services applications to enterprises and their employees. The underlying premise behind Connectbeam’s application is that informal social networks already exist inside all companies and with the right platform, those networks will grow in practical value through efficient information sharing and expert-colleague discovery. Connectbeam was the first company to integrate concepts of social book-marking and tagging with those of social networking specifically for the enterprise. The Connectbeam application resides securely behind your corporate firewall easily deployed as a physical appliance enhancing information-sharing and collaboration in the daily work-flow of enterprise employees, boosting their innovation, improving their decision-making, intensifying their collaboration, and helping them to build valuable relationships across the company, effortlessly. Connectbeam, headquartered in Mountain View, California, is venture-backed and privately held.
If anyone has information, please, post it using the comments section of the blog.
Stephen E Arnold, July 29, 2011
Sponsored by ArnoldIT.com, author of The New Landscape of Enterprise Search
Protected: Double Happiness: SharePoint 2007 and 201 Together
July 29, 2011
Make Metadata Useful. But What If the Tags Are Lousy?
July 28, 2011
I must be too old and too dense to understand why the noise about metadata gives me a headache. I came across a post or story on the CNBC.com Web site that was half way between a commercial and a rough draft of a automated indexing vendor’s temp file stuffed with drafts created by a clever intern. The post hauled around this weighty title: “EMA and ASG Webinar: 7 Best Practices For Making Metadata Useful”. The first thing I did was look up EMA and ASG because I was unfamiliar with the acronyms.
I learned that EMA represents a firm called Enterprise Management Associates. The company does information technology and data management research, industry analysis, and consulting. Fair enough. I have done some of the fuzzy wuzzy work for a couple of reasonably competent outfits, including the once stellar Booz, Allen & Hamilton and a handful of large, allegedly successful companies.
ASG is an acronym for ASG Software Solutions. The parent company grows via acquisitions just like Progress Software and, more recently, Google. The focus of the company seems to be “the cloud in your hand.” I am okay with a metaphorical description.
I am confused about metadata. Source: http://www.thebusyfool.com/wp-content/uploads/2011/05/Decisions_clipart.jpg
What caught my attention is the focus on metadata, which in my little world, is the domain of people with degrees in library and information science, years of experience in building ANSI standard controlled term lists, and hands on time with automated and human centric indexing, content processing, and related systems. An ANSI standard controlled term list is not management research, industry analysis, consulting, or the “cloud in your hand.” Controlled term lists which make life bearable for a person seeking information are quite difficult work, combining the vision of an architect and the nitty gritty stamina of a Roman legionnaire building a road through Gaul.
Here’s the passage that caught my attention and earned a place in my “quotes to note” folder:
As data grows horizontally across the enterprise, businesses are faced with the urgent need to better define data and create an accurate, transparent and accessible view of their metadata. Metadata management and business glossary are foundational technologies that can help companies achieve this goal. EMA developed seven best practices that guide companies to get the most of their data management. All attendees receive the complimentary White Paper Managing Metadata for Accessibility, Transparency and Accountability authored by Shawn Rogers.
I am not sure what some of these words and phrases mean. For example, “better define data”. My question, “What data?” Next I struggled with “create an accurate, transparent, and accessible view of their metadata.” Now there are commercial systems which allow “views” of controlled term lists. One such vendor is Access Innovations, an outfit which visited me in rural Kentucky to talk about new approaches to indexing certain types of problematic content which is proliferating in organizations. Think in terms of social content without much context other than a “handle”, date, and time even within a buttoned up company.
What do users know? Image source: http://www.computersunplugged.com.au/images/angry-man.gif
Another phrase that caught my limited attention was “metadata management and business glossary are foundational”. Okay, but before one manages, one must do a modest amount of work. Even automated systems benefit from smart algorithms helped with a friendly human crafted training document set or direct intervention by a professional information scientist. Some organizations use commercial controlled term lists to seed the automatic content tagging system. I am all for management, but I don’t think I want to jump from the hard work to “management” without going to the controlled vocabulary gym and doing some push ups. “Business glossary” baffled me and I was not annoyed by what seems to be a high school grammar misstep. Nope. The “business glossary” is a good thing, but it must be constructed to match the language of the users, the corpus, and the accepted terminology. Indexing a document with the term “terminal” is not too helpful unless there is a “field code” that pegs the terminal as one where I find airplanes, trains, death, or computer stuff. A “business glossary” does not appear from thin air,although a “cloud” outfit may have that notion. I know better.
I did a quick Google search for “Shawn Rogers,” author of the white paper. Note: I don’t know what a white paper is. The first hit is to a document which is on what I think is a pay-to-play information service called “b-eye”. The second hit points to a LinkedIn profile. I don’t know if this is “the” Shawn Rogers whom I seek. I learned that he is:
[a professional who] has more than 19 years of hands-on IT experience with a focus on Internet-enabled technology. In 2004 he cofounded the BeyeNETWORK and held the position of Executive Vice President and Editorial Director. Shawn guided the company’s international growth strategy and helped the BeyeNETWORK grow to 18 web sites around the world making it the largest and most read community covering the business intelligence, data warehousing, performance management and data integration space. The BeyeNETWORK was sold to TechTarget in April 2010.
I concluded this was “the” Mr. Rogers I sought and that he or his organization is darned good at search engine optimization type work.
What clicked in my mind was a triple tap of hypotheses:
- A couple of services firms have teamed up to cash in on the taxonomy and metadata craze. I thought metadata had come and gone, but obviously these firms are, to use Google’s metaphor, putting more wood behind the metadata thing. So, this is a marketing in order to sell services. As I said, I am okay with that.
- These firms have found a way to address the core problem of indexing by people who do not have the faintest idea of what’s involved in metatagging that helps users. One hopes.
- The two companies are not sure what the outcome of the webinar and the white paper distribution will be. In short, this is a fishing trip or an exploration of the paths on an island owned by a cruise company. There’s not much at risk.
Okay, enough.
Here’s my view on metadata.
First, most organizations have zero editorial policy and zero willingness to do the hard work required to dedupe, normalize, and tag content in a way that allows a user to find a particular item without sticky notes, making phone calls, or clicking and scanning stuff for the needed items. I think vendors promise the sun and moon and deliver gravel. Don’t agree? Use the comments section, please. Don’t call me.
Second, most of the vendors who offer industrial strength indexing and content processing systems know what needs to be done to make content findable. But the licensees often want a silver bullet. So the vendors remain silent on certain key points such as the Roman legionnaire working in the snow part. The cost part is often pushed to the margin as well.
Third, the information technology professionals “know” best. Not surprisingly most content access in organizations is a pretty lukewarm activity. I received an email last week chastising me for pointing out that more than half of an organization’s search system users were dissatisfied with whatever system the company made available. Hey, I just report the facts. I know how to find information in my organization.
Fourth, no one pays real attention to the user of a system. The top brass, the IT experts, and the vendors talk about the users. The users don’t know anything and whatever input those folks provide is not germane to the smarties. Little wonder that in some organizations systems are just worked around. Tells range from a Google search appliance in marketing to sticky notes on monitors.
Will I attend the webinar? Nah. I don’t do webinars. Do I want to change the world and make every organization have a super duper controlled term list and findable content? Nah. Don’t care. Do I want outfits like CNBC to do a tiny bit of content curation before posting unusual write ups with possible grammatical errors? You bet. What if those metadata and other tags are uncontrolled, improperly applied, and mismatched to the lingo? Status quo, I assert.
Enjoy the webinar. Good luck with your metadata and the “cloud in your hands” approach. Back to the goose pond. Honk.
Stephen E Arnold, July 29, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search, which is not a white paper and it is not free. But at $20, such a deal.
Protected: SharePoint and Your Feed Need
July 28, 2011
Protected: A Tips Spreadsheet for SharePoint Columns
July 27, 2011
Making Libraries Easier to View in SharePoint
July 26, 2011
Working with a single monitor and multiple windows open is a constant pain, especially with a teeny, tiny laptop. Switching between the windows is frustrating, but the smart computer user averts this problem by hooking up two or more monitors to their operating system. SharePoint users face a similar problem when working with multiple libraries at once, but it’s not as simple as hooking up another cable to a computer. The wonder Laura Rogers at SPTech Web wrote a great article called, “Display Multiple Libraries in SharePoint 2010” that helps explain the many ways to handle this issue. The article said:
A frequent requirement in SharePoint projects is to display documents from multiple libraries together. There are several different methods in which this can be achieved in SharePoint 2010, ranging from simple out-of-the-box Web parts to more advanced data view Web parts. Of course, as with most tasks in SharePoint, it can be done with custom development, but I usually try to steer towards out-of-the-box functionalities before going that route.
The different methods for opening documents are outlined in this article along with the pros and cons of each one. The web part she describes are the “relevant documents,” “constant web query,” “what’s new,” “data view-merge sources,” “data view-content roll-up,” and “data view-search results.” Read about each one to learn which method will work the best for you.
While you’re at it, head on over to SurfRay.com to learn how to improve your SharePoint search.
Stephen E Arnold, July 26, 2011
Sponsored by SurfRay, developers of Ontolica.