Track the Output of SharePoint Fast Search Crawl Logs

August 7, 2012

Do you need to pull SharePoint Fast Search crawl logs? We do. We read with interest an item on Microsoft’s TechNet Web site. “Get SharePoint Search Crawl Logs” provides an almost ready-to-run script which will accept a search service name and display the associated crawl logs. If there is a crawl log with an error, the script flags that instance. To script can be edited so that it returns different information from the crawly logs. In order to make this tweak, the $crawlLogFilters can be edited.

SharePoint Fast usually does an excellent job of processing content. However, some documents can be malformed or an unexpected network issue can arise. As a result, certain content can be skipped or ignored. A visual inspection of crawl logs is not practical when SharePoint is processing large volumes of content.

If you want to view the crawl logs, TechNet provides a wealth of information. A good place to begin your investigation is in the TechNet Library. If you want to expOrt the SharePoint 2010 search crawl logs, you will find a useful Powershell script in Dave Mc’s Blog in the article “Export the SharePoint 2010 Search Crawl Log.” MSDN also provides information about exporting SharePoint 2010 search crawl logs. To access this information, navigate to the SharePoint Escalation Team’s blog.

Search Technologies’ team of experienced engineers can provide automation tools which eliminate the need to search for solutions to common problems. To learn more about our SharePoint and FFast Search implementation services, navigate to http://www.searchtechnologies.com/microsoft-search.html or contact us at info@searchtechnologies.com.

Iain Fletcher, August 7, 2012

Sponsored by Augmentext

Discussion on Plans for SharePoint 2013 Migrations

August 7, 2012

In “Migrating to SharePoint 2013,” Chris Wright speculates on the new SharePoint release, potential adoption rates, Cloud versus on-premises deployments, and third party options. The author points out that those users of SharePoint Online have a relatively clear upgrade path without much to worry about. However, he adds this about on-premises users:

On-premises users of SharePoint have a much bigger decision to make, and more traditional upgrade options. Early commentators suggest that the full locally installed version of SharePoint has seen slightly less focus than the cloud version. The biggest areas of improvement are web content management, enterprise content management and search.

Wright also suggests that if all else fails, look into a third party migration tool for an easier solution. Third party tools should not be overlooked when adding value to your SharePoint system. We like the feedback we’ve seen about Fabasoft Mindbreeze. Here you can read about the mobility solutions from Mindbreeze:

Fabasoft Mindbreeze Mobile makes company knowledge available on all mobile devices. You can act freely, independently and yet always securely. Irrespective of what format the data is in. Full functionality: Search results are displayed homogenously to the web client with regards to clear design and intuitive navigation.

And with information pairing of your cloud and on-premise data, users can easily access important business information on the go from their smartphones and tablets. The well-established and cost-effective solution is worth a second look at http://www.mindbreeze.com/.

Philip West, August 7, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Connecting Engineering to Other Departments through Data Management

August 7, 2012

One of the greatest problems, historically, impacting manufacturing enterprises is the lack of connectivity between engineering and other departments.  Product lifecycle management (PLM) solutions have helped diminish that problem but haven’t eliminated it entirely. A recent Concurrent Engineering article, “PLM and Product Development Connects Engineering and Service”, discusses how PLM solutions need to bridge the gap between engineering and other departments.

As the article explains it,

“Developments in PLM now look to decrease the gap between engineering and service parts of an organisation. This makes collaboration and data sharing more convenient and more sustainable. It will reduce reliance on meetings and manual feedback procedures because the PLM system will automatically feed data across departments. Extending the scope of PLM across an organisation increases serviceability and creates a more unified approach to product development.”

Inforbix is a PLM provider that approaches data management differently.  As their Website explains,

“Inforbix captures engineering information (eg. CAD data, bill of materials, information from PDM, and other enterprise systems) and makes it available for people outside of engineering. Inforbix product data apps are intuitive and easy to use. It’s a “Google-like” approach that makes finding and sharing engineering and manufacturing data fast and easy.”

As more enterprises demand their PLM solutions do just what Inforbix described above we will see an increase in PLM providers following suit.

Catherine Lamsfuss, August 7, 2012
Sponsored by ArnoldIT.com, developer of Augmentext.

Lexmark Touts Brainware as a Global Player

August 6, 2012

In their News Blog, Lexmark praises Brainware’s latest global associations in “Brainware Update: Capturing Relationships Globally.” The piece explains that data capture outfit Brainware is “expanding more than ever before,” with several new partnerships in new regions formed over the last six weeks alone. These new allies include Mexico’s STN Latam, whose specialty is finance resource planning; Outsourcing and IT consultants Novosit in the Dominica Republic; and IT services firm Content Concepts, operating in the Asia Pacific market. Nice work. See the write up for more details on each enterprise.

The Lexmark second quarter earnings call also mentioned Brainware, stating:

“Perceptive announced that the University of Kansas plans to expand the use of Perceptive Software solutions to a university-wide contract. This will also include the use of Brainware’s award-winning Distiller software to streamline invoice processing. . . .

“And we are leveraging our MPS enterprise presence along with our new Brainware intelligent capture expertise to help our customers extend their smart MFPs to now scan, classify and extract key content from documents all automatically and deposit the content directly into a core system or process, reducing time and manual labor costs in the process.”

It sounds like Brainware is bringing a lot to Lexmark’s projects. The company was formed in 2006 with the buyout of technology from SER Solutions. They emphasize that their auto-learning data capture and search solutions are scalable and user-friendly.

Veteran enterprise search technology vendor ISYS Search Software, another Lexmark acquisition, also received a (brief) mention in the earnings call.

Cynthia Murrell, August 6, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Perfecting Web Site Semantics

August 6, 2012

Web site search is most often frustrating, and at its worst, a detriment to customers and commerce.  Fabasoft Mindbreeze, a company heralded for its advances in enterprise search, is bringing its semantic specialization to the world of Web site search with Fabasoft Mindbreeze InSite.  Daniel Fallmann, Fabasoft Mindbreeze CEO, highlights the features of the new product in his blog entry, “4 Points for Perfect Website Semantics.”

Fallmann lays out the problem:

The problem: Standard search machines, in particular the one provided by CMS, are unproductive and don’t consider the website’s sophisticated structure. The best example: enter the search term ‘product’ and the search delivers no results, even though product is its own category on the site. Even if the search produces a result for another term, there’s nothing more than a ‘relatively un-motivating list of links,’ not really much help to a website visitor.

Using semantics in the search means that the Web site is being understood, not just keyword searched.  Automatic indexing preserves the existing site structure, while providing hassle-free search for the customer.  In addition, InSite benefits the Web site developer, in that he/she can see how users are navigating the site and which elements are most often searched.

The attractive “behind-the-scenes” functioning of Fabasoft Mindbreeze InSite means that customers benefit from the intuitive, semantic search without the distraction of a clunky search layer.  Satisfy your customers and your developers by exploring InSite today.

Emily Rae Aldridge, August 6, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

The Debate between PLM and PDM Continues

August 6, 2012

Differentiating between Product Data Management (PDM) and Product Lifecycle Management (PLM) is difficult at times and usually starts a very heated debate.  One such debate is documented on Engineering Matters in the article, “PLM is Just Data Management… Whatever Dude?”. The author, Chad Jackson, writes in response to a blog post by Adam O’Hearn in which O’Hearn claims PLM and PDM are one in the same.

Jackson summarizes his thoughts on the matter:

“…it’s probably obvious that I disagree Adam’s statement that PLM is nothing more than PDM, ‘whatever dude’ objection withstanding. But I understand where he is coming from. I think his perspective represents the more recent view of PLM that has its foundation in PDM, design release and change management. Additionally, I’d advise Adam not to hold his breath for a PLM Software Provider to step forward to address the shortcomings of PDM.”

One PLM provider has addressed this exact issue – Inforbix. The young company tackles the issue head-on in a blog post on their website:

“Instantly having access to the data you need is the ultimate goal of any data management system. Inforbix takes an approach that involves the application of web, semantic, and cloud technologies to provide users an alternative yet easy means of aggregating and exposing data where ever it’s located. Inforbix embraces the notion that when it comes to data management, less is more.”

Both O’Hearn and Jackson are correct in identifying the problem of the fuzziness surrounding PLM and PDM and their distinguishing characteristics.  Hopefully, more PLM providers will address this problem leading to a clearer understanding of the problem and a solution.

Catherine Lamsfuss, August 6, 2012

Sponsored by ArnoldIT.com, developer of Augmentext.

 

How I Know Facebook Faces Challenges

August 6, 2012

The big tip off is the story in USA Today, “4 Reasons Investors Don’t Like Facebook.” The story appeared in the dead tree edition on August 2, 2012. Another clue is the clumsy handling of Facebook developers. This fumble was given that “real” journalism twist in “Schadenfreude, Anyone? In Wake of Facebook Bullying Claims, Google+ Chief Vic Gundotra Woos Developers.” In case you don’t remember, schadenfreude suggests enjoying another’s discomfort. But for me, the bright yellow flashing light is the Barron’s story “UBS Hit by Facebook Loss; Vows Legal Action Against Nasdaq.” Definitely not a good thing when cousins shoot at one another with real bullets.

Can Facebook get its act together? My hunch is that social media push back is likely to take a toll on Facebook and probably some other social media companies. When one is sitting in the dorm without much desire to study coefficients of friction, fiddling with Facebook is a nifty distraction. Working as an intern allows some time to connect with friends. However, once one realizes that time is a scarce resource, Facebook and other social media lovers may start looking for a new hook up.

What I find fascinating is that Google and Microsoft Bing don’t want to accept that social media may not be the innovation to ignite these firms’ online revenues. Google has suggested that Google Plus is the new Google. Microsoft is a me-too outfit, so social content is getting attention in Redmond. What happens when social drops to a utility function?

The search giants are going to have to focus on relevance and finding high value content. Facebook’s challenges, therefore, are going to be on deck to cause headaches for the search services in my opinion.

Stephen E Arnold, August 6, 2012

Sponsored by Augmentext

Short Honk: High Value Podcast about Solr

August 4, 2012

If you are interested in Lucene/Solr and have a long commute, you will want to check out Episode 187 of the IEEE’s Software Engineering Podcast. You can find the podcast on iTunes. Grant Ingersoll, one of Lucid Imagination’s experts in open source search and a committer on the Apache Lucene/Solr project, reviews the origins of Lucene, explains the features of Solr, and covers a range of important, hard to get search information. According to IEEE, the podcast offers a:

dive into the architecture of the Solr search engine. The architecture portion of the interview covers the Lucene full-text index, including the text ingestion process, how indexes are built, and how the search engine ranks search results.  Grant also explains some of the key differences between a search engine and a relational database, and why both have a place within modern application architectures.

One of the highlights of the podcast is Mr. Ingersoll’s explanation of vector space indexing. Even a high school brush with trigonometry is sufficient to make this important subject fascinating. Highly recommended.

Stephen E Arnold, August 4, 2012

Sponsored by Augmentext

Vertical Searches for User Manuals

August 4, 2012

Makeuseof presents a handy collection of vertical search sites in “Can’t Find a User Manual for Your Gear? Search These Specialist Websites.” Writer Saikat Basu observes that, in the excitement of a new purchase, most of us stuff our user manuals into some corner and forget about them—until we need them! He comments:

“User manuals – those thick (or thin) soft covered sheaf’s of paper with multi-lingual instructions and weird hieroglyphics that we don’t bother to read. . . . We all have rummaged through the house looking for the user manual we ‘misplaced’. No luck.

“Here’s where a bit of smarts comes in. The meticulous guy with foresight will either scan it and keep a softcopy in his computer, or look for a softcopy that’s usually available as PDF on the manufacturer’s site.

“There’s a third option – a bunch of specialist websites which does the hard work for us lazybones, and stockpiles user manuals for us to search and download.”

So, instead of combing through the filing cabinet or, worse, those paper-piles every office seems to collect, turn to this list of sites that can put the desired information at your fingertips at the speed of, well, of your Internet connection. Basu details six sites, describing the purpose behind each, how it works, and what he values most about each one. For example, he likes the forums on Safe Manuals, and appreciates the teardown diagrams at iFixit.

The other four sites that made the list include Retrevo, Manuals Online, eSpares, and Free Manuals (aka TheManuals.com). I recommend tucking the article away for your next manual-related urgency. At the end of the article, Basu puts out the call for reader recommendations, so check the comments section for similar sites.

Cynthia Murrell, August 4, 2012

Sponsored by ArnoldIT.com, developer of Augmentext

Research and Development Innovation: A New Study from a Search Vendor

August 3, 2012

I received message from LinkedIn about a news item called “What Are the Keys to Innovation in R&D?” I followed the links and learned that the “study” was sponsored by Coveo, a search vendor based in Canada. You can access similar information about the study by navigating to the blog post “New Study: The Keys to Innovation for R&D Organizations – Their Own, Unused Knowledge.” (You will also want to reference the news release about the study as well. It is on the Coveo News and Events page.

Engineers need access to the drawings and those data behind the component or subsystem manufactured by their employer. Text based search systems cannot handle this type of specialized data without some additional work or the use of third party systems. A happy quack to PRLog: http://www.prlog.org/10416296-mechanical-design-drawing-services.jpg

The main of the study, as I interpret it, is marketing Coveo as a tool to facilitate knowledge management. Even though I write a monthly column for the print and online publication KMWorld, I do not have a definition of knowledge management with which I am comfortable. The years I spent at Booz, Allen & Hamilton taught me that management is darned tough to define. Management as a practice is even more difficult to do well. Managing research and development is one of the more difficult tasks a CEO must handle. Not even Google has an answer. Google is now buying companies to have a future, not inventing its future with existing staff.

The unhappy state of many search and content processing companies is evidence that those with technological expertise may not be able to generate consistent and growing revenues. Innovation in search has become a matter of jazzing up interfaces and turning up the marketing volume. The $10 billion paid for Autonomy, the top dog in the search and content processing space, triggered grousing by Hewlett Packard’s top executives. Disappointing revenues may have contributed to the departure of some high profile Autonomy Corporation executives. Not even the HP way can make traditional search technology pay off as expected, hoped, and needed. Search vendors are having a tough time growing fast enough to stay ahead of spiking technical and support costs.

When I studied for a year at the Jesuit-run Duquesne University, I encountered Dr. Frances J. Chivers. The venerable PhD was an expert in epistemology with a deep appreciation for the lively St. Augustine and the comedian Johann Gottlieb Fichte. I was indexing medieval Latin sermons. I had to take “required” courses in “knowledge.” In the mid 1960s, there were not too many computer science departments in the text indexing game, so I assume that Duquesne’s administrators believed that sticking me in the epistemology track would improve the performance of my mainframe indexing software. Well, let me tell you: Knowledge is a tough nut to crack.

Now you can appreciate my consternation when the two words are juxtaposed and used by search vendors to sell indexing. Dr. Chivers did not have a clue about what I was doing and why. I tried to avoid getting involved in discussions that referenced existentialism, hermeneutics, and related subjects. Hey, I liked the indexing thing and the grant money. To this day, I avoid talking about knowledge.

Selected Findings

Back to the study. Coveo reports:

We recently polled R&D teams about how they use and share innovation across offices and departments, and the challenges they face in doing so.  Because R&D is a primary creator and consumer of knowledge, these organizations should be a model for how to utilize and share it. However, as we’ve seen in the demand for our intelligent indexing technology, and as revealed in the study, we found that R&D teams are more apt to duplicate work, lose knowledge and operate in soloed, “tribal” environments where information isn’t shared and experts can’t be found.  This creates a huge opportunity for those who get it right—to out-innovate and out-perform their competition.

The question I raised to myself was, “How were the responses from Twitter verified as coming from qualified respondents?” And, “How many engineers with professional licenses versus individuals who like Yahoo’s former president just arbitrarily awarded themselves a particular certification were in the study?” Also, “What statistical tests were applied to the results to validate the the data met textbook-recommended margins of error?”

I may have the answers to these questions in the source documents. I have written about “number shaping” at some of the firms with which I have worked, and I have addressed the issue more directly in my opt in, personal news service Honk. (Honk, a free weekly newsletter, is a no-holds-barred look at one hot topic in search and content processing. Those with a propensity to high blood pressure should not subscribe.)

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta