Beyond Google, How to Work Your Search Engine

August 28, 2015

The article on Funnelback titled Five Ways to Improve Your Website Search offers tips that may seem obvious, but could always stand to be reinforced. Sometimes the Google site:<url> is not enough. The first tip, for example, is simply to be helpful. That means recognizing synonyms and perhaps adding an autocomplete function in case your site users think in different terms than you do. The worst case scenario is search is typing in a term and yielding no results, especially when the problem is just language and the thing being searched for is actually present, just not found. The article goes into the importance of the personal touch as well,

“You can use more than just the user’s search term to inform the results your search engine delivers… For example, if you search for ‘open day’ on a university website, it might be more appropriate to promote and display an ‘International Open Day’ event result to prospective international students instead of your ‘Domestic Student Open Day’ counterpart event. This change in search behavior could be determined by the user’s location – even if it wasn’t part of their original search query.”

The article also suggests learning from the search engine. Obviously, analyzing what customers are most likely to search for on your website will tell you a lot about what sort of marketing is working, and what sort of customers you are attracting. Don’t underestimate search.

Chelsea Kerwin, August 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Lexmark: Signs of Trouble?

August 27, 2015

I read “Shares of Lexmark International Inc. Sees Large Outflow of Money.”

The main point of the write up in my opinion was:

The company shares have dropped 41.65% in the past 52 Weeks. On August 25, 2014 The shares registered one year high of $50.63 and one year low was seen on August 21, 2015 at $29.11.

Today as I write this (August 26, 2015), Lexmark is trading at $28.25.

Why do I care?

The company acquired several search and content processing systems in the firm’s effort to find a replacement for the firm’s traditional business, printers. As you know, Lexmark is one of the IBM units which had an opportunity to find its future outside of IBM.

The company purchased three vendors which were among the companies I monitored:

  • Brainware, the trigram folks
  • ISYS Search Software, the 1988 old school search and retrieval system
  • Kapow (via Lexmark’s purchase of Kofax), the data normalization outfit.

Also, the company’s headquarters are about an hour from my cabin next to the pond filled with mine run off. Cutbacks at Lexmark may spell more mobile homes in my neck of the woods.

Stephen E Arnold, August 27, 2015

Insights into the Cut and Paste Coding Crowd

August 26, 2015

I read “How Developers Search for Code.” Interesting. The write up points out what I have observed. Programmers search for existing — wait for it — code.

Why write something when there are wonderful snippets to recycle. Here’s the paragraph I highlighted:

We also learn that a search session is generally just one to two minutes in length and involves just one to two queries and one to two file clicks.

Yep, very researchy. Very detailed. Very shallow. Little wonder that most software rolls out in endless waves of fixes. Good enough is the sort of sigma way.

Encouraging. Now why did that air traffic control crash happen? Where are the back ups to the data in Google’s Belgium server center? Why does that wonderful Windows 10 suck down data to mobile devices with little regard for data caps? Why does malware surface in Android apps?

Good enough: the new approach to software QA/QC.

Stephen E Arnold, August 26, 2015

How to Search the Ashley-Madison Data and Discover If You Had an Affair Too

August 26, 2015

If you haven’t heard about the affair-promoting website Ashley Madison’s data breach, you might want to crawl out from under that rock and learn about the millions of email addresses exposed by hackers to be linked to the infidelity site. In spite of claims by parent company Avid Life Media that users’ discretion was secure, and that the servers were “kind of untouchable,” as many as 37 million customers have been exposed. Perhaps unsurprisingly, a huge number of government and military personnel have been found on the list. The article on Reuters titled Hacker’s Ashley Madison Data Dump Threatens Marriages, Reputations also mentions that the dump has divorce lawyers clicking their heels with glee at their good luck. As for the motivation of the hackers? The article explains,

“The hackers’ move to identify members of the marital cheating website appeared aimed at maximum damage to the company, which also runs websites such as Cougarlife.com andEstablishedMen.com, causing public embarrassment to its members, rather than financial gain. “Find yourself in here?,” said the group, which calls itself the Impact Team, in a statement alongside the data dump. “It was [Avid Life Media] that failed you and lied to you. Prosecute them and claim damages. Then move on with your life. Learn your lesson and make amends. Embarrassing now, but you’ll get over it.”

If you would like to “find yourself” or at least check to see if any of your email addresses are part of the data dump, you are able to do so. The original data was put on the dark web, which is not easily accessible for most people. But the website Trustify lets people search for themselves and their partners to see if they were part of the scandal. The website states,

“Many people will face embarrassment, professional problems, and even divorce when their private details were exposed. Enter your email address (or the email address of your spouse) to see if your sexual preferences and other information was exposed on Ashley Madison or Adult Friend Finder. Please note that an email will be sent to this address.”

It’s also important to keep in mind that many of the email accounts registered to Ashley Madison seem to be stolen. However, the ability to search the data has already yielded some embarrassment for public officials and, of course, “family values” activist Josh Duggar. The article on the Daily Mail titled Names of 37 Million Cheating Spouses Are Leaked Online: Hackers Dump Huge Data File Revealing Clients of Adultery Website Ashley Madison- Including Bankers, UN and Vatican Staff goes into great detail about the company, the owners (married couple Noel and Amanda Biderman) and how hackers took it upon themselves to be the moral police of the internet. But the article also mentions,

“Ashley Madison’s sign-up process does not require verification of an email address to set up an account. This means addresses might have been used by others, and doesn’t prove that person used the site themselves.”

Some people are already claiming that they had never heard of Ashley Madison in spite of their emails being included in the data dump. Meanwhile, the Errata Security Blog entry titled Notes on the Ashley-Madison Dump defends the cybersecurity of Ashley Madison. The article says,

“They tokenized credit card transactions and didn’t store full credit card numbers. They hashed passwords correctly with bcrypt. They stored email addresses and passwords in separate tables, to make grabbing them (slightly) harder. Thus, this hasn’t become a massive breach of passwords and credit-card numbers that other large breaches have lead to. They deserve praise for this.”

Praise for this, if for nothing else. The impact of this data breach is still only beginning, with millions of marriages and reputations in the most immediate trouble, and the public perception of the cloud and cybersecurity close behind.

 

Chelsea Kerwin, August 26, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

SLI Share Price: Headwinds for Search Evident

August 26, 2015

I read “SLI CEO Ryan Bemoans Low Share price, Says It Should Be $2-Plus.” This is a woulda, coulda, shoulda write up. Reality seems to ignore this somewhat lame mantra.

The write up says:

SLI Systems chief executive Shaun Ryan says the company’s share price is “significantly underpriced” and could be at least four times higher based on other public software-as-a-service valuations.

The write up included this bit of information:

The company today reported a loss of $7.1 million in the year ended June 30, widening from a loss of $5.7 million a year earlier. Operating revenue increased 27 percent to $28.1 million, in line with the $28 million guidance given in April, when it flagged that second-half sales would be lower than expected. Annualized recurring revenue (ARR), its preferred financial measure based on forward subscription revenue, rose 39 percent to $34.6 million.

SLI says its system

… helps you increase e-commerce revenue by connecting your online and mobile shoppers with the products they’re most likely to buy. SLI solutions include SaaS-based learning search, navigation, merchandising, mobile, recommendations and user-generated SEO.

Other publicly trade search vendors are struggling with their financial performance too. For example, Sprylogics, a Canadian vendor, sees it shares trading at $0.33. Lexmark shares are at $28 and change.

Search is a tough niche as Hewlett Packard and IBM are learning.

Stephen E Arnold, August 29, 2015

What Might be Left Out of SharePoint 2016

August 25, 2015

When a new version of any major software is released, users get nervous as to whether their favorite features will continue to be supported or will be phased out. Deprecation is the process of phasing out certain components, and users are warily eyeing SharePoint Server 2016. Read all the details in the Search Content Management article, “Where Can We Expect Deprecation in SharePoint 2016?”

The article begins:

“New versions of Microsoft products always include a variety of additional tools and capabilities, but the flip side of updating software is that familiar features are retired or deprecated. We can expect some changes with SharePoint 2016.”

While Microsoft has yet to officially release the list of what will make the cut and what will be deprecated, they have made it known that InfoPath is being let go. To stay on top of future developments as they happen, stay tuned to ArnoldIT.com. Stephen E. Arnold has made a lifetime career out of all things search, and he lends his expertise to SharePoint on a dedicated feed. It is a great resource for SharePoint tips and tricks at a glance.

Emily Rae Aldridge, August 25, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Insights Into SharePoint 2013 Search

August 25, 2015

It has been awhile since we have discussed SharePoint 2013 and enterprise search.  Upon reading “SharePoint 2013: Some Observations On Enterprise Search” from Steven Van de Craen’s Blog, we noticed some new insights into how users can locate information on the collaborative content platform.

The first item he brings our attention to is the “content source,” an out-of-the-box managed property option that create result sources that aggregate content from different content sources, i.e. different store houses on the SharePoint.   Content source can become a crawled property.  What happens is that meta elements from Web pages made on SharePoint can be added to crawled properties and can be made searchable content:

“After crawling this Web site with SharePoint 2013 Search it will create (if new) or use (if existing) a Crawled Property and store the content from the meta element. The Crawled Property can then be mapped to Managed Properties to return, filter or sort query results.”

Another useful option was mad possible by a user’s request: making it possible to add query string parameters to crawled properties.  This allows more information to be displayed in the search index.  Unfortunately this option is not available out-of-the-box and it has to be programmed using content enrichment.

Enterprise search on SharePoint 2013 still needs to be tweaked and fine-tuned, especially as users’ search demands become more complex.  It makes us wonder when Microsoft will release the next SharePoint installment and if the next upgrade will resolve some of these issues or will it unleash a brand new slew of problems?  We cannot wait for that can of worms.

Whitney Grace, August 25, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

The Integration of  Elasticsearch and Sharepoint Adds Capabilities

August 24, 2015

The article on the IDM Blog titled BA Insight Brings Together Elasticsearch and Sharepoint describes yet another vendor embracing Elasticsearch and falling in love again with Sharepoint. The integration of Elasticsearch and Sharepoint enables customers to use Elasticsearch through Sharepoint portals. The integration also made BA Insight’s portfolio accessible through open source Elasticsearch as well as Logstash and Kibana, Elastic’s data retrieval and reporting systems, respectively. The article quotes the Director of Product Management at Elastic,

“BA Insight makes it possible for Elasticsearch and SharePoint to work seamlessly together…By enabling Elastic’s powerful real-time search and analytics capabilities in SharePoint, enterprises will be able to optimize how they use data within their applications and portals.”  “Combining Elasticsearch and SharePoint opens up a world of exciting applications for our customers, ranging from geosearch and pattern search through search on machine data, data visualization, and low-latency search,” said Jeff Fried, CTO of BA Insight.”

Specific capabilities that the integration will enable include connectors to over fifty system, auto-classification, federation to improve the presentation of results within the Sharepoint framework, applications like Smart Previews and Matter Comparison. Users also have the ability to decide for themselves whether they want to use the Sharepoint search engine or Elastic’s, or combine them and put the results together into a set. Empowering users to make the best choice for their data is at the heart of the integration.

Chelsea Kerwin, August 24, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Search Does Not Work: Maybe, Sometimes

August 23, 2015

I read “Feds Keep Magically Finding Documents They Insisted Didn’t Previously Exist.” I noted that the US government struggles with finding content, if the article is on the money. I learned:

Gawker had sought the email communications of Hillary Clinton deputy Philippe Reines, focused on his conversations with journalists. The State Department came back with a no responsive records reply, which was clearly bullshit, since Reines was known for regularly emailing reporters. So Gawker sued and guess what just happened: the State Department just magically found 17,855 emails that are likely responsive. How about that?

Obviously the US government is not aware of the search systems which can find documents. But what if the US government has these systems. Isn’t the finding issue an indication that basic search and retrieval does  not work? Interesting thought.

Stephen E Arnold, August 23, 2015

Do Search and CMS Deliver a Revenue Winner?

August 21, 2015

I spotted a write up called “Look for Enterprise Search, Analytics and These ECM Leaders for Your Transactional Content.” I found the article darned amazing even for public relations about a mid tier consulting firm and one of its analyses.

The main point of the article is that analysts have analyzed enterprise software and identified vendors who provide “ECM Transactional Content Services.” Fabricating collections of objects and slapping a jargon laded label on the batch is okay with me.

image

Empty calories await you, gentle reader.

What struck me as interesting was this statement:

Forrester Vice President and Principal Analyst Craig Le Clair points to key advancements and opportunities by the leading ECM providers to help enterprises realize greater value in these systems:

  • Ramping analytics to drive insight and reduce administrative burden
  • Accelerating their move to cloud
  • Improved search and content sharing
  • Using stronger and more open application program interfaces (APIs) that spur innovation
  • Moving quickly to fill gaps in their mobile road maps.

Notice the “ECM”. The acronym refers to software which provides editing, access, and publishing functions to its users. The idea, it seems, is that an employee will write a memo and the ECM will keep track of the document. In practice, based on my experience, the ECM recipe usually fails to satisfy my hunger.

ECM and its close cousins in acronym land are similar to the approach articulated by my kindergarten teacher more than half century ago. She said, according to my mother, “Keep your mittens and lunch in your cubby.” The spirit of the kindergarten teacher lives on in enterprise content management systems.

Unfortunately those who have work to do often create content using tools suited for a specific task. For an engineer, that tool might be Solidworks. Bench chemists are often confused when an ECM is described as the tool for their work. One chemist said to me after an enthusiastic presentation by an information technology person, “I work with chemical structures. What’s this person talking about?” Lawyers in the midst of big risk litigation want to use their own and often flawed document systems.  Even the marketer who cheers for ECM for Web content parks some high value data in that wonderful Adobe creative cloud with some back up data on iCloud. I have spotted a renegade analyst with an off the books workstation equipped with an Australian text processing and search system. USA.gov is notable for what is not available because executive brand entities roll their own content solutions.

I was able to review a copy of the consultant report upon which the article was based. Wowza. The write up assembled a grad bag of widely disparate companies, added three cups of buzzwords, and output mixed in one kilo of MBAisms.

To be fair, the report identified “challenges.” These items baffled me. For example, “Deep experience in key transactional applications.” This is a challenge, really?

But the vendors in the report are able to “address emerging opportunities.” Okay, so these are not opportunities. The opportunities are emerging. Hmmm. Here’s an example: “Ramping analytics to drive insight and reduce administrative burden.” Yikes. Ramping analytics. Driving analytics. Reducing administrative burden. Very active stuff this ECM. Gerund alert. Gerund alert.

What companies are into this suite of challenges and emerging opportunities? Here’s the list of the mid tier touted stallions from the ECM stable:

  1. EMC, a company which is considering having a subsidiary of itself purchase the parent company. Folks, when a company does this type of recursive stuff, the core business might be a little bit uncertain.
  2. HP. Yep, an outfit which has lost its way, suffered five consecutive quarters of declining revenue, and bought a company for $11 billion and then wrote off most of that expense because the sellers of the company fooled HP, its consultants, accountants, and lawyers. Okay. A winner for the legal eagles maybe.
  3. IBM. Heaven help me. IBM has suffered declining revenues for 13 consecutive quarters, annoyed me with a blizzard of Watson silliness, and spent lots of time getting rid of businesses. I have a difficult time believing that IBM can manage enterprise content. But, hey, that’s just my rural Kentucky ignorance, right?
  4. Laserfiche. The company offers a “flexible, proven enterprise content management system. I believe this statement. The company was founded in 1987 and sure seems to have its roots in well seasoned technology. The company has lots of customers and lots of award. The only hitch in the git along is that I never ran across this outfit in my work. Bad luck I guess.
  5. Lexmark. Folks, let us recall the rumor that Lexmark and its content businesses are not money makers. I heard that the content cluster achieved an astounding $70 to $80 million shortfall. Who knows if this rumor is accurate. I do know that Lexmark is cutting staff, and one does not take this drastic step unless one needs to reduce costs pronto.
  6. M Files. I never heard of this outfit. I did a quick check of my files and learned that the company “helps enterprises find, share, and secure documents and information. Even in highly regulated industries.” The company is also “passionate about productivity.” The outfit relies on dtSearch for information access. This is okay because dtSearch can process most of the content within a Microsoft-centric environment. But M Files strikes me as a different type of outfit from HP or IBM. As I flipped through the information I had collected, the company struck me as a collection of components. Assembly required.
  7. Newgen Software. Another newbie for me. The company was in my Overflight archive. The firm provides BPM (business process management), ECM (enterprise content management), DMS (I have no idea what this acronym means), CCM (I have no idea what this acronym means), and workflow (I thought this was the same as BPM). The company operated from New Delhi. My thought? Another collection of components with assembly in someone’s future.
  8. Hyland OnBase. This is the third outfit on the list about which I have a modest amount of information. The company says that it is a “leader in ECM.” I believe it. The firm’s url is the same as its flagship product. The company was founded in 1991 and created OnBase, which is a plus. After 25 years, the darned thing should work better than a Rube Goldberg solution assembled from a box of components.
  9. OpenText. Okay, OpenText is a company which has more search engines and content processing systems than most Canadian firms. The challenge at OpenText is having enough cash to invest in keeping the diverse assortment of systems current. Which of these systems is the one referenced in the mid tier firm’s report? SGML search, BASIS, BRS, Nstein, the Autonomy stub in RedDot, Nstein, Fulcrum, or some other approach? Details can be important.
  10. Unisys. Okay, finally a company that is essentially an integrator which still supports Burroughs mainframes. Unisys can implement systems because it is an integrator. For government work, Unisys matches the statement of work to available software. Although some might question this statement, Unisys can implement almost any kind of system eventually.

Several observations:

First, enterprise content management is a big and fuzzy concept. The evidence of this is the number of acronyms some of the companies use to explain what they do. I assume that it is my ignorance which prevents me from understanding exactly how scanning, indexing, retrieval, repurposing, workflow, and administrative functions work in a cost constrained, teleworker, mobile gizmo world.

Second, open source is knocking on the door of this sector. At some point, organizations will tire of the cost and complexity of collections of loosely federated and integrated software subsystems and look for an alternative. Toss in the word Big Data, and there will be a stampede of New Age consultants ready to step forward and reinvent these outfits. Disruption is probably less of a challenge than the challenge of keeping existing revenues from doing the HP, IBM, and Lexmark drift down.

Third, the search function seems to be a utility or an after thought. The only problem is that search does not work particularly well in an enterprise where the workers log in from Starbucks and try to interact with enterprise software from a Blackberry.

Fourth, what an odd collection of outfits? HP, IBM, and Lexmark along with 30 year old imaging firms plus some small outfits. Maybe the selection of firms makes sense to you, gentle reader. For me, the report make evident the struggles of some experts in ECM, BPM, and the acronyms I know zero about.

In short, this mid tier report strikes me as a russische punschtorte. On the surface, the darned thing looks good, maybe mouth watering. After a chomp or two, I want a paprikahenderl.

This ECM thing is a confection, not a meaty chicken. Mixing in search does nothing for the recipe.

Stephen E Arnold, August 22, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta