Want To Know What A Semantic Ecosystem Is

July 8, 2015

Do you want to know what a semantic ecosystem is? The answer is available from TopQuadrant in its article, “Semantic Ecosystem-What’s That About?”  According to the article, a semantic ecosystem enables patterns to be discovered, show the relationships between and within data sources, add meaning to raw data artifacts, and dynamically bring information together.

In short, it shows how data and its sources connect with each other and extracts relationships from it.

What follows the brief explanation about what a semantic ecosystem can do is a paragraph about the importance of data, how it takes many forms, etc., etc.  Trust me, you have heard it before. It then makes a comparison with a natural ecosystem, i.e. the ones find in nature.

The article continues with this piece:

“As in natural ecosystems, we believe that success in business is based on capability – and the ability to adapt and evolve new capabilities. Semantic ecosystems transform existing diverse information into valuable semantic assets. Key characteristics of a semantic ecosystem are that it is adaptable and evolvable. You can start small – with one or more key business solutions and a few data sources – and the semantic foundation can grow and evolve with you.”

It turns out a semantic ecosystem is just another name for information management.  TopQuadrant coined the term to associate with their products and services.  Talk about fancy business jargon, but TopQuadrant makes a point about having an information system work so well that it seems natural.  When a system works naturally, it is able to intuit needs, interpret patterns, and make educated correlations between data.

Whitney Grace, July 8, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Semantic Search and Challenging Patent Document Content Domains

July 7, 2015

Over the years, I have bumped into some challenging content domains. One of the most difficult was the collection of mathematical papers organized with the Dienst architecture. Another was a collection of blog posts from African bulletin board systems in a number of different languages, peppered with insider jargon. I also recall my jousts with patent documents for some pretty savvy outfits.

The processing of each of these corpuses and making them searchable by a regular human being remains an unsolved problem. Progress has been slow, and the focus of many innovators has been on workarounds. The challenge of each corpus remains a high hurdle, and in my opinion, no search sprinter is able to make it over the race course without catching a toe and plunging head first into the Multi-layer SB Resin covered surface.

I read “Why Is Semantic Search So Important for Patent Searching?” My answer was and remains, “Because vendors will grab at any buzzy concept in the hopes of capturing a share of the patent research market?”

The write up take a different approach, an approach which I find interesting and somewhat misleading.

The write up states that there are two ways to search for information: Navigational search sort of like Endeca I assume and research search, which is the old fashioned Boolean logic which I really like.

The article points out that keyword search sucks if the person looking for information does not know the exact term. That’s why I used the reference to Dienst. I wanted to provide an example which requires precise knowledge of terminology. That’s a challenge and it requires specialized knowledge from a person who recognizes that he or she may not know the exact terminology required to locate the needed information. Try the Dienst query. Navigate to a whizzy new search engine like www.unbubble.eu and plug away. How is that working out for you, but don’t cheat. You can’t use the term Dienst.

If you run the query on a point and click Web search system like Qwant.com, you cannot locate the term without running a keyword search.

The problems in patents, whether indexed with value added metadata, humans laboring in a warehouse, or with semantic methods are:

  1. Patent documents exist in versions and each document drags along assorted forms which may or may not be findable. Trips to the USPTO with hat in hand and a note from a senator often do not work. Fancy Dan patent attorneys fall back on the good old method of hunting using intermediaries. Not pretty, not easy, not cheap, and not foolproof. The versions and assorted attachments are often unfindable. (There are sometimes interesting reasons for this kettle of fish and the fish within it.) I don’t have a solution to the chains of documents and the versions of patent documents. Sigh.
  2. Patents include art. Usually the novice reacts negatively to lousy screenshots, clunky drawings, and equations which make it tough to figure out what a superscript character is. Keywords and pointing and clicking, metaphors, razzle dazzle search systems, and buzzword charged solutions from outfits like Thomson Reuters and Lexis are just tools, stone tools chiseled by some folks who want to get paid. I don’t have a good solution to the arts and crafts aspect of patent documents. Sigh sigh.
  3. Patent documents are written at a level of generalization, with jargon, Latinate constructs, and assertions that usually give me a headache. Who signed up to read lots of really bad poetry. Working through the Old Norse version of Heimskringla is a walk in the park compared to figuring out what some patents “mean.” I spent a number of years indexing 15th century Latin sermons. At least in that corpus, the common knowledge base was social and political events and assorted religious material. Patents can be all over the known knowledge universe. I don’t know of a patent processing system which can make this weird prose-poetry understandable if there is litigation or findable if there is a need to figure out if someone cooked up the same system and method before the document in question was crafted. Sigh sigh sigh.
  4. None of the systems I have used over the past 40 years does a bang up job of identifying prior art in scientific, technical or medical journal articles, blog posts, trade publications, or Facebook posts by a socially aware astrophysicist working for a social media company. Finding antecedents is a great deal of work. Has been and will be in my opinion. Sigh sigh sigh sigh. But the patent attorneys cry, “Hooray. We get to bill time.”

The write up presents some of those top brass magnets: Snappy visualizations. The idea is that a nifty diagram will address the three problems I identified in the preceding paragraphs. Visualizations may be able to provide some useful way to conceptualize where a particular patent document falls in a cluster of correctly processed patent documents. But an image does not deliver the mental equivalent of a NOW Foods Why Protein Isolate.

Net net: Pitching semantic search as a solution to the challenges of patent information access is a ball. Strikes in patent searching are not easily obtained unless you pay expert patent attorneys and their human assets to do the job. Just bring your checkbook.

Stephen E Arnold, July 7, 2015

Amazon: Its Search Warrants Watching

July 7, 2015

I read “Amazon Must Face Trademark Lawsuit over Search Results.” The write up reports that “the online retailer’s search results can cause confusion for potential customers.” The product in quest is a watch from a “high end watchmaker Multi Time Machine.”

My own experience with Amazon search results is that, on the whole, the system outputs “close” results. Close as in horseshoes. My annoyance grows each time I click on a title only to learn that it is not available. Grrr. How tough is it to allow me to NOT out results which I do not want to view? There are other issues as well. These range from the do it yourself approach to content processing for Amazon’s “enterprise search” on AWS to the baffling listing of results which are Amazon’s, in Amazon’s warehouse, available from an Amazon partner, or listed by a now unemployed middle school teacher after the product did not move at a recent garage sale.

The write up points out:

Amazon displays MTM Special Ops in the search field and immediately below the search field, along with similar watches manufactured by MTM’s competitors for sale. MTM alleged this could cause customers to buy from one of those competitors, rather than encouraging the shopper to look for MTM watches elsewhere.

But everyone loves Amazon, the click throughs (which are not used to fund Beyond Search, thank you), and the wonky lovable founder. I am convinced he is the world’s smartest man. I mean who could even think of being more intelligent?

I suppose my dull average intelligence, like Multi Time’s, is just not able to understand the relevance of Amazon’s search and retrieval system.

Stephen E Arnold, July 7, 2015

Coveo Partners with Etherios on Salesforce Services

July 7, 2015

Professional services firm Etherios is teaming up with Coveo in a joint mission to add even more value to customers’ Salesforce platforms, we learn from “Etherios and Coveo Announce Strategic Alliance” at Yahoo Finance. Etherios is a proud Salesforce Platinum Partner. The press release tells us:

 “Coveo connects information from across a company’s IT ecosystem of record and delivers the knowledge that matters to customers and agents in context. Coveo for Salesforce – Communities Edition helps customers solve their own cases by proactively offering case-resolving knowledge suggestions, and Coveo for Salesforce – Service Cloud Edition allows customer support agents to upskill as they engage customers by injecting case-resolving content and experts into the Salesforce UI as they work.

“Etherios provides customers with consulting and implementation services in the areas of Sales, Customer Service, Field Service and IoT [Internet of Things]. … Etherios capabilities span operational strategy, business process, technical design and implementation expertise.”

 Founded in 2005, Coveo leverages search technology to boost users’ skills, knowledge, and proficiency while supplying tools for collaboration and self-service. The company maintains offices in the U.S. (SanMateo, CA), the Netherlands, and Quebec.

 A division of Digi International, Etherios launched in 2008 specifically to supply cloud-based tools for Salesforce users. They prefer to inhabit the cutting edge, and operate out of Chicago, Dallas, and San Francisco.

 Cynthia Murrell, July 7, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Digestible Content Tool For The Busy Person

July 7, 2015

RSS feeds and Web page readers curate content from select Web sites tailored to suit a users’ needs.  While all of the content is gathered in one spot and the headlines are available to read, sometimes the readers return hundreds of articles and users do not have the time to read all of them.  True, sometimes users can glen the facts from the headlines and the small blurb included with it, but sometimes it is not enough.

There are apps that gather and summarize a users’ content, but these are usually geared towards a specific industry or an enterprise system.   There is a content reader that was designed for the average user, while at the same time it can be programmed to serve the needs of many professionals.  The Context Organizer from Content Discovery Inc. is an application that summarizes Web pages and documents in order to pinpoint relevant information.    The Content Organizer works via five basic steps:

“1. Get to the point – Speed-up reading by condensing web pages, emails and documents into keywords and summaries presented in context.

  1. Make a Long Story Short – The Short Summary headlines most important sentences – instant information capsules.
  2. Accelerate Search – Search the web with relevant keywords. Summarize Google search results for rapid understanding.
  3. Take Notes – Quickly collect topics and sentences. Send them to WordPad or Word. Share notes – send them by e-mail.
  4. Visualize – View summaries in context as Mindjet MindManager maps.”

There are three different Context Organizer versions: one that specifically searches the Web, another that searches the Web and Microsoft Products, and the third is a combination of the prior versions plus it includes the Mindjet MindManager.  The prices range from $60-$120 with a free twenty-one day trial, which we suggest you start with.  Always start with free trial first, because you mind be throwing away money on an item you do not like.  With the amount of content available on the Web, any tool that helps organize and summarize it is worth investigating.

Whitney Grace, July 7, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Bing Game: Search Has to Be Fun, Fun, Fun

July 6, 2015

Navigate to “Microsoft Put a Pong Game in Its Bing Search Engine.” Yep, when I run a query I definitely want to distract myself with a quick video game session. Doesn’t everyone 70 years old have this compelling need to lose focus and forget why one visited a search engine in the first place. No wonder Bing is just so darned wonderful. Just the other day I was looking for information about the Citadel exploit from 2011, and I ended up playing Pong. Wow, as I recall, the experience was really helpful to my work.

The write up states;

People are discovering that if you search for “pong” on the Bing site, the search results include a playable version of one of the first video games ever made. The game allows the classic digital paddles to be moved up and down with a mouse or keyboard on the PC, or via fingers on touch screen.

Let’s have more distractions to prevent me from experiencing incomplete and irrelevant results to my queries.

Stephen E Arnold, July 6, 2015

What Watson Can Do For Your Department

July 6, 2015

The story of Justin Chen, a Finance Manager, is one of many “Stories by Role” now displayed on IBM. Each character has a different job, such as Liza Hay from Marketing, Donny Cruz from IT and Anisa Mirza from HR. Each job comes with a problem for which Watson, IBM’s supercomputer, has just the solution. Justin, the article relates, is having trouble deciding which payments to follow. Watson provides solutions,

“With IBM® Watson™ Analytics, Justin can ask which customers are least likely to pay, who is most likely to pay and why. He can analyze this information… [and] collect more payments more efficiently… With Watson Analytics, Justin can ask which customers are likely to leave and which are likely to stay and why. He can use the answers for analysis of customer attrition and retention, predict the effect on revenue and determine which customer investments will lead to more profitable growth.”

It seems that the now world-famous Watson has been converted from search to a basket containing any number of IBM software solutions. It isn’t stated in the article, but we can probably assume that the revenue from each solution counts toward Watson’s soon to be reported billions in revenue.

Chelsea Kerwin, July 6, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Lexmark: Brainware, ISYS, and Kofax May Not Be Enough

July 5, 2015

Here I am. Sitting in the misty morn contemplating layoffs in the Louisville-Lexington region. At a Fourth of July party, the founder of a large Kentucky-based business reassured his listeners that there would be almost no layoffs as a result of the Aetna-Humana deal. I yawned.

My mind was not attending to the woes of Humana’s soon to be unemployed thousands. I was considering the news item I had just read on my trusty Blackberry Classic (right, no iPhone for me, gentle reader).

The short item was “Insider Selling: Lexmark International CFO David Reeder Sells 7,283 Shares of Stock (LXK).” Who was doing the selling? The person was David Reeder, the Lexmark chief financial officer. Perhaps Mr. Reeder has to send a child to school or must replace a cracking concrete driveway?

Lexmark beat some analyst estimates in its April 2015 quarterly statement. What’s the big deal?

The write up reports:

Several analysts have recently commented on the stock. Analysts at Goldman Sachs initiated coverage on shares of Lexmark International in a research note on Wednesday, June 17th. They set a “sell” rating and a $34.00 price target on the stock. Analysts at Zacks downgraded shares of Lexmark International from a “hold” rating to a “sell” rating in a research note on Wednesday, June 3rd. Analysts at Cross Research upgraded shares of Lexmark International from a “sell” rating to a “hold” rating and raised their price target for the stock from $36.00 to $43.00 in a research note on Thursday, May 14th. Analysts at Brean Capital reiterated a “hold” rating on shares of Lexmark International in a research note on Thursday, April 30th. Finally, analysts at TheStreet upgraded shares of Lexmark International from a “hold” rating to a “buy” rating in a research note on Tuesday, April 28th. Five analysts have rated the stock with a sell rating, four have issued a hold rating and two have assigned a buy rating to the company. The company currently has an average rating of “Hold” and an average target price of $39.29.

My question is, “Will revenues from the content processing acquisitions ignite Lexmark’s revenues and pump up the profits?” My research suggests that Lexmark may find that making big money from content centric software is no picnic on a warm sunny day.

I am rooting for the printer company, but I am a realist. Some Lexmarkians may want to keep their résumés sparkling and bright. When a CFO sells shares, I pay attention.

Stephen E Arnold, July 5, 2015

An Oddly Mystical, Whimsical Listicle Combining Big Data and Search

July 4, 2015

Some listicles are clearly the work of college students after a tough beer pong tournament. Others seem as if they emanate from beyond Pluto’s orbit. I am not sure where on this spectrum between the addled and extraterrestrial the listicle in “Top 11 Open Source big Data Enterprise Search Software” falls.

Here’s the list for your contemplation. I have added some questions after each company’s name. Consult the original write up for the explanation the inclusion of these systems in the list. I found the write ups without much heft or “wood” to use a Google term.

  1. Apache Solr. Yep, uses Lucene libraries, right. Performance? Exciting sometimes.
  2. Apache Lucene Core. Ah, Lego blocks for the engineer with some aspirations for continuous employment.
  3. Elasticsearch. The leader in search and retrieval. To do big data, there are some other components required. Make sure your programming and engineering expertise are up to the job.
  4. Sphinx. Okay, workable for structured data. Work required to stuff unstructured content into this system.
  5. Constellio. Isn’t this a part time project of a consulting firm focused on Canadian government work?
  6. DataparkSearch Engine. Yikes.
  7. ApexKB. Okay, a script. For enterprise applications. Big Data? Wow.
  8. Searchdaimon ES. Useful, speedier than either Lucene or Elasticsearch. Not a big data engine without some extra work. Come to think of it. A lot of work.
  9. mnoGoSearch. Well, maybe for text.
  10. Nutch. Old in the tooth. Why not use Lucene?
  11. Xapian. Very robust. Make certain that you have programming expertise and engineering knowledge. Often ignored which is too bad. But be prepared for some heavy lifting or paying a wizard with a mental fork lift to do the job.

Now which of these systems can do “big data.” In one sense, if you are exceptionally gifted with engineering and programming skills, I suppose any of these can do tricks. As Samuel Johnson allegedly observed to his biographer:

“Sir, a woman’s preaching is like a dog’s walking on his hind legs. It is not done well; but you are surprised to find it done at all.”

On the other hand, these programs can be used as a utility within a more robust content processing system which has been purpose built to deal with large flows of structured and unstructured content. But even that takes work.

Anyone want to give Constellio a shot at processing real time Facebook posts? Anyone want to use any of these systems to solve that type of search problem? Show of hands, please?

Stephen E Arnold, July 4, 2015

Silobreaker Takes Gold and Silver in Online Decathlon

July 4, 2015

Short honk: I have been a fan of the Silobreaker system, which is available for commercial and governmental content processing. I read Network Products Guide “New Products and Service: Winners 10th Annual 2015 IT Awards” recommended solutions league table this morning. Silobreaker, founded by a couple of wizards with military and commercial experience. According to the league table, the Silobreaker content processing and information access system is the top dog for applications centering in Europe, the Middle East and Asia. This means that the system’s multi-lingual capabilities were the best, according to the Network Products Guide’s editors. The company also nailed a silver medal for US focused solutions. You can get more information about Silobreaker at www.silobreaker.com. Sign up. Join the thousands of users who want to work with a winner.

Stephen E Arnold, July 4, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta