Yippy Revealed: An Interview with Michael Cizmar, Head of Enterprise Search Division

August 16, 2016

In an exclusive interview, Yippy’s head of enterprise search reveals that Yippy launched an enterprise search technology that Google Search Appliance users are converting to now that Google is sunsetting its GSA products.

Yippy also has its sights targeting the rest of the high-growth market for cloud-based enterprise search. Not familiar with Yippy, its IBM tie up, and its implementation of the Velocity search and clustering technology? Yippy’s Michael Cizmar gives some insight into this company’s search-and-retrieval vision.

Yippy ((OTC PINK:YIPI) is a publicly-trade company providing search, content processing, and engineering services. The company’s catchphrase is, “Welcome to your data.”

The core technology is the Velocity system, developed by Carnegie Mellon computer scientists. When IBM purchased Vivisimio, Yippy had already obtained rights to the Velocity technology prior to the IBM acquisition of Vivisimo. I learned from my interview with Mr. Cizmar that IBM is one of the largest shareholders in Yippy. Other facets of the deal included some IBM Watson technology.

This year (2016) Yippy purchased one of the most recognized firms supporting the now-discontinued Google Search Appliance. Yippy has been tallying important accounts and expanding its service array.

image

John Cizmar, Yippy’s senior manager for enterprise search

Beyond Search interviewed Michael Cizmar, the head of Yippy’s enterprise search division. Cizmar found MC+A and built a thriving business around the Google Search Appliance. Google stepped away from on premises hardware, and Yippy seized the opportunity to bolster its expanding business.

I spoke with Cizmar on August 15, 2016. The interview revealed a number of little known facts about a company which is gaining success in the enterprise information market.

Cizmar told me that when the Google Search Appliance was discontinued, he realized that the Yippy technology could fill the void and offer more effective enterprise findability.  He said, “When Yippy and I began to talk about Google’s abandoning the GSA, I realized that by teaming up with Yippy, we could fill the void left by Google, and in fact, we could surpass Google’s capabilities.”

Cizmar described the advantages of the Yippy approach to enterprise search this way:

We have an enterprise-proven search core. The Vivisimo engineers leapfrogged the technology dating from the 1990s which forms much of Autonomy IDOL, Endeca, and even Google’s search. We have the connector libraries THAT WE ACQUIRED FROM MUSE GLOBAL. We have used the security experience gained via the Google Search Appliance deployments and integration projects to give Yippy what we call “field level security.” Users see only the part of content they are authorized to view. Also, we have methodologies and processes to allow quick, hassle-free deployments in commercial enterprises to permit public access, private access, and hybrid or mixed system access situations.

With the buzz about open source, I wanted to know where Yippy fit into the world of Lucene, Solr, and the other enterprise software solutions. Cizmar said:

I think the customers are looking for vendors who can meet their needs, particularly with security and smooth deployment. In a couple of years, most search vendors will be using an approach similar to ours. Right now, however, I think we have an advantage because we can perform the work directly….Open source search systems do not have Yippy-like content intake or content ingestion frameworks. Importing text or an Oracle table is easy. Acquiring large volumes of diverse content continues to be an issue for many search and content processing systems…. Most competitors are beginning to offer cloud solutions. We have cloud options for our services. A customer picks an approach, and we have the mechanism in place to deploy in a matter of a day or two.

Connecting to different types of content is a priority at Yippy. Even through the company has a wide array of import filters and content processing components, Cizmar revealed that Yippy is “enhanced the company’s connector framework.”

I remarked that most search vendors do not have a framework, relying instead on expensive components licensed from vendors such as Oracle and Salesforce. He smiled and said, “Yes, a framework, not a widget.”

Cizmar emphasized that the Yippy IBM Google connections were important to many of the company’s customers plus we have also acquired the Muse Global connectors and the ability to build connectors on the fly. He observed:

Nobody else has Watson Explorer powering the search, and nobody else has the Google Innovation Partner of the Year deploying the search. Everybody tries to do it. We are actually doing it.

Cizmar made an interesting side observation. He suggested that Internet search needed to be better. Is indexing the entire Internet in Yippy’s future? Cizmar smiled. He told me:

Yippy has a clear blueprint for becoming a leader in cloud computing technology.

For the full text of the interview with Yippy’s head of enterprise search, Michael Cizmar, navigate to the complete Search Wizards Speak interview. Information about Yippy is available at http://yippyinc.com/.

Stephen E Arnold, August 16, 2016

Why Search Does Not Change Too Much: Tech Debt Is a Partial Answer

August 12, 2016

I read “The Human Cost of Tech Debt.” The write up picks up the theme about the amount of money needed to remediate engineering mistakes, bugs, and short cuts. The cost of keeping an original system in step with newer market entrants’ products adds another burden.

The write up is interesting and includes some original art. Even though the art is good, the information presented is better; for example:

For a manager, a code base high in technical debt means that feature delivery slows to a crawl, which creates a lot of frustration and awkward moments in conversation about business capability.  For a developer, this frustration is even more acute.   Nobody likes working with a significant handicap and being unproductive day after day, and that is exactly what this sort of codebase means for developers. Each day they go to the office knowing that it’s going to take the better part of a day to do something simple like add a checkbox to a form.  They know that they’re going to have to manufacture endless explanations for why seemingly simple things take them a long time.  When new developers are hired or consultants brought in, they know that they’re going to have to face confused looks, followed by those newbies trying to hide mild contempt.

My interest is search and content processing. I asked myself, “Why are search and retrieval systems better than they were in 1975. When I queried the RECON system, I was able to find specific documents which contained information matching the terms in my query. Four decades ago, I could generate a useful result set. The bummer was that the information appeared on weird thermal printer paper. But I usually found the answer to my question in a fraction of the time required for me to run a query on my Windows machine or my Mac.

What’s up?

My view is that search and retrieval tends to be a recycling business. The same basic systems and methods are used again and again. The innovations are wrappers. But to make search more user friendly, add ons look at a user’s query history and behind the scenes filter the results to match the history.

The shift to mobile has been translated to providing results that other people have found useful. Want a pizza? You can find one, but if you want Cuban food in Washington, DC, you may find that the mapping service does not include a popular restaurant for reasons which may be related to advertising expenditures.

We ran a series of queries across five Dark Web search and retrieval systems. None of the systems delivered high precision and high recall results. In order to find certain large sites, manual review and one-at-a-time clicking and review were needed to locate what we were querying.

Regular Web or Dark Web. Online search has discarded useful AND, OR, NOT functions, date and time stamps, and any concern about revealing editorial or filtering postures to a user.

Technological debt explains that most search outfits lack the money to deliver a Class A solution. What about the outfits with oodles of dough and plenty of programmers? The desire and need to improve search is not a management priority.

Some vendors mobile search operates from a vendor’s copy of the indexed sites. Easy, computationally less expensive, and good enough.

Tech debt is a partial explanation for the sad state of online search at this time.

Stephen E Arnold, August 12, 2016

USAGov Wants More Followers on Snapchat

August 12, 2016

The article on GCN titled Tracking the Ephemeral: USAGov’s Plan for Snapchat portrays the somewhat desperate attempts of the government to reach out to millennials. Perhaps shocking to non-users of the self-immolating picture app, Snapchat claims over a hundred million active users each day, mostly comprised of 13 to 34 year olds. The General Service Administration of USAGov plans to use Snapchat to study the success of their outreach like how many followers they receive and how many views their content gets. The article mentions,

“And while the videos and multimedia that make up “Snapchat stories” disappear after just 24 hours, the USAGov team believes the engagement metrics will provide lasting value. Snapchat lets account owners see how many people are watching each story, if they watch the whole story and when and where they stop before it’s over — allowing USAGov to analyze what kind of content works best.”

If you are wondering how this plan is affected by the Federal Records Acts which stipulates documentation of content, GSA is way ahead of you with a strategy of downloading each story and saving it as a record. All in all the government is coming across as a somewhat clingy boyfriend trying to find out what is up with his ex by using her favorite social media outlet. Not a great look for the US government. But at least they aren’t using ChatRoulette.

 

Chelsea Kerwin, August 12, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden /Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/

 

WCC and Elm Developers

August 10, 2016

WCC is a specialist search and content processing company. The firm maintains a low profile, which sparks my interest. I noted that WCC is hosting an Elm programming language meet up. What’s interesting is the write up announcing this initiative. I have reproduced some of the lingo used to make this meet up known to the fans of WCC and, of course, Elm:

WCC is excited to host this meetup at her headquarter. Being very interested in the latests software development technology and the advancement of knowledge, we are happy to facilitate this meetup at our offices. Elm is a functional programming language for declaratively creating web browser-based graphical user interfaces.

Anyone can misspell a word. But I particularly liked “her headquarter.” I was expecting “the company’s headquarters.”

Stephen E Arnold, August 10, 2016

Battle of the Maps

August 10, 2016

Once upon a time Mapquest.com used to be the best map Web site on the Internet, then came along Google Maps and then Apple Maps unleashed its own cartography tool.  Which is the better GPS tool?  Justin Obeirne decided to get to the bottom and find which application is better.  He discussed his findings in “Cartography Comparison: Google Maps And Apple Maps.”

Both Google and Apple want their tool to be the world’s first Universal Map, that is the map most used by the world’s population.  Google Maps is used by one billion of the world’s population, but Apple Maps has its fair share of users too.  These tools are not just mere applications, however, they are powerful platforms deployed in many apps as well.

These maps have their differences: colors, styles, and even different types of maps.  The article explains:

“At its heart, this series of essays is a comparison of the current state of Google’s and Apple’s cartography. But it’s also something more: an exploration into all of the tradeoffs that go into designing and making maps such as these.  These tradeoffs are the joy of modern cartography?—?the thousands of tiny, seemingly isolated decisions that coalesce into a larger, greater whole.  Our purpose here is not to crown a winner, but to observe the paths taken?—?and not taken.”

After reading the article, take your pick and decide which one appeals to you.  From my experience, Google Maps is more accurate and prone to have the most updated information.  Apple makes great technology, but cartography really is not their strongest point.

 

Whitney Grace, August 10, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden /Dark Web meet up on August 23, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233019199/

Technology: The New Dr. Evil in the Digital Dark Age

August 9, 2016

When I ride my mule down the streets of Harrod’s Creek, I marvel at the young folks who walk while playing with their mobile phones. Heading home after buying oats for Melissa, I look forward to my kerosene lamps.

Technology does not frighten me. I find technology and the whiz kids amusing. I read “Technology Is Now Pop Culture’s Favorite Enemy.” Goodness. I find gizmos and bits fun. The write up suggests that fun loving, top one percenters in education and wealth are finding themselves at the wrong end of a varmint trap.

I find it interesting that technology, which some folks in big cities believe is the way out of a gloomy tunnel, is maybe not flowers, butterflies, and rainbows. (The unicorns have taken to the woods it seems. No unicorns at the moment.)

I learned:

The ubiquitous nature of futuristic technology has lead to an exponential increase in our distrust of each other and the products we use, but most interesting, has taken away some of the blame from government bodies and corporations. We no longer fear agency bodies as much as we fear the physical technology they use.

That seems harsh. I like the phrase, “We’re from the government and here to help you.” Don’t you?

The write up adds a philosophical note:

Despite us being more savvy of how to use social media or despite us having a better understanding of how computers work in general, most of us still aren’t fluent in how it all fits together. We give so much of ourselves over to our devices, and we don’t ask for much in return. When we give something that inanimate that much control over us, it’s terrifying to think that we’re willingly giving up our freedom.

Let’s think about technology in terms of public Web search. One plugs a query into a system. The system returns a list of results; that is, suggestions where information related to the query may be found.

But what is happening is that the person reviewing the outputs does not have to ask, “Are these results accurate? Are they advertising? Are they comprehensive?” There is another question as well, “Is the information objective?” And what about, “Is the information accurate; that is, verifiable?”

The search systems perform another magic trick. The user becomes a content input. This means that the person with access to the queries as a group or the query subset related to a particular individual has new information. In my experience, knowledge is power, and the folks using the search system do not generally have access to this information.

Asymmetry results. The technology outfits offering service have more information than the users. Search does more to illuminate the dark corners of those using the search system than the results of a search illuminate the user’s mind.

Without the inclination to figure out what’s valid and what’s not or lacking the expertise to perform this type of search results vetting, the users become the used.

That sounds philosophical but there is a practical value to the observation. Without access and capability, the information presented becomes a strong influence on how one thinks, views facts, and has behavior influenced.

My thought is, “Welcome to the medieval world.” It is good to be a king or a queen. To be an information peasant is the opposite.

Giddy up, Melissa. Time to be heading back to the digital hollow to think about the new digital Dr. Evil.

Stephen E Arnold, August 9, 2016

Beyond Search HonkinNews Video for 8 August 2016 Online Now

August 9, 2016

You can view the August 8, 2016, HonkinNews program at this link. The video comes from Goodwill-grade 8 mm film equipment. The program highlights recent stories from the free (yep, no cost whatsoever) Beyond Search Web log. Learn about the how one Google executive “escaped” life in the fast lane. The Verizon acquisition of Yahoo reminds Stephen of Washington’s wooden false teeth. The deal allows Verizon to own two Internet artifacts. Hewlett Packard Enterprise, owner of Autonomy, faces an uncertain future as its sells units and thinks about selling itself. And there’s more in the six minute news program; for example, a restrained MBA cheer for Big Data. But that’s a sotte voce rah, rah. Like Beyond Search, the honking video version tries to separate the giblets from the goose feathers in the thrilling world of search, content processing, and related disciplines. That’s not easy in today’s search-centric world where relevance is mostly good enough and jargon is its own virtual reality.

Ken Toth, August 9, 2016

Honkin News: Beyond Search Video News Program Available Now

August 2, 2016

Honkin’ News is now online via YouTube at https://youtu.be/hf93zTSixgo. The weekly program tries to separate the giblets from the goose feathers in online search and content processing. Each program draws upon articles and opinion appearing in the Beyond Search blog.

The Beyond Search program is presented by Stephen E Arnold, who resides in rural Kentucky. The five minute programs highlights stories appearing in the daily Beyond Search blog and includes observations not appearing in the printed version of the stories. No registration is required to view the free video.

Arnold told Beyond Search:

Online search and content processing generate modest excitement. Honkin’ News comments on some of the more interesting and unusual aspects of information retrieval, natural language processing, and the activities of those working to make software understand digital content. The inaugural program highlights Verizon’s Yahoo AOL integration strategy, explores why search fails, and how manufacturing binders and fishing lures might boost an open source information access strategy.

The video is created using high tech found in the hollows of rural Kentucky; for example, eight mm black-and-white film and two coal-fired computing devices. One surprising aspect of the video is the window showing the vista outside the window of the Beyond Search facility. The pond filled with mine drainage is not visible, however.

Kenny Toth, August 2, 2016

From the Amusing Searches File: Trump to Hitler to Omission

August 1, 2016

I read a story which I assume is spot on, dead accurate, and 110 percent true. Navigate to “Google Search Connects Trump’s Book to Hitler’s ‘Mein Kampf’.” The story, in the the best traditions of real journalism, reports:

…typing the name of Trump’s 2015 book “Crippled America” into a Google image search, in addition to bringing up images of that book, displayed images of Adolf Hitler’s 1920s manifesto “Mein Kampf.” Google has been in the spotlight before for a connection between Trump and the infamous Nazi leader. In June, Googling the phrase “When was Hitler born” also produced an image of Donald Trump and listed his birthday. In that case, Google said it removed the Trump image, and a recent search confirms that the candidate’s image is no longer connected with Hitler’s birthday.

If you find the Hitler thing amusing, check out “Google Tweaks System after Trump Left Off Search Results for Presidential Candidates.” The write up, which I am sure is right as rain states:

According to Google, the omissions were the result of a “technical bug” in the Knowledge Graph, the massive information-mapping system that provides the top results bar under many fact-based searches. “Only the presidential candidates participating in an active primary election were appearing in a Knowledge Graph result,” a Google spokesperson said in a statement. “Because the Republican and Libertarian primaries have ended, those candidates did not appear. This bug was resolved early this morning.”

Was this self correcting or did an analogy entity make the fix? I recall from some time and place that Google did not fiddle search results. It must, therefore, be algorithms. Why worry about algorithms driving autos, performing surgery, or filtering information? I don’t worry. I believe everything I read on the Internet.

Those algorithms have a sense of humor. How was this linkage fixed? Maybe a human intervened, but I thought Google’s smart system worked all by its lonesome. I know that relevance is a struggle. Is it mine or others’?

Stephen E Arnold, August 1, 2016

Baidu Hopes Transparency Cleans up Results

July 28, 2016

One of the worries about using commercial search engines is that search results are polluted with paid links. In the United States, paid results are differentiated from organic results with a little banner or font change.  It is not so within China and Seeking Alpha shares an interesting story about a Chinese search engine, “Baidu Cleans Up Search Site, Eyes Value.”  Baidu recently did a major overhaul of its search engine, which was due a long, long time ago. Baidu was more interested in generating profits than providing its users a decent service.   Baidu neglected to inform its users that paid links appeared alongside organic results, but now they have been separated out like paid links in the US.

Results are cleaner, but it did not come in time to help one user:

“For anyone who has missed this headline-grabbing story, the crisis erupted after 21-year-old cancer patient Wei Zexi used Baidu to find a hospital to treat his disease. He trusted the hospital he chose partly because it appeared high in Baidu’s results. But he was unaware the hospital got that ranking because it paid the most in an online auctioning system that has helped to make Baidu hugely profitable. Wei later died after receiving an ineffective experimental treatment, though not before complaining loudly about how he was misled.”

The resulting PR nightmare forced Baidu to clean up its digital act.  This example outlines one of the many differences between US and Chinese business ethics.  On average the US probably has more educated consumers than China, who will call out companies when they notice ethical violations.  While it is true US companies are willing to compromise ethics for a buck, at least once they are caught they cannot avoid the windfall.  China on the other hand, does what it wants when it wants.

 

Whitney Grace, July 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta