An Interview with Chen Li and Dev Bhatia
SRCH2, founded in 2008, is now marketing version 3.0 of its search technology. The company was founded by Dr. Chen Li, and the CEO is Dev Bhatia. The company positions its technology as enabling Google-like search for mobile, tablets, alternate devices, and e-commerce.
I spoke with SRCH2's management on May 23, 2013. The full text of my interview with the SRCH2 team appears below.
Thanks for taking the time to speak with me. What's the history of SRCH2?
SRCH2 was founded in 2008. Now entering its 3.0 release cycle, the company’s patented search software enables “rich search” features including instant forward search (using a full-text forward index cached in memory), configurable error tolerance, rapid geo search, configurable rankings, and more.
In many ways, Google’s unveiling of “Instant Search” in 2010 provided consumers with a powerful example of what search could do for them. But extending that idea out to different internal and external corporate search boxes, on a wide variety of platforms and devices, has been difficult. Customizing existing search tools to deliver some version of “rich search” functionality has proven to produce an end-product which is complex, brittle, and slow.
SRCH2 is built to address these needs from the ground up, whereas other technologies have evolved to support them. As such, SRCH2 offers clear differentiation when you also consider complexity and time to market. When you add in-memory performance to this, SRCH2 offers a killer combination for these use cases.
SRCH2 taps the hardware and data innovations available today to deliver great search experiences. Ultimately, SRCH2 wants to change how developers and search pros think about search—to make it a more meaningful, relevant and profitable interaction with data.
When did you become interested in text and content processing?
Dr. Chen Li first published work in the field while he was a Ph.D. candidate at Stanford more than a decade ago. His Ph.D. advisor was Dr. Jeffrey Ullman, who is now an investor and advisor at SRCH2. Chen was at Stanford at a dynamic time in the industry, and a number of his lab mates and contemporaries went on to start great companies in the space, including Junglee, Google, and Aster Data. Chen’s own research focused on the challenges around fast access to data. Now, fully 10+ years later, he’s winning accolades for his foundational work. Last year, SIGMOD gave him its “Test of Time” Award, and just two weeks ago, Chen received the 10-Year Best Paper award from DASFAA in Wuhan, China.
SRCH2 has been investing in search, content processing, and text analysis for several years. What are the general areas of your research activities?
We are directing research resources at databases, search, and big data.
Many vendors argue that mash ups are and data fusion are "the way information retrieval will work going forward? I am referring to structured and unstructured information. How does SRCH2 perceive this blend of search and user-accessible outputs? Is consumerization opening new opportunities or distracting people from more challenging applications of SRCH2's technology?
There are two ways to look at mash-ups; in the UX and in the index. Both of these are emerging spaces. The market is only beginning to deliver on the potential to innovate in these areas and SRCH2 can support both.
The emergence of new data inputs resulting from adoption of unstructured and semi-structured databases and architectures has actually created a new challenge for data scientists: how best to access and display that data. Often, unstructured data can’t be properly indexed because of the limitations of the search software itself. This means that the best way to approach the data properly is with a full-text index. In the case of SRCH2, we offer a full text index cached in memory, with elegant algorithmic devices embedded in the index to enable rapid traversal of a trie.
What’s a trie?
Oh, good question. A trie is an efficient tree data structure that is used to store strings.
And the payoff?
The result is fast access to search results, and ultimately, to a host of new UX opportunities.
For example, search results can be simpler to configure and manage, and the UX can be much more actionable and profitable than before. This is a new way to think of search—as a profit center and ROI driver for users. Here’s an example. Imagine highly relevant type forward results, on a mobile handset, for an e-commerce retailer. Now, imagine those type-forward results combined with click-to-buy or click-to-call tags, all location-aware. You’ve just delivered extraordinary value through great search.
Without divulging your firm's methods or clients, will you characterize a typical use case for your firm's search and retrieval capabilities?
SRCH2 clients are using it in a wide variety of contexts and devices. One of the things we’re very proud of is the breadth of uses and use cases. A major global handset manufacturer is porting it to the kernel, across millions of handsets. Uses and platforms include mobile, e-commerce, social, and alternative devices, like cable set-top boxes.
There are very specific reasons for this wide variety of end use cases. First is the relative compactness of SRCH2’s binary and index. Second, and equally important, is the configurability of the engine. Using SRCH2, a developer can configure error tolerance by edit-distance. She can prioritize within relevant type forward results by profitability metrics, or by location, or any other metric. An e-commerce retailer might use this to hash results toward items in stock, or nearby, or related metric. The point is that we are handing out the tools, and the answers are in the hands of the great, forward-thinking developers who use it.
What are the benefits to a commercial organization or a government agency when working with your firm?
Benefits of SRCH2 include our removing the complexity of typical search approaches in a variety of contexts. In addition, our approach permits a rapid time to market. We utilize an in-memory approach. We think speed is the killer app. Finally, we offer improved service based on SRCH2 technology, as compared with typical commercial or open source projects.
Ultimately, our mission is to change the place of search within a commercial organization or government agency. It should be an opportunity to interact with data to deliver results and action. We’ve covered some use cases above. Clearly, in e-commerce, getting people to actionable results when they are hunting for products is an ROI-driver. Merchandisers can improve profits by making sure they have highly relevant forward search results.
But the advantages in other areas are also quite apparent. In mobile, type forward coupled with error correction offer users the chance to minimize character entry on a touch screen. The fat finger problem is just a universal pain point in mobile, and the topic comes up often when we talk with clients in the handset business. People mistype on their phones, a lot, and SRCH2 minimizes the pain by generating great, relevant type forward results and tolerating their typos. Finally, this feature also proves quite valuable in non-traditional contexts. Many of our users talk about the problem they have when ordering a movie or TV show using search on a set-top box. They’re using remote control devices to enter characters by pointing. Quite painful. And SRCH2 can help with that.
How does an information retrieval engagement with your firm move through its life cycle?
A client downloads our software, configures the engine on their data, starts the engine, and develops the front end UI. They use our RESTful API to do search queries and update data. With our product, they can provide a very powerful search interface with many Google-like features. One of the keys is simplicity of configuration. We believe this is a “launch in an afternoon” project. Fundamentally different than a typical enterprise development project, as you know. The complexity that has been imposed on the market has, indeed, been stifling. We want to free people up to work on improving the experience, which is an iterative process which can even begin after launch. Imagine launching in a day, then spending the rest of your time figuring out what configuration of results is best for your particular needs in your particular industry.
One challenge to those involved with squeezing useful elements from large volumes of content is the volume of content AND the rate of change in existing content objects. What does your firm provide to customers to help them deal with the volume problem?
We agree. Our engine allows the client to specify how frequently updates should be merged into the main indexes so that users can search on latest data. The latency is configurable, and it can be as low as less than one second.
Due to the in-memory approach, SRCH2’s engine has the advantages of using efficient representation of the indexes and reducing the cost of disk IOs. This advantage is especially significant when dealing with dynamic changes.
Another challenge, particularly in professional intelligence operations, is moving data from point A to point B; that is, information enters a system but it must be made available to an individual who needs that information or at least must know about the information. What does your firm offer licensees to address this issue of content "push", report generation, and personalization within a work flow?
SRCH2 is not in the database management and data movement business. There are other great vendors out there. One of the things we do provide is elegant integration of our search capabilities alongside some of the data management solutions out there. We feel it’s important to be clear about what we do, and do not do. We tend to stick to our knitting, and just provide great search. Put it this way: if the data can be accessed in some standard format, say JSON, we can index it.
There has been a surge in interest in putting "everything" in a repository and then manipulating the indexes to the information in the repository. On the surface, this seems to be gaining traction because network resident information can "disappear" or become unavailable. What's your view of the repository versus non repository approach to content processing?
We are data-location agnostic. Our clients are using SRCH2 in both centralized and virtualized scenarios, with great success. Users access our engine using URI requests to manage all the data, updates, and queries, and the engine internally translates these operations into the corresponding components.
I am on the fence about the merging of retrieval within other applications. What's your take on the "new" method which some people describe as "search enabled applications"? Endeca and Exalead have work flow components as part of their search platforms? What's your firm offer?
We believe end-users and search integrators should be able to tap search within their applications, broadly. And the applications can be both software, and hardware embedded applications, like GPS devices. The limit should not be imposed by the search software. One factor which is limiting the size and scope of the search market is the limit imposed by the actual physical size of the search software and search index itself. We feel that in a market where the cost of memory is shrinking rapidly, where devices are getting smarter, and where data-driven apps and uses are exploding, search software has to also be accessible. For our part, we’re developing to keep ahead of these trends by constantly focusing on simplicity, elegance, and configurability. Put tools into the hands of application developers, and let them decide how to implement great search.
There seems to be a popular perception that the world will be doing computing via iPad devices and mobile phones. My concern is that serious computing infrastructures are needed and that users are "cut off" from access to more robust systems? How does your firm see the computing world over the next 12 to 18 months? What products / services will you be focusing on to deliver on your next vision?
We agree that the current search market has put itself into a bit of a cul-de-sac, where it’s difficult to leverage great search tools on a mobile or tablet. But that’s precisely the point. Our software is absolutely designed to run on a mobile device, or tablet, where our search features are demonstrably more relevant and valuable. Our goal is to improve the SRCH2 engine to make search on these devices more powerful, so that they are not “cut off” as you put it. The market has spoken on these devices, and they are not going away. The next 12 to 18 months will only bring more innovation and variety into computing. Looking further out, we even see where new computing infrastructures and device categories, like Google Glass, are going. We see new input methods, like voice-activated, or “blink-activated” (!) as opportunities for great search software to master. They’re ultimately just query input mechanisms. In many ways, any non-QWERTY input just is an opportunity for fuzzy error correction and forward search to add more value.
We love mobile, tablets, and whatever comes after that. Bring it on!
Put on your wizard hat. What are the three most significant technologies that you see affecting your search business? How will your company respond?
Mobile applications; location-based services, and Cloud-based service.
The current search tools and services, including Google and several open source solutions, are not optimized for such applications and technologies. The shortcoming lies far deeper than product design. It lies in the very foundational algorithms. Our search engine is built on the world's leading research on the type of search that is uniquely tooled for mobile, location, and cloud applications. We don't merely respond to these areas. We lead. The higher and tougher the demands from these new applications are, the more advantage SRCH2's search technology would have over other solutions. So we don't just cope with the challenges in these new areas. We thrive on them.
Where does a reader get more information about your firm?
Our Web site is srch2.com. Contact information is available there.
SRCH2’s technology does bring many Google-like features to enterprise and mobile search. The company’s approach sharply reduces deployment time. Instead of talking about a one-day installation, the SRCH2 approach delivers on quick and painless roll out and integration. The system’s response time and its time-saving features such as auto-suggest make the experience seamless and fluid.
Stephen E Arnold, June 4, 2013