An Interview with Gilles Andre
Founded in 2001, PolySpot delivers agile, open enterprise search infrastructure solutions and applications. Unlike other providers, PolySpot focuses firmly on the user experience by enriching raw data upstream, thereby improving decision-making. Companies need search solutions that are capable of covering their entire information system and that can help users to classify, share and search for information quickly. Companies also need to build their business intelligence strategy around these data, ensuring that strategically- important information is read, classified and analyzed automatically. Furthermore, they require business-specific applications that provide search functions tailored to each of their activities, such as customer service, HR, compliance and sales management.
PolySpot's solutions have millions of users worldwide, across all business sectors, with customers including Allianz, BNP Paribas, Bureau Veritas, Crédit Agricole, OSEO, Schlumberger, Veolia, Trinity Mirror, and Vinci.
Gilles Andre is PolySpot's chief executive officer. He joined the firm in 2011. I spoke with Mr. Andre right after the American Thanksgiving holiday. I wanted learn about his plans for PolySpot and how the firm’s technology meshed with the enterprise search and content processing trends which will influence organizations’ buying decisions in 2012. The full text of the interview appears below.
How did you get interested in content processing?
It’s a long story: I Set up what is now known as an extract, transform, and load or ETL company in 1997. After two fast-growth years, my team and I had hit $20 million in annual revenues. Then we made a decision that looked highly attractive for our investors. We sold the company to the Canadian firm Hummingbird. Looking back, I think we sold too early.
What did that experience teach you about digital information and opportunity?
Back to your question: I learned that there are significant opportunities in the market sector. I also learned that unstructured data is amplifying some opportunities.
Well, I began talking with Olivier Lefassy late 2009 about the digital information space. I have known Olivier for more than a decade, and we began to discuss how the structured and unstructured challenges were triggering some problems that traditional enterprise software vendors could not solve. The major vendors talked a good game, but they could not deliver quickly or economically solutions to the “big data problem”.
I decided to have a closer look to the PolySpot opportunity and proceeded to intensive due diligences on the market space to evaluate which disruptive strategy and positioning would convert PolySpot from a good “French” product into an international success. I think it was ten months ago that I concluded that Olivier, his team and I could deliver a solution to organizations wanting to gain access to information which answered a business question. Our approach was to leapfrog the study, plan, implement, customize, and upgrade approach that most enterprise vendors force upon their licenses.
PolySpot’s agile framework, its use of open source technology like Lucene, and a focus on putting information in the business work flow. Olivier Lefassy, David Fischer - our CTO - and I had designed some interesting ideas, and I was eager to fine tune these elements into a business model that would propel PolySpot over the hurdles which cause many enterprise information solutions to fail.
What is your background?
Ah, I have an MBA in Finance & Marketing. I have been fortunate to develop a facility in French, English, and Spanish. We have some other basic biographical data on LinkedIn www.linkedin.com if anyone is interested. More relevant to the opportunity at PolySpot is that I have 10 years’ corporate experience at high level position at Thomson Multimedia.
In addition to the big company background, I have 15 years of successful software entrepreneurial experience. I have worked with three companies through the critical stage of taking an idea and sparking rapid growth
Can you give me an example?
Yes, there was Leonard’s Logic which created and pioneered the ETL segment. We were able to generate about $15 million in its first 18 months of activity. As I said, we were acquired by Hummingbird. After that great learning experience, I was able to garner more international experience working in the US, India, the UK, and Spain.
What’s your exposure to open source?
Good question. When I was board member at Talend, a very successful French initiative in the ETL segment from inception in 2006 to December 2010, I came to understand the potential of open source software. PolySpot gives me a chance to leverage my knowledge about fast growth, high potential companies, open source software, and the “big data” opportunity around us. I think you can say that data management and information are woven throughout my business fabric.
What are some of the specifics which you identified as giving PolySpot an advantage in the high competitive enterprise information market sectors?
Let me come at the question this way. First, I think PolySpot is more than a one-solution company. PolySpot enables information retrieval and findability for a system’s users. PolySpot’s approach makes it possible to transform what I think of as “Information chaos” into an “information asset.” The approach, stated simply, makes our licensees’ life easier.
What do you mean?
I believe that solving the problem of information overload is a challenge in many organization. Yesterday’s technologies that have reached their limits facing the “big data” phenomenon.
So, the business value of PolySpot consists of delivering enriched information (knowledge) to the right people at the right time. We do this quickly, at a very competitive price point, and without the consulting and engineering seesaw that many vendors force their customers to endure.
I heard that you have about 60 employees. How many are working on the product?
I don’t want to provide too many details about our staffing, but more than half of our team are working full time on PolySpot products.
I want to dig into what makes PolySpot different from the more than 300 companies I monitor. In your view, what sets PolySpot apart?
PolySpot is involved from the data collection process to information deliver. We have what I call an “innovative enrichment module.” Our product officer calls this core component the PolySpot Sense Builder.
That’s a good name.
I agree. So what we have is proprietary technology to take unstructured data like email or Web pages and add value to the content.
To deal with the data extraction issue, PolySpot has developed more than 70 connectors (http:/ /www.polyspot.com/produits/connecteurs-polyspot.html) enabling data and content extraction from most widely used file types; for example, file systems, content management systems, Lotus Notes, customer relationship management systems, SQL databases, and so on.
We distribute our connectors across the licensees’ information system. The content goes through the ETL sequence and the value-added indexing process and ends up in an “information warehouse” or info-warehouse. On the surface, our approach looks like a data warehouse but our implementation converts storage into the PolySpot Sense Builder. What we put in place is a dynamic, real time information hub. The Sense Building handles structured and unstructured data and enables the our state of the art cross-referential and semantics enrichment. We extract entities, index, and perform more sophisticated operations to allow a user to get exactly what he or she needs to answer a business question.
Would you give me an example of what types of queries I can run with PolySpot?
Keep in mind we are not just a search system. But you can enter a couple of words and search for specific expertise, or job title to retrieve related documents. Because we extract names and entities we can automatically link the data in a corporate directory and include information about who wrote a document, who read a document, and other actions.
Are you creating reports?
I would say that we are actually blending raw data to produce valuable and relevant information to targeted users. You can use the PolySpot processed content to generate a very useful report, but you can—as I suggested—locate information related to a person, product, competitor, and so on.
We like to describe our unique approach in terms of what it makes possible; for example, our innovative architecture is an “Agile Enterprise Search infrastructure.” We point out that our approach at a very fundamental level offers high flexibility Information management and wide range of external processes capabilities to raise raw data to valuable knowledge.
Is the architecture able to handle rich media?
Yes, rich media is a very important data type now. So we can process spoken words into text and then process it without our PolySpot system.
These types of next generation operations permit transformation and cross-enrichment. Because we use asynchronous processes, we can handle the types of content processing spikes that slow down many of our competitors’ systems.
And there is no need to re-index for new enrichment or metadata. As you know many commercial systems endlessly reindex. We have avoided this engineering misstep.
How does PolySpot handle high volume flows of real time information?
PolySpot’s approach consists in consolidating the collected data. We intake the information and process it as it arrives. The indexes are updated in real time.
PolySpot infrastructure is based on Lucene/Solr technologies. What's the value of this new infrastructure?
Like IBM, we rely on the open source community to keep the core search and retrieval functionality working. We believe that Lucene offers the strongest indexing library available on the market with thousands of contributors. For basic search, it is quite challenging to compete with the Lucene/Solr community. We have learned that for an enterprise use case, functionality is very good and because the code is open, new features and functions can be added to meet a specific client’s requirements. A commercial search and retrieval solution, on the other hand, puts road blocks in front of developers who need to make a change to the commercial package.
What has PolySpot built around but separate from the Lucene core?
That’s a great question. We build the connectors I mentioned before and a connector software development kit. We engineered out proprietary transformation and enrichment platform (that’s the Sense Builder components) which adds intelligence to raw information. We also developed a very innovative end to end administration console enabling to design and maintain search applications with no particular technical skill, this eases Lucene and SOLR configuration but also amplifies the search functionalities provided by SOLR. Last, we have added display modules, information views, and graphical user interfaces. These can easily be customized. To make it brief, PolySpot delivers the first end-to-end packaged search infrastructure over Lucene and SOLR core technologies.
Isn’t deploying Lucene a time consuming activity?
One of the benefits of our approach is that the normal month long or multi week cycle for Lucene is eliminated. We have engineered the PolySpot system so that most simple deployments take a day to five days.
Is this a stripped down implementation?
No, the core system permits search, relationship discovery, and business intelligence functions. In most cases, our licensees use the system from the day it becomes live, adding tweaks as specific needs are identified. The entire process saves 90 percent of the hassle for the installation of basic search and deploying advanced reporting functions.
Most analysts fall into a routine when it comes to analyzing data. Tables of numbers are often ignored. Systems output flashy graphics, but some of the visuals are impenetrable to me. What does PolySpot do to make its systems' outputs useful to an analyst or business professional?
We follow many of the ideas in book Information Is Beautiful by David McCandless. He does an excellent job of addressing many of the issues associated with presenting and visualizing complex information. We think we have done a very solid job in providing report and visualization templates in our standard product. We include graphical best practices as well.
We provide and maintain a large collections of graphical widgets and objects. These include lists to sliders to basic chart and graphs. We also offer different display modes,grid views, and mosaics, among others. We include default presentations for documents, news , people, products, image, video. Each of these have specific presentation and access requirements.
Let’s talk about how a PolySpot client uses your firm’s system. Can you walk me through a use case?
Sure, As you know, employees need relevant information to take decisions at the right time.
Is this a single location install?
No, Eurovia operates globally. Because of the distributed nature of the company and the volume of data growing in the firms many offices, an information challenge existed. In our discussions with senior management, we learned that there was the problem of “big data” in both structured and unstructured forms. There were different types of content and many file forms.
Users wanted to tap into the available information and be able to access relevant, in context information quickly and easily. Employees wanted a single point of access. But several systems had to be learned. Information outputs then had to be pieced together from many different sources. There was a need to get information about business procedures, tap into the experience of colleagues, pull information from catalogs and databases of prices, and so on.
What did PolySpot do?
We installed the basic system. We deployed the connectors needed access the content in the many different “silos” of information. We processed the information with our transformation and enrichment module. We automatically performed semantic and cross referential enrichment processes. As part of this process, we did entity extraction and tagging; for instance, assigning job titles, information from various company directories, and so on.
We made the system available. There was no need for formal training. Users had access to on point information. The technology team was able to make changes via our graphical graphical administration console. The system administrator could edit or create business extraction and transformation rules. Our asynchronous enrichment processes allowed content to remain fresh. As an added benefit, as the system iterates, it automatically enriches content in a progressive way. There is, therefore, no need to reindex content when adding new metadata to the system.
The employees use one screen to access information via a single point of access. The information presented to the user is contextualized and enriched thanks to semantic analysis and cross-referential enrichment.
Would you give me an example of a typical query?
Sure, employees can search like this : “I want documents written by a site manager in the UK.” The user does not have to know the author of the document to find information. If the user does not want to search, a click will retrieve activity reports, business procedures and processes , or compliance functions. This means that PolySpot enforces compliance actions which raises employee productivity and minimizes most major compliance risks.
Let’s assume the company operates with many different languages. What is the PolySpot use case?
PolySpot supports many languages. So for our client Bureau Veritas we provided a quality and safety solution. The system had to provide specific information and work flow actions to more than 50,000 employees in 120 countries. Quite a complex challenge. What makes this an interesting use case is the fact that Bureau Veritas grows via acquisitions. So the company is constantly having to process content from these new units. With our pan- enterprise approach, employees could find and act on information regardless of its location in the company. This was work flow and compliance, not just content access.
What types of content can PolySpot process and make actionable?
We deal with more than 500 different file formats. Our ETL capabilities make it easy to handle structured (from database), semi-structured (documents along with metadata ) and unstructured (full text, the Web, social outputs from Facebook, and many other types. Our connector SDK makes it easy to process a file type we do not currently support.
Where does PolySpot's product fit in an organization?
We know from what our clients tell us that PolySpot complements content management systems, portals, and similar information centric systems. Many of our licensees use PolySpot to enhance the scope of the initial content available within an enterprise portal solution for staff and other authorized users.
We have been told that PolySpot was used to make a shift from one content management and production system to another vendor’s editorial system. PolySpot delivered transparent information to the new system. PolySpot was used as a change management system.
A number of our clients use the PolySpot modules to enhance and enlarge the scope of the standard search features capability of existing enterprise applications; for example, adding features to a legacy enterprise search system or an enterprise financial management suite.
Does PolySpot have a partner program?
Yes, and we are expanding that ecosystem now and throughout 2012. We have been creating a partner ecosystem and can select the best partner to fit specific customer needs accordingly. We have partners who are specialists in CMS, audio, video, etc.
Ease of deployment is our flagship theme to convince a very large numbers of service companies to join our network. Over 20 partners passed our certification program in the last 3 months. A sign that PolySpot is delivering innovation along with high ROI.
What can prospects and licensees expect from PolySpot in 2012?
One idea we will be moving forward is based on the fact that open source invites innovative approaches. For two years now, we have been working on our transformation and enrichment module, the PolySpot Sense Builder. We want to see if we can get others to build solutions on our platform and then make these components available to other PolySpot licensees, partners, and integrators.
When you look forward six to 12 months, what are the major trends you are monitoring in entity extraction?
I continue to watch relationship discovery. The idea which I think will gain importance in 2012 is finding out what is happening when, where, between what entities (people, companies, and events.) I am also monitoring apps in the enterprise and the shift to mobile access to a range of enterprise information and activities. Overall in 2012, there will be more attention given to delivering useful facts to business users when the users need the data.
How does a reader get more information about PolySpot?
I suggest that your readers navigate to our Web site at www.polyspot.com. We provide links to our blog, our Twitter posts, and other information about our company, technology, products, and services.
We have monitored PolySpot for several years. The company has made rapid progress by sever al measures. The company has been adding employees to keep pace with client demand. The firm’s technology has gained important features and functions. Of particular interest is PolySpot’s use of open source search technology to provide core information retrieval functions. Finally, the company has an interesting approach to tap into both the commercial demand for big data solutions and the open source community’s need to have robust infrastructure to which to connect. Our view is that PolySpot has emerged as a firm that warrants close observation. For companies looking for a fast-cycle, high-performance system upon which to develop next- generation information solutions, you will want to take a close look at PolySpot. The company offers full support for mobile apps and has technology which makes installation, configuration, customization, and content acquisition quick, smooth, and free of the “sell first, code later” approach to real-life enterprise information challenges.
Stephen E. Arnold, December 13, 2011