Interview with Informatica CEO

November 26, 2015

Blogger and Datameer CEO Stefan Groschupf interviews Anil Chakravarthy, acting CEO of Informatica, in a series of posts on his blog, Big Data & Brews. The two executives discuss security in the cloud, data infrastructure, schemas, and the future of data. There are four installments as of this writing, but it was an exchange in the second iteration, “Big Data Brews: Part II on Data Security with Informatica,” that captured our attention. Here’s Chakravarthy’s summary of the challenge now facing his company:

Stefan: From your perspective, where’s the biggest growth opportunity for your company?

Anil: We look at it as the intersection of what’s happening with the cloud and big data. Not only the movement of data between our premise and cloud and within cloud to cloud but also just the sheer growth of data in the cloud. This is a big opportunity. And if you look at the big data world, I think a lot of what happens in the big data world from our perspective, the value, especially for enterprise customers, the value of big data comes from when they can derive insights by combining data that they have from their own systems, etc., with either third-party data, customer-generated data, machine data that they can put together. So, that intersection is good for, and we are a data infrastructure provider, so those are the two big areas where we see opportunity.

It looks like Informatica is poised to make the most of the changes prompted by cloud technology. To check out the interview from the beginning, navigate to the first installment, “Big Data & Brews: Informatica Talks Security.”

Informatica offers a range of data-management and integration tools. Though the company has offices around the world, they maintain their headquarters in Redwood City, California. They are also hiring as of this writing.

Cynthia Murrell, November 26, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Big data, Corporate Concerns, Data, Management, News, Security, Tools | Comments Off on Interview with Informatica CEO

Rundown on Legal Knowledge Management

September 24, 2015

One of the new legal buzzwords is knowledge management and not just old-fashioned knowledge management, but rather quick, efficient, and effective. Time is an expensive commodity for legal professionals, especially with the amount of data they have to sift through for cases. Mondaq explains the importance of knowledge management for law professionals in the article, “United States: A Brief Overview Of Legal Knowledge Management.”

Knowledge management first started in creating an effective process for managing, locating, and searching relevant files, but it quickly evolved into implementing a document managements system. While knowledge management companies offered law practices decent document management software to tackle the data hill, an even bigger problem arose. The law practices needed a dedicated person to be software experts:

“Consequently, KM emphasis had to shift from finding documents to finding experts. The expert could both identify useful documents and explain their context and use. Early expertise location efforts relied primarily on self-rating. These attempts almost always failed because lawyers would not participate and, if they did, they typically under- or over-rated themselves.”

The biggest problem law professional face is that they might invest a small fortune in a document management license, but they do not know how to use the software or do not have the time to learn. It is a reminder that someone might have all the knowledge and best tools at their fingertips, but unless people have the knowledge on how to use and access it, the knowledge is useless.

Whitney Grace, September 24, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Business intelligence, Legal matters, Management, News, Search, Technology | Comments Off on Rundown on Legal Knowledge Management

Online Ads Discriminate

August 3, 2015

In our modern age, discrimination is supposed to be a thing of the past. When it does appear, people take to the Internet to vent their rage and frustrations, eager to point out this illegal activity. Online ads, however, lack human intelligence and are only as smart as their programmed algorithm. Technology Review explains in “Probing The Dark Side of Google’s Ad-Targeting System” that Google’s ad service makes inaccurate decisions when it comes to gender and other personal information.

A research team at Carnegie Mellon University and the International Computer Science Institute built AdFisher, a tool to track targeted third party ads on Google. AdFisher found that ads were discriminating against female users. Google offers a transparency tool that allows users to select what types of ads appear on their browsers, but even if you use the tool it doesn’t stop some of your personal information from being used.

“What exactly caused those specific patterns is unclear, because Google’s ad-serving system is very complex. Google uses its data to target ads, but ad buyers can make some decisions about demographics of interest and can also use their own data sources on people’s online activity to do additional targeting for certain kinds of ads. Nor do the examples breach any specific privacy rules—although Google policy forbids targeting on the basis of “health conditions.” Still, says Anupam Datta, an associate professor at Carnegie Mellon University who helped develop AdFisher, they show the need for tools that uncover how online ad companies differentiate between people.”

The transparency tool only controls some of the ads and third parties can use their own tools to extract data. Google stands by its transparency tool and even offers users the option to opt-out of ads. Google is studying AdFisher’s results and seeing what the implications are.

The study shows that personal data spills out on the Internet every time we click a link or use a browser. It is frightening how the data can be used and even hurtful if interpreted incorrectly by ads. The bigger question is not how retailers and Google uses the data, but how do government agencies and other institutes plan to use it?

Whitney Grace, August 3, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under algorithms, Google, Marketing, News, Technology | Comments Off on Online Ads Discriminate

Hadoop Rounds Up Open Source Goodies

July 17, 2015

Summer time is here and what better way to celebrate the warm weather and fun in the sun than with some fantastic open source tools. Okay, so you probably will not take your computer to the beach, but if you have a vacation planned one of these tools might help you complete your work faster so you can get closer to that umbrella and cocktail. Datamation has a great listicle focused on “Hadoop And Big Data: 60 Top Open Source Tools.”

Hadoop is one of the most adopted open source tool to provide big data solutions. The Hadoop market is expected to be worth $1 billion by 2020 and IBM has dedicated 3,500 employees to develop Apache Spark, part of the Hadoop ecosystem.

As open source is a huge part of the Hadoop landscape, Datamation’s list provides invaluable information on tools that could mean the difference between a successful project and failed one. Also they could save some extra cash on the IT budget.

“This area has a seen a lot of activity recently, with the launch of many new projects. Many of the most noteworthy projects are managed by the Apache Foundation and are closely related to Hadoop.”

Datamation has maintained this list for a while and they update it from time to time as the industry changes. The list isn’t sorted on a comparison scale, one being the best, rather they tools are grouped into categories and a short description is given to explain what the tool does. The categories include: Hadoop-related tools, big data analysis platforms and tools, databases and data warehouses, business intelligence, data mining, big data search, programming languages, query engines, and in-memory technology. There is a tool for nearly every sort of problem that could come up in a Hadoop environment, so the listicle is definitely worth a glance.

Whitney Grace, July 17, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Data mining, Database, News, Open source, Search, Search quality, Text processing | Comments Off on Hadoop Rounds Up Open Source Goodies

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.