Is Your Company a Data Management Leader or Laggard?

November 4, 2016

The article titled Companies are Falling Short in Data Management on IT ProPortal describes the obstacles facing many businesses when it comes to data management optimization. Why does this matter? The article states that big data analytics and the internet of things will combine to form an over $300 billion industry by 2020. Companies that fail to build up their capabilities will lose out—big. The article explains,

More than two thirds of data management leaders believe they have an effective data management strategy. They also believe they are approaching data cleansing and analytics the right way…The [SAS] report also says that approximately 10 per cent of companies it calls ‘laggards’, believe the same thing. The problem is – there are as many ‘laggards’, as there are leaders in the majority of industries, which leads SAS to a conclusion that ‘many companies are falling short in data management’.

In order to avoid this trend, company leaders must identify the obstacles impeding their path. A better focus on staff training and development is only possible after recognizing that a lack of internal skills is one of the most common issues. Additionally, companies must clearly define their data strategy and disseminate the vision among all levels of personnel.

Chelsea Kerwin,  November 4, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Semantify Secures Second Funding Round

August 4, 2016

Data-management firm Semantify has secured more funding, we learn from “KGC Capital Invests in Semantify, Leaders in Cognitive Discovery and Analytics” at Benzinga. The write-up tells us primary investor KGC Capital was joined by KDWC Venture Fund and Bridge Investments in making the investment, as well as by existing investors (including its founder, Vishy Dasari.) The funds from this Series A funding round will be used to address increased delivery, distribution, and packaging needs.

The press release describes Semantify’s platform:

“Semantify automates connecting information in real time from multiple silos, and empowers non-technical users to independently gain relevant, contextual, and actionable insights using a free form and friction-free query interface, across both structured and unstructured content. With Semantify, there would be no need to depend on data experts to code queries and blend, curate, index and prepare data or to replicate data in a new database. A new generation self-service enterprise Ad-hoc discovery and analytics platform, it combines natural language processing (NLP), machine learning and advanced semantic modeling capabilities, in a single seamless proprietary platform. This makes it a pioneer in democratization of independent, on demand information access to potentially hundreds of millions of users in the enterprise and e-commerce world.”

Semantify cites their “fundamentally unique” approach to developing data-management technology as the force behind their rapid deployment cycles, low maintenance needs, and lowered costs. Formerly based in Delaware, the company is moving their headquarters to Chicago (where their investors are based). Semantify was founded in 2008. The company is also hiring; their About page declares, toward the bottom: “Growing fast. We need people;” as of this writing, they are seeking database/ BI experts, QA specialists, data scientists & knowledge modelers, business analysts, program & project managers, and team leads.

 

 

Cynthia Murrell, August 4, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

 

How-To Overview of Building a Data Platform to Handle Real-Time Datasets

March 11, 2016

The article on Insight Data Engineering titled Building a Streaming Search Platform offers a glimpse into the Fellows Program wherein grad students and software engineers alike build data platforms and learn cutting-edge open source technologies. The article delves into the components of the platform, which enables close to real-time search of a streaming text data source, with Twitter as an example. It also explores the usefulness of such a platform,

On average, Twitter users worldwide generate about 6,000 tweets per second. Obviously, there is much interest in extracting real-time signal from this rich but noisy stream of data. More generally, there are many open and interesting problems in using high-velocity streaming text sources to track real-time events. … Such a platform can have many applications far beyond monitoring Twitter…All code for the platform I describe here can be found on my github repository Straw.”

Ryan Walker, a Casetext Data Engineer, describes how these products might deliver major results in the hands of a skilled developer. He uses the example of a speech to text monitor being able to transcribe radio or TV feeds and send the transcriptions to the platform. The platform would then seek key phrases and even be set up to respond with real-time event management. There are many industries that will find this capability very intriguing due to their dependence on real-time information processing, including finance and marketing.

 

Chelsea Kerwin, March 11, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Barry Zane and SPARQL City Acquired by Cambridge Semantics for Graph Technology

February 12, 2016

The article titled Cambridge Semantics Acquires SPARQL City’s IP, Expanding Offering of Graph-Cased Analytics at Big Data Scale on Business Wire discusses the benefits of merging Cambridge’s Semantics’ Anzo Smart Data Platform with SPARQL City’s graph analysis capacities. The article specifically mentions the pharmaceutical industry, financial services, and homeland security as major business areas that this partnership will directly engage due to the enhanced data analysis and graph technologies now possible.

“We believe this IP acquisition is a game-changer for big data analytics and smart data discovery,” said Chuck Pieper, CEO of Cambridge Semantics. “When coupled with our Anzo Smart Data Platform, no one else in the market can provide a similar end-to-end, semantic- and graph-based solution providing for data integration, data management and advanced analytics at the scale, context and speed that meets the needs of enterprises. The SPARQL City in-memory graph query engine allows users to conduct exploratory analytics at big data scale interactively.”

Barry Zane, a leader in database analytics with 40 years experience and CEO and founder of SPARQL City, will become the VP of Engineering at Cambridge Semantics. He mentions in the article that this acquisition has been a long time coming, with the two companies working together over the last two years.

 

Chelsea Kerwin, February 12, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Machine Learning Hindsight

January 18, 2016

Have you ever found yourself saying, “If I only knew then, what I know now”?  It is a moment we all experience, but instead of stewing over our past mistakes it is better to share the lessons we’ve learned with others.  Data scientist Peadar Coyle learned some valuable lessons when he first started working with machine learning.  He discusses three main things he learned in the article, “Three Things I Wish I Knew Earlier About Machine Learning.”

Here are the three items he wishes he knew then about machine learning, but know now:

  • “Getting models into production is a lot more than just micro services
  • Feature selection and feature extraction are really hard to learn from a book
  • The evaluation phase is really important”

Developing models is an easy step, but putting them in production is difficult.  There are many major steps that need attending to and doing all of the little jobs isn’t feasible on huge projects.   Peadar recommends outsourcing when you can.  Books and online information are good reference tools, but when they cannot be applied to actual situations the knowledge is useless.  Paedar learned that real world experience has no comparison.  When it comes to testing, it is a very important thing.  Very much as real world experience is invaluable, so is the evaluation.  Life does not hand perfect datasets for experimentation and testing different situations will better evaluate the model.

Paedar’s advice applies to machine learning, but it applies more to life in general.

 

Whitney Grace, January 18, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

IBMs CFO Reveals IBMs Innovation Strategy: Why Not Ask Watson

January 11, 2016

The article on TechTarget titled IBM CFO Schroeter on the Company’s Innovation Strategy delves into the mind of Martin Schroeter regarding IBM’s strategy for chasing innovation in healthcare and big data. This year alone IBM acquired three healthcare companies with data on roughly one hundred million people as well as massive amounts of data on medical conditions. Additionally, as the article relates,

“IBM’s purchase of The Weather Co.’s data processing and analytics operations brought the company a “massive ingestion machine,” which plays straight into its IoT strategy, Schroeter said. The ingestion system pulls in 4 GB of data per second, he said, and runs a lot of analytics as users generate weather forecasts for their geographies. The Weather Co. system will be the basis for the company’s Internet of Things platform, he said.”

One of many interesting tidbits from the mouth of Schroeter was this gem about companies being willing to “disrupt [themselves]” to ensure updated and long-term strategies that align technological advancement with business development. The hurtling pace of technology has even meant IBM coming up with a predictive system to speed up the due diligence process during acquisitions. What once took weeks to analyze and often lost IBM deals has now been streamlined to a single day’s work. Kaboom.

 

Chelsea Kerwin, January 11, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Interview with Informatica CEO

November 26, 2015

Blogger and Datameer CEO Stefan Groschupf interviews Anil Chakravarthy, acting CEO of Informatica, in a series of posts on his blog, Big Data & Brews. The two executives discuss security in the cloud, data infrastructure, schemas, and the future of data. There are four installments as of this writing, but it was an exchange in the second iteration, “Big Data  Brews: Part II on Data Security with Informatica,” that  captured our attention. Here’s Chakravarthy’s summary of the challenge now facing his company:

Stefan: From your perspective, where’s the biggest growth opportunity for your company?

Anil: We look at it as the intersection of what’s happening with the cloud and big data. Not only the movement of data between our premise and cloud and within cloud to cloud but also just the sheer growth of data in the cloud. This is a big opportunity. And if you look at the big data world, I think a lot of what happens in the big data world from our perspective, the value, especially for enterprise customers, the value of big data comes from when they can derive insights by combining data that they have from their own systems, etc., with either third-party data, customer-generated data, machine data that they can put together. So, that intersection is good for, and we are a data infrastructure provider, so those are the two big areas where we see opportunity.

It looks like Informatica is poised to make the most of the changes prompted by cloud technology. To check out the interview from the beginning, navigate to the first installment, “Big Data & Brews: Informatica Talks Security.”

Informatica offers a range of data-management and integration tools. Though the company has offices around the world, they maintain their headquarters in Redwood City, California. They are also hiring as of this writing.

Cynthia Murrell, November 26, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

IBM Launches Informative Blog

November 20, 2015

IBM has created a free Paper.li blog that features information about the company: IBM’s InfoSphere Master Data Management Roundup. Besides the general categories of Headlines and Videos, readers can explore articles under Science, Technology, Business, and two IBM-specific categories, #Bluemix and #IBM. If you love to watch as Big Blue gets smaller, you will find this free newspaper useful in tracking some of the topics upon which IBM is building its future.

Oddly, though, we did not spot any articles from Alliance at IBM  on the site. Some employees are unhappy with the way the company has been treating its workers, and have launched that site to publicize their displeasure. Here’s their Statement of Principles:

“Alliance@IBM/CWA Local 1701 is an IBM employee organization that is dedicated to preserving and improving our rights and benefits at IBM. We also strive towards restoring management’s respect for the individual and the value we bring to the company as employees. Our mission is to make our voice heard with IBM management, shareholders, government and the media. While our ultimate goal is collective bargaining rights with IBM, we will build our union now and challenge IBM on the many issues facing employees from off-shoring and job security to working conditions and company policy.”

It looks like IBM has more to worry about than sliding profits. Could the two issues be related?

Cynthia Murrell, November 20, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Matchlight Lights Up Stolen Data

June 26, 2015

It is a common gimmick on crime shows for the computer expert to be able to locate information, often stolen data, by using a few clever hacking tricks.  In reality it is not that easy and quick to find stolen data, but eWeek posted an article about a new intelligence platform that might be able to do the trick: “Terbium Labs Launches Matchlight Data Intelligence Platform.”  Terbium Labs’ Matchlight is able to recover stolen data as soon as it is released on the Dark Web.

How it works is simply remarkable.  Matchlight attaches digital fingerprints to a company’s files, down to the smallest byte.  Data recovered on the Dark Web can then be matched to the Terbium Labs’s database.  Matchlight is available under a SaaS model.  Another option they have for clients is a one-way fingerprinting feature that keeps a company’s data private from Terbium Labs.  They would only have access to the digital fingerprints in order to track the data.  Matchlight can also be integrated into already existing SharePoint or other document management systems.  The entire approach to Matchlight is taking a protective stance towards data, rather than a defensive.

“We see the market shifting toward a risk management approach to information security,” [Danny Rogers, CEO and co-founder of Terbium} said. “Previously, information security was focused on IT and defensive technologies. These days, the most innovative companies are no longer asking if a data breach is going to happen, but when. In fact, the most innovative companies are asking what has already happened that they might not know about. This is where Matchlight provides a unique solution.”

Across the board, data breaches are becoming common and Matchlight offers an automated way to proactively protect data.  While the digital fingerprinting helps track down stolen data, does Terbium Labs have a way to prevent it from being stolen at all?

Whitney Grace, June 26, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Eric Schmidt On Search Ambition and Attitude at the GOOG

May 20, 2015

The article on Business Insider titled Google’s Former CEO Reveals The Complicated Search Question He Wants Google To Be Able To Answer reports on Eric Schmidt’s speech in Berlin where he mentioned the hurdles Google is yet to overcome. Obviously, Google is an incredibly ambitious company, and should never be satisfied. He spelled out one particular question he would like the search engine to be able to answer,

“Try a query like ‘show me flights under €300 for places where it’s hot in December and I can snorkel,'” Schmidt says. “That’s kind of complicated: Google needs to know about flights under €300; hot destinations in winter; and what places are near the water, with cool fish to see. That’s basically three separate searches that have to be cross-referenced to get to the right answer. Sadly, we can’t solve that for you today. But we’re working on it.”

Schmidt also argued on behalf of Google in regards to the EU investigation into Google possibly favoring its own results rather than a fair spread of companies. Schmidt claimed that Google is most interested in simplifying search for users, rather than obliging users to click around. Since Google search is admittedly ad-oriented, Schmidt’s position seems to be at least semi-accurate.

Chelsea Kerwin, May 20 , 2014

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

 

Next Page »

  • Archives

  • Recent Posts

  • Meta