Can Analytics Be Cloud Friendly?
August 24, 2016
One of the problems with storing data in the cloud is that it is difficult to run analytics. Sure, you can run tests to determine the usage of the cloud, but analyzing the data stored in the cloud is another story. Program developers have been trying to find a solution to this problem and the open source community has developed some software that might be the ticket. Ideata wrote about the newest Apache software in “Apache Spark-Comparing RDD, Dataframe, and Dataset.”
Ideata is a data software company and they built many of the headlining products on the open source software Apache Spark. They have been using Apache Spark since 2013 and enjoy using it because it offers a rich abstraction, allows the developer to build complex workflows, and perform easy data analysis.
Apache Spark works like this:
Spark revolves around the concept of a resilient distributed dataset (RDD), which is a fault-tolerant collection of elements that can be operated on in parallel. An RDD is Spark’s representation of a set of data, spread across multiple machines in the cluster, with API to let you act on it. An RDD could come from any datasource, e.g. text files, a database via JDBC, etc. and can easily handle data with no predefined structure.
It can be used as the basis fort a user-friendly cloud analytics platform, especially if you are familiar with what can go wrong with a dataset.
Whitney Grace, August 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Neural Networks and Thought Commands
July 22, 2015
If you’ve been waiting for the day you can operate a computer by thinking at it, check out “When Machine Learning Meets the Mind: BBC and Google Get Brainy” at the Inquirer. Reporter Chris Merriman brings our attention to two projects, one about hardware and one about AI, that stand at the intersection of human thought and machine. Neither venture is anywhere near fruition, but a peek at their progress gives us clues about the future.
The internet-streaming platform iPlayer is a service the BBC provides to U.K. residents who wish to catch up on their favorite programmes. In pursuit of improved accessibility, the organization’s researchers are working on a device that allows users to operate the service with their thoughts. The article tells us:
“The electroencephalography wearable that powers the technology requires lucidity of thought, but is surprisingly light. It has a sensor on the forehead, and another in the ear. You can set the headset to respond to intense concentration or meditation as the ‘fire’ button when the cursor is over the option you want.”
Apparently this operation is easier for some subjects than for others, but all users were able to work the device to some degree. Creepy or cool? Perhaps it’s both, but there’s no escaping this technology now.
As for Google’s undertaking, we’ve examined this approach before: the development of artificial neural networks. This is some exciting work for those interested in AI. Merriman writes:
“Meanwhile, a team of Google researchers has been looking more closely at artificial neural networks. In other words, false brains. The team has been training systems to classify images and better recognise speech by bombarding them with input and then adjusting the parameters to get the result they want.
But once equipped with the information, the networks can be flipped the other way and create an impressive interpretation of objects based on learned parameters, such as ‘a screw has twisty bits’ or ‘a fly has six legs’.”
This brain-in-progress still draws some chuckle-worthy and/or disturbing conclusions from images, but it is learning. No one knows what the end result of Google’s neural network research will be, but it’s sure to be significant. In a related note, the article points out that IBM is donating its machine learning platform to Apache Spark. Who knows where the open-source community will take it from here?
Cynthia Murrell, July 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Prepare To Update Your Cassandra
June 2, 2015
It is time for an update to Apache’s headlining, open source, enterprise search software! The San Diego Times let us know that “DataStax Enterprise 4.7 Released” and it has a slew of updates set to make open source search enthusiasts drool. DataStax is a company that built itself around the open source Apache Cassandra software. The company specializes in enterprise applications for search and analytics.
The newest release of DataStax Enterprise 4.7 includes several updates to improve a user’s enterprise experience:
“…includes a production-certified version of Cassandra 2.1, and it adds enhanced enterprise search, analytics, security, in-memory, and database monitoring capabilities. These include a new certified version of Apache Solr and Live Indexing, a new DSE feature that makes data immediately available for search by leveraging Cassandra’s native ability to run across multiple data centers.”
The update also includes DataStax’s OpCenter 5.2 for enhanced security and encryption. It can be used to store encryption keys on servers and to manage admin security.
The enhanced search capabilities are the real bragging points: fault-tolerant search operations-used to customize failed search responses, intelligent search query routing-queries are routed to the fastest machines in a cluster for the quickest response times, and extended search analytics-using Solr search syntax and Apache Spark research and analytics tasks can run simultaneously.
DataStax Enterprise 4.7 improves enterprise search applications. It will probably pull in users trying to improve their big data plans. Has DataStax considered how its enterprise platform could be used for the cloud or on mobile computing?
Whitney Grace, June 2, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

