Connecting SharePoint with External Data
July 28, 2015
One of the most frequently discussed SharePoint struggles is integrating SharePoint data with existing external data. IT Business Edge has compiled a short slideshow with helpful tips regarding integration, including the possible use of business connectivity services. See all the details in their presentation, “Eight Steps to Connect Office 365/SharePoint Online with External Data.”
The summary states:
“According to Mario Spies, senior strategic consultant at AvePoint, a lot of companies are in the process of moving their SharePoint content from on-premise to Office 365 / SharePoint Online, using tools such as DocAve Migrator from SharePoint 2010 or DocAve Content Manager from SharePoint 2013. In most of these projects, the question arises about how to handle SharePoint external lists connected to data using BDC. The good news is that SharePoint Online also supports Business Connectivity Services.”
To continue to learn more about the tips and tricks of SharePoint connectivity, stay tuned to ArnoldIT.com, particularly the SharePoint feed. Stephen E. Arnold is a lifelong leader in all things search, and his expertise is especially helpful for SharePoint. Users will continue to be interested in data migration and integration, and how things may be easier with the SharePoint 2016 update coming soon.
Emily Rae Aldridge, July 28, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Hadoop Rounds Up Open Source Goodies
July 17, 2015
Summer time is here and what better way to celebrate the warm weather and fun in the sun than with some fantastic open source tools. Okay, so you probably will not take your computer to the beach, but if you have a vacation planned one of these tools might help you complete your work faster so you can get closer to that umbrella and cocktail. Datamation has a great listicle focused on “Hadoop And Big Data: 60 Top Open Source Tools.”
Hadoop is one of the most adopted open source tool to provide big data solutions. The Hadoop market is expected to be worth $1 billion by 2020 and IBM has dedicated 3,500 employees to develop Apache Spark, part of the Hadoop ecosystem.
As open source is a huge part of the Hadoop landscape, Datamation’s list provides invaluable information on tools that could mean the difference between a successful project and failed one. Also they could save some extra cash on the IT budget.
“This area has a seen a lot of activity recently, with the launch of many new projects. Many of the most noteworthy projects are managed by the Apache Foundation and are closely related to Hadoop.”
Datamation has maintained this list for a while and they update it from time to time as the industry changes. The list isn’t sorted on a comparison scale, one being the best, rather they tools are grouped into categories and a short description is given to explain what the tool does. The categories include: Hadoop-related tools, big data analysis platforms and tools, databases and data warehouses, business intelligence, data mining, big data search, programming languages, query engines, and in-memory technology. There is a tool for nearly every sort of problem that could come up in a Hadoop environment, so the listicle is definitely worth a glance.
Whitney Grace, July 17, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Algorithmic Art Historians
July 14, 2015
Apparently, creativity itself is no longer subjective. MIT Technology Review announces, “Machine Vision Algorithm Chooses the Most Creative Paintings in History.” Traditionally, art historians judge how creative a work is based on its novelty and its influence on subsequent artists. The article notes that this is a challenging task, requiring an encyclopedic knowledge of art history and the judgement to decide what is novel and what has been influential. Now, a team at Rutgers University has developed an algorithm they say is qualified for the job.
Researchers Ahmed Elgammal and Babak Saleh credit several developments with bringing AI to this point. First, we’ve recently seen several breakthroughs in machine understanding of visual concepts, called classemes. that include recognition of factors from colors to specific objects. Another important factor: there now exist well-populated online artwork databases that the algorithms can, um, study. The article continues:
“The problem is to work out which paintings are the most novel compared to others that have gone before and then determine how many paintings in the future have uses similar features to work out their influence. Elgammal and Saleh approach this as a problem of network science. Their idea is to treat the history of art as a network in which each painting links to similar paintings in the future and is linked to by similar paintings from the past. The problem of determining the most creative is then one of working out when certain patterns of classemes first appear and how these patterns are adopted in the future. …
“The problem of finding the most creative paintings is similar to the problem of finding the most influential person on a social network, or the most important station in a city’s metro system or super spreaders of disease. These have become standard problems in network theory in recent years, and now Elgammal and Saleh apply it to creativity networks for the first time.”
Just what we needed. I have to admit the technology is quite intriguing, but I wonder: Will all creative human endeavors eventually have their algorithmic counterparts and, if so, how will that effect human expression?
Cynthia Murrell, July 14, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Chrome Restricts Extensions amid Security Threats
June 22, 2015
Despite efforts to maintain an open Internet, malware seems to be pushing online explorers into walled gardens, akin the old AOL setup. The trend is illustrated by a story at PandoDaily, “Security Trumps Ideology as Google Closes Off its Chrome Platform.” Beginning this July, Chrome users will only be able to download extensions for that browser from the official Chrome Web Store. This change is on the heels of one made in March—apps submitted to Google’s Play Store must now pass a review. Extreme measures to combat an extreme problem with malicious software.
The company tried a middle-ground approach last year, when they imposed the our-store-only policy on all users except those using Chrome’s development build. The makers of malware, though, are adaptable creatures; they found a way to force users into the development channel, then slip in their pernicious extensions. Writer Nathanieo Mott welcomes the changes, given the realities:
“It’s hard to convince people that they should use open platforms that leave them vulnerable to attack. There are good reasons to support those platforms—like limiting the influence tech companies have on the world’s information and avoiding government backdoors—but those pale in comparison to everyday security concerns. Google seems to have realized this. The chaos of openness has been replaced by the order of closed-off systems, not because the company has abandoned its ideals, but because protecting consumers is more important than ideology.”
Better safe than sorry? Perhaps.
Cynthia Murrell, June 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
IBM Provides Simple How-To Guide for Cloudant
April 24, 2015
The article titled Integrate Data with Cloudant and CouchDB NoSQL Database Using IBM InfoSphere Information Server on IBM offers a breakdown of the steps necessary to load JSON documents and attachments to Cloudant. In order to follow the steps, the article notes that you will need Cloudant, CouchDB, and IBM InfoSphere DataStage. The article concludes,
“This article provided detailed steps for loading JSON documents and attachments to Cloudant. You learned about the job design to retrieve JSON documents and attachments from Cloudant. You can modify the sample jobs to perform the same integration operations on a CouchDB database. We also covered the main features of the new REST step in InfoSphere DataStage V11.3, including reusable connection, parameterized URLs, security configuration, and request and response configurations. The JSON parser step was used in examples to parse JSON documents.”
Detailed examples with helpful images guide you through each part of the process, and it is possible to modify the examples for CouchDB. Although it may seem like a statement of the obvious the many loyal IBM users out there, perhaps there are people who still need to be told. If you are interested in learning the federation of information with a logical and simple process, use IBM.
Chelsea Kerwin, April 24, 2014
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
A Little Lucene History
March 26, 2015
Instead of venturing to Wikipedia to learn about Lucene’s history, visit the Parse.ly blog and read the post, “Lucene: The Good Parts.” After detailing how Doug Cutting created Lucene in 1999, the post describes how searching through SQL in the early 2000s was a huge task. SQL databases are not the best when it comes to unstructured search, so developers installed Lucene to make SQL document search more reliable. What is interesting is how much it has been adopted:
“At the time, Solr and Elasticsearch didn’t yet exist. Solr would be released in one year by the team at CNET. With that release would come a very important application of Lucene: faceted search. Elasticsearch would take another 5 years to be released. With its recent releases, it has brought another important application of Lucene to the world: aggregations. Over the last decade, the Solr and Elasticsearch packages have brought Lucene to a much wider community. Solr and Elasticsearch are now being considered alongside data stores like MongoDB and Cassandra, and people are genuinely confused by the differences.”
If you need a refresher or a brief overview of how Lucene works, related jargon, tips for using in big data projects, and a few more tricks. Lucene might just be a java library, but it makes using databases much easier. We have said for a while, information is only useful if you can find it easily. Lucene made information search and retrieval much simpler and accurate. It set the grounds for the current big data boom.
Whitney Grace, March 26, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

