Recent Developments in Deep Learning Architecture from AlexNet to ResNet
September 27, 2016
The article on GitHub titled The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3) is not an article about the global media giant but rather the advancements in computer vision and convolutional neural networks (CNNs). The article frames its discussion around the ImageNet Large-Scale Recognition Challenges (ILSVRC), what it terms the “annual Olympics of computer vision…where teams compete to see who has the best computer vision model for tasks such as classification, localization, detection and more.” The article explains that the 2012 winners and their network (AlexNet) revolutionized the field.
This was the first time a model performed so well on a historically difficult ImageNet dataset. Utilizing techniques that are still used today, such as data augmentation and dropout, this paper really illustrated the benefits of CNNs and backed them up with record breaking performance in the competition.
In 2013, CNNs flooded in, and ZF Net was the winner with an error rate of 11.2% (down from AlexNet’s 15.4%.) Prior to AlexNet though, the lowest error rate was 26.2%. The article also discusses other progress in general network architecture including VGG Net, which emphasized depth and simplicity of CNNs necessary to hierarchical data representation, and GoogLeNet, which tossed the deep and simple rule out of the window and paved the way for future creative structuring using the Inception model.
Chelsea Kerwin, September 27, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Tips on How to Make the Most of Big Data (While Spending the Least)
April 13, 2016
The article titled The 10 Commandments of Business Intelligence in Big Data on Datanami offers wisdom written on USB sticks instead of stone tablets. In the Business Intelligence arena, apparently moral guidance can take a backseat to Big Data cost-savings. Suggestions include: Don’t move Big Data unless you must, try to leverage your existing security system, and engage in extensive data visualization sharing (think Github). The article explains the importance of avoiding certain price-gauging traps,
“When done right, [Big Data] can be extremely cost effective… That said…some BI applications charge users by the gigabyte… It’s totally common to have geometric, exponential, logarithmic growth in data and in adoption with big data. Our customers have seen deployments grow from tens of billions of entries to hundreds of billions in a matter of months. That’s another beauty of big data systems: Incremental scalability. Make sure you don’t get lowballed into a BI tool that penalizes your upside.”
The Fifth Commandment remind us all that analyzing the data in its natural, messy form is far better than flattening it into tables due to the risk of losing key relationships. The Ninth and Tenth Commandments step back and look at the big picture of data analytics in 2016. What was only a buzzword to most people just five years ago is now a key aspect of strategy for any number of organizations. This article reminds us that thanks to data visualization, Big Data isn’t just for data scientists anymore. Employees across departments can make use of data to make decisions, but only if they are empowered to do so.
Chelsea Kerwin, April 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Retraining the Librarian for the Future
March 28, 2016
The Internet is often described as the world’s biggest library containing all the world’s knowledge that someone dumped on the floor. The Internet is the world’s biggest information database as well as the world’s biggest data mess. In the olden days, librarians used to be the gateway to knowledge management but they need to vamp up their skills beyond the Dewey Decimal System and database searching. Librarians need to do more and Christian Lauersen’s personal blog explains how in, “Data Scientist Training For Librarians-Re-Skilling Libraries For The Future.”
DST4L is a boot camp for librarians and other information professionals to learn new skills to maintain relevancy. Last year DST4L was held as:
“DST4L has been held three times in The States and was to be set for the first time in Europe at Library of Technical University of Denmark just outside of Copenhagen. 40 participants from all across Europe were ready to get there hands dirty over three days marathon of relevant tools within data archiving, handling, sharing and analyzing. See the full program here and check the #DST4L hashtag at Twitter.”
Over the course of three days, the participants learned about OpenRefine, a spreadsheet-like application that cane be used for data cleanup and transformation. They also learned about the benefits of GitHub and how to program using Python. These skills are well beyond the classed they teach in library graduate programs, but it is a good sign that the profession is evolving even if the academia aspects lag behind.
Whitney Grace, March 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
No Search Just Browse Images on FindA.Photo
March 2, 2016
The search engine FindA.Photo proves itself to be a useful resource for browsing images based on any number of markers. The site offers a general search by terms, or the option of browsing images by color, collection (for example, “wild animals,” or “reflections”) or source. The developer of the site, David Barker, described his goals for the services on Product Hunt,
“I wanted to make a search for all of the CC0 image sites that are available. I know there are already a few search sites out there, but I specifically wanted to create one that was: simple and fast (and I’m working on making it faster), powerful (you can add options to your search for things like predominant colors and image size with just text), and something that could have contributions from anyone (via GitHub pull requests).”
My first click on a swatch of royal blue delivered 651 images of oceans, skies, panoramas of oceans and skies, jellyfish ballooning underwater, seagulls soaring etc. That may be my own fault for choosing such a clichéd color, but you get the idea. I had better (more various) results through the collections search, which includes “action,” “long-exposure,” “technology,” “light rays,” and “landmarks,” the last of which I immediately clicked for a collage of photos of the Eiffel Tower, Louvre, Big Ben, and the Great Wall of China.
Chelsea Kerwin, March 2, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Search the Snowden Documents
July 16, 2015
This cat has long since forgotten what the inside of the bag looked like. Have you perused the documents that were released by Edward Snowden, beginning in 2013? A website simply titled “Snowden Doc Search” will let you do just that through a user-friendly search system. The project’s Description page states:
“The search is based upon the most complete archive of Snowden documents to date. It is meant to encourage users to explore the documents through its extensive filtering capabilities. While users are able to search specifically by title, description, document, document date, and release date, categories also allow filtering by agency, codeword, document topic, countries mentioned, SIGADS, classification, and countries shared with. Results contain not only full document text, pdf, and description, but also links to relevant articles and basic document data, such as codewords used and countries mentioned within the document.”
The result of teamwork between the Courage Foundation and Transparency Toolkit, the searchable site is built upon the document/ news story archive maintained by the Edward Snowden Defense Fund. The sites Description page also supplies links to the raw dataset and to Transparency Toolkit’s Github page, for anyone who would care to take a look. Just remember, “going incognito doesn’t hide your browsing from your employer, your internet service provider, or the websites you visit.” (Chrome)
Cynthia Murrell, July 16 , 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

