Big Data Needs to Go Public

December 16, 2016

Big Data touches every part of our lives and we are unaware.  Have you ever noticed when you listen to the news, read an article, or watch a YouTube video that people say items such as: “experts claim, “science says,” etc.”  In the past, these statements relied on less than trustworthy sources, but now they can use Big Data to back up their claims.  However, popular opinion and puff pieces still need to back up their big data with hard fact.  Nature.com says that transparency is a big deal for Big Data and algorithm designers need to work on it in the article, “More Accountability For Big-Data Algorithms.”

One of the hopes is that big data will be used to bridge the divide between one bias and another, except that he opposite can happen.  In other words, Big Data algorithms can be designed with a bias:

There are many sources of bias in algorithms. One is the hard-coding of rules and use of data sets that already reflect common societal spin. Put bias in and get bias out. Spurious or dubious correlations are another pitfall. A widely cited example is the way in which hiring algorithms can give a person with a longer commute time a negative score, because data suggest that long commutes correlate with high staff turnover.

Even worse is that people and organizations can design an algorithm to support science or facts they want to pass off as the truth.  There is a growing demand for “algorithm accountability,” mostly in academia.  The demands are that data sets fed into the algorithms are made public.  There also plans to make algorithms that monitor algorithms for bias.

Big Data is here to say, but relying too much on algorithms can distort the facts.  This is why the human element is still needed to distinguish between fact and fiction.  Minority Report is closer to being our present than ever before.

Whitney Grace, December 16, 2016

At Last an Academic Search, but How Much Does It Cost?

December 9, 2016

I love Google.  You love Google.  Everyone loves Google so much that it has become a verb in practically every language.  Google does present many problems, however, especially in the inclusion of paid ads in search results and Google searches are not academically credible.  Researchers love the ease of use with Google, but there a search engine does not exist that returns results that answer a simple question based on a few keywords, NLP, and citations (those are extremely important).

It is possible that a search engine designed for academia could exist, especially if it can be subject specific and allows full-text access to all results.  The biggest problem and barrier in the way of a complete academic search engine is that scholarly research is protected by copyright and most research is behind pay walls belonging to academic publishers, like Elsevier.

Elsevier is a notorious academic publisher because it provides great publication and it is also expensive to subscribe to it digitally.  The Mendeley Blog shares that Elsevier has answered the academic search engine cry: “Introducing Elsevier DataSearch.”  The Elsevier DataSearch promises to search through reputable information repositories and help researchers accelerate their work.

DataSearch is still in the infant stage and there is an open call for beta testers:

DataSearch offers a new and innovative approach.  Most search engines don’t actively involve their users in making them better; we invite you, the user, to join our User Panel and advise how we can improve the results.  We are looking for users in a variety of fields, no technical expertise is required (though welcomed).  In order to join us, visit https://datasearch.elsevier.com and click on the button marked ‘Join Our User Panel’.”

This is the right step forward for any academic publisher!  There is one thing I am worried about and that is: how much is the DataSearch engine going to cost users?  I respect copyright and the need to make a profit, but I wish there was one all-encompassing academic database that was free or had a low-cost subscription plan.

Whitney Grace, December 9, 2016

Bitcoin Textbook to Become Available from Princeton

March 16, 2016

Bitcoin is all over the media but this form of currency may not be thoroughly understood by many, including researchers and scholars. An post on this topic, The Princeton Bitcoin textbook is now freely available, was recently published on Freedom to Tinker, a blog hosted by Princeton’s Center for Information Technology Policy. This article announces the first completed draft of a Princeton Bitcoin textbook. At 300 pages, the manuscript is geared to those who hope to gain a technical understanding of how Bitcoin works and is appropriate for those who have a basic understanding of computer science and programming. According to the write-up,

“Researchers and advanced students will find the book useful as well — starting around Chapter 5, most chapters have novel intellectual contributions. Princeton University Press is publishing the official, peer-reviewed, polished, and professionally done version of this book. It will be out this summer. If you’d like to be notified when it comes out, you should sign up here. Several courses have already used an earlier draft of the book in their classes, including Stanford’s CS 251. If you’re an instructor looking to use the book in your class, we welcome you to contact us, and we’d be happy to share additional teaching materials with you.”

As Bitcoin educational resources catch fire in academia, it is only a matter of time before other Bitcoin experts begin creating resources to help other audiences understand the currency of the Dark Web. Additionally, it will be interesting to see if research emerges regarding connections between Bitcoin, the Dark Web and the mainstream internet.

 

Megan Feil, March 16, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Reclaiming Academic Publishing

October 21, 2015

Researchers and writers are at the mercy of academic publishers who control the venues to print their work, select the content of their work, and often control the funds behind their research.  Even worse is that academic research is locked behind database walls that require a subscription well beyond the price range of a researcher not associated with a university or research institute.  One researcher was fed up enough with academic publishers that he decided to return publishing and distributing work back to the common people, says Nature in “Leading Mathematician Launches arXiv ‘Overlay’ Journal.”

The new mathematics journal Discrete Analysis peer reviews and publishes papers free of charge on the preprint server arXiv.  Timothy Gowers started the journal to avoid the commercial pressures that often distort scientific literature.

“ ‘Part of the motivation for starting the journal is, of course, to challenge existing models of academic publishing and to contribute in a small way to creating an alternative and much cheaper system,’ he explained in a 10 September blog post announcing the journal. ‘If you trust authors to do their own typesetting and copy-editing to a satisfactory standard, with the help of suggestions from referees, then the cost of running a mathematics journal can be at least two orders of magnitude lower than the cost incurred by traditional publishers.’ ”

Some funds are required to keep Discrete Analysis running, costs are ten dollars per submitted papers to pay for software that manages peer review and journal Web site and arXiv requires an additional ten dollars a month to keep running.

Gowers hopes to extend the journal model to other scientific fields and he believes it will work, especially for fields that only require text.  The biggest problem is persuading other academics to adopt the model, but things move slowly in academia so it will probably be years before it becomes widespread.

Whitney Grace, October 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

  • Archives

  • Recent Posts

  • Meta