Surprise, Most Dark Web Content Is Legal
November 21, 2016
If you have been under the impression that Dark Web is that big chunk of the Internet where all activities and content is illegal, you are wrong.
In a news report published by Neowin, and titled Terbium Labs: Most of the Dark Web Content, Visible Through Tor, Is Legal reveals:
Contrary to popular belief that the majority of the dark web, accessible through Tor is mostly legal… or offline! With extremism making up just a minuscule 0.2% of the content looked at.
According to this Quora thead, Dark Web was developed by US Military and Intelligence to communicate with their assets securely. The research started in 1995 and in 1997, mathematicians at Naval Research Laboratory developed The Onion Router Project or Tor. People outside Military Intelligence started using Tor to communicate with others for various reasons securely. Of course, people with ulterior motives spotted this opportunity and began utilizing Tor. This included arms and drug dealers, human traffickers, pedophiles. Mainstream media thus propagated the perception that Dark Web is an illegal place where criminal actors lurk, and all content is illegal.
Terbium Labs study indicates that 47.7% of content is legal and rest is borderline legal in the form of hacking services. Very little content is technically illegal like child pornography, arms dealing, drug dealing, and human trafficking related.
The Dark Web, however, is not a fairyland where illegal activities do not occur. As the news report points out:
While this report does prove that seedy websites exist on the dark web, they are in fact a minority, contradictory to what many popular news reports would have consumers believe.
Multiple research agencies have indicated that most content is legal on Dark Web with figures to back that up. But they still have not revealed, what this major chunk of legal content is made of? Any views?
Vishal Ingole, November 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Sugar Polluted Scientific Research
October 19, 2016
If your diet includes too much sugar, it is a good idea to cut back on the amount you consume. If also turns out if you have too much sugar in your research, the sugar industry will bribe you to hide the facts. Stat News reports that even objective academic research is not immune from corporate bribes in the article, “Sugar Industry Secretly Paid For Favorable Harvard Research.”
In the 1960s, Harvard nutritionists published two reviews in medical journals that downplayed the role sugar played in coronary heart disease. The sugar industry paid Harvard to report favorable results in scientific studies. Dr. Cristin Kearns published a paper in JAMA Internal Medicine about her research into the Harvard sugar conspiracy.
Through her research, she discovered that Harvard nutrionists Dr. Fredrick Stare and Mark Hegsted worked with the Sugar Research Foundation to write a literature review that countered early research that linked sucrose to coronary heart disease. This research would later help the sugar industry increase its market share by convincing Americans to eat a low-fat diet.
Dr. Walter Willett, who knew Hegsted and now runs the nutrition department at Harvard’s public health school, defended him as a principled scientist…‘However, by taking industry funding for the review, and having regular communications during the review with the sugar industry,’ Willett acknowledged, it ‘put him [Hegsted] in a position where his conclusions could be questioned. It is also possible that these relationships could induce some subtle bias, even if unconscious,’ he added.
In other words, corporate funded research can skew scientific data so that it favors their bottom dollar. This fiasco happened in the 1960s, have things gotten worse or better? With the big competition for funding and space in scientific journals, the answer appears to be yes.
Whitney Grace, October 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Google DeepMind Acquires Healthcare App
April 5, 2016
What will Google do next? Google’s London AI powerhouse has set up a new healthcare division and acquired a medical app called Hark, an article from Business Insider, tells us the latest. DeepMind, Google’s artificial intelligence research group, launched a new division recently called DeepMind Health and acquired a healthcare app. The article describes DeepMind Health’s new app called Hark,
“Hark — acquired by DeepMind for an undisclosed sum — is a clinical task management smartphone app that was created by Imperial College London academics Professor Ara Darzi and Dr Dominic King. Lord Darzi, director of the Institute of Global Health Innovation at Imperial College London, said in a statement: “It is incredibly exciting to have DeepMind – the world’s most exciting technology company and a true UK success story – working directly with NHS staff. The types of clinician-led technology collaborations that Mustafa Suleyman and DeepMind Health are supporting show enormous promise for patient care.”
The healthcare industry is ripe for disruptive technology, especially technologies which solve information and communications challenges. As the article alludes to, many issues in healthcare stem from too little conveyed and too late. Collaborations between researchers, medical professionals and tech gurus appears to be a promising answer. Will Google’s Hark lead the way?
Megan Feil, April 5, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Machine Learning Used to Decipher Lute Tablature
December 23, 2015
The Oxford Journal’s Early Music publication reveals a very specialized use of machine learning in, “Bring ‘Musicque into the Tableture’: Machine-Learning Models for Polyphonic Transcription of 16th-Century Lute Tablature” by musical researchers Reinier de Valk and Tillman Weyde. Note that this link will take you to the article’s abstract; to see the full piece, you’ll have to subscribe to the site. The abstract summarizes:
“A large corpus of music written in lute tablature, spanning some three-and-a-half centuries, has survived. This music has so far escaped systematic musicological research because of its notational format. Being a practical instruction for the player, tablature reveals very little of the polyphonic structure of the music it encodes—and is therefore relatively inaccessible to non-specialists. Automatic polyphonic transcription into modern music notation can help unlock the corpus to a larger audience and thus facilitate musicological research.
“In this study we present four variants of a machine-learning model for voice separation and duration reconstruction in 16th-century lute tablature. These models are intended to form the heart of an interactive system for automatic polyphonic transcription that can assist users in making editions tailored to their own preferences. Additionally, such models can provide new methods for analysing different aspects of polyphonic structure.”
The full article lays out the researchers’ modelling approaches and the advantages of each. They report their best model returns accuracy rates of 80 to 90 percent, so for modelers, it might be worth the $39 to check out the full article. We just think it’s nice to see machine learning used for such a unique and culturally valuable project.
Cynthia Murrell, December 23, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Understanding Trolls, Spam, and Nasty Content
December 9, 2015
The Internet is full of junk. It is a cold hard fact and one that will never die as long as the Internet exists. The amount of trash content was only intensified with the introduction of Facebook, Twitter, Instagram, Pinterst, and other social media platforms and it keeps pouring onto RSS feeds. The academic community is always up for new studies and capturing new data, so a researcher from the University of Arkansas decided to study mean content. “How ‘Deviant’ Messages Flood Social Media” from Science Daily is an interesting new idea that carries the following abstract:
“From terrorist propaganda distributed by organizations such as ISIS, to political activism, diverse voices now use social media as their major public platform. Organizations deploy bots — virtual, automated posters — as well as enormous paid “armies” of human posters or trolls, and hacking schemes to overwhelmingly infiltrate the public platform with their message. A professor of information science has been awarded a grant to continue his research that will provide an in-depth understanding of the major propagators of viral, insidious content and the methods that make them successful.”
Dr. Nitin Agarwal and will study what behavioral, social, and computational factors cause Internet content to go viral, especially if they have deviant theme. Deviant means along the lines something a troll would post. Agarwal’s research is part of a bigger investigation funded by the Office of Naval Research, Air Force Research, National Science Foundation, and Army Research Office. Agarwal will have a particular focus on how terrorist groups and extremist governments use social media platforms to spread their propaganda. He will also be studying bots that post online content as well.
Many top brass organizations do not have the faintest idea of even what some of the top social media platforms are, much less what their purpose is. A study like this will raise the blinders about them and teach researchers how social media actually works. I wonder if they will venture into 4chan.
Whitney Grace, December 9, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Linguamatics Clears Copyright
December 1, 2015
What is a researcher’s dream? A researcher’s dream is to be able to easily locate and access viable, full-text resources without having to deal with any copyright issues. One might think that all information is accessible via the Internet and a Google search, but if this is how you think professional research happens then you are sadly mistaken. Most professional articles and journals are locked behind corporate academic walls with barbed wire made from copyright laws.
PR Newswire says in “Linguamatics Expands Cloud Text Mining Platform To Include Full-Text Articles” as way for life science researchers to legally bypass copyright. Linguamatics is a natural language processing text-mining platform and it will now incorporate the Copyright Clearance Center’s new text mining solution RightFind XML. This will allow researchers to have access to over 4,000 peer-reviewed journals from over twenty-five of scientific, technical, and medical publishers.
“The solution enables researchers to make discoveries and connections that can only be found in full-text articles. All of the content is stored securely by CCC and is pre-authorized by publishers for commercial text mining. Users access the content using Linguamatics’ unique federated text mining architecture which allows researchers to find the key information to support business-critical decisions. The integrated solution is available now, and enables users to save time, reduce costs and help mitigate an organization’s copyright infringement risk.”
I can only hope that other academic databases and publishers will adopt a similar and (hopefully) more affordable way to access full-text, viable resources. One of the biggest drawbacks to Internet research is having to rely on questionable source information, because it is free and readily available. Easier access to more accurate information form viable resources will not only improve information, but also start a trend to increase its access.
Whitney Grace, December 1, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Digging into Googles Rich Answer Vault
November 4, 2015
Google has evolved from entering precise keywords into the search engine to inputting random questions, complete with question mark. Google has gone beyond answering questions and keyword queries. Directly within search results for over a year now, Google has included content referred to as “rich answers,” meaning answers to search queries without having to click through to a Web site. Stone Temple Consulting was curious how much people were actually using rich answers, how they worked, and how can they benefit their clients. In December 2014 and July 2015, they ran a series of tests and “Rich Answers Are On The Rise!” discusses the results.
Using the same data sets for both trials, Stone Temple Consulting discovered that use of Google rich answers significantly grew in the first half of 2015, as did the use of labeling the rich answers with titles, and using images with them. The data might be a skewed in favor of the actual usage of rich answers, because:
“Bear in mind that the selected query set focused on questions that we thought had a strong chance of generating a rich answer. The great majority of questions are not likely to do so. As a result, when we say 31.2 percent of the queries we tested generated a rich answer, the percentage of all search queries that would do so is much lower.”
After a short discussion about the different type of rich answers Google uses and how those different types of answers grew. One conclusion that can be drawn from the types of rich answers is that people are steadily relying more and more on one tool to find all of their information from a basic research question to buying a plane ticket.
Whitney Grace, November 4, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
CSI Search Informatics Are Actually Real
October 29, 2015
CSI might stand for a popular TV franchise, but it also stands for “compound structured identification” Phys.org explains in “Bioinformaticians Make The Most Efficient Search Engine For Molecular Structures Available Online.” Sebastian Böcker and his team at the Friedrich Schiller University are researching metabolites, chemical compounds that determine an organism’s metabolism. Metabolites are used to gauge information about the condition of living cells.
While this is amazing science there are some drawbacks:
“This process is highly complex and seldom leads to conclusive results. However, the work of scientists all over the world who are engaged in this kind of fundamental research has now been made much easier: The bioinformatics team led by Prof. Böcker in Jena, together with their collaborators from the Aalto-University in Espoo, Finland, have developed a search engine that significantly simplifies the identification of molecular structures of metabolites.”
The new search works like a regular search engine, but instead of using keywords it searches through molecular structure databases containing information and structural formulae of metabolites. The new search will reduce time in identifying the compound structures, saving on costs and time. The hope is that the new search will further research into metabolites and help researchers spend more time working on possible breakthroughs.
Whitney Grace, October 29, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Algorithmic Bias and the Unintentional Discrimination in the Results
October 21, 2015
The article titled When Big Data Becomes Bad Data on Tech In America discusses the legal ramifications of relying on algorithms for companies. The “disparate impact” theory has been used in the courtroom for some time to ensure that discriminatory policies be struck down whether they were created with the intention to discriminate or not. Algorithmic bias occurs all the time, and according to the spirit of the law, it discriminates although unintentionally. The article states,
“It’s troubling enough when Flickr’s auto-tagging of online photos label pictures of black men as “animal” or “ape,” or when researchers determine that Google search results for black-sounding names are more likely to be accompanied by ads about criminal activity than search results for white-sounding names. But what about when big data is used to determine a person’s credit score, ability to get hired, or even the length of a prison sentence?”
The article also reminds us that data can often be a reflection of “historical or institutional discrimination.” The only thing that matters is whether the results are biased. This is where the question of human bias becomes irrelevant. There are legal scholars and researchers arguing on behalf of ethical machine learning design that roots out algorithmic bias. Stronger regulations and better oversight of the algorithms themselves might be the only way to prevent time in court.
Chelsea Kerwin, October 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Reclaiming Academic Publishing
October 21, 2015
Researchers and writers are at the mercy of academic publishers who control the venues to print their work, select the content of their work, and often control the funds behind their research. Even worse is that academic research is locked behind database walls that require a subscription well beyond the price range of a researcher not associated with a university or research institute. One researcher was fed up enough with academic publishers that he decided to return publishing and distributing work back to the common people, says Nature in “Leading Mathematician Launches arXiv ‘Overlay’ Journal.”
The new mathematics journal Discrete Analysis peer reviews and publishes papers free of charge on the preprint server arXiv. Timothy Gowers started the journal to avoid the commercial pressures that often distort scientific literature.
“ ‘Part of the motivation for starting the journal is, of course, to challenge existing models of academic publishing and to contribute in a small way to creating an alternative and much cheaper system,’ he explained in a 10 September blog post announcing the journal. ‘If you trust authors to do their own typesetting and copy-editing to a satisfactory standard, with the help of suggestions from referees, then the cost of running a mathematics journal can be at least two orders of magnitude lower than the cost incurred by traditional publishers.’ ”
Some funds are required to keep Discrete Analysis running, costs are ten dollars per submitted papers to pay for software that manages peer review and journal Web site and arXiv requires an additional ten dollars a month to keep running.
Gowers hopes to extend the journal model to other scientific fields and he believes it will work, especially for fields that only require text. The biggest problem is persuading other academics to adopt the model, but things move slowly in academia so it will probably be years before it becomes widespread.
Whitney Grace, October 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

