Shining a Flashlight in Space
November 9, 2016
A tired, yet thorough metaphor of explaining the dark web is shining a flashlight in space. If you shine a flashlight in space, your puny battery-powered beacon will not shed any light on the trillions of celestial objects that exist in the vacuum. While you wave the flashlight around trying to see something in the cosmos, you are too blind to see the grand galactic show hidden by the beam. The University of Michigan shared the article, “Shadow Of The Dark Web” about Computer Science and Engineering Professor Mike Cafarella and his work with DARPA.
Cafarella is working on Memex, a project that goes beyond the regular text-based search engine. Using more powerful search tools, Memex concentrates on discovering information related to human trafficking. Older dark web search tools skimmed over information and were imprecise. Cafarella’s work improved dark web search tools, supplying data sets with more accurate information on traffickers, their contact information, and their location.
Humans are still needed to interpret the data as the algorithms do not know how to interpret the black market economic worth of trafficked people. His dark web search tools can be used for more than just sex trafficking:
His work can help identify systems of terrorist recruitment; bust money-laundering operations; build fossil databases from a century’s worth of paleontology publications; identify the genetic basis of diseases by drawing from thousands of biomedical studies; and generally find hidden connections among people, places, and things.
I would never have thought a few years ago that database and data-mining research could have such an impact, and it’s really exciting,’ says Cafarella. ‘Our data has been shipped to law enforcement, and we hear that it’s been used to make real arrests. That feels great.
In order to see the dark web, you need more than a flashlight. To continue the space metaphor, you need a powerful telescope that scans the heavens and can search the darkness where no light ever passes.
Whitney Grace, November 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
The Trials, Tribulations, and Party Anecdotes Of “Edge Case” Names
May 16, 2016
The article titled These Unlucky People Have Names That Break Computers on BBC Future delves into the strange world of “edge cases” or people with unexpected or problematic names that reveal glitches in the most commonplace systems that those of us named “Smith” or “Jones” take for granted. Consider Jennifer Null, the Virginia woman who can’t book a plane ticket or complete her taxes without extensive phone calls and headaches. The article says,
“But to any programmer, it’s painfully easy to see why “Null” could cause problems for a database. This is because the word “null” is often inserted into database fields to indicate that there is no data there. Now and again, system administrators have to try and fix the problem for people who are actually named “Null” – but the issue is rare and sometimes surprisingly difficult to solve.”
It may be tricky to find people with names like Null. Because of the nature of the controls related to names, issues generally arise for people like Null on systems where it actually does matter, like government forms. This is not an issue unique to the US, either. One Patrick McKenzie, an American programmer living in Japan, has run into regular difficulties because of the length of his last name. But that is nothing compared to Janice Keihanaikukauakahihulihe’ekahaunaele, a Hawaiian woman who championed for more flexibility in name length restrictions for state ID cards.
Chelsea Kerwin, May 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Amazon Punches Business Intelligence
November 11, 2015
Amazon already gave technology a punch when it launched AWS, but now it is releasing a business intelligence application that will change the face of business operations or so Amazon hopes. ZDNet describes Amazon’s newest endeavor in “AWS QuickSight Will Disrupt Business Intelligence, Analytics Markets.” The market is already saturated with business intelligence technology vendors, but Amazon’s new AWS QuickSight will cause another market upheaval.
“This month is no exception: Amazon crashed the party by announcing QuickSight, a new BI and analytics data management platform. BI pros will need to pay close attention, because this new platform is inexpensive, highly scalable, and has the potential to disrupt the BI vendor landscape. QuickSight is based on AWS’ cloud infrastructure, so it shares AWS characteristics like elasticity, abstracted complexity, and a pay-per-use consumption model.”
Another monkey wrench for business intelligence vendors is that AWS QuickSight’s prices are not only reasonable, but are borderline scandalous: standard for $9/month per user or enterprise edition for $18/month per user.
Keep in mind, however, that AWS QuickSight is the newest shiny object on the business intelligence market, so it will have out-of-the-box problems, long-term ramifications are unknown, and reliance on database models and schemas. Do not forget that most business intelligence solutions do not resolve all issues, including ease of use and comprehensiveness. It might be better to wait until all the bugs are worked out of the system, unless you do not mind being a guinea pig.
Whitney Grace, November 11, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
TemaTres Open Source Vocabulary Server
November 3, 2015
The latest version of the TemaTres vocabulary server is now available, we learn from the company’s blog post, “TemaTres 2.0 Released.” Released under the GNU General Public License version 2.0, the web application helps manage taxonomies, thesauri, and multilingual vocabularies. The web application can be downloaded at SourceForge. Here’s what has changed since the last release:
*Export to Moodle your vocabulary: now you can export to Moodle Glossary XML format
*Metadata summary about each term and about your vocabulary (data about terms, relations, notes and total descendants terms, deep levels, etc)
*New report: reports about terms with mapping relations, terms by status, preferred terms, etc.
*New report: reports about terms without notes or specific type of notes
*Import the notes type defined by user (custom notes) using tagged file format
*Select massively free terms to assign to other term
*Improve utilities to take terminological recommendations from other vocabularies (more than 300: http://www.vocabularyserver.com/vocabularies/)
*Update Zthes schema to Zthes 1.0 (Thanks to Wilbert Kraan)
*Export the whole vocabulary to Metadata Authority Description Schema (MADS)
*Fixed bugs and improved several functional aspects.
*Uses Bootstrap v3.3.4
See the server’s SourceForge page, above, for the full list of features. Though as of this writing only 21 users had rated the product, all seemed very pleased with the results. The TemaTres website notes that running the server requires some other open source tools: PHP, MySql, and HTTP Web server. It also specifies that, to update from version 1.82, keep the db.tematres.php, but replace the code. To update from TemaTres 1.6 or earlier, first go in as an administrator and update to version 1.7 through Menu-> Administration -> Database Maintenance.
Cynthia Murrell, November 3, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
CSI Search Informatics Are Actually Real
October 29, 2015
CSI might stand for a popular TV franchise, but it also stands for “compound structured identification” Phys.org explains in “Bioinformaticians Make The Most Efficient Search Engine For Molecular Structures Available Online.” Sebastian Böcker and his team at the Friedrich Schiller University are researching metabolites, chemical compounds that determine an organism’s metabolism. Metabolites are used to gauge information about the condition of living cells.
While this is amazing science there are some drawbacks:
“This process is highly complex and seldom leads to conclusive results. However, the work of scientists all over the world who are engaged in this kind of fundamental research has now been made much easier: The bioinformatics team led by Prof. Böcker in Jena, together with their collaborators from the Aalto-University in Espoo, Finland, have developed a search engine that significantly simplifies the identification of molecular structures of metabolites.”
The new search works like a regular search engine, but instead of using keywords it searches through molecular structure databases containing information and structural formulae of metabolites. The new search will reduce time in identifying the compound structures, saving on costs and time. The hope is that the new search will further research into metabolites and help researchers spend more time working on possible breakthroughs.
Whitney Grace, October 29, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Libraries Failure to Make Room for Developer Librarians
October 23, 2015
The article titled Libraries’ Tech Pipeline Problem on Geek Feminism explores the lack of diverse developers. The author, a librarian, is extremely frustrated with the approach many libraries have taken. Rather than refocusing their hiring and training practices to emphasize technical skills, many are simply hiring more and more vendors, hardly a solution. The article states,
“The biggest issue I see is that we offer a fair number of very basic learn-to-code workshops, but we don’t offer a realistic path from there to writing code as a job. To put a finer point on it, we do not offer “junior developer” positions in libraries; we write job ads asking for unicorns, with expert- or near-expert-level skills in at least two areas (I’ve seen ones that wanted strong skills in development, user experience, and devops, for instance).”
The options available are that librarians either learn to code in their spare time (not viable), or enter the tech workforce temporarily and bring your skills back after a few years. This option is also full of drawbacks, especially that even white women are marginalized in the tech industry. Instead, the article stipulates the libraries need to make more room for hiring and promoting people with coding skills and interests while also joining the coding communities like Code4Lib.
Chelsea Kerwin, October 23, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Reclaiming Academic Publishing
October 21, 2015
Researchers and writers are at the mercy of academic publishers who control the venues to print their work, select the content of their work, and often control the funds behind their research. Even worse is that academic research is locked behind database walls that require a subscription well beyond the price range of a researcher not associated with a university or research institute. One researcher was fed up enough with academic publishers that he decided to return publishing and distributing work back to the common people, says Nature in “Leading Mathematician Launches arXiv ‘Overlay’ Journal.”
The new mathematics journal Discrete Analysis peer reviews and publishes papers free of charge on the preprint server arXiv. Timothy Gowers started the journal to avoid the commercial pressures that often distort scientific literature.
“ ‘Part of the motivation for starting the journal is, of course, to challenge existing models of academic publishing and to contribute in a small way to creating an alternative and much cheaper system,’ he explained in a 10 September blog post announcing the journal. ‘If you trust authors to do their own typesetting and copy-editing to a satisfactory standard, with the help of suggestions from referees, then the cost of running a mathematics journal can be at least two orders of magnitude lower than the cost incurred by traditional publishers.’ ”
Some funds are required to keep Discrete Analysis running, costs are ten dollars per submitted papers to pay for software that manages peer review and journal Web site and arXiv requires an additional ten dollars a month to keep running.
Gowers hopes to extend the journal model to other scientific fields and he believes it will work, especially for fields that only require text. The biggest problem is persuading other academics to adopt the model, but things move slowly in academia so it will probably be years before it becomes widespread.
Whitney Grace, October 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Attivio Does Data Dexterity
October 9, 2015
Enterprise search company Attivio has an interesting post in their Data Dexterity Blog titled “3 Questions for the CEO.” We tend to keep a close eye on industry leader Attivio, and for good reason. In this post, the company’s senior director of product marketing Jane Zupan posed a few questions to her CEO, Stephen Baker, about their role in the enterprise search market. Her first question has Baker explaining his vision for the field’s future, “search-based data discovery”; he states:
“With search-based data discovery, you would simply type a question in your natural language like you do when you perform a search in Google and get an answer. This type of search doesn’t require a visualization tool. So, for example, you could ask a question like ‘tell me what type of weather conditions which exist most of the time when I see a reduction in productivity in my oil wells.’ The answer that comes back, such as ‘snow,’ or ‘sleet,’ gives you insights into how weather patterns affect productivity. Right now, search can’t infer what a question means. They match the words in a query, or keywords, with words in a document. But [research firm] Gartner says that there is an increasing importance for an interface in BI tools that extend BI content creation, analysis and data discovery to non-skilled users. You don’t need to be familiar with the data or be a business analyst or data scientist. You can be anyone and simply ask a question in your words and have the search engine deliver the relevant set of documents.”
Yes, many of us are looking forward to that day. Will Attivio be the first to deliver? The interview goes on to discuss the meaning of the company’s slogan, “the data dexterity company.” Part of the answer involves gaining access to “dark data” buried within organizations’ data silos. Finally, Zupan asks what “sets Attivio apart?” Baker’s answers: the ability to quickly access data from more sources; deriving structure from and analyzing unstructured data; and friendliness to “non-technical” users.
Launched in 2008, Attivio is headquartered in Newton, Massachusetts. Their team includes folks with an advantageous combination of backgrounds: in search, database, and business intelligence companies.
Cynthia Murrell, October 9, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Maverick Search and Match Platform from Exorbyte
August 31, 2015
The article titled Input Management: Exorbyte Automates the Determination of Identities on Business On (a primarily German language website) promotes the Full Page Entity Detect from Exorbyte. Exorbyte is a world leader in search and match for large volumes of data. They boast clients in government, insurance, input management and ICT firms, really any business with identity resolution needs. The article stresses the importance of pulling information from masses of data in the modern office. They explain,
“With Full Page Entity Detect provides exorbyte a solution to the inbox of several million incoming documents.This identity data of the digitized correspondence (can be used for correspondence definition ) extract with little effort from full-text documents such as letters and emails and efficiently compare them with reference databases. The input management tool combines a high fault tolerance with accuracy, speed and flexibility.Gartner, the software company from Konstanz was recently included in the Magic Quadrant for Enterprise Search.”
The company promises that their Matchmaker technology is unrivaled in searching text without restrictions, even without language, allowing for more accurate search. Full Page Entity Detect is said to be particularly useful when it comes to missing information or overlooked errors, since the search is so thorough.
Chelsea Kerwin, August 31 , 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
SharePoint 2016 Beta is Coming Soon
August 13, 2015
There is a lot of excitement about the future of SharePoint. Microsoft wants to capitalize on the good buzz but in their excitement the timeline has gotten skewed. It seems that the most recent change is in their favor, however. CMS Wire covers the story in their article, “Cancel Your Plans: SharePoint 2016 Beta is (Almost) Here.”
The author begins:
“For the past couple of years, we IT pros really haven’t known what our place in the world was going to be with SharePoint. But I feel like in the past couple of months I’ve seen the future. At least for me, as an IT pro, part of that future is identity. So you’re going to be hearing a lot more about that from me. But also the reason you’re going to be hearing about a lot of that is because next month — August — we’re going to get our first public beta of SharePoint 2016.”
The beta release will come earlier than projected. Lots of updates will come fast and frequently once the release is available, making it difficult to stay ahead of the curve. In order to sort through the chaos, stay tuned to ArnoldIT.com, a website carefully curated by Stephen E. Arnold. His SharePoint feed is a great way to stay in touch with the latest news, without being overwhelmed by the unnecessary details.
Emily Rae Aldridge, August 13, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

