Easy as 1,2,3 Common Mistakes Made with Data Lakes
December 15, 2015
The article titled Avoiding Three Common Pitfalls of Data Lakes on DataInformed explores several pitfalls that could negate the advantages of data lakes. The article begins with the perks, such as easier data access and of course, the cost-effectiveness of keeping data in a single hub. The first is sustainability (or the lack thereof), since the article emphasizes that data lakes actually require much more planning and management of data than conventional databases. The second pitfall raised is resource allocation,
“Another common pitfall of implementing data lakes arises when organizations need data scientists, who are notoriously scarce, to generate value from these hubs. Because data lakes store data in their native format, it is common for data scientists to spend as much as 80 percent of their time on basic data preparation. Consequently, many of the enterprise’s most valued resources are dedicated to mundane, time-consuming processes that considerably lengthen time to action on potentially time-sensitive big data.“
The third pitfall is technology contradictions or trying to use traditional approaches on a data lake that holds both big and unstructured data. Be not alarmed, however, the article goes into great detail about how to avoid these issues through data lake development with smart data technologies such as semantic tech.
Chelsea Kerwin, December 15, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Search Data from Bing for 2015 Yields Few Surprises
December 11, 2015
The article on Search Engine Watch titled Bing Reveals the Top US and UK Searches of 2015 in the extremely intellectual categories of Celebs, News, Sport(s), Music, and Film. Starting with the last category, guess what franchise involving wookies and Carrie Fisher took the top place? For Celebrity searches, Taylor Swift took first in the UK, and Caitlyn Jenner in the US, followed closely by Miley Cyrus (and let’s all take a moment to savor the seething rage this data must have caused in Kim Kardashian’s heart.) What does this trivia matter? Ravleen Beeston, UK Sales Director of Bing, is quoted in the article with her two cents,
“Understanding the interests and motivations driving search behaviour online provides invaluable insight for marketers into the audiences they care about. This intelligence allows us to empower marketers to create meaningful connections that deliver more value for both consumers and brands alike. By reflecting back on the key searches over the past 12 months, we can begin to anticipate what will inspire and how to create the right experience in the right context during the year to come.”
Some of the more heartening statistics were related to searches for women’s sports news, which increased from last year. Serena Williams was searched more often than the top five male tennis players combined. And saving the best for last, in spite of the dehumanizing and often racially biased rhetoric we’ve all heard involving Syrian refugees, there was a high volume of searches in the US asking how to provide support and aid for refugees, especially children.
Chelsea Kerwin, December 11, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Career Advice from Successful Googlers
November 18, 2015
A few words of wisdom from a Google veteran went from Quora query to Huffington Post article in, “What It Takes to Rise the Ranks at Google: Advice from a Senior Staff Engineer.” The original question was, “How hard is it to make Senior Engineer at Google.” HuffPo senior editor Nico Pitney reproduces the most popular response, that of senior engineer Carlos Pizano. Pizano lists some of his education and pre-Google experience, and gives some credit to plain luck, but here’s the part that makes this good guidance for approaching many jobs:
“I happen to be a believer of specialization, so becoming ‘the person’ on a given subject helped me a lot. Huge swaths of core technology key to Google’s success I know nothing about, of some things I know all there is to know … or at least my answers on the particular subject were the best to be found at Google. Finally, I never focused on my career. I tried to help everybody that needed advice, even fixing their code when they let me and was always ready to spread the knowledge. Coming up with projects but giving them to eager, younger people. Shine the light on other’s accomplishments. All that comes back to you when performance review season comes.”
Knowing your stuff and helping others—yes, that will go a long way indeed. For more engineers’ advice, some of which is more Google-specific, navigate to the list of responses here.
Cynthia Murrell, November 18, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Digging into Googles Rich Answer Vault
November 4, 2015
Google has evolved from entering precise keywords into the search engine to inputting random questions, complete with question mark. Google has gone beyond answering questions and keyword queries. Directly within search results for over a year now, Google has included content referred to as “rich answers,” meaning answers to search queries without having to click through to a Web site. Stone Temple Consulting was curious how much people were actually using rich answers, how they worked, and how can they benefit their clients. In December 2014 and July 2015, they ran a series of tests and “Rich Answers Are On The Rise!” discusses the results.
Using the same data sets for both trials, Stone Temple Consulting discovered that use of Google rich answers significantly grew in the first half of 2015, as did the use of labeling the rich answers with titles, and using images with them. The data might be a skewed in favor of the actual usage of rich answers, because:
“Bear in mind that the selected query set focused on questions that we thought had a strong chance of generating a rich answer. The great majority of questions are not likely to do so. As a result, when we say 31.2 percent of the queries we tested generated a rich answer, the percentage of all search queries that would do so is much lower.”
After a short discussion about the different type of rich answers Google uses and how those different types of answers grew. One conclusion that can be drawn from the types of rich answers is that people are steadily relying more and more on one tool to find all of their information from a basic research question to buying a plane ticket.
Whitney Grace, November 4, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
CSI Search Informatics Are Actually Real
October 29, 2015
CSI might stand for a popular TV franchise, but it also stands for “compound structured identification” Phys.org explains in “Bioinformaticians Make The Most Efficient Search Engine For Molecular Structures Available Online.” Sebastian Böcker and his team at the Friedrich Schiller University are researching metabolites, chemical compounds that determine an organism’s metabolism. Metabolites are used to gauge information about the condition of living cells.
While this is amazing science there are some drawbacks:
“This process is highly complex and seldom leads to conclusive results. However, the work of scientists all over the world who are engaged in this kind of fundamental research has now been made much easier: The bioinformatics team led by Prof. Böcker in Jena, together with their collaborators from the Aalto-University in Espoo, Finland, have developed a search engine that significantly simplifies the identification of molecular structures of metabolites.”
The new search works like a regular search engine, but instead of using keywords it searches through molecular structure databases containing information and structural formulae of metabolites. The new search will reduce time in identifying the compound structures, saving on costs and time. The hope is that the new search will further research into metabolites and help researchers spend more time working on possible breakthroughs.
Whitney Grace, October 29, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
New Search System for Comparing Companies
September 22, 2015
There is a new tool out to help companies compile information on their competitors: RivalSeek. This brainchild of entrepreneur Richard Brevig seeks to combat an issue he encountered when he turned to Google while researching the market for a different project: Google’s “personalized search” filters
keep users from viewing the whole landscape of any particular field. Frustration led Brevig to develop some tools of his own, which he realized might appeal to others. The site’s homepage explains simply:
“Find your competitors that Google can’t. RivalSeek’s competitor search engine looks past filter bubbles, finding competitors you’ve never heard of.”
More information can be found in Brevig’s brief introductory video on YouTube. There’s also this “quick demo,” which can be found on YouTube or playing quietly on RivalSeek’s home page. While the tool is still in Beta, Brevig is confident enough in its usefulness to charge $29 a month for access. You can find an example success story, for the Dollar Shave Club, at the company’s blog.
This is a great idea. While Google’s filter bubbles can be convenient, it is clear that confirmation bias is not their only hazard. Perhaps Brevig would be interested in expanding this tool into other areas, like science, literature, or sociology. Just a suggestion.
Cynthia Murrell, September 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Where’s the Finish Line Enterprise Search?
September 16, 2015
What never ceases to amaze me is that people are always perplexed when goals for technology change. It always comes with a big hullabaloo and rather than complaining about the changes, time would be better spent learning ways to adapt and learn from the changes. Enterprise search is one of those technology topics that sees slow growth, but when changes occur they are huge. Digital Workplace Group tracks the newest changes in enterprise search, explains why they happened, and how to adapt: “7 Ways The Goal Posts On Enterprise Search Have Moved.”
After spending an inordinate amount of explaining how the author arrived at the seven ways enterprise search has changed, we are finally treated to the bulk of the article. Among the seven reasons are obvious insights that have been discussed in prior articles on Beyond Search, but there are new ideas to ruminate about. Among the obvious are that users want direct answers, they expect search to do more than find information, and understanding a user’s intent. While the obvious insights are already implemented in search engines, enterprise search lags behind.
Enterprise search should work on a more personalized level due it being part of a closed network and how people rely on it to fulfill an immediate need. A social filter could be applied to display a user’s personal data in search results and also users rely on the search filter as a quick shortcut feature. Enterprise search is way behind in taking advantage of search analytics and how users consume and manipulate data.
“To summarize everything above: Search isn’t about search; it’s about finding, connecting, answers, behaviors and productivity. Some of the above changes are already here within enterprises. Some are still just being tested in the consumer space. But all of them point to a new phase in the life of the Internet, intranets, computer technology and the experience of modern, digital work.”
As always there is a lot of room for enterprise search improvement, but these changes need to made for an updated and better work experience.
Whitney Grace, September 16, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Mondeca Has a Sandbox
September 15, 2015
French semantic tech firm Mondeca has their own research arm, Mondeca Labs. Their website seems to be going for a playful, curiosity-fueled vibe. The intro states:
“Mondeca Labs is our sandbox: we try things out to illustrate the potential of Semantic Web technologies and get feedback from the Semantic Web community. Our credibility in the Semantic Web space is built on our contribution to international standards. Here we are always looking for new challenges.”
The page links to details on several interesting projects. One entry we noticed right away is for an inference engine; they say it is “coming soon,” but a mouse click reveals that no info is available past that hopeful declaration. The site does supply specifics about other projects; some notable examples include linked open vocabularies, a SKOS reader, and a temporal search engine. See their home page, above, for more.
Established in 1999, Mondeca has delivered pragmatic semantic solutions to clients in Europe and North America for over 15 years. The firm is based in Paris, France.
Cynthia Murrell, September 15, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Bing Snapshots for In-App Searches
September 9, 2015
Developers have a new tool for incorporating search data directly into apps, we learn in “Bing Snapshots First to Bring Advanced In-App Search to Users” at Search Engine Watch. Apparently Google announced a similar feature, Google Now on Tap, earlier this year, but Microsoft’s Bing has beaten them to the consumer market. Of course, part of Snapshot’s goal is to keep users from wandering out of “Microsoft territory,” but many users are sure to appreciate the convenience nevertheless. Reporter Mike O’Brien writes:
“With Bing Snapshots, developers will be able to incorporate all of the search engine’s information into their apps, allowing users to perform searches in context without navigating outside. For example, a friend could mention a restaurant on Facebook Messenger. When you long-press the Home button, Bing will analyze the contents of the screen and bring up a snapshot of a restaurant, with actionable information, such as the restaurant’s official website and Yelp reviews, as well Uber.”
Bing officials are excited about the development (and, perhaps, scoring a perceived win over Google), declaring this the start of a promising relationship with developers. The article continues:
“Beyond making sure Snapshots got a headstart over Google Now on Tap, Bing is also able to stand out by becoming the first search engine to make its knowledge graph available to developers. That will happen this fall, though some APIs are already available on the company’s online developer center. Bing is currently giving potential users sneak peeks on its Android app.”
Hmm, that’s a tad ironic. I look forward to seeing how Google positions the launch of Google Now on Tap when the time comes.
Cynthia Murrell, September 9, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Google Seeks SEO Pro
August 12, 2015
Well, isn’t this interesting. Search Engine Land tells us that “Google Is Hiring an SEO Manager to Improve its Rankings in Google.” The Goog’s system is so objective, even Google needs a search engine optimization expert! That must be news to certain parties in the European Union.
Reporter Barry Schwartz spotted the relevant job posting at the company’s Careers page. Responsibilities are as one might expect: develop and maintain websites; maintain and develop code that will engage search engines; keep up with the latest in SEO techniques; and work with the sales and development departments to implement SEO best practices. Coordination with the search-algorithm department is not mentioned.
Google still stands as one of the most sought-after employers, so it is no surprise they require a lot of anyone hoping to fill the position. Schwartz notes, though, that link-building experience is not specified. He shares the list of criteria:
“The qualifications include:
*BA/BS degree in Computer Science, Engineering or equivalent practical experience.
*4 years of experience developing websites and applications with SQL, HTML5, and XML.
*2 years of SEO experience.
*Experience with Google App Engine, Google Custom Search, Webmaster Tools and Google Analytics and experience creating and maintaining project schedules using project management systems.
*Experience working with back-end SEO elements such as .htaccess, robots.txt, metadata and site speed optimization to optimize website performance.
*Experience in quantifying marketing impact and SEO performance and strong understanding of technical SEO (sitemaps, crawl budget, canonicalization, etc.).
*Knowledge of one or more of the following: Java, C/C++, or Python.
*Excellent problem solving and analytical skills with the ability to dig extensively into metrics and analytics.”
Lest anyone doubt the existence of such an ironic opportunity, the post reproduces a screenshot of the advertisement, “just in case the job is pulled.”
Cynthia Murrell, August 12, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

