Web Analytics: A Fancy Way of Saying You Have a Blue Ribbon Winning Bloodhound Tracking You
June 18, 2020
DarkCyber is easily confused. Every day brings more incredible cyber security marketing hoo-hah. And each day more incredible security issues come to light. A good example was the Wall Street Journal’s story “Russian Hackers Evaded Firms’ Detection Tools”, Wednesday, June 18, 2020. Yeah, those cyber tools are special.
The story “Lightweight Alternatives to Google Analytics” is a helpful round up of digital bloodhounds. If you are looking for ways to make sense of Web site log files, you can work through the snapshots of such systems as GoatCounter, Plausible, Simple Analytics, and Fathom.
The intriguing segment of the write up is, in DarkCyber’s opinion, this statement:
Google tracks and stores a huge amount of information about users.
A 2018 paper [PDF] by Douglas Schmidt highlights the extent of Google’s tracking, with location tracking on Android devices as one example:
Both Android and Chrome send data to Google even in the absence of any user interaction. Our experiments show that a dormant, stationary Android phone (with Chrome active in the background) communicated location information to Google 340 times during a 24-hour period, or at an average of 14 data communications per hour. The paper distinguishes between “active” and “passive” tracking. Active tracking is when the user directly uses or logs into a Google service, such as performing a search, logging into Gmail, and so on. In addition to recording all of a user’s search keywords, Google passively tracks users as they visit web sites that use GA and other Google publisher tools. Schmidt found that in an example “day in the life” scenario, “Google collected or inferred over two-thirds of the information through passive means”. Schmidt’s paper details how GA cookie tracking works, noting the difference between “1st-party” and “3rd-party” cookies — the latter of which track users and their ad clicks across multiple sites: While a GA cookie is specific to the particular domain of the website that user visits (called a “1st-party cookie”), a DoubleClick cookie is typically associated with a common 3rd-party domain (such as doubleclick.net). Google uses such cookies to track user interaction across multiple 3rd-party websites. When a user interacts with an advertisement on a website, DoubleClick’s conversion tracking tools (e.g. Floodlight) places cookies on a user’s computer and generates a unique client ID. Thereafter, if the user visits the advertised website, the stored cookie information gets accessed by the DoubleClick server, thereby recording the visit as a valid conversion. Because such a large percentage of web sites use Google advertising products as well as GA, this has the effect that the company knows a large fraction of users’ browsing history across many web sites, both popular sites and smaller “mom and pop” sites. In short, Google knows a lot about what you like, where you are, and what you buy. Google does provide ways to turn off features like targeted advertising and location tracking, as well as to delete the personalized profile associated with an account. However, these features are almost entirely opt-in, and most users either don’t know about them or just never bother to turn them off. Of course, just switching away from GA won’t eliminate all of these privacy issues (for example, it will do nothing to stop Android location tracking or search tracking), but it’s one way to reduce the huge amount of data Google collects. In addition, for site owners that use a GA alternative, Google does not get a behind-the-scenes look at the site’s traffic patterns — data which it could conceivably use in the future to build a competing tool.
A paywall may be protecting this write up. Nevertheless, if the information in the passage quoted above is accurate, Google’s senior management may have to do some explaining as the company executes some “Dancing with the Stars” footwork if regulators decide to dig into such assertions.
And the bloodhound, “Who me?” Woof.
Stephen E Arnold, June 18, 2020
Organic or Paid Search? Answer: Pay Up
June 16, 2020
There is a weird symbiosis. Unlike the sucker fish clamped on a shark, the predator’s fellow travelers operate in the dark digital ocean. “Organic Vs Paid Search: Explained” correctly points out that traffic costs money. This is not 1994, gentle reader. This is 2020 and the costs of running an ad supported search engine are difficult to control.
The write-up ignores a simple fact: Online advertising companies want anyone who wants clicks and traffic to pay. Like the IRS oriented phrase: Death, taxes, and the online traffic levy.
This means that “organic search” — the 1994 style of Web indexing — is dead like dinosaurs. The future is pay to play.
As output devices become smaller and voice creeps forward as a way to explain where to get a pizza, the free loading sucker fish are going to get scraped off the digital shark. The shark will then eat the sucker fish.
What’s this mean for search engine optimization? More baloney, more hand waving, and another lost cause.
Pay to play, the phrase of the future. There’s no cyber Mother Theresa to intervene.
Stephen E Arnold, June 16, 2020
Google: Is Hiding URLs Part of the Walled Garden Play?
June 15, 2020
I read yet another baffled lament about Google’s hiding urls. Google nuked urls for PDFs a while ago. Now the url itself will go away. “Google Resumes Its Senseless Attack on the URL Bar, Hides Full Addresses on Chrome 85” states:
Google has tried on and off for years to hide full URLs in Chrome’s address bar, because apparently long web addresses are scary and evil. Despite the public backlash that came after every previous attempt, Google is pressing on with new plans to hide all parts of web addresses except the domain name.
The write up asserts:
Google’s goal with Accelerated Mobile Pages (AMP) and similar technologies is to keep users on Google-hosted content as much as possible, and Chrome for Android already modifies the address bar on AMP pages to hide that the pages are hosted by Google.
DarkCyber wants to point out that as regulation becomes increasingly likely, outfits like the Google are doing a “land grab.” Ideas which have been subject to the old-school five-year mentality are now being pushed forward.
Google will become increasingly aggressive in its drive to capture, create, and retain as many clicks as possible. When the url bar is blank, the idea is that a user will just type “CNN” and let Google do the deciding for the user.
The less one knows, the better it is for Google’s ad matching. The more clicks Google can generate by “improving the user experience” with less information, the more revenue the firm projects that it will earn.
Let’s face it. The majority of users are really poor online information searchers. Therefore, Google’s Internet wants to be positioned as the Internet.
When regulators arrive, it is unlikely that those astute individuals will understand how removing information increases Google’s projected online ad revenue.
Nefarious? Nope, just an acceleration of Google management actions.
Where are those clicks going? Same places, but the real estate on which ads can be displayed is tiny compared to those big fat, boat anchor PC screens.
Thus, less info is good for the Google, and the regulatory authorities will be confused. Without an influx of clicks, obtained one way or another, the Google is vulnerable to the likes of Amazon-, Apple-, and Facebook-type predators.
Stephen E Arnold, June 15, 2020
Whom Do We Trust? Facebook, Google, Others?
June 10, 2020
Internet giants Google and Facebook keep assuring us they respect our privacy, but can we trust them? Facebook, for example, just promised the personal data it is supplying to Covid-19 researchers, academics, and humanitarian agencies is stripped of any identifying information. Daijiworld reports, “Facebook Says Not Sharing Users’ Data with Researchers, Academics.” We’re told:
“Over the past few months, public health researchers have used data sets released by Facebook to inform decisions around Covid-19 across Asia, Europe and North America.”
However, we are assured, Facebook’s Data for Good program protects users’ anonymity:
“The social networking giant said it has created a differential privacy framework that protects the privacy of individuals in aggregated datasets by ensuring no one can identify specific people in these datasets. In 2017, the company launched ‘Data for Good’ with the goal of empowering partners with data to help make progress on major social issues. … Facebook said the research partners enrolled in the ‘Data for Good’ programme only have access to aggregate information from Facebook and it does not share any individual information.”
Sounds great—but are we to simply take Facebook’s word for it? The company is not exactly known for its transparency.
Meanwhile, Inventiva reports, “Google Is Sued for Secretly Amassing a Vast Trove of User Web Data.” Despite that company’s pledge that users are in complete control of their data, a complaint recently filed in federal court in San Jose claims otherwise. The plaintiffs accuse Google of invasion of privacy and violations of federal wiretapping law. Writer Apurva Saxena reports:
“Google surreptitiously amasses billions of bits of information –every day — about internet users even if they opt out of sharing their information, three consumers alleged in a proposed class action lawsuit. … According to the suit, the company collects information, including IP addresses and browsing histories, whenever users visit web pages or use an app tied to common Google services, such as Google Analytics and Google Ad Manager. This makes ‘Google “one stop shopping” for any government, private, or criminal actor who wants to undermine individuals’ privacy, security, or freedom,’ the consumers allege.”
Companies like Facebook and Google (one might add in Amazon for good measure) have obtained a great deal of power and revenue through data collection, and we have only their promises that they are not violating user privacy. Who will hold them accountable? We shall see how this lawsuit pans out; similar suits have been summarily dismissed.
Cynthia Murrell, June 10, 2020
Google Docs: More Than Enabler of Student Messages via Its Comments Function
June 8, 2020
Teachers are often befuddled by their students. Google Docs makes it possible for students to use the comments features to exchange interesting messages. When an adult approaches, a click makes the content disappear. Great for students, not so good for some teachers and parents.
“How Google Docs Became the Social Media of the Resistance” explains what may be another facet of the Googlers’ code Byzantium. Google is ubiquitous and most people don’t think too much or too long about the implications of collaborative tools for word processing and Excel-like software.
Boring, right?
The write up explains:
… Google Docs has emerged as a way to share everything from lists of books on racism to templates for letters to family members and representatives to lists of funds and resources that are accepting donations. Shared Google Docs that anyone can view and anyone can edit, anonymously, have become a valuable tool for grassroots organizing during both the coronavirus pandemic and the police brutality protests sweeping the US. It’s not the first time. In fact, activists and campaigners have been using the word processing software for years as a more efficient and accessible protest tool than either Facebook or Twitter.
Let’s assume that the article is accurate. Will Google take some action to control what its “users” do with its Microsoft Office clone? WWGD (what will Google do) is a new company watching sport. DarkCyber believes that it may become more interesting than bird watching.
Is information stored on Google Docs accessible for monitoring? Maybe Google is not responsible for what is users do? Hmmmm.
Stephen E Arnold, June 8, 2020
Google to Australia: What! Us Pay You? Take a Walkabout, Mates
June 2, 2020
This will be interesting. Google has found the Australian request for money to save “real news” unacceptable. The information, if accurate, appears in “Google Rejects Call For Huge Australian Media Payout.” DarkCyber learned:
Google has rejected demands it pay hundreds of millions of dollars per year in compensation to Australian news media under a government-imposed revenue sharing deal.
What’s interesting is that Google, working overtime to control its costs of being the Google, said:
The company’s top executive in Australia said Google made barely Aus$10 million (US$6.7 million) per year from news-linked advertising, a fraction of a government watchdog’s estimates for the sector.
Will that explanation fly in Canberra (yep, DarkCyber know there was an aircraft with the moniker Canberra, but did you know that word may mean “meeting place?). Unfortunately the meeting place for Google and the government of Australia is likely to be in Oodnadatta in the summer.
The Ms. Silva, chief Googler in Australia:
also denied ACCC arguments that the tech firms gain significant “indirect benefits” from displaying news since the content draws users to their platforms. News “represents only a tiny number of queries” on Google, accounting last year for barely one percent of actions on Google Search in Australia, she said.
After 22 years of almost zero government initiative in regulating or using legislative mechanisms to deal with Google, Australia is moving forward to protect news. The effort will be interesting to watch. Unfortunately companies are likely to have more sticktoativity than some government professionals. What happens if Google hires some of the attorneys pushing the anti Google activity?
Stephen E Arnold, June 2, 2020
Google and Page Experience
June 2, 2020
Just a short item, a question in reality: What’s a “page experience”? I understand a Web page. This is the digital equivalent of a note card or a sheet of paper. I understand experience; for example, a tumble down a flight of steps is an experience.
But “page experience”? Fuzzy, weird word pairing like those old school America Online word pairs.
“Google Will Factor Page Experience into Search Rankings” explains that the phrase means:
page experience will measure how users perceive the experience of interacting with a web page. To determine page experience, Google will consider Core Web Vitals, metrics that measure user experience, as well as existing signals, like mobile-friendliness, safe-browsing and HTTPS-security.
Interesting. The “secret” Google ranking algorithm gets another batch of signals to use to determine relevance. Does “relevance” mean which and how many ads can be matched to information retrieved from one of Google’s universal search indexes?
The article reports:
Google says it will still prioritize the best information…
For decades methods like precision and recall, Boolean logic, and controlled vocabularies provided mechanisms for matching indexed information to a user query.
Most people have zero idea how Google determines what information to display. Does it matter? To most people online information is accurate. Jibber jabber like “page experience” is a phrase with a hefty payload of content free suggestivity.
Advertising revenue is the point of the exercise, isn’t it? Perhaps there is a correlation between Amazon’s and Facebook’s growing online advertising businesses and Google verbiage?
Using Google in a quest to find relevant information is an experience in itself.
Stephen E Arnold, June 2, 2020
Google Enters a Summer of Excuses
May 31, 2020
I wrote about the “dog ate my homework” and Google a few days ago. Now we have an Android release delay. The canine excuse was Covid. No dog this time. “Google Postpones Android 11 Unveiling amid U.S. Protests” reports the delay due to civil disorder, protests, and reprises of LA issues. No big deal, of course, but why not just say, “We’re not ready.” Is the new Google an excuse generation system? Are not software updates handled from the cloud? Does not Google have access to a video conferencing system? Interesting. Facebook, Microsoft, and Zoom services work okay for virtual announcements.
Stephen E Arnold, May 31, 2020
Google: The Web. Period.
May 29, 2020
A few years ago, DarkCyber noticed that Google hides the url for PDF files. For anyone interested in creating a reference to a PDF, the Google change was an annoyance. There are workarounds, but Google is pretty skilled at creating work which channels users into the firm’s walled garden. If you look for a url in an AMP page, there are no more urls, at least according to Zenexer via this tweet. DarkCyber believes that the Google walled garden, discussed in Google Version 2: The Calculating Predator is taking specific steps to become the “Internet.” I am not going to review information I published in 2006. Are there work arounds? Of course. Will 96 percent of the people relying on Google for information use them? Not in a blue moon. What are the implications of this? Many. But after 14 years of erecting the digital equivalent of Machu Picchu, who cares? Who notices? Who bothers to deal with footnotes provenance, and sources?
Answer: Not the people in regulatory agencies in Washington, DC. And maybe only four cent of Google’s users or products. That Lewis Carroll guy did the garden thing, didn’t he? What were his motives? Yikes.
Stephen E Arnold, May 29, 2020
Google and Its Hard-to-Believe Excuse of the Week
May 27, 2020
I taught for one or two years when I was in graduate school. Did I ever hear a student say, “My dog ate my homework”? I sure did. I heard other excuses as well; for example, “I was shot on Thanksgiving Day (a true statement. The student showed me the bullet wound in his arm.) I also heard, “I had to watch my baby sister, and she was sick so I couldn’t get the homework.” True. As it turned out, the kid was an only child.
But I never heard, “The algorithm did it.”
Believe it or not, Engadget reported this: “YouTube Blames Bug for Censoring Comments on China’s Ruling Party.” I think Engadget should have written “about China’s” but these real journalists use Grammarly, like, you know.
The article states:
Whatever way the error made its way into YouTube, Google has been slow to address it.
For DarkCyber, the important point is that software, not a work from home or soon to be RIFed human made the error.
The Google plays the “algorithm did it.”
Despite Google’s wealth of smart software, the company’s voice technology has said nothing about the glitch.
Stephen E Arnold, May 27, 2020

