Be Smart: Live in Greenness

August 27, 2020

I do not want to be skeptical. I do not want to suggest that a study may need what might be called verification. Please, read “Residential Green Space and Child Intelligence and Behavior across Urban, Suburban, and Rural Areas in Belgium: A Longitudinal Birth Cohort Study of Twins.” To add zip to your learning, navigate to a “real” news outfit’s article called “Children Raised in Greener Areas Have Higher IQ, Study Finds.” Notice any significant differences.

First, the spin in the headline. The PLOS article points out that the sample comes from Belgium. How representative is this country when compared to Peru or Syria? How reliable are “intelligence” assessments? What constitutes bad behavior? Are these “qualities” subject to statistically significant variations due to exogenous factors?

I don’t want to do a line by line comparison of the write up which wants to ring the academic gong. Nor do I want to read how “real” journalists deal with a scholarly article.

I would point out this sentence in the scholarly article:

To our knowledge, this is the first study investigating the association between residential green space and intelligence in children.

Yeah, let’s not get too excited from a sample of 620 in Belgium. Skip school. Play in a park or wander through thick forests.

Stephen E Arnold, August 27, 2020

Bias in Biometrics

August 26, 2020

How can we solve bias in facial recognition and other AI-powered biometric systems? We humans could try to correct for it, but guess where AI learns its biases—yep, from us. Researcher Samira Samadi explored whether using a human evaluator would make an AI less biased or, perhaps, even more so. We learn of her project and others in Biometric Update.com’s article, “Masks Mistaken for Duct Tape, Researchers Experiment to Reduce Human Bias in Biometrics.” Reporter Luana Pascu writes:

“Curious to understand if a human evaluator would make the process fair or more biased, Samadi recruited users for a human-user study. She taught them about facial recognition systems and how to make decisions about system accuracy. ‘We really tried to imitate a real-world scenario, but that actually made it more complicated for the users,’ Samadi said. The experiment confirmed the difficulty in finding an appropriate dataset with ethically sourced images that would not introduce bias into the study. The research was published in a paper called A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition.”

Many other researchers are studying the bias problem. One NIST report found a lot of software that produced 10-fold to 100-fold increase in the probability of Asian and African American faces being inaccurately recognized (though a few systems had negligible differences). Meanwhile, a team at Wunderman Thompson Data found tools from big players Google, IBM, and Microsoft to be less accurate than they had expected. For one thing, the systems had trouble accounting for masks—still a persistent reality as of this writing. The researchers also found gender bias in all three systems, even though the technologies used are markedly different.

There is reason to hope. Researchers at the Durham University’s Computer Science Department managed to reduce racial bias by one percent and improve ethnicity accuracy. To achieve these results, the team used a synthesized data set with a higher focus on feature identification. We also learn:

“New software to cut down on demographic differences in face biometric performance has also reached the market. The ethnicity-neutral facial recognition API developed by AIH Technology is officially available in the Microsoft Azure Marketplace. In March, the Canadian company joined the Microsoft Partners Network (MPN) and announced the plans for the global launch of its Facial-Recognition-as-a-Service (FRaaS).”

Bias in biometrics, and AI in general, is a thorny problem with no easy solution. At least now people are aware of the issue and bright minds are working to solve it. Now, if only companies would be willing to delay profitable but problematic implementations until solutions are found. Hmmm.

Cynthia Murrell, August 26, 2020

Cloud Data: Clear with Rain Predicted for On Premises Hardware

August 21, 2020

I like surveys which provides some information about sample size. “Survey: How the Pandemic Is Shaking Up the Network Market” says that 2,400 information technology decision makers participated. How were these individuals selected? How was the survey conducted? When was the survey conducted? are questions not answered. Nevertheless, some of the findings seemed interesting.

One of the surprising factoids was the shift from a license for a period of time to a “subscription.” How many outfits are subscribing to cloud services? The write up reports:

The average proportion of IT services consumed via subscription will accelerate by 38% in the next two years, from 34% of the total today to 46% in 2022, and the share of organizations that consume a majority (over 50%) of their IT solutions ‘as a service’ will increase by approximately 72% in that time.

Automatic monthly payments and probably tricky cancellation policies will be part of the subscription business, but that’s just a hunch, not a survey finding.

Other items of interest included these factoids:

77% [of those responding to the survey] said that investments in networking projects had been postponed or delayed since the onset of COVID-19, and 28% indicated that projects had been cancelled altogether.

35% of ITDMs globally are planning to increase their investment in AI-based networking technologies, with the APAC region leading the charge at 44% (including 60% of ITDMs [the acronym which few probably know means “IT decision-makers”]  in India and 54% in Hong Kong).

just 8% [of the sample] globally plan to continue with only CapEx investments.

Net net: Pricing and curtailing capital expenditures may be trends. If these data are accurate, the data suggest that companies targeting on premises sales of hardware may face some headwinds. Of course, I believe everything I read on the Internet, particularly objective surveys.

Stephen E Arnold, August 21, 2020

True or False: AI Algorithms Are Neutral Little Puppies

August 11, 2020

The answer, according to CanIndia News, is false. (I think some people believe this.) “Google IBM, Microsoft AI Models Fail to Curb Gender Bias” reports:

new research has claimed that Google AI datasets identified most women wearing masks as if their mouths were covered by duct tapes. Not just Google. When put to work, artificial intelligence-powered IBM Watson virtual assistant was not far behind on gender bias. In 23 per cent of cases, Watson saw a woman wearing a gag while in another 23 per cent, it was sure the woman was “wearing a restraint or chains”.

Before warming up the tar and chasing geese for feathers, you may want to note that the sample was 265 men and 265 females. Note: The subjects were wearing covid masks or personal protective equipment.

Out of the 265 images of men in masks, Google correctly identified 36 per cent as containing PPE. It also mistook 27 per cent of images as depicting facial hair.

The researchers learned that 15 per cent of images were misclassified as duct tape.

The write up highlights this finding:

Overall, for 40 per cent of images of women, Microsoft Azure Cognitive Services identified the mask as a fashion accessory compared to only 13 per cent of images of men.

Surprised? DarkCyber is curious about:

  1. Sample size. DarkCyber’s recollection is that the sample should have been in the neighborhood of 2,000 or so with 1,000 possible women and 1,000 possible men
  2. Training. How were the models trained. Were “masks” represented in the training set? What percentage of training images had masks?
  3. Image quality. What steps were taken to ensure that the “images” were of consistent quality; that is, focus, resolution, color, etc.

DarkCyber is interested in the “bias” allegation. But DarkCyber may be biased with regard to studies which make it possible to question sample size, training, and data quality/consistency. The models may have flaws, but the bias thing? Maybe, maybe not.

Stephen E Arnold, August 11, 2020

Need Global Financial Data? One Somewhat Useful Site

July 21, 2020

If you need a financial number, you may not have to dig through irrelevant free Web search results or use your Bloomberg terminal to find an “aggregate” function for the category in which you have an interest. Yippy.

Navigate to “All of the World’s Money and Markets in One Visualization.” As you know, my skepticism filter blinks when I encounter the logical Taser “all.” I also like to know where, when, how, and why certain data are obtained. The mechanism for normalizing the data is important to me as well. Well, forget most of those questions.

Look at the Web page. Pick a category. Boom. You have your number.

Accurate? Timely? Verifiable?

Not on the site. But in a “good enough” era of Zoom meetings, a number is available. Just in a picture.

Stephen E Arnold, July 21, 2020

Data Visualizations: An Opportunity Converted into a Border Wall

May 18, 2020

I read “Understanding Uncertainty: Visualizing Probabilities.” The information in the article is useful. Helpful examples make clear how easy it is to create a helpful representation of certain statistical data.

The opportunity today is to make representations of numeric data, probabilities, and “uncertainty” more easily understandable.

The barrier is that “good enough” visualizations can be output with the click of a mouse. The graphic may be attractive, but it may distort the information allegedly presented in a helpful way.

But appearance may be more important than substance. Need examples. Check out the Covid19 “charts”. Most of these are confusing and ignore important items of information.

Good enough is not good enough.

Stephen E Arnold, May 18, 2020

Bayesian Math: Useful Book Is Free for Personal Use

May 11, 2020

The third edition of Bayesian Data Analysis (updated on February 13, 2020) is available at this link. The authors are Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. With the Bayes’ principles in hand, making sense of some of the modern smart systems becomes somewhat easier. The book covers the basics and advanced computation. One of the more interesting sections is Part V: Nonlinear and Nonparametric Models. You may want to add this to your library.

Stephen E Arnold, May11, 2020

Facebook Is Definitely Evil: Plus or Minus Three Percent at a 95 Percent Confidence Level

March 2, 2020

The Verge Tech Survey 2020 allegedly and theoretically reveals the deepest thoughts, preferences, and perceptions of people in the US. The details of these people are sketchy, but that’s not the point of the survey. The findings suggest that Facebook is a problem. Amazon is a problem. Other big tech companies are problems. Trouble right here is digital city.

The survey findings come from a survey of 1123 people “nationally representative of the US.” There was no information about income, group with which the subject identifies, or methodology. But the result is a plus or minus three percent at a 95 percent confidence level. That sure seems okay despite DarkCyber’s questions about:

  • Sample selection. Who pulled the sample, from where, were people volunteers, etc.
  • “Nationally representative” means what? Was it the proportional representation method? How many people from Montana and the other “states”? What about Puerto Rico? Who worked for which company?
  • Plus or minus three percent. That’s a swing at a 95 percent confidence level. In terms of optical character recognition that works out to three to six errors per page about 95 percent of the time. Is this close enough for a drone strike or an enforcement action. Oh, right, this is a survey about big tech. Big tech doesn’t think the DarkCyber way, right?
  • What were the socio economic strata of the individuals in the sample?

What’s revealed or discovered?

First, people love most of the high profile “names” or “brands.” Amazon is numero uno, the Google is number two, and YouTube (which is the Google in case you have forgotten is number three. So far, the data look like a name recognition test. “Do you prefer this unknown lye soap or Dove?” Yep, people prefer Dove. But lye soap may be making a come back.

The stunning finding is that Facebook and Twitter impact society in a negative way. Contrast this to lovable Google and Amazon, 72 percent are favorable to the Google and 70 percent are favorable to Amazon.

Here’s the data about which companies people trust. Darned Amazing. People trust Microsoft and Amazon the most.

image

Which companies do the homeless and people in rural West Virginia trust?

Plus 72 percent of the sample believe Facebook has too much “power.” What does power mean? No clue for the context of this survey.

Gentle reader, please, examine the article containing these data. I want to go back in time and reflect on the people who struggled in my statistics classes. Painful memories but I picked up some cash tutoring. I got out of that business because some folks don’t grasp numerical recipes.

Stephen E Arnold, March 2, 20020

Global CIO Survey: Surprises and Yawns

February 23, 2020

IT Brief published a story about a global CIO survey conducted by Logicalis, an integrator services IT company. The data appeared in “44% of CIOs Think Their AI Comprehension Not Very Successful.” Quite a headline. CIOs are able to give themselves a C minus or D plus in understanding artificial intelligence? Interesting. DarkCyber assumed its was As all the way.

Let’s look at some of the findings, and DarkCyber urges you to check out the original story for more of the data.

  • Nine percent of the respondent “believe that their organization is very successful at comprehending the advantages of AI technology”
  • 44 percent “believe their organization is not very successful at all” at comprehending the advantages of AI technology
  • Internet of Things technologies (not defined, by the way) are used for crating new products, creating operational efficiencies, and enhancing existing products.

These data are based on a sample of 888 CIOs. DarkCyber does not know how these individuals were selected, how the questions were administered, or what methods were used to calculate the nice round percentages.

Stephen E Arnold, February 23, 2020

Want Facebook Statistics?

February 19, 2020

If you want a round up of Facebook statistics, take a look at “Facebook Statistics You Need to Know.” The data come from secondary sources. You may want to verify the factoids before you head to a job interview at Facebook. If you are applying for work at a social media company or a mid tier consulting firm, go with the numbers. Here are three which DarkCyber noted:

An okay, boomer number: People aged 65 and over are the fastest-growing demographic on Facebook

An Amazon wake up call: In the U.S., 15% of social media users use Facebook to shop

TV executive, are you in touch with viewer preferences?  Square Facebook videos get 35% more views than landscape videos

No data are presented about the percentage of Mr. Zuckerberg’s neighbors in Palo Alto who dislike him, however.

Stephen E Arnold, February 19, 2020

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta