Temptation Is Powerful: Will Big AI Tech Take the Bait

November 12, 2025

Another short essay from a real and still-alive dinobaby. If you see an image, we used AI. The dinobaby is not an artist like Grandma Moses.

I have been using the acronym BAIT for “big AI tech” in my talks. I find it an easy way to refer to the companies with the money and the drive to try to take over the use of software to replace most humans’ thinking. I want to raise a question, “Will BAIT take the bait?”

What is this lower case “bait”? In my opinion, lower case “bait” refers to information people and organizations would consider proprietary, secret, off limits, and out of bounds. Digital data about health, contracts, salaries, inventions, interpersonal relations, and similar categories of information would fall into the category of “none of your business” or something like “it’s secret.”

A calculating predator is about to have lunch. Thanks, Venice.ai. Not what I specified but good enough like most things in 2025.

Consider what happens when a large BAIT outfit gains access to the contents of a user’s mobile device, a personal computer, storage devices, images, and personal communications? What can a company committed to capturing information to make its smart software models more intelligent and better informed learn from these types of data? What if that data acquisition takes place in real time? In an organization or a personal life situation, an individual entity may not be able to cross tabulate certain data. The information is in the organization or the data stream for a household, but it is not connected. Data processing can acquire the information, perform the calculations, and “identify” the significant items. These can be sued to predict or guess what response, service, action, or investment can be made.

Microsoft’s efforts with Copilot in Excel raise the possibility and opportunity to examine an organization’s or a person’s financial calculations as part of a routine “let’s make the Excel experience better.” If you don’t know that data are local or on a cloud provider server, access to that information may not be important to you. But are those data important to a BAIT outfit? I think those data are tempting, desirable, and ultimately necessary for the AI company to “learn.”

One possible solution is for the BAIT company to tap into personal data, offering assurances that these types of information are not fodder for training smart software. Can people resist temptation? Some can. But others, with large amounts of money at stake, can’t.

Let’s consider a recent news announcement and then ask some hypothetical questions. I am just asking questions, and I am not suggesting that today’s AI systems are sufficiently organized to make use of the treasure trove of secret information. I do have enough experience to know that temptation is often hard to resist in a certain type of organization.

The article I noted today (November 6, 2025) is “Gemini Deep Research Can Tap into Your Gmail and Google Drive.” The write up reports what I assume to be accurate data:

After adding PDF support in May [2025], [Google] Gemini Deep Research can now directly tap information stored in your Gmail and Google Chat conversations, as well as Google Drive files…. Now, [Google] Deep Research can “draw on context from your [Google] Gmail, Drive and Chat and work it directly into your research.” [Google] Gemini will look through Docs, Slides, Sheets and PDFs stored in your Drive, as well as emails and messages across Google Workspace. [Emphasis added by Beyond Search for clarity]

Can Google resist the mouth watering prospect of using these data sources to train its large language models and its experimental AI technology?

There are some other hypotheticals to consider:

What informational boundaries is Google allegedly crossing with this omnivorous approach to information?
How can Google put meaningful barriers around certain information to prevent data leakage?
What recourse do people or organizations have if Google’s smart software exposes sensitive information to a party not authorized to view these data?
How will Google’s advertising algorithms use such data to shape or weaponize information for an individual or an organization?
Will individuals know when a secret has been incorporated in a machine generated report for a government entity?

Somewhere in my reading I recall a statement attributed to Napoleon. My recollection is that in his letters or some other biographical document about Napoleon’s approach to war, he allegedly said something like:

Information in nine tenths of any battle.

The BAIT organizations are moving with purpose and possibly extreme malice toward systems and methods that give them access to data never meant to be used to train smart software. If Copilot in Excel happens and if Google processes data in their grasp, will these types of organizations be able to resist juicy, unique, high-calorie morsels zeros and ones?

I am not sure these types of organizations can or will exercise self control. There is money and power and prestige at stake. Humans have a long track record of doing some pretty interesting things. Is this omnivorous taking of information wrapped in making one’s life easier an example of overreach?

Will BAIT outfits take the bait? Good question.

Stephen E Arnold, November 12, 2025

Written by Stephen E. Arnold · Filed Under AI, Business strategy, News

Comments

Got something to say?

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.