ChatGPT, Claude, and Gemini: Where Your Data Actually Goes (And How to Stop It)

Listen to this article
12 min · tap to play

Your staff are already using AI. The question is whether you know about it — and whether you know where their conversations are going.

Right now, somebody in your organisation is pasting client data into ChatGPT. Someone else is summarising a confidential report with Claude. A third person is using Gemini to draft a proposal with commercially sensitive figures. They are not being reckless. They are being productive. But every one of those conversations is being stored, processed, and — on most plans — used to train the next generation of AI models.

This is not a theoretical risk. It is happening today, across every sector, in businesses of every size. And the data policies of the three largest AI providers are far less protective than most people assume.

The good news? You do not have to choose between AI productivity and data safety. You can have both. Here is how.

76%
Of UK organisations have
shadow AI usage
£670K
Average added cost
per breach from shadow AI
5 yrs
Claude’s data retention
when training is enabled

The Shadow AI Problem

Shadow AI is the term for employees using AI tools without IT knowledge or approval. It is not new — shadow IT has been a headache for decades — but AI has made it exponentially more dangerous. When someone uses an unauthorised spreadsheet, the risk is limited. When someone pastes a client contract into a public AI tool, that data enters a training pipeline you have zero control over.

Research from IBM’s 2026 Cost of Data Breach Report found that 76% of UK organisations have shadow AI as a definite or probable challenge. The same report found that shadow AI incidents add approximately £670,000 to the average cost of a data breach. That is not the total cost — that is the additional cost on top of an already expensive incident.

Why it happens

Staff are not being malicious — they are being productive

Most shadow AI usage comes from well-intentioned employees trying to work faster. They sign up for free ChatGPT accounts, paste in client documents for summarisation, use Claude to draft emails, or feed Gemini with financial data for analysis. They do not read the terms of service. They do not think about where the data goes. The answer is not to ban AI — it is to give your team a better way to use it.

OpenAI (ChatGPT): What Happens to Your Data

OpenAI’s data policies depend entirely on which plan you are using. The differences are significant — and the default settings on consumer plans should concern any business handling sensitive data.

Consumer plans: Free, Plus, and Pro

Training: ChatGPT trains on your conversations by default. There is an opt-out toggle buried in settings, but enabling it disables your chat history. Most users never find it, and even fewer use it.

Retention: Conversations on consumer plans are retained indefinitely. OpenAI was court-ordered to preserve all chat logs from May to September 2025 as part of the New York Times copyright lawsuit — a reminder that your conversations can become legal evidence.

Data residency: UK data residency is available only for Enterprise customers. Consumer plan data is processed and stored in the United States.

Advertising: As of February 2026, OpenAI is running advertisements on the free tier. Your conversations now exist alongside an ad-supported business model.

Business plans: Team, Enterprise, and API

Training: OpenAI does not train on data from Team, Enterprise, or API usage. This is a contractual guarantee backed by a Data Processing Agreement.

Retention: Admin-controlled retention policies. Enterprise customers can set their own data lifecycle rules.

OpenAI certifications

ISO 27001 • SOC 2 Type 2

OpenAI holds ISO 27001 and SOC 2 Type 2 certifications for its enterprise offerings. These are meaningful security baselines, but they do not change the fundamental data handling on consumer plans. The models themselves — GPT-4o, o3, o4-mini — are exceptionally capable. The problem is not the intelligence. It is where the data goes.

Anthropic (Claude): What Happens to Your Data

Anthropic has positioned Claude as the “safety-first” AI company. But a significant privacy policy change in September 2025 and two data leaks in early 2026 have complicated that narrative.

Consumer plans: Free, Pro, and Max

Training: Since September 2025, Anthropic trains on Free, Pro, and Max conversations by default. This was a notable pivot — prior to that date, Anthropic did not train on user conversations. The change was buried in a terms of service update that most users did not notice.

Retention: When training is enabled, Anthropic retains conversation data for up to 5 years. That is one of the longest retention periods of any major AI provider.

Data residency: All Claude data is stored in the United States. There is no UK or EU data residency option for any Anthropic plan. The only workaround is accessing Claude through AWS Bedrock in an EU region, but this requires technical implementation and is not available to most business users.

Business plans: Work, Enterprise, and API

Training: Claude for Work, Enterprise, and API tiers do not train on your data. API retention is limited to 7 days for trust and safety purposes.

Recent incidents

Two data leaks in March 2026 • $1.5B copyright settlement

In March 2026, Anthropic experienced two separate data exposure events: one involving the Mythos model’s training data and another involving Claude Code source material. Additionally, Anthropic reached a $1.5 billion copyright settlement. Despite these incidents, Claude remains one of the most intelligent models available — Opus 4 is widely considered the strongest reasoning model on the market. Again, the issue is not capability. It is data control.

Google (Gemini): What Happens to Your Data

Google’s Gemini has the broadest distribution of any AI assistant — it is embedded in Gmail, Docs, Search, and Android. That reach makes its data policies particularly important to understand.

Consumer plans: Free Gemini

Training: Google trains on free Gemini conversations by default. There is no ambiguity here — it is stated clearly in their terms.

Retention: Conversations on the free tier are subject to a 3-year human review retention period. That means Google employees can read your conversations for quality assurance and model improvement for up to three years after you send them.

Business plans: Workspace and Enterprise

Training: Google Workspace plans include a contractual guarantee that Gemini does not train on your data. This is backed by a Data Processing Agreement (DPA) — something the consumer tier does not offer.

Data residency: UK data residency has been available since February 2026 for Enterprise and Workspace customers. This is a genuine advantage over Anthropic, which offers no European residency at all.

GeminiJack: the zero-click exploit

Zero-click corporate data exfiltration — June 2025

In June 2025, security researchers disclosed “GeminiJack” — a zero-click exploit that allowed attackers to exfiltrate corporate data through Gemini without any user interaction. The attack leveraged Gemini’s deep integration with Google Workspace to extract sensitive documents. Google patched the vulnerability, but the incident demonstrated the risks of tight AI-productivity suite integration.

Certifications: Google holds ISO 27001, ISO 42001, SOC 2, and FedRAMP certifications — the broadest certification portfolio of the three providers.

Side-by-Side Comparison

Here is how the three providers stack up on the policies that matter most to UK businesses.

ChatGPT (OpenAI) Claude (Anthropic) Gemini (Google)
Training (Free) Yes — opt-out toggle Yes — since Sept 2025 Yes — by default
Training (Enterprise) No — contractual No — contractual No — contractual + DPA
UK Data Residency Enterprise only None — US only Enterprise + Workspace
Retention (Free) Indefinite 5 years (if training on) 3-year human review
Retention (Enterprise) Admin-controlled 7-day API / admin-set Admin-controlled
Recent Incidents NYT lawsuit log preservation; ads on free tier Two March 2026 data leaks; $1.5B copyright settlement GeminiJack zero-click exploit; Personal Intelligence class-action

The pattern is clear: every provider is safe at the enterprise tier and unsafe at the consumer tier. The problem is that your employees are not using the enterprise tier. They are using free accounts with their personal email addresses.

But here is the thing most articles miss: these are brilliant models. ChatGPT, Claude, and Gemini represent the most capable AI ever built. The intelligence is not the problem. The data handling is. And that is a solvable problem.

Nerdster.ai

We give you the best AI models — without the data risk

We help UK businesses use ChatGPT, Claude, Gemini, and open-source models safely. Custom AI agents, private LLMs, and full data control. Book a free 30-minute consultation.

How Nerdster.ai Solves This

At Nerdster.ai, we believe you should not have to choose between AI capability and data safety. Every major AI model — GPT-4o, Claude Opus, Gemini Pro, Llama, Mistral — is available through secure APIs and private hosting. The question is not which model to use. It is how to use it without exposing your clients’ data.

That is exactly what we do. We work with UK professional services firms to deploy AI that is as intelligent as anything you can get from ChatGPT or Claude — but where your data never leaves your control.

We help you choose and use the best AI models

Not every task needs the same model. A contract review might need Claude’s deep reasoning. A quick email draft might work best with GPT-4o. A coding task might suit an open-source model running locally. We help you understand which model fits which workload — and set up the secure infrastructure to use them all without worrying about data exposure.

We build custom AI agents for your workloads

Generic AI tools are powerful but inefficient. They do not know your processes, your templates, your compliance requirements, or your clients. A custom AI agent does.

These agents run on the most capable AI models available — but they are configured to work within your data boundaries. No client information leaves your environment. Every interaction is logged. Every output is traceable.

We offer your own private LLM — locally or privately hosted

For firms that need absolute data control, we deploy private large language models that run entirely on your infrastructure. This is not a watered-down version of AI. Modern open-source models like Llama 4 and Mistral Large rival the performance of ChatGPT and Claude on most business tasks — and they run on hardware you own.

What private hosting looks like

Your data never leaves — full stop

On-premises: We install and configure AI models on your existing servers or dedicated GPU hardware in your office. The models run on your network. Nothing touches the internet. Ideal for law firms, financial advisers, and healthcare providers handling the most sensitive client data.

Private cloud

Enterprise-grade AI in your own cloud environment

Dedicated cloud: We deploy AI models in your own AWS, Azure, or Google Cloud tenancy — in the UK region of your choice. You get the scalability of cloud computing with the data isolation of on-premises. All traffic stays within your virtual private cloud. No shared infrastructure. No third-party data access.

Air-gapped deployment

Zero internet connection — zero exfiltration risk

Air-gapped: For maximum security, we deploy air-gapped AI systems on hardware with no internet connection whatsoever. The AI processes your data in a sealed environment. No network traffic in or out. This is the gold standard for firms handling privileged legal matter, financial due diligence, or classified information.

With any of these options, you get the full capability of modern AI — document analysis, summarisation, drafting, coding, research, translation — without a single byte of client data leaving your control. Every query and every response is logged within your systems, giving you a complete audit trail for regulators.

What This Means for Your Firm

If you operate in a regulated sector — legal, financial services, healthcare, accounting — the implications of public AI data policies are severe. But the solution is straightforward.

The regulatory landscape

“The risk is not that AI is dangerous. The risk is that uncontrolled AI creates liability you cannot manage, audit, or defend to a regulator. Private AI eliminates that risk entirely.”

Getting Started with Nerdster.ai

You do not need to solve this overnight. But you do need to start. Here is how working with us typically looks.

1. Free AI audit (30 minutes)

We map your AI exposure — no cost, no commitment

We identify which AI tools your staff are using, what data is being shared, and where your compliance gaps are. You get a clear picture of your current shadow AI risk — and a practical roadmap for fixing it.

2. Model selection and agent design

We match the best AI to your specific workloads

We evaluate your workflows and recommend which AI models and custom agents will deliver the most value. Contract review? Client intake? Research? Reporting? We design agents tailored to how your team actually works — not generic chatbots.

3. Secure deployment and training

Your team gets AI that is better than what they were using — and completely safe

We deploy your AI solution — whether that is secure API access, custom agents, a private LLM, or an air-gapped system — and train your team to use it. The result: your staff stop using shadow AI because the approved alternative is faster, smarter, and easier. Shadow AI disappears because you have replaced it with something better.

The Bottom Line

ChatGPT, Claude, and Gemini are extraordinary tools. They are also, on their consumer plans, extraordinary risks for any business that handles client data. The default settings train on your conversations. The retention periods are measured in years. The data residency is overwhelmingly US-based. And none of them offer a Data Processing Agreement on free or personal plans.

But the intelligence behind these models is available to you — privately, securely, and on your terms. You can use the most capable AI ever built without worrying about where your client data ends up. You can build custom AI agents that know your business and your compliance requirements. You can run your own private LLM that never connects to the internet.

That is what Nerdster.ai does. We help UK businesses understand, choose, and deploy AI safely. If that sounds like what your firm needs, let’s talk.

Frequently Asked Questions

Does ChatGPT train on my data?
Yes, on free, Plus, and Pro plans ChatGPT trains on your conversations by default. You can opt out via a toggle in settings, but this disables your chat history. OpenAI Team, Enterprise, and API tiers do not train on your data and offer admin-controlled retention policies. Nerdster.ai can set up secure enterprise API access so your team gets ChatGPT’s full capability with proper data protections.
Is Claude safe for client data?
Not on consumer plans. Since September 2025, Anthropic trains on Free, Pro, and Max conversations by default, with a 5-year retention period. All data is stored in the US only. However, Claude remains one of the most capable AI models available. Nerdster.ai deploys Claude via secure enterprise API or AWS Bedrock, giving you Claude’s intelligence with full data control and UK hosting options.
What is shadow AI and how do I stop it?
Shadow AI refers to employees using AI tools without IT knowledge or approval. 76% of UK organisations are affected, adding approximately £670,000 to average breach costs. The most effective way to stop shadow AI is to replace it with something better — approved AI tools that are faster, smarter, and safer than free alternatives. Nerdster.ai helps you deploy exactly that.
What is a private LLM and can my firm use one?
A private LLM is a large language model that runs entirely on your own infrastructure — on-premises servers, your own cloud tenancy, or air-gapped hardware. Modern open-source models like Llama 4 and Mistral Large deliver performance comparable to ChatGPT and Claude on most business tasks. Nerdster.ai handles the entire deployment, from hardware selection to model configuration to staff training.
Can Nerdster.ai build custom AI agents for my business?
Yes. We design and build AI agents tailored to your specific workflows — document review, client intake, research, reporting, and more. These agents use the best available AI models but run within your data boundaries, with full audit trails and compliance built in from day one.
Nerdster.ai

The best AI models.
Your data stays put.

Custom AI agents, private LLMs, and secure model access — built for UK professional services. Book a free 30-minute consultation to see how we can help your firm.