All articles
Fundamentals6 min read

Can an AI repeat what you typed?

Yes, and it's documented: large models memorize part of their data and can emit fragments verbatim — PII included. What the research actually shows.

By Pierre de ONYRI

Yes, it's possible — and the research documents it. Large language models “memorize” part of their training data and can, under certain conditions, emit fragments of it verbatim, including personal information. The effect was first measured on GPT-2, then confirmed on production models like ChatGPT. The practical takeaway is simple: what you paste into an AI doesn't necessarily “get lost in the crowd” — the best protection is to not send it in the clear in the first place.

What “memorize” means

A model is trained to predict the continuation of a text. For some sequences — especially ones seen multiple times — it doesn't just learn a style: it retains the exact passage. In 2021, a team led by Nicholas Carlini showed you could query GPT-2 to make it emit hundreds of verbatim sequences from its training data, including names, email addresses, phone numbers and identifiers. That data was never “made available”: it resurfaced because the model had memorized it.

Memorization grows with model size

A follow-up study (Carlini et al., ICLR 2023) quantified the effect and found three clear relationships: memorization grows with model size, with how many times an example is duplicated in the data, and with the length of the context used to prompt the model. The authors' conclusion: memorization is more widespread than previously believed, and it tends to get worse as models scale — absent active mitigations.

Diagram: data typed by a user enters a model (dark); later, a stranger's query makes the same fragment come back out verbatim (amber). Below, three rising cobalt bars: memorization grows with model size, duplication and context.
After Carlini et al. (2021, 2023) and Nasr et al. (2023) — a memorized fragment can resurface verbatim.

And on ChatGPT, not just old models?

Yes. In late 2023, a team including the same researchers described a “divergence attack” that pushes a production model — ChatGPT included — to break out of its chatbot behavior and emit training data, at roughly 150 times the normal rate. The point of that research isn't to help attack a model, but to prove an uncomfortable fact: alignment and the “assistant” format don't remove memorization, they hide it.

Among the kinds of content these studies saw resurface from a model:

  • personal information (names, emails, phone numbers);
  • code snippets and technical identifiers;
  • whole passages of text, copied verbatim from training.
StudyModel testedWhat it shows
Carlini et al., 2021GPT-2Verbatim extraction of training sequences, including names, emails and phone numbers
Carlini et al., 2023 (ICLR)GPT-Neo family, etc.Memorization grows with model size, duplication and context length
Nasr et al., 2023ChatGPT (production)A “divergence attack” makes training data resurface ~150× more often
Three peer-reviewed works, from open model to production model.

What this changes for you

Today's prompts aren't yesterday's training set — but the principle holds for anything that can be retained and later reused to improve a model. If sensitive data enters a system, assume it can one day come back out in a way you don't control. Hence a common-sense rule:

  1. 1Treat every prompt as potentially retained, not ephemeral.
  2. 2Never paste in the clear what you wouldn't want resurfacing: identities, identifiers, keys, secrets.
  3. 3Remove the sensitive data before sending — instead of hoping it “dilutes.”

That's exactly what ONYRI Sanitize does: detect the sensitive data in your text and replace it with reversible tokens before sending, then restore the answer in your browser. The model never sees the real information — so it can neither memorize it nor emit it. Detection and the token↔value mapping stay in your browser.

Frequently asked questions

Can an AI really emit my data verbatim?
Peer-reviewed work has shown it on GPT-2 and then on production models like ChatGPT: a model can return memorized fragments verbatim, including personal information. The risk grows with model size and with how often the data is duplicated.
Are my prompts used to train the model?
It depends on the provider and your settings, and it can change. The safest move is not to rely on it: treat every prompt as potentially retained and keep sensitive data out of it in the clear.
How do I keep sensitive data from being memorized?
By not sending it. An anonymization engine detects the sensitive data and replaces it with a reversible token before sending; the model only receives neutralized text, which it can't memorize on your behalf.

Sources & references

Keep your sensitive data in your browser

ONYRI Sanitize detects and masks your sensitive data before it reaches the AI, then restores the answer — from names to API keys.

Anonymize my prompt

Read next