All articles
Fundamentals6 min read

Anonymization vs pseudonymization vs tokenization

Anonymization, pseudonymization, tokenization: three often-confused terms. Clear definitions, GDPR status and concrete consequences for AI.

By Pierre de ONYRI

In short: pseudonymization replaces an identifier but keeps a re-identification key (the data remains “personal” under the GDPR); anonymization makes re-identification reasonably impossible (the data falls outside the GDPR); tokenization is a reversible replacement technique that serves either one, depending on where the mapping table lives. For an AI prompt, the goal is clear: the external service receives only tokens, and the key never reaches it.

Pseudonymization: reversible, so still “personal”

Pseudonymization (GDPR Article 4) replaces direct identifiers with pseudonyms, but a key allows you to map back to the person. As long as that key exists, the data remains personal data: it gets a lighter regime but stays subject to the GDPR.

Anonymization: irreversible, outside the GDPR

Anonymization aims for re-identification that is reasonably impossible, accounting for all means likely to be used. If effective, the data is no longer “personal” and falls outside the GDPR. It's a demanding standard: surface masking isn't enough if cross-referencing can still reach the person.

Tokenization: the technique, not the status

Tokenization replaces a value with a token and keeps a mapping to restore it later. It's a technique, not a legal category: its status depends on who holds the mapping table. The key question is: where does that table live?

What it changes for your prompts

  • Sending pseudonymized text WITH the key: the data stays personal on the recipient's side.
  • Sending tokenized text WITHOUT the key: the recipient gets nothing to re-identify with.
  • Keeping the key on the client: you restore the answer without ever exposing it.

ONYRI Sanitize applies exactly this principle: the token ↔ value mapping stays in browser memory and never travels to the backend. The AI service only sees tokens; you alone restore the answer.

Frequently asked questions

Is pseudonymized data protected by the GDPR?
Yes. As long as a re-identification key exists, the data remains personal and subject to the GDPR, even if the regime is lighter than for clear-text data.
Is tokenization enough to “anonymize”?
Not on its own: it all depends on the mapping table. If it's never transmitted to the recipient, that recipient gets nothing to re-identify with — which directly serves minimization.
Why aim for anonymization rather than pseudonymization for a prompt?
Because the recipient (the AI service) must receive no means to re-identify. By keeping the key on the client, you hand it the equivalent of anonymized data from its standpoint.

Sources & references

Keep your sensitive data in your browser

ONYRI Sanitize detects and masks your sensitive data before it reaches the AI, then restores the answer — from names to API keys.

Anonymize my prompt

Read next