All articles
Fundamentals6 min read

Sensitive data: United States vs France, what changes

SSN, EIN, ZIP in the US; social security number, SIREN, IBAN in France. National identifiers differ by country — why detection must adapt.

By Alexis de ONYRI

Sensitive data doesn't take the same shape in the United States and in France: a US SSN (9 digits) is nothing like a French social security number (15 digits), and an EIN or a ZIP code has no direct equivalent in French formats (SIREN, postal code, IBAN). Practical consequence: an effective detection engine must be country-aware — adapting its rules to the country, or it misses identifiers and multiplies false positives.

Personal identifiers: different formats

  • United States: SSN (9 digits), sometimes formatted 123-45-6789.
  • France: social security number (15 digits, with a check key).
  • Phone, address, date: distinct conventions and formats (MM/DD vs DD/MM).

Corporate and tax identifiers

  • United States: EIN (employer ID), bank routing/account numbers.
  • France: SIREN/SIRET, IBAN, tax identifier.
  • Addresses: ZIP code (5 digits) vs French postal code (5 digits, different logic).

Why detection must be country-aware

Applying French rules to a US text (or vice versa) creates two problems: leaks (an unrecognized SSN isn't masked) and noise (digit strings wrongly taken for identifiers). Adapting detectors to the country — formats, lengths, check keys — improves both recall and precision.

Beyond FR and US

Many organizations operate across several countries. The ideal is to choose the detection country, with solid coverage where it exists and a reasonable fallback elsewhere — while clearly flagging when a country's rules aren't yet optimal.

ONYRI Sanitize offers country-aware detection (full FR and US rules: SSN, EIN, ZIP, US dates… on the US side; social security, SIREN, IBAN… on the FR side), with the ability to add your own business rules.

Frequently asked questions

Why not apply the same rules everywhere?
Because national identifiers have different formats, lengths and check keys. Generic rules miss real identifiers and trigger false positives on harmless numbers.
Are a US SSN and a French social security number the same?
No: 9 digits in the US, 15 digits (with a check key) in France. Detecting them correctly requires country-specific rules.
What happens for an uncovered country?
A good tool applies a reasonable fallback (common families: email, IBAN, card…) while indicating that national-identifier coverage isn't yet optimal for that country.

Sources & references

Keep your sensitive data in your browser

ONYRI Sanitize detects and masks your sensitive data before it reaches the AI, then restores the answer — from names to API keys.

Anonymize my prompt

Read next