Anonymize an Excel or CSV before analyzing it with AI
An HR, customer or finance export holds dozens of personal data points per row. How to analyze it with AI without exposing everything.
To analyze an Excel or CSV file with AI without exposing sensitive data, anonymize the identifying columns before sending: each value (name, email, phone, IBAN…) is replaced with a consistent token, the AI produces its analysis on the tokens, and you restore the real values in your browser. The analysis stays accurate; the people behind the rows are never exposed.
Why a spreadsheet is a high-risk case
An HR, customer or accounting export concentrates personal data across entire columns and hundreds of rows. Pasting it as-is into an assistant means handing over a file that can directly re-identify people — exactly what the minimization principle aims to prevent.
- Identity columns: first/last name, email, phone, address.
- Financial columns: IBAN, salary, tax identifier.
- HR columns: employee ID, status, possibly health data.
- Cross-referencing: even surface-anonymized, combined columns re-identify.
The method, column by column
- 1Identify the identifying columns (direct and indirect).
- 2Tokenize: the same value always gets the same token (aggregates stay correct).
- 3Run your analysis (summary, trend, categorization) on the tokenized text.
- 4Restore the output in your browser to map results back to real rows.
Keep the analysis accurate
The goal isn't to degrade the data but to remove the identifier. Amounts, dates and categories can stay (depending on your need), while identities become tokens. You get a faithful analysis without disclosing who is who.
ONYRI Sanitize's Tables section applies this flow at scale (Web Worker processing beyond 1,000 rows), with country-aware detection and the ability to add your own sensitive columns.
Frequently asked questions
- Does anonymization skew my statistics?
- No, if tokens are consistent: the same value keeps the same token, so counts, per-group averages and groupings stay correct. Only identities are replaced.
- Do I need to anonymize the whole file?
- Only the identifying columns (direct and indirect). Purely numeric, non-identifying columns can often stay, depending on your analysis and internal policy.
- What about large files?
- Background in-browser processing (Web Worker) anonymizes thousands of rows without freezing the UI — and without the file leaving your machine.
Sources & references
Keep your sensitive data in your browser
ONYRI Sanitize detects and masks your sensitive data before it reaches the AI, then restores the answer — from names to API keys.
Anonymize my prompt