The Samsung lesson: pasting internal data into AI
In 2023, Samsung engineers pasted confidential source code and meeting notes into ChatGPT. Several leaks in weeks. What the incident teaches every team.
In 2023, Samsung engineers pasted confidential source code and internal meeting notes into ChatGPT to save time. Within a few weeks, several such leaks were spotted, and the company responded by restricting generative AI tools on its devices and networks. The lesson isn't “AI is dangerous”: it's that pasting sensitive content into an external tool sends it outside your control perimeter — and once it's gone, you can't recall it.
What happened
Shortly after allowing ChatGPT in one division, Samsung observed — according to press reports — three separate episodes of confidential data leaking through the tool:
- 1an engineer pastes proprietary source code into ChatGPT to track down an error;
- 2another submits equipment-related code to get optimizations;
- 3a third has an internal recorded meeting summarized.
In each case, the sensitive content left the company for a third party's servers. Per the reporting, Samsung then restricted external generative AI on its devices, pointing to a central problem: once data is transmitted, it's hard to “retrieve and delete,” and it could be exposed to others.
Why copy-paste is the weak link
A copy-paste is instant, invisible and irreversible. It triggers no alert, doesn't make the user re-read what they're exposing, and transmits the exact text — identities, identifiers, secrets included. That's precisely why a large company could leak three times in a few weeks: not out of malice, but because the convenient tool and the risky gesture are the same gesture.
The right lesson (not “ban AI”)
Banning AI deprives teams of a real gain and drives shadow AI — people use it anyway, on their personal phone, outside any framework. The right answer is to keep the tool and neutralize the gesture: make sure sensitive content never leaves in the clear.
- Anonymize before sending: replace names, code, identifiers and secrets with reversible tokens.
- Keep the token↔value mapping on the user's side, never on the model's server.
- Restore the answer locally, to keep usable text without ever exposing the original.
| Risky reflex | Safe reflex |
|---|---|
| Paste raw source code to debug it | Paste the code with secrets and identifiers swapped for tokens |
| Submit a meeting note with names | Submit the note with names and clients tokenized |
| Hope the provider keeps nothing | Never send the real data |
That's ONYRI Sanitize's approach: the engine detects sensitive data in your text and replaces it with reversible tokens before sending to the AI; detection and the mapping stay in your browser, and only an anonymized version is transmitted. Your teams keep AI's speed, without replaying the Samsung lesson.
Frequently asked questions
- What actually happened at Samsung with ChatGPT?
- Per press reports (2023), employees pasted proprietary source code and internal meeting notes into ChatGPT across separate episodes. The company then restricted external generative AI on its devices and networks.
- Should I ban ChatGPT at work?
- Banning mostly drives covert use, outside any framework. It's more effective to keep the tool but anonymize sensitive content before sending, so no real data leaves your control.
- How do I prevent this kind of leak in my team?
- By neutralizing the copy-paste gesture: an anonymization engine detects names, code, identifiers and secrets and replaces them with reversible tokens before sending, with the mapping staying on the user's side.
Sources & references
Keep your sensitive data in your browser
ONYRI Sanitize detects and masks your sensitive data before it reaches the AI, then restores the answer — from names to API keys.
Anonymize my prompt