The European Data Protection Board (EDPB) has released an opinion focused on data protection in AI models. They’re tackling important topics like assessing AI anonymity, legal grounds for data processing, and how tech companies can reduce impacts on individuals’ privacy within the EU.
This opinion came about after a request from Ireland’s Data Protection Commission, which oversees many multinational companies under the GDPR.
What are the main takeaways?
First, the EDPB clarified when an AI model qualifies as “anonymous.” An AI model can be labeled anonymous if the likelihood of tracing the training data back to a specific individual is “insignificant.” Evaluating anonymity requires a thorough, case-by-case analysis by the supervisory authorities.
Developers can show anonymity by:
- Limiting the collection of personal data from the start by excluding irrelevant sources.
- Implementing strong technical measures to prevent any chance of re-identification.
- Making sure data is genuinely anonymized.
- Applying data minimization techniques to cut down unnecessary personal data.
- Regularly checking the risks of re-identification through audits and testing.
According to Kathryn Wynn, a data protection lawyer, these requirements complicate how AI companies can approach claims of anonymity. She believes the interpretation of the law could impose heavy compliance demands, particularly regarding purpose limitation and transparency.
Next, the EDPB discussed when AI companies can process personal data without getting consent. They can do so under a “legitimate interest” basis if they convincingly argue that their interests—like improving AI models—outweigh the rights of individuals. This is crucial for tech firms, as asking for consent for vast data sets isn’t practical or cost-effective.
To qualify, companies must pass three tests:
- Legitimacy test: Identify a lawful reason for processing personal data.
- Necessity test: Prove that the processing is essential for their purpose, with no less intrusive alternatives available.
- Balancing test: Ensure the legitimate interests in processing the data outweigh the effects on individual rights and freedoms.
If a company struggles with the balancing test, they might still avoid requiring consent if they implement mitigating measures to lessen the processing’s impact. Some examples include:
- Technical safeguards: Using encryption to lower security risks.
- Pseudonymisation: Altering identifiable information to unlink the data from individuals.
- Data masking: Using fake data when real data isn’t necessary.
- User rights mechanisms: Making it easy for individuals to opt out or request corrections.
- Transparency: Being open about data processing practices through campaigns and labels.
- Web scraping restrictions: Preventing unauthorized scraping with opt-out lists or excluding sensitive information.
Malcolm Dowden, a technology lawyer, noted that the term “legitimate interest” has sparked debate, especially concerning the U.K.’s Data (Use and Access) Bill. Supporters of AI argue that processing data fuels innovation and social good. Critics warn that this doesn’t account for potential privacy concerns or risks of misinformation.
Concerns have also been raised by advocacy groups, like Privacy International, about whether AI models, such as OpenAI’s, face adequate scrutiny under these tests.
Lastly, the opinion addresses the consequences of unlawfully processing personal data during AI development. If any AI model is built by improperly handling data, it affects how that model can function. Authorities evaluate each situation individually, considering factors like:
- If the same company retains and processes personal data, they must ensure compliance in both development and deployment phases.
- If a different firm is involved during deployment, they need to have properly assessed the model’s lawful status beforehand.
- If data is anonymized after improper processing, only subsequent non-personal data processing escapes GDPR liability.
Why should AI firms pay attention to this guidance? While it doesn’t have the power of law, it impacts how privacy laws are enforced across the EU. Companies can face hefty fines—up to €20 million or 4% of their annual revenue for GDPR violations. In serious cases, they may need to alter or even terminate their AI operations.
AI companies often find it challenging to meet GDPR standards due to the sheer volume of personal data needed for training, often drawn from public databases. This can lead to legal conflicts and fines. For instance, in September, the Dutch Data Protection Authority hit Clearview AI with a €30.5 million fine for unlawfully gathering facial images from the internet. Around the same time, the Irish DPC pushed for changes after Elon Musk’s X was ordered to stop using public posts from EU users without their consent to train its AI chatbot, Grok.