• 5 min read

Data Protection - AI and Chat GPT

DP-news-alert

Data Protection and Generative AI – what are the concerns?

AI, or artificial intelligence, has been in existence for many years and is in all likelihood already incorporated within software products used by organisations. However, the launch of the Chat GPT ‘chat bot’ by OpenAI towards the end of 2022 has brought the subject of AI (specifically what is known as ‘generative AI’) into news headlines and data protection publications as the undoubted benefits of generative AI also create a complex web of legal, regulatory and ethical issues. 

Generative AI, like Chat GPT, can be asked by a user to create new content (as opposed to just analysing something that already exists) like written text, audio, code, videos, pictures etc. which it will very quickly do. It uses data and algorithms to create content on request in a realistic manner so it appears to have been created by a human.

What data protection issues are we seeing with generative AI? 

The data protection and privacy issues with generative AI become apparent when we look at how it creates content. Chat GPT and other similar generative AI products use ‘large language models’ which means they have had vast amounts of information fed into them, large amounts of which has been scraped from the internet, to ‘train’ them. This information may include personal data, with such personal data being processed to design, train, test and deploy AI.
The data protection and privacy issues with generative AI become apparent when we look at how it creates content. Chat GPT and other similar generative AI products use ‘large language models’ which means they have had vast amounts of information fed into them, large amounts of which has been scraped from the internet, to ‘train’ them. This information may include personal data, with such personal data being processed to design, train, test and deploy AI.

When considering using generative AI, like Chat GPT, the questions to consider from a data protection perspective include:

1.   Whose information was it that was used to train the AI and did it include personal data?

2.   If the training data or the content produced by generative AI includes personal data, was relevant data protection law complied with when the information was scraped from the internet so that it can be used and reproduced by AI (and used by a user who has requested content) in compliance with data protection law?

Under the EU and UK GDPR a legal basis is required for the collection and processing of personal data. A privacy notice should be presented to data subjects at the point personal data is collected explaining how their personal data will be used. The data subjects should also have the opportunity to exercise their rights in respect of their personal data e.g. the “right to be forgotten”. These are not easy concepts to address in generative AI modelling, making compliance difficult to truly achieve.

3.   What information is the user putting into Chat GPT in order to request the new content? Personal data and confidential information should not be inputted into Chat GPT as there is a risk, if not a probability, that Chat GPT will use the data and reproduce it for other users. 

4.   Is the content produced correct and accurate? Some generative AI output has been found to be incorrect or misleading (or in some cases entirely made up!) and concerns have also been raised about AI having inherent bias, as it is dependent on the type, quality and accuracy of the machine-learning techniques and information it is trained with and the data which it is being asked to source, review and interpret. If personal data is involved these issues are concerning as a key principle of data protection law is that personal data held by a controller should be complete, up-to-date and accurate.

5.   If there is a data protection compliance issue relating to personal data in content produced by generative AI, who is responsible for that? In all likelihood, once content is created by an organisation using generative AI it becomes the controller for that content and is therefore responsible for its compliance with data protection law.

Aside from pure data protection issues there are various other concerns with generative AI – which we will not go into in detail here – but they include issues with breaches of intellectual property rights, issues with ownership of content and in a cyber security context it is worth data protection specialists being aware that generative AI could be used as a phishing or scamming tool. For example, we are all used to receiving phishing emails which usually don’t read particularly well and have mistakes in them making them easier to spot as being a potential risk to an organisation and potentially personal data it holds. With the launch of Chat GPT and similar products, it seems these AI products are being used to create more believable and sophisticated phishing emails.

What is happening to try and protect personal data in a generative AI context? 

In the news recently we have seen the following:

1.   A Chat GPT data breach resulting in Italy’s data protection authority announcing a temporary ban on Chat GTP. 

2.   The European Data Protection Board has “decided to launch a dedicated task force to foster cooperation and to exchange information on possible enforcement actions conducted by data protection authorities” in relation to Chat GPT.

Given the speed at which generative AI is taking off and the long list of concerns with it which have been raised, regulation is having to be developed quickly and is in truth, playing catch up. This is how regulation is progressing in the EU and UK:

1.   Progress on the EU AI Act is being made, although this is still some way off coming into effect. This Act is expected to introduce comprehensive regulation which imposes a broad range of mandatory requirements on the developers and deployers of AI systems, across all sectors.

2.   The UK government issued an AI white paper (called ‘A pro-innovation approach to AI regulation’) setting out how the UK proposes to regulate AI. You can access a copy of the white paper here.

The first thing to note about this white paper is that the UK is not proposing any new legal requirements at this stage but it does set out ’5 principles to guide and inform the responsible development and use of AI in all sectors of the economy” which are:

(1) Safety, security and robustness
(2) Appropriate transparency and explainability
(3) Fairness
(4) Accountability and governance
(5) Contestability and redress

These are all principles which are familiar from data protection law.

3.   The UK Information Commissioner’s Office (ICO) has updated its guidance relating to AI and data protection and its AI toolkit.

4.   The ICO has published a blog post on key questions that developers and users of generative AI need to ask to ensure compliance with UK GDPR – read here.

Our data protection and technology teams have many years of experience in advising organisations on both their technology requirements and those related to compliance with data protection law. To speak to one of the team you can get in touch by calling us on 0800 2800 421

If you have not received this article directly, but would like to receive articles and data protection news alerts from Trethowans, please contact us to sign up to our alerts.

Answers are just a click away