Data privacy

A representation of computer data through networked lines and binary numbers. In the centre is the security protection shield symbol and a padlock.

The terms and conditions of some of these LLMs mean that any data you put into the tool can be used by the company as part of its ongoing training data sets. You may therefore effectively lose any copyright you had over those materials.

Some of the large volumes of data used to train the models are under copyright, and some were publicly posted but under certain usage conditions (in other words, only for use on a certain platform).

Nevertheless, this material was used within the training data of the tools, and there are a number of copyright court cases pending in different countries to establish whether the GenAI companies are liable for these breaches (for example, at the time of writing cases involving the BBC [Tip: hold Ctrl and click a link to open it in a new tab. (Hide tip)] , Getty and Disney). It is unclear at the moment whether a user would be liable for breach of copyright if the output of a GenAI tool was substantially based on copyrighted material.

Finally, it is also not clear who legally owns the output of a GenAI tool. Many of the terms and conditions state that the output is owned by the user, but this may not be upheld by the courts if they are based on copyrighted materials or are very generic outputs.

Explainability

Legal implications

My OpenLearn Create Profile

Download this course

About this course

Course rewards

Ethical and responsible use of Generative AI

Data privacy