AI needs a label

Foto von Christian Rieß.
Prof. Dr.-Ing. Christian Rieß (Foto: Privat)

FAU computer scientist Christian Riess explains why the labeling of AI-generated content is so difficult.

A photo that was never taken. A text that wasn’t written by a human. Both are now part of everyday life, yet dealing with AI-generated content touches on fundamental questions of democracy, media trust, and digital public discourse. Among other things, the EU AI Act aims to regulate how such content should be made meaningfully and legally recognizable in the future. Included in the debate: Computer scientist Prof. Dr. Christian Riess, Chair of Applied Cryptography at Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). As part of a group of experts, he has developed recommendations for a code of practice for the labeling of AI-generated content.

Many are calling for the labeling of AI-generated content. It sounds simple – so why is implementation in practice so complicated?

It is difficult to address the diversity of possible media and use cases with a single framework regulation so that the actual goal is achieved, namely, informing users when they encounter AI-generated content.

We distinguish between labeling that is perceptible to humans and machine-readable labels. An appropriate form of perceptible labeling always depends on the medium and use case, e.g., for an audio document, an automatic announcement like “This audio document was AI-generated” or a small “AI” marking at the edge of an image.

Imperceptible labeling is important to automatically extract the information that a document is AI-generated and make this known to users. The challenge here is that the label needs to persist when processed by the platforms where the data is ultimately published. Standardized methods for inserting labels into the metadata of documents already exist. But large platforms often automatically remove such metadata from documents. The code of practice thus calls for labels to be added to the content as a digital watermark which can be read when published on platforms. But there are still many other technical questions that need to be resolved.

Foto von Christian Rieß.
It is difficult to address the diversity of possible media and use cases with one single framework regulation in a way that achieves the actual goal, namely to inform users when they encounter AI-generated content.
Prof. Dr. Christian Riess

Companies, civil society, and legal experts with sometimes vastly different perspectives sit at the negotiating table. Where do interests clash most sharply?

In principle, any form of regulation imposes a burden on companies, whereas civil society has a legitimate interest in being informed when it is interacting with AI-generated content. From the perspective of civil society, it is important to find a solution that works well for the citizens of Europe without disclosing potential private information of individuals. From companies’ perspective, it is important to find solutions that are cost-effective to implement and, on the other hand, do not impair the actual AI-generated content to the extent that it no longer serves its intended purpose.

And how can we tell that a viable compromise has been reached?

Companies can demonstrate their support for the code of practice as signatories and simultaneously gain legal certainty. Civil society can rely on the signatories to label artificially-generated content in accordance with standards. For civil society and particularly professional groups such as journalists, fact-checkers, researchers, and law enforcement agencies, the quality of the regulations will become evident in their practical work when they can detect AI labels. For both parties, it is important to know that the code of practice can be updated if it turns out that certain aspects are not practical.

Where is the boundary between human work and AI-generated content? Do students now have to label their work as AI-generated content if ChatGPT merely suggested individual phrases?

There is a difference between between labeling fully AI-generated content, AI-modified content, and AI-supported functions, such as spell-checking. However, the boundaries are indeed fluid, and educational institutions like FAU are rightly developing their own specific guidelines on what form of AI assistance is appropriate in which context.

How will we be able to tell in the future that the regulation has been effective?

The intended societal impact of combating disinformation more effectively must be examined scientifically following implementation. However, the code of practice cannot go beyond the provisions of the AI Act; for example, under the AI Act, the explicit labeling of human-generated (non-AI) content is not foreseen.

Further information