Generative AI and Copyright: The First Court Case in Europe

On 27 September 2024, the Hamburg Regional Court (Ger. Landgericht Hamburg) ruled in the case Kneschke v. LAION e.V (hereinafter: the LAION case). Kneschke, a professional photographer, brought a claim against the non-profit organisation LAION for including one of his photographs in its “LAION-5B” dataset, which is intended for training artificial intelligence (AI) models. Kneschke argued that this constituted an unlawful use of his copyright work and an infringement of his copyright in the photograph.

Introduction

The Hamburg court ruled that LAION did not infringe Kneschke’s copyright because the use of the photograph was permitted under the text and data mining (TDM) exception for scientific research purposes, as governed by German copyright law. The relevant provision was introduced into the German Act on Copyright and Related Rights (Ger. Urheberrechtsgesetz, UrhG) in transposition of Article 3 of Directive (EU) 2019/790 on copyright and related rights in the Digital Single Market (hereinafter: the CDSM Directive).

This is the first case of judicial review of the legal basis for TDM introduced by the European Union (hereinafter: the EU) through the CDSM Directive. It provides a legal basis for the collection of big data for AI models and might also serve as a legal basis for AI model training. Aside from the positive outcome for LAION, the case is also a significant milestone for those interested in the transparency of data sets.

The ruling confirms that the exceptions for TDM introduced by the CDSM Directive in 2019 also apply to the use of copyright works in connection with generative AI models. The case highlighted the important role that non-profit organizations can play in making data sets publicly available, by recognising LAION as a research organisation. This is crucial for ensuring transparency of the data used in the training and building of AI models.

Facts of the Case

Photographer Kneschke brought a claim against the German non-profit organisation LAION for the unauthorised use of his photograph. LAION operates the “LAION-5B” dataset, which is available online and intended for AI training. The disputed photographs were not themselves contained in the dataset; rather, it only includes textual descriptions (metadata) and hyperlinks to websites where the photographs were publicly available at the time of dataset creation.

Thus, Kneschke could not sue LAION on the basis that it made his photographs publicly available without authorisation, since the “LAION-5B” dataset does not actually contain any image files. Instead, Kneschke argued that LAION had created reproductions of his photographs through the use of the AI model. The purpose of this process was to verify the accuracy of the image descriptions. LAION had gathered the photographs and their associated descriptions from the internet, and used an AI model to ensure that the descriptions matched the content of the images. In other words, LAION did not merely collect photographs – it actively used an AI model to check whether the accompanying descriptions were accurate.

Court’s Decision

The court held that reproductions made in the course of creating the dataset constituted acts of text and data mining and therefore fell under a statutory exception covering such acts. The court further found that LAION, as a non-profit organisation whose goal is to promote scientific understanding of AI, qualifies under the TDM exception for scientific research as defined in Article 3 of the CDSM Directive (transposed in German law as § 60d UrhG). In doing so, the court rejected the argument that LAION failed to meet the standard of independence from commercial entities (in this case, from the company Stability.ai).

The court granted LAION legal protection under the TDM exception for scientific research purposes, as provided by Article 3 of the CDSM Directive, finding that LAION’s activities make a significant contribution to scientific research. LAION is not a typical research organisation such as a university, but rather a non-profit organisation whose mission is to enhance scientific and thereby public understanding of datasets that may be used to train AI models. Datasets are a key element in what is otherwise a largely non-transparent process of training AI models. Accordingly, and in line with academic commentary, the court’s decision to classify LAION as a “research organisation” within the meaning of Article 3 of the CDSM Directive and to grant it legal protection on this basis is welcome (Keller, 2024 and Guadamuz, 2024).

Unresolved Issues

The LAION judgment is relatively narrow, addressing only the use of TDM for the creation of a dataset, and not the subsequent training of an AI model using that dataset. In this regard, future rulings across different jurisdictions will be of interest, especially given that Member States have implemented the CDSM Directive differently (Keller, 2024).

The court also indirectly addressed the possible application of § 44b UrhG, which implements Article 4 of the CDSM Directive and governs TDM for commercial purposes. This exception provides the legal basis for the lawful use of copyright works in training AI models for commercial use, except where rights holders have explicitly and appropriately opted out. German law requires that such opt-outs be “machine-readable”. Although the court touched on this issue, it was not decisive in the present case – but may prove crucial in the future.

The Hamburg court ruled only on the legality of using copyright works under the TDM exception for creating a dataset and did not rule on the applicability of this exception to the subsequent training of AI models (Rosati, 2024). In obiter dicta, however, the court observed that the Artificial Intelligence Act (hereinafter: the AI Act), which governs also AI model training, explicitly refers to Article 4 of the CDSM Directive (see Recital 105 and Article 53(1)(c) of the AI Act). Accordingly, at least in principle, there should no longer be any doubt that the Article 4 exception extends to the training of AI models, as both the AI Act and the LAION judgment affirm this interpretation (Guadamuz, 2024).

Conclusion

It appears that Kneschke sued the wrong party. Rather than bringing the claim against LAION, he should have brought it against Stability.ai (or another commercial entity using the datasets to train their AI models).

The judgment is also relevant for Slovenia, which transposed the relevant provisions of the CDSM Directive into its Copyright and Related Rights Act (ZASP). Article 57b ZASP provides a TDM exception for scientific research, while Article 57a ZASP applies to all other purposes, including commercial use.

The Slovenian exceptions allow the free reproduction of lawfully accessed works for the purpose of TDM. They also permit the digitisation of analogue content and remote access to analogue content, however only for TDM purposes. In the case of the scientific research exception (Article 57b ZASP), the sharing and making available to the public of the results of TDM is also permitted. Rights holders are obliged to ensure that beneficiaries of both exceptions can effectively perform TDM. If they fail to do so, they must take action within 72 hours, or they may face sanctions (Bogataj Jančič, 2023).

A shortcoming of the Slovenian implementation is that the national legislator defined the concept of “lawful access” more narrowly than the European legislator. Under the ZASP, lawful access includes access based on free and open licenses, contracts, or other legal bases, such as exceptions and limitations to copyright and provisions of special laws (e.g., the Legal Deposit Act). However, even though Recital 14 of the CDSM Directive explicitly states that “lawful access” should also cover access to content that is freely available online, the Slovenian implementation fails to include this (or at least fails to do so explicitly).

Nevertheless, it is entirely possible that in an actual dispute, the courts could (and should) interpret the ZASP more broadly and, in light of Recital 14 of the CDSM Directive, find that lawful access also includes access to content that is freely available online. This issue was raised in a letter dated 30 June 2024 to the European Commission, which decided not to initiate infringement proceedings against Slovenia. The European Commission explained that, although the Slovenian implementation does not explicitly refer to content that is freely available online as part of the definition of lawful access, it also does not expressly exclude it. Therefore, the CDSM Directive can be considered transposed in accordance with its objective and purpose, as stated in the recitals.

As a result, Articles 57a and 57b ZASP can (and should) be interpreted in light of Recital 14 of the CDSM Directive, meaning that lawful access under Slovenian law can (and should) be understood to also include access to content that is freely available online.

This article by Mark Bauer and Dr. Maja Bogataj Jančič, LL.M., LL.M., titled “Generative AI and Copyright”, was published on 7 November 2024 in the journal Pravna praksa, Year 43, Issue Nos. 42–43/2024, pp. 12–13.