You will be redirected to the website of our parent company, Schönherr Rechtsanwälte GmbH: www.schoenherr.eu
The EU faces a pivotal moment in adapting copyright law to the rise of generative artificial intelligence (GenAI) in music and art, ushering intellectual property (IP) into new and uncharted territory.
Legal clarity is needed regarding the use of copyrighted works as training data and the resulting questions about ownership of AI-generated content.
Thus, a novel market is emerging for training data and direct creative licensing, offering new opportunities for IP rights holders, developers and investors alike. All of this underscores the need for a new type of direct licensing agreement.
At the heart of the issue lies a fundamental question: who owns the output of a machine? Under current EU copyright legislation, protection is granted only to works created by humans. This leaves AI-generated text, music, paintings and other creative outputs in a legal grey area. If no human author is involved, can these works be copyrighted at all? And if a human prompts the AI, does that make them the author?
These questions are not just theoretical – they have real implications for IP rights holders, developers, and users.
The implication that copyright must belong solely to individuals was reaffirmed in the "Artificial Intelligence Strategy" of 2023 issued by the Court of Justice of the European Union (CJEU), which excludes purely machine-generated content from copyright protection.
The legal vacuum surrounding AI-generated works has prompted calls for reform. Some stakeholders advocate for a sui generis right or a new category of protection for AI-generated content, while others caution against diluting the human-centric foundation of copyright law. For now, the EU maintains its position that copyright must remain tied to human creativity, though the debate is far from being settled.
While it is easy to focus on the prompt-based result returned by the AI platform, since it is the most "tangible" from a user perspective, it also begs the fundamental question: how does the AI know what to reply?
To this end, the answer is quite complex and requires a separate in-depth analysis. Nevertheless, for the sake of brevity, it can be broadly simplified as referring to datasets used by developers to train the AI model.
GenAI systems are trained on massive datasets, where each piece of information is first collected, cleaned, annotated and processed. Only once this step has been completed can developers proceed with fine-tuning the AI model. The final step in the training process involves additional input, which comes in the form of reinforcement learning.
A simplified overview of the complex process outlined above can be broken down into: data collection, data pre-processing, model pre-training, model fine-tuning and reinforcement learning.
Since the prompts and their subsequent answers cover a wide array of topics, their initial datasets must also comply with that demand. This may lead to the risk of the gathered data potentially including copyrighted works, including literary works, music and paintings, raising concerns about whether using such data without permission constitutes infringement.
The EU's Copyright in the Digital Single Market (CDSM) Directive 2019/790 provides some guidance through its Text and Data Mining (TDM) exceptions, but the rules are complex and often misunderstood. Nonetheless, since it is an EU Directive, its provisions are not directly enforceable, and each EU Member State must transpose it within their national legislation.
One of the EU's key priorities is increasing transparency in GenAI systems. The EU's AI Act came into force in August 2024 as Regulation (EU) 2024/1689. It includes provisions requiring developers to disclose whether content was generated by AI and to provide information about the data used to train their models. This is essential for building trust and ensuring that consumers and creators can distinguish between human-made and machine-made works.
As a means to decrease overhead costs, increase response time efficiency, and provide up-to-date answers, some GenAI platforms have repositioned themselves and implemented Real-time Augmented Generation (RAG) technology. Such technology consists of combining GenAI with present information retrieval. For clarity, this would entail that the engines generate answers by searching, identifying and synthesising up-to-date information available online.
With RAG, copyright-protected content is used not only for training but also for content generation, which could potentially lead to infringement cases – even if the developers did not initially intend to infringe. This risk stems from the need to deliver quick and relatively comprehensive prompt-based responses to end users.
Ultimately, if IP rights holders can trace how their works were used in training, they are better positioned to assert their rights. To this end, transparency supports enforcement.
As a side note, a novel and interesting step in this direction was taken by Denmark, where the government is exploring the possibility of allowing citizens to protect their likeness and image via copyright. This initiative is a response to the unauthorised use of AI for creating deepfakes.
It is not uncommon for online platforms to advocate for transparency while simultaneously relying on datasets gathered through web scraping to train their AI models. Although this remains a controversial issue, it raises an important question: what comes next?
As previously touched upon, the only way to ensure greater clarity moving forward is for the parties to enter into a legally binding agreement that clearly defines the rights and obligations of all involved.
The enforcement of IP rights in the context of GenAI presents unique challenges. Identifying infringing content, tracing its origins and attributing liability can be difficult when AI systems operate as black boxes. Moreover, the decentralised nature of AI development, often involving multiple actors across jurisdictions, complicates enforcement efforts.
The EU's approach to GenAI and copyright is still evolving, but one thing is clear: the future of creativity will be collaborative. IP rights holders and developers will increasingly work together, and the law must adapt to reflect this new reality.
Up to this point, many IP rights holders worried that their works were being used without consent or compensation leading to a clear imbalance.
Article 3 of the aforementioned CDSM allows TDM for research purposes by research organisations and cultural heritage institutions. However, Article 4 of the CDSM permits TDM for any purpose, provided the IP rights holder has not expressly reserved their rights in this regard. This opt-out mechanism has become a key tool for creators and publishers to control the use of their works in training AI models.
As a result, the opt-out alone does not facilitate access. For AI developers seeking high-quality and legally compliant datasets, direct licensing agreements with IP rights holders are emerging as a preferred solution. These agreements provide legal certainty, enable compensation for the IP rights holders and support the development of trustworthy GenAI platforms.
Rather than viewing AI as a threat, the 2025 Executive Briefing on "The Development of Generative Artificial Intelligence from a Copyright Perspective", issued by the European Union Intellectual Property Office, encourages IP rights holders to see it as an opportunity.
This new type of direct licensing agreement could enable IP rights holders to monetise their works by making them available for AI training under clear and fair terms, allowing developers to use them directly as datasets. In turn, it could lead to the emergence of new markets for training data and creative content.
By implementing these direct licensing agreements, IP rights holders could potentially control how their works are used, receive equitable compensation, and gain greater transparency.
In return, developers can use this opportunity to have legal certainty, reduced risk of litigation, improved data quality, and enhanced scalability.
Overall, it can set out the contractual framework necessary for defining data ownership, usage, and data treatment, as well as for improving data collection and compilation, alongside data security policies, practices and protocols.
As this is a novel and emerging market, many key players are still trying to set the pace and establish a standard contractual framework that is mutually beneficial for both developers and IP rights holders.
As GenAI continues to reshape the creative industries, the EU is laying the groundwork for a licensing ecosystem that respects IP rights while enabling innovation. Direct licensing agreements between AI developers and rights holders are not only a legal necessity but also a strategic opportunity to build a more equitable and sustainable digital economy.
By supporting innovative direct licensing models, clarifying legal frameworks and promoting transparency, the EU Member States appear to be shaping a landscape that balances the human and the machine – all with the goal of advancing creativity, creating new opportunities and pushing IP into new frontiers.
author: Tiberiu Protopopescu
Tiberiu
Protopopescu
Senior Attorney at Law
romania