Image from Pixabay

Introduction

Ever since digital online transactions became ubiquitous from the late 1990s onward, the copyright world has been grappling with metadata problems. Metadata can generally be defined as “data about data”, where for example the musical recording is the central data point, and the song’s title, composer(s), performer(s), etc. are the metadata. Incorrect or missing metadata can cause missed licensing opportunities, as the relevant rightholders or collecting societies cannot be traced, or works are not detected. Moreover, data deficiencies can lead to recommender system biases, as well as a lack of adequate information for the general public, cultural archives and historical research. Recommender systems nowadays play a crucial role in the online dissemination of (copyrighted) works, and can be defined as fully or partially automated systems used by online platforms to suggest in their interface specific information to recipients of the service or prioritise that information.

In 2020, a contribution to this blog reaffirmed that metadata matter for the future of copyright. Since then, the importance of metadata for a fair and well-functioning copyright system has arguably only further increased. Throughout the last five years, we have witnessed an ever-increasing online streaming market, and a boom in large-scale text and data mining (TDM) of online works for generative AI models. Both developments add to the importance of a satisfactory metadata ecosystem to enable proper remuneration and licensing practices.

The 2020 metadata post noted the then recent “Feasibility study for the establishment of a European Music Observatory”, issued by the European Commission (EC). The matter remained on the EC’s agenda, spurring the launch of the EU Horizon Project Open Music Europe (OpenMusE) in 2022. The ongoing OpenMusE project provides new insights – both theoretical and empirical – that help uncover the metadata problems for the music industry. This blog post will first outline the main challenges present in the (European) music industry regarding metadata, distinguishing between technical, economic and legal challenges. Then, the question of what copyright laws, and other fields of law, have to offer to overcome those challenges will be answered.

 

Metadata design – technical challenges

Missing or incorrect metadata in a dataset could be overcome, or at least mitigated, by interoperability between datasets. If a model is able to consistently describe metadata, it allows for the linking of entities and concepts from various datasets. However, there are numerous challenges to music metadata design in relation to interoperability. Interoperability would require designing a music metadata model across different genres and historical periods, i.a., in such a way as to accommodate various use cases over heterogeneous data sources. With that, it requires an approach harmonising all requirements from different stakeholders, in order to design a model that can be adapted to different datasets (de Berardinis et al. 2023).

Requirements for interoperability can revolve around metadata granularity, i.e. the level of detail metadata provide. A specific category of data that can be expressed in terms of granularity is provenance metadata, which detail the origin, history, and lineage of data.

For music metadata, discrepancies in metadata granularity and provenance between datasets are – in part – caused by the disparate processes of releasing recordings in the music industry. Smaller, independent labels and artists commonly release their music online through music distributors. 2019 research commissioned by the UK IPO already noted that when smaller labels or artists move their work from one distributor to another, the new distributor is often free in its decision to assign new International Standard Recording Codes (ISRCs) instead of adopting all ISRCs assigned by their predecessor. Moreover, some assignors do not assign the code in accordance with the standard and guidelines of the International Federation of the Phonographic Industry (IFPI). New empirical findings from the OpenMusE project confirm these problematic practices currently still persist (Vieira 2025).

The ISRC system is decentralized, with each national authority being responsible for managing the ISRCs in a specific country or territory. There is a public, apparently worldwide ISRC search tool on IFPI’s website. However, it shows signs of incompleteness for ISRCs of lesser known (independent) repertoires, and for additional metadata in general, likely caused partially by the fact that there is no requirement to submit metadata during or after the process of assigning the codes.

That these issues persist around ISRCs, one of the most central data points in music metadata, raises the question how much is still lacking regarding the availability and interoperability of other relevant music metadata that are (even) less consistently adopted.

 

Metadata availability – economic challenges

Now, small labels and artists can only do so much in solving insufficient metadata. As could be inferred from the above, music metadata are created and held by many different actors, such as collecting societies, music labels, publishers and distributors, online content-sharing service providers (OCSSPs), and artists themselves. This makes for a fragmented landscape full of “data silos”. Interoperability between those silos is not just hampered by metadata design issues, but perhaps mostly by a shortage of incentives to share for the most influential actors. Having a wealth of (music) metadata gives OCSSPs, music labels and conceivably collecting societies an advantage vis-à-vis their competitors.

It begs the question what breakthrough could make the metadata flow. Fund one or more public organizations that function as metadata hubs? Stimulate commercial incentives to nudge big data owners to share? Or encourage them through legislative measures? Likely, a plethora of repertoires, actors and metadata requires a plethora of solutions.

A private initiative like Digiciti focuses on data exchange intermediation between private and public data holders. The company’s aim is that the data exchange “should be funded as far as possible via the organizations providing services which use it and benefit from it”. The initiative also hopes to make use of the framework of the 2022 Data Governance Act (DGA). Below, it will be touched upon whether the DGA could be useful for improving music metadata interoperability.

 

What does the law have to offer?

That the current state of music metadata poses problems for effective copyright enforcement does not per se mean that copyright law, or the law in general, can solve those problems. Still, it is worth examining what adjustments in the EU legal framework can potentially ameliorate some of the dysfunctionalities surrounding music metadata and the copyright system.

In current EU copyright law, Article 17(4)(b) of the Copyright in the Digital Single Market (DSM) Directive can provide a starting point. The article imposes on OCSSPs a “best efforts” obligation to ensure the unavailability of specific works for which the rightholders have notified the OCSSPs with the relevant and necessary information. Rightholders’ notifications could be complemented with descriptive metadata requirements on the relevant works (Senftleben et al. 2022). More complete metadata within OCSSPs data pools can already have positive effects, e.g. more accurate remuneration. However, it would be particularly beneficial when interoperability between OCSSPs’ data sources and a (more) public repository is realised, or, alternatively, if Article 17(4)(b) would also require submitting the metadata in parallel to a central body managing the EU copyright data repository.

That is where the potential of the DGA comes in. One of the DGA’s central aims is to further develop the borderless digital internal market, with domain-specific European data spaces for data pooling and sharing. An important role is given to data intermediation service providers (Chapter III). Article 2(11)(b) DGA excludes “services that focus on the intermediation of copyright-protected content”, an example of which are OCSSPs, as Recital 29 specifies. However, it could be argued that intermediation services for music metadata would not intermediate the copyright content itself (so the musical recordings), but merely the metadata of that content. If such a demarcation is applied, the DGA’s European Data Innovation Board (EDIB), being tasked with DGA implementation and guidelines, could then facilitate interoperability standards for a European data space for the music sector.

Forthcoming research output from the OpenMusE project will further develop the ideas outlined above, as well as other legal solutions relating to Article 4(3) DSM Directive, the Data Act, and the Collective Rights Management Directive.

 

Conclusion

As mentioned, a plethora of repertoires, actors and metadata likely requires a plethora of solutions for a fair and well-functioning EU copyright marketplace. To legal practitioners active in the music sector: start (or continue) assisting your label and musician clients with filling in all possibly relevant metadata. Of course, for fundamental, long-term solutions, we must turn to the relevant EU bodies. So, to the two EU legislative bodies: make sure that the music sector can also benefit from new data space legislation like the DGA, and consider aligning Article 17 DSM Directive with the metadata needs of the copyright system.


________________________

To make sure you do not miss out on regular updates from the Kluwer Copyright Blog, please subscribe here.


Kluwer Arbitration
This page as PDF

Leave a Reply

Your email address will not be published. Required fields are marked *