Executive summary
The principal objective of the Health Data Catalog Application Profile (HealthDCAT-AP) is to establish a standardised, interoperable metadata schema tailored to the health domain. This schema is intended to facilitate the discovery, sharing, and reuse of health datasets across the European Union. By aligning with the broader DCAT-AP (Data Catalog Vocabulary Application Profile) standard, HealthDCAT-AP ensures that health-related datasets are uniformly described, promoting interoperability and enabling seamless data exchange among researchers, public health institutions, policymakers, and other stakeholders. HealthDCAT-AP introduced specific extensions to the DCAT-AP model to meet the unique requirements of the health sector. This standardisation is crucial for the European Health Data Space (EHDS) framework, enabling effective health data sharing across Europe. By enhancing the accessibility and availability of health data, HealthDCAT-AP plays an essential role in helping the EHDS achieve its objectives of improving public health outcomes, advancing research, and informing policy-making across the EU, all while upholding stringent data protection and privacy standards.

The Health Data Cataloging Literacy digital platform is based on the second and final deliverable of Work Package 6 of the EHDS2 pilot project consortium. The first deliverable, the draft specification of HealthDCAT-AP, is available at HealthDCAT-AP Draft Specification. The development of HealthDCAT-AP will continue transparently and publicly within the framework of the Second Joint Action Towards the European Health Data Space (TEHDAS2). TEHDAS2 lays the groundwork for the harmonised implementation of secondary health data use within the European Health Data Space (EHDS), advancing the EU's vision of a connected and interoperable health data ecosystem.

This platform is designed to provide practical support to data holders, including health data access bodies, researchers, policymakers, and other stakeholders involved in managing and sharing health data across Europe. It offers guidance on cataloging health data in compliance with the HealthDCAT-AP specification, facilitating the adoption of metadata standards that enhance data discoverability, accessibility, and interoperability.

By promoting structured and harmonised metadata practices, the platform aims to empower data providers to effectively document and share their datasets, ensuring compliance with EU regulations and fostering trust in cross-border health data exchange. Additionally, the platform serves as an educational resource, equipping users with the necessary knowledge and tools to navigate the evolving landscape of health data governance. Through interactive resources, best practices, and case studies, it supports stakeholders in implementing metadata strategies that align with the principles of FAIR (Findable, Accessible, Interoperable, and Reusable) data management. By fostering digital literacy in health data cataloging, the platform contributes to the broader goal of building a robust and sustainable European health data infrastructure that enables secure and ethical data reuse for research, innovation, and policy making.
HealthDCAT-AP is an extension of the DCAT-AP metadata standard specifically tailored to describe health datasets in the European Health Data Space (EHDS). It aims to improve the interoperability, discoverability, and reuse of health data across Europe by aligning with DCAT-AP and incorporating health-specific metadata elements. This initiative supports the EHDS framework by facilitating standardised metadata descriptions for datasets, enabling seamless data sharing and secondary use for research and policy-making. By leveraging FAIR data principles, HealthDCAT-AP ensures that health datasets are findable, accessible, interoperable, and reusable. Its development follows an open and collaborative process, with stakeholders contributing via GitHub to refine and enhance its specifications. Ultimately, HealthDCAT-AP serves as a crucial component in harmonising metadata management across EU health data catalogues.
HealthDCAT-AP extends DCAT-AP to meet the specific metadata requirements of the health sector within the European Health Data Space (EHDS). While it maintains the fundamental structure of DCAT-AP, it introduces additional classes and metadata elements designed for health dataset catalogues. This ensures compliance with EHDS regulations, including metadata interoperability between national and EU-level dataset catalogues. The profile also supports structured descriptions of dataset sources, scope, main characteristics, and access conditions, allowing data users to efficiently discover and assess relevant health data. Furthermore, HealthDCAT-AP promotes semantic annotation, knowledge graph integration, and support for advanced search functionalities, ensuring metadata is both machine-readable and suitable for AI-driven applications. Its ultimate goal is to streamline metadata exchange, enhance dataset interoperability, and enable federated catalogues for health data sharing.
The HealthDCAT-AP model provides a structured approach to describing health datasets, integrating both standard DCAT-AP elements and new extensions tailored to the health domain. It includes specific metadata elements to facilitate data discovery, access, and quality assessment, ensuring datasets are described in a way that meets the needs of researchers, policymakers, and data users. The model introduces health-specific properties such as healthCategory, qualityAnnotation, and populationCoverage, allowing datasets to be classified and evaluated according to EHDS requirements. It also supports interoperability through persistent dereferenceable URIs and aligns with existing health data standards and vocabularies. Additionally, the model enables federated cataloguing by linking metadata across different sources, contributing to the creation of a comprehensive health data knowledge graph. By incorporating these features, HealthDCAT-AP ensures that health datasets are not only discoverable but also properly structured for advanced analytics and AI applications.