This is a critical addition, as understanding the purpose behind data collection is relevant to any dataset, whether personal or non-personal. While this type of metadata is beneficial for all datasets, it becomes especially relevant in the case of personal data, as it aligns with the requirements of the General Data Protection Regulation (GDPR). Under GDPR, organisations must clearly articulate the legal basis and purpose for processing personal data, making this information essential for datasets involving personal information. The inclusion of such properties within HealthDCAT-AP ensures a more complete understanding of the context and rationale for data collection, improving dataset transparency and aiding users in making informed decisions when accessing and utilising health data.

Property

Purpose

URI

dpv:hasPurpose

Range

Literal

Definition

A free text statement of the purpose of the processing of data or personal data.

Usage note

The purpose or goal here is intended to sufficiently describe the intention or objective of why the data or technology is being used, and should be broader than mere technical descriptions of achieving a capability. For example, "Analyse Data" is an abstract purpose with no indication of what the analyses is for as compared to a purpose such as "Marketing" or "Service Provision" which provide clarity and comprehension of the 'purpose' and can be enhanced with additional descriptions.

Property

Legal basis

URI

dpv:hasLegalBasis

Range

rdfs:Resource, expressed as a URI.

Definition

The legal basis used to justify processing of personal data.

Usage note

Legal basis (plural: legal bases) are defined by legislations and regulations, whose applicability is usually restricted to specific jurisdictions which can be represented using dpv:hasJurisdiction or dpv:hasLaw. Legal basis can be used without such declarations, e.g. 'Consent', however their interpretation will require association with a law, e.g. 'EU GDPR'.

HealthDCAT-AP also incorporates the Personal Data extension of the Data Privacy Vocabulary (DPV) specification providing additional concepts to represent different types and categories of personal data. The Personal Data extension (DPV-PD) offers as such a harmonised, standardised RDF framework of describing sensitive information in the context of HealthDCAT-AP. By extending DCAT-AP with DPV-PD, it enables the sensitive nature of health datasets containing personal information to be uniformly described and understood across various dataset catalogues, promoting interoperability and consistency.

Property

Personal Data

URI

dpv:hasPersonalData

Range

rdfs:Resource, expressed as a URI.

Definition

Key elements that represent an individual in the dataset.

Usage note

This definition of personal data encompasses the concepts used in GDPR Art.4-1 for 'personal data' and ISO/IEC 2700 for 'personally identifiable information (PII)'.

Real-world example:
The dataset 'Linking of registers for COVID-19 vaccine surveillance' links selected variables from existing registries for COVID-19 vaccine surveillance, in order to ensure the monitoring of COVID- 19 vaccines in the phase following their marketing authorization. Given that this dataset contains personal-level information on Belgian citizens, its creation required approval from the Information Security Committee. The dataset's legal basis, purpose, and personal data categories are transparently communicated to data users through the use of the Data Privacy Vocabulary properties: dpv:hasLegalBasis, dpv:hasPurpose, and dpv:hasPersonalData.
dpv:hasPurpose dpv:Purpose
dpv:hasPurpose [
a dpv:Purpose;
dct:description "The primary objective of Sciensano's LINK-VACC project is 
to monitor COVID-19 vaccines post-authorization and evaluate the public health value
of prioritizing vaccination for people with comorbidities.
This involves assessing the vaccines' effectiveness and safety in
the broader population context, beyond the limited scope of clinical trials,
and determining future vaccination policies
in public health emergencies such as epidemics or pandemics"
@en
];
dpv:hasLegalBasis dpv:LegalBasis

dpv:hasLegalBasis [

a dpv:LegalBasis ;

dct:description "CSI Deliberation no. 21/028 of february 18, 2021, last amended on june 18, 2021, relating to the communication of data to pseudonymized personal character relating to the health of vaccinnet+, healthdata covid-19 database i and ii, healthdata covid-19 clinical database, cobrha, statbel and the agency intermutualist in sciensano, as part of the link-vacc project and the subsequent processing of personal data pseudonymised by the federal drug agency in view monitoring the safety of covid-19 vaccines"@en;

dct:source <https://www.ehealth.fgov.be/ehealthplatform/file/view/AXkNfdPml9vUUfvGGfJr?filename=21-028-f212-AFMPS-vaccinnet-modifi%C3%A9e%20le%2018%20juin%202021.pdf>, <https://www.ehealth.fgov.be/ehealthplatform/file/view/AX_-9sZSuwVJMAnC0ENo?filename=21-028-f166-LINK-VACC-modifi%C3%A9e%20le%205%20avril%202022.pdf> ;

];

dpv:hasPersonalData dpv:PersonalData
dpv:hasPersonalData dpv-pd:Gender, dpv-pd:Age, dpv-pd:Location, dpv-pd:Nationality, 
dpv-pd:Education, dpv-pd:HealthRecord;