Glossary of terms

Most of the terms listed here have been defined in a glossary produced by the Connected Health Cities Programme in the UK. Some have been drawn from the IMI Code of Practice. Some terms have been defined in accordance with international (ISO) standards or European Commission publications.

They have been reproduced or adapted here with permission.



Access Control: A means of ensuring that the people who have been given access to all or part of a data record have been approved to do so.

Aggregated data: Statistical data about multiple individuals that has been combined to show general trends or values without identifying individuals within the data.

Algorithm: a process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer

Anonymisation: The process of rendering data into a form which does not identify individuals and where identification is not considered feasible. (read more)

Anonymous: Information that has been modified to no longer make is reasonably possible to identify any of the individuals whose data is included.

Artificial Intelligence (AI): Artificial intelligence refers to systems designed by humans that, given a complex goal, act in the physical or digital world by perceiving their environment, interpreting the collected structured or unstructured data, reasoning on the knowledge derived from this data and deciding the best action(s) to take (according to pre-defined parameters) to achieve the given goal.

Audit trail or Audit log: A record of everyone who has looked at or changed a record, why and when they did so and what changes were made.

Authentication: A process of reliably identifying persons or devices by securely associating an identifier with a means of verification.


Blockchain: The best known distributed ledger technology providing trust, traceability and security in systems that exchange data or assets, enabling a final and definitive record of transactions to be held in a network across a series of nodes and avoiding one centralised location and the need for intermediaries’ services.


Care Pathway: A plan agreed by all disciplines involved in the care of a specific group of patients (such as patients who have suffered a stroke), which outlines the agreed standards for treatment, based on the best available evidence. It makes clear what different tasks (interventions) need to be done by which professionals, when and where.

Carer or Caregiver: An individual who provides unpaid care to a patient or service user, most commonly a member of their family or friend.

Clinical Audit: A method for improving practice, patient care or services provided. It is used to compare current practice with a set of standards or criteria, then identify areas for improvement, make changes to practice and re-audit to ensure that improvement has been achieved.

Clinical Trial: An investigation in human subjects intended to discover or verify the effect of one or more investigational health interventions (e.g., drugs, diagnostics, devices, therapy protocols) that generate safety and efficacy data before making the health intervention available in health care.

Cloud Computing: The storing, processing and use of data on remotely located computers accessed over the internet.

Confidential Data or Confidential Information: See Personally identifiable data.

Confidentiality: Ensuring that information is not made available or disclosed to unauthorised individuals, or organisations.

Consent: see Explicit Consent and Implied Consent.

Cyber Security: The processes used to safeguard and secure information assets from being stolen or attacked.


Database: A structured set of data held in a computer, in a form that is well suited to analysis

Data Breach: Any failure to meet the requirements of the General Data Protection Regulation, unlawful disclosure or misuse of personally identifiable data and an inappropriate invasion of people’s privacy.

Data Controller: a term used to describe an individual or organisation who determines the purposes for which and the manner in which any personally identifiable data is or will be processed. It is the responsibility of the Data Controller to ensure that any processing of personally identifiable data has an appropriate legal basis.

Data Linkage: A technique that involves bringing together and analysing data from a variety of sources, typically data that relates to the same individual.

Data Processor: a term used to describe any person (other than an employee of the Data Controller) who processes personally identifiable data on behalf of the Data Controller. Data Controllers must choose Data Processors carefully and have in place a written contract (detailing the information governance requirements) and effective means of monitoring, reviewing and auditing their processing.

Data Security: Protecting data and information systems from unauthorised access, use, disclosure, disruption, modification or destruction.

Data Sharing: The disclosure of data from one or more organisations to another organisation or organisations, or the sharing of data between different parts of a single organisation.

Data Subject: An identified or identifiable natural person, who is the subject of personal data.

Deep Learning: A particularly accurate machine learning approach with less need of human guidance, referring to the fact that the neural network has several layers between the input and the output, learning the overall input-output relation in successive steps.

De-identification: A general term for any process that removes the association between a set of identifiable data and the person it relates to. It is used here specifically to mean the removal of patient identifiers from data.

De-personalised data: This is information that does not identify an individual, because identifiers have been removed or scrambled. However, the information is still about an individual person and so needs to be protected. It might, in theory, be possible to re-identify the individual if the data was not adequately protected, for example if it was combined with different sources of information.


Encryption: A process of maintaining data integrity and confidentiality by converting plain data into a secret code with the help of an algorithm. The corresponding reverse process is "decryption", a transformation that restores encrypted data to its original form. Only authorized users with a key can decrypt encrypted data. Encryption is regarded as an effective way to defend against the abuse of IT technologies, such as hacking, identity and personal data theft, fraud and the improper disclosure of confidential information. It ensures the protection of cybersecurity, data protection and privacy.


Identifiable information: See ‘Personally identifiable data’.

Identifier: An item of data, which by itself or in combination with other identifiers, enables an individual to be identified.

Implied consent: an unwritten agreement between the patient and the health and social care professionals that provide their care that allows their data to be collected, processed and shared as long as it is relevant for their care, that it is kept confidential and the patient has not objected.

Indirect care: purposes other than individual care of the patient. This includes activities that contribute to the overall provision of services to a population as a whole or a group of patients with a particular condition. It also covers health services management, preventative medicine, and medical research. Examples of such activities would be risk prediction and stratification, service evaluation, needs assessment, financial audit.

Information governance: this is how organisations manage the way that any data is handled within the health and social care system in England. It covers the policy and legal requirements that organisations need to meet to ensure that data is handled legally, securely, efficiently, effectively and in a manner which maintains public trust.


Linkage: The merging of information or data from two or more sources, with the object of combining facts concerning an individual or an event, which are not available in any separate record.


Machine learning: A sub-discipline of artificial intelligence, regarding the ability of a software/computer to learn from its environment or from a very large set of representative data, enabling systems to adapt their behaviour to changing circumstances or to perform tasks for which they have not been explicitly programmed.


Patient data: Data that is collected about a patient whenever they go to a doctor or receive social care. It may include details about the individual’s physical or mental health, such as height and weight or detail of any allergies, and their social care needs and services received. It may also include next of kin information.

Personally identifiable data: This term describes personal information about identified or identifiable individuals, which should be kept securely and only used for agreed legally approved purposes.

Personal data: According to the General Data Protection Regulation, this is data that relates to a living individual who can be identified from this data, or from a combination of this data and other data which is in the possession of, or is likely to come into the possession of, the data controller.

Primary care: Primary care refers to services provided by organisations such as GP practices, dental practices, community pharmacies and high street optometrists.

Pseudonym: a unique identifier (sometimes created by scrambling an actual identifier), which does not itself reveal an individual’s ‘real world’ identity but distinguishes between different individuals in a data set

Pseudonymous Data: The process of distinguishing individuals in a data set by using a unique identifier, which does not reveal their ‘real world’ identity (see also Anonymisation). (read more)

Public interest: Something ‘in the public interest’ is something that serves the interests of society as a whole. The ‘public interest test’ is used to determine whether the benefit of disclosing personally identifiable patient data outweighs both the personal interest of the individual concerned and the need to protect the public’s trust in the confidentiality of services.


Sensitive personal data: Data that identifies a living individual regarding his or her: racial or ethnic origin, political opinions, religious beliefs or other beliefs of a similar nature, membership of a trade union, physical or mental health or condition, sexual life, convictions, legal proceedings against the individual or allegations of offences committed by the individual.

Study Participant: Any person participating in a research study, whether or not a clinical trial. It can refer to patients or healthy volunteers (it does not include health care professionals).