‘Big Data’ used for the early identification of other diseases associated with cancer
A novel computer programme to help doctors and scientists to better understand which other diseases are likely to occur in patients with cancer.
What was the health challenge?
There were 17 million new cases of cancer worldwide in 2018 and that number continues to rise. Cancer has a huge impact on patients. It is also a major burden for health care systems across the world. Patients with cancer may be at a higher risk of also developing other diseases. If clinicians knew which other diseases have the highest risk of occurring in patients with each type of cancer, they could prioritise trying to detect that disease early or, if possible, prevent it from occurring.
What was the purpose of the research?
This research was undertaken to improve medical and research understanding of which additional diseases are most likely to occur in patients with cancer. The work focused on patients with the nine most commonly occurring cancers. The researchers first looked at the diseases most often occurring before each patient developed the cancer, in case any of those diseases actually increased the risk of the cancer occurring. They then looked at diseases occurring after the cancer was diagnosed, in case the presence of the cancer increased a patient’s risk of certain other diseases. Their aim was to develop a way of helping clinicians and researchers in the future to discover which other diseases might be associated with cancer in the same patient over time.
Why and how was health data used?
Researchers working at the College of Medical Science and Technology, Taipei Medical University in Taiwan, obtained permission to analyse data from the Taiwan National Health Insurance database for almost all the Taiwanese population. This database contains health information on around 782 million outpatient visits. The data they analysed included outpatient visits, dental visits, hospitalisations, medications prescribed, medications refilled, laboratory and imaging examinations and operations.
What was the legal basis for using the data?
The data provided to the research team was fully anonymised.
What were the results?
The patient records in the database were first grouped and analysed according to each patient’s age and gender. The computer programme was set up to detect diseases occurring in any patient within three years of each other, and to display these disease associations in ways that would help a scientist to determine the strongest disease connections. This data was accumulated for millions of patients, to highlight the disease associations that occur most often, and have not just coincided by chance. This collection of disease associations was used to create a visual image that can be rotated or zoomed to enable very detailed inspection. Their system is called the Cancer Associations Map Animation (CAMA).
What was the benefit to healthcare systems?
Patients with cancers could benefit from this development as their clinicians can be more easily aware of which other diseases a patient might be at risk from, seek to detect those early or possibly to avoid them occurring.
The results of this research have been openly published. The CAMA computer system is also openly available to researchers, worldwide.
This research is also an example of the benefits of using ‘Big Data’ through the ability to analyse a large number of patient records whilst protecting each patient’s identity.
Further information
Scientific research paper: