The impact of commercial health datasets on medical research and health-care algorithms.

TitleThe impact of commercial health datasets on medical research and health-care algorithms.
Publication TypeJournal Article
Year of Publication2023
AuthorsAlberto IRose I, Alberto NRose I, Ghosh AK, Jain B, Jayakumar S, Martinez-Martin N, McCague N, Moukheiber D, Moukheiber L, Moukheiber M, Moukheiber S, Yaghy A, Zhang A, Celi LAnthony
JournalLancet Digit Health
Volume5
Issue5
Paginatione288-e294
Date Published2023 May
ISSN2589-7500
KeywordsAlgorithms, Biomedical Research, Consumer Health Information, Datasets as Topic, Humans, Privacy, Reproducibility of Results
Abstract

As the health-care industry emerges into a new era of digital health driven by cloud data storage, distributed computing, and machine learning, health-care data have become a premium commodity with value for private and public entities. Current frameworks of health data collection and distribution, whether from industry, academia, or government institutions, are imperfect and do not allow researchers to leverage the full potential of downstream analytical efforts. In this Health Policy paper, we review the current landscape of commercial health data vendors, with special emphasis on the sources of their data, challenges associated with data reproducibility and generalisability, and ethical considerations for data vending. We argue for sustainable approaches to curating open-source health data to enable global populations to be included in the biomedical research community. However, to fully implement these approaches, key stakeholders should come together to make health-care datasets increasingly accessible, inclusive, and representative, while balancing the privacy and rights of individuals whose data are being collected.

DOI10.1016/S2589-7500(23)00025-0
Alternate JournalLancet Digit Health
PubMed ID37100543
PubMed Central IDPMC10155113
Grant ListR01 EB017205 / EB / NIBIB NIH HHS / United States
R56 EB017205 / EB / NIBIB NIH HHS / United States