The future of human health may require unlocking access to our health data

The right to individual privacy meets the need for data to enhance public and personal health

The synergy of artificial intelligence (AI) and medical imaging has opened new horizons for healthcare. AI-mediated computer vision has been in use in oncology for more than 20 years; however, its arrival in other medical fields, notably dentistry, is more recent. Dentistry is particularly notable here, because while relatively few people have seen an oncologist, almost everyone sees a dentist at least every few years. The application of AI in dentistry, thus, extends to a much broader population. Rapid advances in AI oblige us to consider system design choices in light of possible futures that we may discern only indistinctly today. In healthcare, the stakes are high–higher, arguably, than in any other areas of AI deployment.

Key Takeaways

AI systems used in healthcare are "discriminative," meaning that they are trained to identify data based on expertly identified examples. In contrast, "generative" AI systems create new data.
Existing data protection regulations such as GDPR and HIPAA are important for patient privacy, but pose ethical challenges for training AI systems.
All healthcare stakeholders will have to work together to find ways of pooling data for further medical AI development while preserving individual privacy.

Discriminative vs generative AI

The family of machine learning algorithms most often used in medical imaging and diagnostic applications is called “discriminative,” as they are techniques for discriminating between existing data points. Discriminative systems are trained to classify data in order to determine the likelihood that a feature exists in the signal data (an early-stage lung nodule in a chest CT scan, for example). These algorithms should not be confused with the “generative” algorithms applied in popular new AI tools like ChatGPT or DALL-E, which create – or “hallucinate” – entirely new data points that in some way resemble the data on which they are trained.

The ability to fabricate outputs so believable that we are unable to see that they are fabrications has raised concerns about the application of generative algorithms in some arenas–including medicine, where human health is at stake. Those concerns will no doubt moderate the deployment of generative algorithms in healthcare. The algorithms with the greatest immediate utility in medicine, however, are discriminative. Discriminative algorithms are measured by the objective accuracy of their output, not their semblance of accuracy, and, like all machine learning algorithms, they are able to ingest and draw value from data at a rate that far exceeds human capacity.

Ethical considerations

The linkage between AI and data clearly has enormous potential significance for medicine. Unfettered access to medical data hones the precision of diagnostic tools and the ability of AI to detect patterns in large volumes of data will reveal connections and interactions that we do not now suspect. The bottleneck of hypothesis – the requirement to begin with a theory in order to define and obtain funding for medical research – is eliminated when the data itself yields answers without waiting for questions to be asked.

Currently, medical information systems are largely of the “walled garden” type: They are held within a practice, insurer, or medical center with limited outside access. In order to make the best use of data generated within these systems, methods of anonymizing and pooling large masses of patient data will be needed, together with scalable AI systems capable of scouring immense reservoirs of multimodal data.

Of course, unchecked access to medical data adds to the ethical considerations that attend any discussion of AI-integrated healthcare. As AI technologies require vast amounts of data to improve accuracy and efficacy, the question of how to protect patient privacy while leveraging data insights becomes increasingly complex. The General Data Protection Regulation (GDPR) in Europe and the Health Insurance Portability and Accountability Act (HIPAA) in the United States offer frameworks for data protection and privacy, but they currently fail to account for the considerable value that data brings.

Collaborating for the future

As interoperability of digital systems continues its exponential growth, we can expect an overwhelming increase in the novel information that is produced by AI. Consider that the rapid advancement of AI technologies often outpaces regulatory measures, however, and it becomes apparent that finding an ideal balance between innovation and privacy will require considerable effort from all corners of the healthcare system. That effort must begin with thoughtful collaboration between policymakers, technologists, and healthcare providers.

To maximize tomorrow’s benefits, we need to think today about the most efficient ways to foster interconnectivity and the most effective ways to overcome complex questions about privacy, ownership, intellectual property, and system supervision. We should, for example, develop ethical guidelines around health data that account not only for the privacy of patients (as GDPR and HIPAA do) but also for the benefits that new innovative technologies can bring to patient care.

In medicine – as opposed to, say, advertising – data is a public good. The challenge for medicine in the digital age will be to find ways of pooling information for the common benefit of all patients, while still protecting individual privacy and preserving the privatized nature of our healthcare system.