← Blog

Multimodal AI in medicine: radiology, pathology and the future of genomic precision

11 jun 2026

Med-Gemini-Polygenic, a Google DeepMind model, predicted the risk of depression, stroke, glaucoma, rheumatoid arthritis, all-cause mortality, coronary artery disease, COPD and type 2 diabetes from genomic data — and outperformed traditional linear polygenic scores in all eight. For six additional conditions, it made predictions without having been specifically trained for them. This is precision medicine working with AI: it not only classifies better — it generalizes.

Why multimodality changes medicine

Medical practice was never monomodal. A lung cancer diagnosis involves: tomography imaging (visual), radiology report (text), biopsy with immunohistochemistry (microscopic image + molecular data), patient history (text), blood tests (numerical data), and tumor genomic analysis (sequence). Processing each modality separately — with distinct tools, interpreted by distinct specialists — is like reading a book one letter at a time.

Multimodal models that integrate all these sources into a single inference have the potential to capture correlations across modalities that human specialists cannot systematize. An experienced radiologist intuitively integrates what they see in the image with the patient's history. A model trained on millions of cases can systematize this process in a scalable way.

Med-Gemini: radiology, pathology and genomics in one architecture

Med-Gemini from Google DeepMind is the family of models that demonstrates this potential in the most comprehensive way documented through 2026. It is organized into four sub-models:

Med-Gemini-L (text and long context): Scores 91.1% on MedQA — 4.6 percentage points more than its predecessor Med-PaLM 2. It uses uncertainty-guided web search to integrate up-to-date medical literature. It outperformed GPT-4 in all 14 benchmarks where direct comparison was possible.

Med-Gemini-2D (2D medical images): Trained on chest X-ray, CT slices, histopathology slides, ophthalmology and dermatology images. It generates X-ray reports surpassing the previous state of the art by up to 12% for normal and abnormal exams. Radiologists rated 57% of the reports generated on normal exams as equivalent or superior to the original reports — a result that, in 2024, would have been considered implausible.

Med-Gemini-3D (3D volumetric images): Processes complete volumetric CT — not individual slices. More than half of the generated CT reports were rated by radiologists as equivalent in management recommendations to what a radiologist would produce.

Med-Gemini-Polygenic (genomic data): Predicts health outcomes from polygenic data — combinations of genetic variants of low individual effect that together predict disease risk. It outperforms traditional linear models in 8 conditions and generalizes to 6 additional ones not included in training.

Med-Gemini is not a publicly available product. It works through research partnerships with Google Cloud for healthcare. MedGemma (described in the previous article in this series) is the derived open-weight version, available to developers.

Radiology: the use case closest to scale

Of all medical specialties, radiology is where AI has come closest to real clinical impact. The FDA has authorized more than 950 AI medical devices through early 2026, and the majority are in radiology — especially detection of lung nodules in CT, mammography analysis, and stroke triage in head CT images.

Current models can detect specific findings in images with accuracy comparable to specialists. The productivity gain is significant: a radiologist can review 30 CTs per hour; with AI doing a pre-classification (normal/abnormal/urgent), they can review 60, focusing attention on the problematic cases. The model does not replace — it prioritizes.

The next frontier in radiology is radiomic-genomic integration: correlating image features (such as texture, volume, heterogeneity of a tumor on CT) with molecular profiles from the biopsy. This integration — called "radiogenomics" — may allow molecular characterization of the tumor without invasive biopsy, from the image.

Computational pathology: beyond the human eye

In histopathology — analysis of tissue slides for cancer diagnosis — models such as Phikon were trained on millions of tissue samples and learned representations of microscopic patterns that are not always articulable by human pathologists. These models detect subtle features of tumor aggressiveness, predict response to specific treatments and identify molecular subtypes from cellular morphology.

MerMED-FM, launched in 2025, took this approach further: a vision model trained on 3.3 million medical images from more than 10 specialties and 7 modalities (CT, X-ray, ultrasound, histopathology, fundoscopy, OCT, dermatology). The premise is that a model trained on the multiple visual languages of medicine develops richer representations than models trained on an isolated specialty.

The horizon: the virtual cell

The most ambitious goal of the field is what Recursion and other labs call the "virtual cell" — a computational model capable of simulating the response of a human cell to any intervention (drug, gene editing, environmental perturbation) before any physical experiment. If the "virtual cell" becomes viable, it would allow in silico screening of billions of pharmacological compounds, personalized to the genome of a specific patient.

It is still a research aspiration. But the trajectory of 2024-2026 — from protein structure prediction (AlphaFold 3), through long-context genomic models (Evo 2), to clinical multimodal integration (Med-Gemini) — draws the path. Precision medicine that combines imaging, genomics, clinical history and biochemistry into a single decision support system is no longer science fiction. It is a matter of scale and validation.

Get the latest posts

New articles on AI, Vibe Code and Builder Code — by email or Telegram.

or
Get it on Telegram

By subscribing, you agree to receive emails/messages and to the Privacy Policy. You can unsubscribe anytime. No spam.