Research


Pathology Image Analysis Laboratory

The Mahmood Lab (www.mahmoodlab.org) at Harvard Medical School, BWH and MGH is dedicated to advancing the field of computational pathology by developing state-of-the-art machine learning methods that deepen our understanding of disease processes and enhance diagnostic workflows. By leveraging advancements in representation learning, multimodal data integration, generative and agentic approaches we aim to create robust tools capable of extracting meaningful insights from clinical data. A central focus of our research is not only improving diagnostic accuracy and patient outcome prediction but also ensuring the interpretability and clinical utility of these tools. We are a highly interdisciplinary team with expertise in computational science, pathology, and medicine to address critical challenges in the diagnosis and treatment of complex diseases. A summary of current and future research directions in the lab is given below,

Representation Learning for Pathology

Computational Pathology fundamentally hinges on our ability to learn and distill deep representations from histopathology images. Crafting these universal, domain and task agnostic representations is pivotal to address challenges such as data scarcity, minimizing intra- and inter-observer variability, and reducing the need of labor-intensive annotations. Our research harnesses the power of Self-Supervised Learning (SSL) to train large-scale foundation models to build histology representations. By utilizing cutting-edge models, we build vision encoders of histology regions of interest that we further to giga-pixel whole-slide images for slide representation learning. Further enriching our model, we develop strategies for multimodal pretraining that weave together expression data and textual information, crafting a multidimensional understanding of pathology data.

Key Publications:

  • UNI – Towards a general-purpose foundation model for computational pathology – Nature Medicine 2024[link]
  • HIPT – Heretical Image Pyramid Transformer –  CVPR 2022[link]

Multimodal Representation Learning and Generative AI for Pathology

Natural language data contain rich information about cell and tissue morphology, their interpretation in the context of diagnosis, prognosis as well as patient management. When paired with corresponding histology images, the image captions have the potential to offer strong signals of supervision beyond the simple discrete class labels used in supervised learning or different views of semantic preserving transformations used in self-supervised contrastive learning. Our research aims to leverage both vision only, text only as well as paired vision language data for representation learning (both unimodal and multimodal) and developing multimodal models specialized for visual language understanding in pathology

Key Publications:

  • PathChat – A Multimodal generative AI chatbot for pathology – Nature 2024 [link]
  • CONCH – Vision-language foundation model for pathology – Nature Medicine 2024 [link]
  • MI-Zero – Multiple Instance Zero-Shot Transfer for Histopathology Images – CVPR 2023[link]

3D Pathology

Despite human tissues inherently being three-dimensional (3D) in structure, the prevailing diagnostic methodology has relied on the analysis of thin, two-dimensional (2D) tissue sections placed on glass slides. This 2D tissue sampling captures merely a fraction of the morphological complexity present in the full 3D tissue. Our research aims to close this gap by developing a 3D computational framework based on 3D state-of-the-art deep learning technologies, with the primary goal of achieving better diagnostic and prognostic performance than current clinical practice. We accomplish this by explicitly encoding 3D morphology and aggregating heterogeneous morphology from the entire tissue volume.

Key Publication

  • TriPath - Weakly Supervised AI for Efficient Analysis of 3D Pathology Samples.” Cell (2024). [link]

Weakly Supervised Learning and Applications

Advances in scanning systems, imaging technologies and storage devices are generating an ever-increasing volume of whole-slide images (WSIs) acquired in clinical facilities, which can be computationally analyzed using artificial intelligence (AI) and deep learning technologies. The digitization and automation of clinical pathology, also referred to as computational pathology (CPath), can provide patients and clinicians the means for more objective diagnoses and prognoses, allows the discovery of novel biomarkers and can help to predict response to therapy. Our research uses the latest developments in Multiple instance learning (MIL) to achieve superior performance without subject to inter-observer variability in various diagnostic tasks.

Publications

  • TOAD - AI-based pathology predicts origins for cancers of unknown primary. Nature 2021[link]
  • CLAM - Data-efficient and weakly supervised computational pathology on whole-slide images.” Nature biomedical engineering 2021[link]
  • CRANE - Deep learning-enabled assessment of cardiac allograft rejection from endomyocardial biopsies.” Nature medicine 2022[link]

Multimodal data fusion

While WSIs already provide detailed descriptions of the patient status through rich morphological cues, it has become clear that the cancer progression and responses to therapeutic regimens are governed by a multitude of factors, warranting incorporation of additional modalities for improved outcome prediction. Omics data, such as gene expression and mutations, presents a natural choice as it can provide comprehensive tissue molecular details, which may or may not be reflected in morphology, and can be obtained within routine clinical workflow. Leveraging latest developments in multimodal fusion strategies, we develop frameworks that use complementary information in histology and genomics for better patient prognosis.

Key Publications:

  • PORPOISE - Pan-cancer integrative histology-genomic analysis via multimodal deep learning.  Cancer Cell 2022[link]
  • MCAT - Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. ICCV 2022[link]
  • Pathomic fusion - An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis.” IEEE Transactions on Medical Imaging 2022[link]

Multiplex Imaging

Multiplex imaging technologies provide a unique capability to delve into the intricate biological interactions unfolding within tissues, surpassing the limitations of conventional tissue microscopy (histology). In this context, our research is centered on the development of innovative machine learning-based computational methods designed to tackle various challenges, including cell phenotyping and unraveling the complexities of the tumor microenvironment, leveraging the rich information contained within multiplex images.

Publications

  • Shaban, Muhammad, et al. “MAPS: Pathologist-level cell type annotation from tissue images through machine learning.” Nature Communications (2024): 28. [link]
  • Shaban, Muhammad, et al. “Deep Learning Model Imputes Missing Stains in Multiplex Images.” bioRxiv (2023): 2023-11. [link]

Agentic AI for Diagnosis and Biomedical Discovery

Our lab is exploring the development of agentic AI systems—models that move beyond passive predictions to actively assist in diagnostic decision-making and biomedical research. These systems are designed to autonomously analyze data, generate hypotheses, and provide actionable recommendations in real-time clinical and research settings. By integrating multimodal data, such as histopathology, genomics, and imaging, agentic AI can uncover novel insights into disease mechanisms while supporting clinicians in navigating complex cases. This forward-looking research aims to create AI tools that not only complement human expertise but also drive discovery and innovation in healthcare and the life sciences.

Open-source Research Software

We are committed to fostering transparency, reproducibility, and collaboration in computational pathology by developing and releasing open-source software tools. Our lab has created widely used tools such as CLAM, UNI, and COCH, which have been adopted in hundreds of studies across the globe. These tools empower researchers and clinicians to analyze complex datasets, develop new methodologies, and build upon our work to advance the field further. By making our software openly available, we aim to lower barriers to innovation and enable the global research community to collaboratively address critical challenges in healthcare and biomedical science. See our open-source tools at https://github.com/mahmoodlab