PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks

  • 2025-03-31 17:28:02
  • Fang Yan, Jianfeng Wu, Jiawen Li, Wei Wang, Jiaxuan Lu, Wen Chen, Zizhao Gao, Jianan Li, Hong Yan, Jiabo Ma, Minda Chen, Yang Lu, Qing Chen, Yizhi Wang, Xitong Ling, Xuenian Wang, Zihan Wang, Qiang Huang, Shengyi Hua, Mianxin Liu, Lei Ma, Tian Shen, Xiaofan Zhang, Yonghong He, Hao Chen, Shaoting Zhang, Zhe Wang
  • 0

Abstract

The complexity and variability inherent in high-resolution pathologicalimages present significant challenges in computational pathology. Whilepathology foundation models leveraging AI have catalyzed transformativeadvancements, their development demands large-scale datasets, considerablestorage capacity, and substantial computational resources. Furthermore,ensuring their clinical applicability and generalizability requires rigorousvalidation across a broad spectrum of clinical tasks. Here, we presentPathOrchestra, a versatile pathology foundation model trained viaself-supervised learning on a dataset comprising 300K pathological slides from20 tissue and organ types across multiple centers. The model was rigorouslyevaluated on 112 clinical tasks using a combination of 61 private and 51 publicdatasets. These tasks encompass digital slide preprocessing, pan-cancerclassification, lesion identification, multi-cancer subtype classification,biomarker assessment, gene expression prediction, and the generation ofstructured reports. PathOrchestra demonstrated exceptional performance across27,755 WSIs and 9,415,729 ROIs, achieving over 0.950 accuracy in 47 tasks,including pan-cancer classification across various organs, lymphoma subtypediagnosis, and bladder cancer screening. Notably, it is the first model togenerate structured reports for high-incidence colorectal cancer anddiagnostically complex lymphoma-areas that are infrequently addressed byfoundational models but hold immense clinical potential. Overall, PathOrchestraexemplifies the feasibility and efficacy of a large-scale, self-supervisedpathology foundation model, validated across a broad range of clinical-gradetasks. Its high accuracy and reduced reliance on extensive data annotationunderline its potential for clinical integration, offering a pathway towardmore efficient and high-quality medical services.