TILs Grand Challenges

Artificial Intelligence/Deep Learning/Machine learning approaches to TIL-assessment

At present, in early stage disease clinico-pathological risk stratification is performed using a limited set of features such as tumor size and lymph node status. Very large adjuvant trials such as ALTTO and APHINITY that have applied these stratification schemes have illustrated the key problems with the current classification scheme - it does not stratify patients with sufficient granularity to permit selection for clinical trials. The current scheme also takes the approach of placing patients on a continuum of risk. This is at odds with results from high-throughput technologies such as gene expression profiling and genomic assays, which focus on identifying individual patient groups with particular clinical behavior. Several results in this area have identified genomic/transcriptomic or proteomic features which in hindsight are associated with particular histological features. This suggests that the histological appearance of a tumor represents a useful cancer phenotype which can be further explored, and contribute to staging and stratification.

Machine learning refers to the general computational approach whereby data is used by algorithms to develop predictive models. These models are finely tuned to optimize accuracy and generalizability as applied to new data. Although machine learning existed for some time, more recently, advances in algorithm development and hardware infrastructure has enabled ‘deep learning’ approaches. Deep learning was originally designed to mimic the neural architecture of the human brain, and conceptually uses a series of connected nodes (neural nets) which respond to input in a way that is tuned with repeated cycles of learning. Neural nets have the ability to learn rich representations of complex data, which may contain hierarchical and non-linear relationships. These abilities make neural nets ideally suited to image classification. They have exhibited spectacular results in this area, often matching or exceeding human performance.

Machine learning provides a toolkit with many applications in digital pathology. On one level, the robust performance and training characteristics of deep learning allows us to develop accurate automated assays for pathological features such as grade and lymphocyte infiltration. These have the potential to be ‘learn once, apply everywhere’. This is in contrast to existing methods, which must be constantly calibrated and validated. Such assays would aid pathologists in routine reporting, and could be expected to improve the validity of current pathology based clinico-pathological features. They would also help standardize pathology results within and across trials.

TILs have been shown to be a reliable and reproducible marker of tumor immunogenicity in breast cancer. It is clear that higher levels of TILs are associated with improved prognosis in early stage TNBC and HER2-positive breast cancer, as well as a higher probability of achieving pCR in the neoadjuvant setting. Analysis of TILs in residual disease specimens after neoadjuvant therapy has also been shown to have prognostic value. The evaluation of TILs as a biomarker in breast cancer is likely to be extended from the research domain to the clinical setting in the near future. The assessment of TILs by digital image analysis might be useful for standardization in the future, since this approach has the potential, for example, to determine the number of TILs per mm² stromal tissue as an exact measurement contrary to the approximate semi-quantitative evaluation suggested at this moment. In the first International Guidelines on TIL-assessment in breast cancer we proposed to develop an inter-laboratory Ring study to assess the reproducibility and clinical validity of TILs assessment, including machine learning algorithms. While TILs have been measured morphologically and have been shown to add predominantly prognostic information, methodological open questions in the morphological evaluation of TILs still remain, for example the assessment and importance of spatial TIL-heterogeneity. The measurement on H&E-stained slides most likely represents the beginning of the efforts to use infiltrating cell properties as companion diagnostic tests. Thus, as a field, we should be open to the introduction of molecular methods, most likely in situ, that can classify the TILs-component and bring higher levels of information to the patient sample. However, at this time, these Artificial Intelligence-methods are still experimental and not sufficiently documented for introduction into standard practice.

On another level however, machine learning also permits discovery of image based features which may be very difficult for current approaches to identify, particularly if they only exist in small groups of patients. The key benefit of machine learning here is to rapidly identify pathological features in large datasets in a standardized way – this is an essential first step in deciding if these features are clinically relevant, and is largely infeasible using current approaches. Machine learning also permits modification and retraining of the feature set to optimize accuracy and interpretability, which is again infeasible with current methods.

  • Watch our Training videos here 
  • See more Reference images here
  • See more Pitfalls images here
  • More WSI Images here!
    This is a set of Whole Slide Images of matched core-biopsies (for example 09BXXX) and corresponding surgical full section specimens of the same case (09SXXXX) of representative breast cancer cases, including matched cases of invasive lobular cancer. The TIL-counts of these slides are pending!

NuCLS benchmark TILs dataset

The NuCLS dataset contains over 220,000 labelled nuclei from breast cancer images from TCGA. These nuclei were annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive. These data can be used in several ways to develop and validate algorithms for nuclear detection, classification, and segmentation, or as a resource to develop and evaluate methods for interrater analysis. Data from both single-rater and multi-rater studies are provided. For single-rater data we provide both pathologist-reviewed and uncorrected annotations. For multi-rater datasets we provide annotations generated with and without suggestions from weak segmentation and classification algorithms.

For more details consult our paper, or contact us directly with questions.
Related: If you like this work, you will probably be interested in our 2019 region crowdsourcing paper and dataset.

Automated Breast Cancer Detection to Identify Spatial Distributions of Tumor Infiltrating Lymphocytes in Invasive Breast Cancer

Click here to access the tool.

TIGER - challenge (Tumor InfiltratinG lymphocytes in breast cancER)


We will conduct a public Grand Challenge, wherein the scientific community is invited to participate. The community will be able to develop their TIL ML-tool using a set of freely available training data, and then apply their ML tools on a large independent test set consisting of Her2 positive and Triple Negative breast cancer cases from both clinical routine and from a large phase 3 clinical trial.



CATALINA-challenge (CollAborative Til vALidatIoN chAllenge)


This is the evaluation of the performance of 2 different existing Machine Learning-tools on a set of 7 phase 3 adjuvant TNBC-clinical trials (doi: 10.1200/JCO.18.01010.)


1. Computational and Integrative Pathology Group, Department of Pathology, NorthWestern University, Chicago, IL, USA

The Computational and Integrative Pathology Group (CIPG) is a research team from the Department of Pathology at Northwestern University in Chicago, IL, with a documented track record in developing accurate algorithms for computational assessment of TILs.


Mohamed Amgad Tageldin, M.D., Ph.D. is a physician with a passion and expertise in computational pathology applications. He obtained his medical degree from the Department of Medicine, Cairo University, Egypt. He then moved to the United States and obtained his Ph.D. in Computer Science from Emory University in Atlanta, with a focus on the computational discovery of interpretable histopathologic prognostic biomarkers in invasive carcinomas of the breast. Mohamed won multiple awards including the 2019 Chris Schoettle Graduate Research Award, the 2021 Trainee award from the Association for Pathology Informatics, and the 2021 Travel Award from the Digital Pathology Association.  

Lee A. D. Cooper, Ph.D. is an Associate Professor of Pathology and the director of the Center for Computational Imaging and Signal Analytics in Medicine at Northwestern University Feinberg School of Medicine in Chicago. He obtained his Ph.D. in Electrical and Computer Engineering from the Ohio State University, Ohio. Before moving to Northwestern University, he had a joint appointment as an Assistant Professor at the Departments of Biomedical Informatics and Biomedical Engineering at Emory University and the Georgia Institutes of Technology in Atlanta. Dr. Cooper’s research focuses on predicting clinical outcomes from genomic, imaging, and histopathology data, and extracting quantitative phenotypic information from digital pathology images. 


The CIPG has published two large-scale open-access datasets, BCSS and NuCLS, containing 20,000+ and 200,000+ annotations of tissue regions and cell nuclei in breast cancer. The datasets enable training accurate supervised deep-learning models for TILs detection:

They published a number of algorithms for automated detection of TILs in breast cancer, documented in the following publications:

Members of the CIPG are actively involved with the digital pathology community and are active participants in multiple professional societies including the Association of Pathology Informatics, Digital Pathology Association, American Association for Cancer Research, and the International Immuno-Oncology Working Group. In 2020, they led the effort to develop consensus recommendations for computational TILs assessment consistent with clinical guidelines:


CIPG’s approach relies on a lightweight, multi-resolution convolutional neural network model called MuTILs. MuTILs uses two parallel U-Nets to jointly detect histopathologic tissue regions and nuclei at 1 and 0.5 microns-per-pixel resolutions, respectively. It imposes a set of biological constraints to ensure compatibility between tissue region and nuclear predictions. The MuTILs method has won the Best Poster Award at the 2021 Pathology Informatics Summit. MuTILs was trained on the BCSS and NuCLS datasets, and has a high accuracy for stromal segmentation and TILs detection. MuTILs produces the following variants of the TILs score (Fig 1):

A detailed validation of the algorithm can be found here.

2. Computational Pathology and Integrative Genomics Team, Centre for Evolution and Cancer & Division for Molecular Pathology, The Institute of Cancer Research, London

The Computational Pathology and Integrative Genomics (Yuan Lab) bridges the interface between artificial intelligence, cancer biology, and clinical science, to decipher and target cancer evolution and immune escape. Yuan Lab pioneers several AI pathology innovations for accurate prediction of cancer survival and treatment response, with a focus on understanding tumors as evolving ecosystems. The team specializes in driving novel integration between pathological sections and next-generation sequencing to offer new insights into why some cancers are so difficult to treat.

Full list of publications can be found here:


Yeman Brhane Hagos is a PhD student trained in computer vision, medical image analysis and machine learning. He has over 5 years’ experience in medical image analysis and deep learning. He develops deep learning based computational pipelines to analyse H&E, multiplex immunohistochemistry and multispectral immunofluorescence image, and to understand the association between the complex spatial tumour microenvironment and patient clinical outcome.

Dr Nick Trahearn is a Postdoctoral Training Fellow at the Institute of Cancer Research, London.  In his 9 years working in the field of Digital Pathology he has developed tools for both assistive and exploratory applications. His current research interests are in the use of image analysis and deep learning to identify features of the tissue that are predictive of patient outcome in prostate and colorectal cancers.

Dr Khalid AbdulJabbar, trained as an R&D engineer, with a PhD in computer vision and machine learning. Khalid studies the evolution of lung cancers by integrating omics data with computational pathology. He is currently a postdoctoral fellow at ICR London.

Prof Yinyin Yuan leads the Computational Pathology and Integrative Genomics team at ICR, London. She brings over a decade of experience in machine learning and digital pathology to cancer research to develop innovative clinical tests and cancer therapies.



Our AI-TIL test utilizes the latest deep learning advances and scientific knowledge on the critical role of immune cells in cancer. This test is IP-protected under a patent, (WO/2016/083791) “Scoring of tumor infiltration by lymphocytes” at regional/national phase in Europe (EP3224624B1 granted) and in the U.S (US10346973B2 granted) [1]. The test was further developed with the advice of Dr Roberto Salgado, and demonstrated to correlate well with pathology sTIL in lung [2] and breast cancers.

The AI-TIL pipeline consists of:

  1. A generalizable, accurate and extensively validated AI method to differentiate cell types in H&E images based on morphological differences and map their spatial locations [2].
  2. Spatial, unsupervised analysis of immune cells to classify them into three types: intratumor, adjacent to tumor or distal tumor lymphocytes, based on their spatial proximity to cancer cells. This will generate quantitative TIL scores that inform the level of immune infiltration in the tumor [1].



AI-TIL: Computed using spatial modelling. All lymphocytes are identified and classified using unsupervised clustering, into intratumor lymphocytes (itl), adjacent-tumor lymphocytes (atl), and distal-tumor lymphocytes (dtl). This is based on their spatial proximity to epithelial cell nests in the H&E. The total number of adjacent-tumor lymphocytes (atl), i.e. stromal TILs, was divide by the total stromal cell count, to generate the estimate of pathological sTILs. Using this framework, we will also test the ITLR score (intra-tumor lymphocytes).

More details can be found here and here.




Immune hotspot: Computed as the fraction of immune hotspot (clustering)

using the Getis–Ord algorithm weighted by the tumor area. Unexpectedly high levels of immune-cancer co-clustering is indicative of favourable long-term prognosis in ER- breast cancer.







Ecological measure of immune-cancer colocalization using the Morisita-Horn index between identified cancer and immune cells. Used as a measure of community structure and predator–prey interactions, to quantify the extent to which cancer cells and immune cells colocalize in routine histology whole-sections. Shown to be prognostic for Her2+ breast cancer.