Artificial Intelligence/Deep Learning/Machine learning approaches to TIL-assessment
At present, in early stage disease clinico-pathological risk stratification is performed using a limited set of features such as tumor size and lymph node status. Very large adjuvant trials such as ALTTO and APHINITY that have applied these stratification schemes have illustrated the key problems with the current classification scheme – it does not stratify patients with sufficient granularity to permit selection for clinical trials. The current scheme also takes the approach of placing patients on a continuum of risk. This is at odds with results from high-throughput technologies such as gene expression profiling and genomic assays, which focus on identifying individual patient groups with particular clinical behavior. Several results in this area have identified genomic/transcriptomic or proteomic features which in hindsight are associated with particular histological features. This suggests that the histological appearance of a tumor represents a useful cancer phenotype which can be further explored, and contribute to staging and stratification.
Machine learning refers to the general computational approach whereby data is used by algorithms to develop predictive models. These models are finely tuned to optimize accuracy and generalizability as applied to new data. Although machine learning existed for some time, more recently, advances in algorithm development and hardware infrastructure has enabled ‘deep learning’ approaches. Deep learning was originally designed to mimic the neural architecture of the human brain, and conceptually uses a series of connected nodes (neural nets) which respond to input in a way that is tuned with repeated cycles of learning. Neural nets have the ability to learn rich representations of complex data, which may contain hierarchical and non-linear relationships. These abilities make neural nets ideally suited to image classification. They have exhibited spectacular results in this area, often matching or exceeding human performance.
Machine learning provides a toolkit with many applications in digital pathology. On one level, the robust performance and training characteristics of deep learning allows us to develop accurate automated assays for pathological features such as grade and lymphocyte infiltration. These have the potential to be ‘learn once, apply everywhere’. This is in contrast to existing methods, which must be constantly calibrated and validated. Such assays would aid pathologists in routine reporting, and could be expected to improve the validity of current pathology based clinico-pathological features. They would also help standardize pathology results within and across trials.
TILs have been shown to be a reliable and reproducible marker of tumor immunogenicity in breast cancer. It is clear that higher levels of TILs are associated with improved prognosis in early stage TNBC and HER2-positive breast cancer, as well as a higher probability of achieving pCR in the neoadjuvant setting. Analysis of TILs in residual disease specimens after neoadjuvant therapy has also been shown to have prognostic value. The evaluation of TILs as a biomarker in breast cancer is likely to be extended from the research domain to the clinical setting in the near future. The assessment of TILs by digital image analysis might be useful for standardization in the future, since this approach has the potential, for example, to determine the number of TILs per mm² stromal tissue as an exact measurement contrary to the approximate semi-quantitative evaluation suggested at this moment. In the first International Guidelines on TIL-assessment in breast cancer we proposed to develop an inter-laboratory Ring study to assess the reproducibility and clinical validity of TILs assessment, including machine learning algorithms. While TILs have been measured morphologically and have been shown to add predominantly prognostic information, methodological open questions in the morphological evaluation of TILs still remain, for example the assessment and importance of spatial TIL-heterogeneity. The measurement on H&E-stained slides most likely represents the beginning of the efforts to use infiltrating cell properties as companion diagnostic tests. Thus, as a field, we should be open to the introduction of molecular methods, most likely in situ, that can classify the TILs-component and bring higher levels of information to the patient sample. However, at this time, these Artificial Intelligence-methods are still experimental and not sufficiently documented for introduction into standard practice.
On another level however, machine learning also permits discovery of image based features which may be very difficult for current approaches to identify, particularly if they only exist in small groups of patients. The key benefit of machine learning here is to rapidly identify pathological features in large datasets in a standardized way – this is an essential first step in deciding if these features are clinically relevant, and is largely infeasible using current approaches. Machine learning also permits modification and retraining of the feature set to optimize accuracy and interpretability, which is again infeasible with current methods.
The Working Group is launching soon a collaboration with the Massive Analysis and Quality Control Consortium (www.maqcsociety.org) characterizing tumor infiltrating lymphocytes using machine learning algorithms. Developing a machine learning based assay for tumor infiltrating lymphocytes would enable rapid expansion of this potential pathological feature, and by providing an adjunct to human pathologists, enhance the validity and robustness for prognosis/prediction.