Can Machine Learning Improve the Prediction of Neoadjuvant Therapy Effects in Breast Cancer?

What is neoadjuvant therapy

To increase the survival and the success rate of surgical tumor removal, breast cancer patients are treated with the so-called neoadjuvant therapy.

Most often this is either chemotherapy, where drugs indiscriminately kill fast-growing cells, targeted therapy, where drugs block the growth specifically of cancer cells, or hormonal therapy where certain hormones, related to cancer growth, are blocked. The overall goal is to, before the surgery, reduce the size of a tumor and kill cancer cells that already spread throughout the body.

However, many patients do not respond to therapy. In some studies, the absence of all detectable cancer after completing treatment (also known as Pathological complete response – pCR) was achieved by only 19% of patients [1]. Therefore, there is a strong need to improve the prediction of response to therapy so that a treatment could be tailored to specific groups of patients and lead to a more positive response.

PREDICT online tool for predicting response to therapy

To improve the prediction, it is crucial to know which characteristics of both patient and tumor should be taken into consideration.

Our knowledge of that is limited.

The widely used online tool, PREDICT, with an accuracy of about 70%, is based on using several clinical-based data such as age at diagnosis, tumor size,  number of lymph nodes, ER, HER2, and Ki-67 status. The first version was developed in 2010 and has since then been the state-of-the-art solution.

Other research analyzed the prognostics influence of various molecular [2,3,4] and tissue level factors [5] but their wider applicability is hindered by the fact that they have been mostly based on small datasets and different treatments.

Using machine learning to integrate datasets and improve robustness

A recent study [6], led by Cancer Research UK Cambridge Institute, made an important step forward.

They combined clinical, multi-omics DNA & RNA and digital pathology (lymphocyte density) data obtained from clinical trials and analyzed their predictive power for neoadjuvant breast cancer therapy. As a target they used pCR, assessed at the surgery.

The preliminary analysis showed that none of these datasets alone performed robustly enough.

Therefore, the authors used a machine learning framework to integrate datasets into a predictive model of pCR for breast cancer. They created a series of six pCR prediction models including different feature combinations using: (1) clinical features only, and adding (2) DNA, (3) RNA, (4) DNA and RNA, (5) DNA, RNA and digital pathology, and (6) DNA, RNA, digital pathology and treatment. The models achieved the following accuracies: 70% (clinical), 80% (clinical and DNA), 86% (clinical and RNA), 86% (clinical, DNA and RNA), 85% (clinical, DNA, RNA, and digital pathology), 87% (fully integrated model (clinical, DNA, RNA, digital pathology and treatment)).

The way forward in predictive modelling

Two main takeaways from this research should guide further research:

First, the increase of accuracy of 17% compared to the use of clinical data only, showed that treatment response is determined to a great degree by the characteristics of the totality of the tumor ecosystem. (Among these tumor proliferation and immune activation emerged as one of the most important determinants of response.)

Second, they clearly demonstrated the importance of data integration in developing tumor treatment predictive models. This could open the path to the development of even more powerful predictive models that can be expanded to other tumor types.



Leave a Comment

Your email address will not be published. Required fields are marked *