Enhancing the Prediction of Breast Cancer Progression Through Multi-modal Data Transformation

View/ Open
Date
2025Author
Abdullakutty, FaseelaAkbari, Younes
Al-Maadeed, Somaya
Bouridane, Ahmed
Rifat Hamoudi, Rifat
Metadata
Show full item recordAbstract
The ability to predict breast cancer metastases is essential for making effective clinical decisions and managing patients. Traditional models predominantly rely on structured clinical data, which often lacks essential contextual details, limiting their predictive accuracy. In order to address this limitation, a multi-modal approach is introduced in which structured data is transformed into unstructured text, while contextual richness is preserved. Using this text, synthetic images are generated across three key diagnostic modalities, histopathology, mammography, and ultrasound, to enhance predictive capabilities. Based on converted text data, a pre-trained diffusion model was used to generate synthetic medical images in histopathology, mammography, and ultrasound modalities. The impact of a variety of text description variants on image quality and metastasis prediction was assessed. Comprehensive tumor descriptions or a combination of histological type and differentiation status were the most effective generation strategies. A comparison was conducted between three prediction approaches: a unimodal approach, an early fusion approach based on concatenation, and the Multi Co-Guided Attention (MCGA) approach. Through mutual attention, MCGA enhances feature alignment by addressing inter- and intra-modal heterogeneity and capturing complex relationships. Unimodal and multi-modal methods were evaluated with the application of SMOTE to mitigate the impact of data imbalance. Multi-modal fusion significantly outperforms unimodal methods, especially when class imbalances are mitigated by using SMOTE. When SMOTE was applied, Ultrasound+BERT achieved the highest level of accuracy (0.90), followed by Histopathology+BERT (0.88), and Mammogram+BERT (0.88). As compared to early fusion, MCGA demonstrated better class balance and improved minority class detection. Incorporating unstructured text with synthetic imaging modalities improves the accuracy of metastasis prediction by preserving contextual information. A MCGA fusion is particularly effective in ensuring balanced class performance, particularly for rectifying class imbalances. Through this approach, the complementary strengths of textual and visual data are leveraged to overcome limitations in multi-modality integration. These results demonstrate the potential for advancing the prediction of breast cancer metastasis, offering a more robust and context-aware framework for clinical decision-making.
Collections
- Computer Science & Engineering [2518 items ]

