Breast cancer is a genetically heterogeneous disease with distinct gene expression patterns within a tumor. However, the invasive and expensive nature of genomic examination impedes its extensive use in clinical practice. Magnetic Resonance Imaging (MRI) is noninvasive and widely used for cancer diagnosis and treatment. In view of this, we developed a contrast learning-based framework to synthesize genomic characteristics from MRI. Specifically, we extracted image features using the 3D-ResNet18 architecture, while cell subpopulation features were obtained through a Multilayer Perceptron (MLP). The contrastive learning network aligns the image features and genomic features in the representation space using a contrastive loss. We saved the weights of the image feature extractor from the contrastive learning stage as pretraining weights for the generator in the generative model and used the discriminator to distinguish between the generated immune cell subpopulations and the real immune cell subpopulations. Further survival analysis of the generated immune cell subpopulations was conducted using the log-rank test. The dataset consisted of 135 patients, with 81 samples allocated to the training set and 54 samples assigned to the testing set. Based on the univariate Cox hazard model, ten immune cell subpopulations significantly associated with overall survival were identified. Immune cell subpopulations were generated using the model proposed in this work, and the risk score was calculated using multivariate Cox regression. The generated risk score of immune cells achieved the R square of 0.48 and 0.43 in the validation and the test cohort, respectively. Significant differences in prognosis were observed after grouping the patients according to risk score, with p values of 0.033 and 0.011 in the validation and test sets, respectively.
|