Deep multimodal state-space fusion of endoscopic-radiomic and clinical data for survival prediction in colorectal cancer
PMC12756232
· 10.1038/s41746-025-02236-3
Gap Declaration
The retrospective public datasets on which all analyses are based have variable annotation quality, small sample sizes for certain tasks (particularly survival prediction), and no paired endoscopy-CT data at the subject level, which limits the evaluation of true cross-modal complementarity. External validation is limited to a small number of centres and acquisition protocols, and comparisons with prior survival models rely on results reported in separate cohorts rather than direct head-to-head evaluation. Prospective, multi-center validation, the addition of other modalities like MRI, pathology, and molecular data, the creation of paired multimodal datasets to better analyze cross-modal interactions, and research into workflow integration and practical clinical impact are all important areas for future work.
Abstract
Integrating complementary surface and cross sectional cues is central to preoperative assessment of colorectal cancer, but technically challenging because endoscopic images and pelvic CT encode anatomy at different scales. Here we present HydraMamba, a multimodal selective state space framework that fuses endoscopy and CT for joint lesion segmentation, lesion detection, and survival prediction. The model couples a shared state space backbone with two lightweight modules. Across the endoscopic dataset and the CT dataset, HydraMamba achieved state-of-the-art lesion analysis (endoscopy: Dice 0.856, F1 0.918; CT: Dice 0.812, F1 0.888) and delivered calibrated survival modeling on the CT dataset (Harrell’s C index 0.832, Uno’s C@1y 0.853, integrated Brier score 0.161, calibration slope ≈1.01). …
Conclusions / Discussion
Discussion HydraMamba unifies endoscopic and CT information within a selective state-space framework and delivers consistent gains across tasks. On endoscopy, the model attains Dice 0.856 and detection F1 0.918; on CT, Dice is 0.812 with F1 0.888. For survival prediction, HydraMamba reaches C = 0.832 with IBS = 0.161 and a calibration slope of 1.01, outperforming Cox+radiomics and DeepSurv, and exceeding transformer/Mamba baselines while maintaining superior calibration. Ablations clarify complementary roles. Removing AnatoTI reduces boundary accuracy on both modalities; removing APSI primarily lowers detection recall and degrades survival discrimination and calibration. The full configuration yields the strongest survival performance and the lowest prediction error, indicating that anatomy-aware token interpolation and prototype-driven context injection together are necessary to realize the model’s performance envelope. The results show that the proposed modules, coupled with a state-space backbone, provide an effective and calibrated multimodal solution for colorectal cancer segmentation, detection, and survival prediction. The current work has a number of shortcomings. The retro…
Keeper Review
The Appreciated Gateway must be evaluated by a human keeper.
Does this declaration represent a genuine open research gap?
Does this declaration represent a genuine open research gap?
Structural Hole
40% bridge
Technique originates in neuroscience; functional analogues in psychology, criminal justice literature are absent.
○ NAUGHT — Open Opportunity
No paper has claimed this gap. Appreciate the opportunity.
Provenance