Construction and Performance Optimization of a Multimodal Transformer-Based Fake News Detection Model
DOI:
https://doi.org/10.70695/IAAI202601A7Keywords:
Transformer; False News; Detection; Cross Modal Interaction; Feature FusionAbstract
To address the issues of current false news characterized by multimodal fusion dissemination and the limited interpretability of existing methods, we explore a multimodal transformer detection model (SEL-MSIT) that integrates supervised contrastive learning and multi-stage interaction. Based on feature extraction, the supervised contrastive learning module enhances feature discrimination capability, while the multi-scale cross-modal interaction module is constructed to explore deep correlations. A consistent attention mechanism is employed to achieve efficient feature fusion. Experimental results demonstrate that SEL-MSIT outperforms mainstream baselines in terms of accuracy, precision, recall, and F1-score. Ablation experiments are conducted to verify the effectiveness of each optimization module, and the results can serve as supplementary data for decision-making.