TRUST-VL: A Unified and Explainable Vision–Language Model for General Multimodal Misinformation Detection
Zehong Yan, Peng Qi, Wynne Hsu and Mong Li Lee
National University of Singapore
EMNLP, 2025 / PDF / Project Page / Code / Data
We present a unified and explainable vision–language model for various multimodal misinformation detection tasks across textual, visual, and cross-modal distortions, enhanced by the Question-Aware Visual Amplifier (QAVA) and supported by the large-scale TRUST-Instruct dataset of 198K reasoning samples.