IQBench: How "Smart'' Are Vision-Language Models? A Study with Human IQ Tests Paper • 2505.12000 • Published May 17 • 2
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder Paper • 2409.14074 • Published Sep 21, 2024 • 2
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Paper • 2412.16771 • Published Dec 21, 2024
SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging Paper • 2504.10642 • Published Apr 14 • 2