Geometry-Aware Rotary Position Embedding for Consistent Video World Model Paper • 2602.07854 • Published Feb 8 • 10
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 212k • 1.58k
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Paper • 2406.07057 • Published Jun 11, 2024 • 17
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Paper • 2406.07057 • Published Jun 11, 2024 • 17