
Based on 1 review
Alibaba's multimodal model accepting text, image, video, and audio inputs. Open-weight under Apache 2.0 with native understanding across all major modalities.
Released
September 22, 2025
Parameters
Unknown
Context
128K
Pricing
Free
| Benchmark | Category | Score | Performance |
|---|---|---|---|
MMLU | knowledge | 81.2% | 81 |
HellaSwag | language | 86.5% | 87 |
Last updated: March 15, 2026
Benchmark scores may vary based on evaluation methodology and conditions.