Meta’s Maverick AI Model Falls Short on Benchmark

CosmicTaco · 2025-04-12T10:40:54.444561+05:30

- Earlier this week, Meta faced scrutiny for using an experimental Llama 4 Maverick model to score high on LM Arena. - LM Arena maintainers changed policies and scored the vanilla Maverick, which ranked lower than rivals. - The unmodified Maverick fell behind models like OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro. - Meta explained that their experimental Maverick was optimized for conversationality, skewing benchmark results. - A Meta spokesperson expressed excitement for developers to customize Llama 4 and provide feedback. Source: [TechCrunch](https://techcrunch.com/2025/04/11/metas-vanilla-maverick-ai-model-ranks-below-rivals-on-a-popular-chat-benchmark/)

Earlier this week, Meta faced scrutiny for using an experimental Llama 4 Maverick model to score high on LM Arena.
LM Arena maintainers changed policies and scored the vanilla Maverick, which ranked lower than rivals.
The unmodified Maverick fell behind models like OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro.
Meta explained that their experimental Maverick was optimized for conversationality, skewing benchmark results.
A Meta spokesperson expressed excitement for developers to customize Llama 4 and provide feedback.

Source: TechCrunch