Pages List
List view
All charts and visualisations are created using Lava Metrics.
Early access to our Beta? 👉 Sign up here
Marketing AI Performance Leaderboard - June 2025 Results
Overall Leaderboard June 2025
Overall Scores by LLM
What are the overall results?
🏆 Overall Winner: OpenAI: o4-mini-high
❌ Overall Loser: Qwen: qwen-max
Category Winners and Losers
Category Ranking (Best → Worst)
Ranking reflects the average performance of all tested LLMs in each task category, ordered from highest to lowest.
- Copywriting
- Research Online
- Analysis of Internal Data
- Strategic Planning
FAQs
What does this Leaderboard represent?
We have designed tests that simulate a marketer’s interaction with native platform UIs (e.g., ChatGPT, Gemini) across several marketing domains:
- Copywriting: Generating ad copy, email subject lines, and social media posts.
- Internal Data Analysis: Interpreting sample CRM data to identify trends and insights.
- Strategic Planning: Creating marketing plans based on given scenarios.
- Online Research: Gathering information from the web to support marketing decisions.
How were the tests scored?
Each test output is evaluated by specialised AI “judges.”
- Judges are themselves AI agents configured with specific evaluation criteria.
- They parse the Test Answer, compare it against expected outcomes or benchmarks, and score on multiple dimensions (e.g., factual correctness, tone, format).
- Final scores are normalized and aggregated to produce a single value per test.