Overall Leaderboard June 2025

All charts and visualisations are created using Lava Metrics.

Early access to our Beta? 👉 Sign up here
 
Marketing AI Performance Leaderboard - June 2025 Results

Overall Leaderboard June 2025


Overall Scores by LLM

🔗 See full results dashboard

What are the overall results?

🏆 Overall Winner: OpenAI: o4-mini-high
❌ Overall Loser: Qwen: qwen-max
 

Category Winners and Losers

Losers by Category ❌

🔗 See full results dashboard
 

Winners by Category 🏆

🔗 See full results dashboard
 

Category Ranking (Best → Worst)

Ranking reflects the average performance of all tested LLMs in each task category, ordered from highest to lowest.
  1. Copywriting
  1. Research Online
  1. Analysis of Internal Data
  1. Strategic Planning
 

FAQs

What does this Leaderboard represent?
We have designed tests that simulate a marketer’s interaction with native platform UIs (e.g., ChatGPT, Gemini) across several marketing domains:
 
  • Copywriting: Generating ad copy, email subject lines, and social media posts.
  • Internal Data Analysis: Interpreting sample CRM data to identify trends and insights.
  • Strategic Planning: Creating marketing plans based on given scenarios.
  • Online Research: Gathering information from the web to support marketing decisions.
 
How were the tests scored?
Each test output is evaluated by specialised AI “judges.”
 
  • Judges are themselves AI agents configured with specific evaluation criteria.
  • They parse the Test Answer, compare it against expected outcomes or benchmarks, and score on multiple dimensions (e.g., factual correctness, tone, format).
  • Final scores are normalized and aggregated to produce a single value per test.
Where can I see the full results?