Intelligent Tech

Claude AI Knocks GPT 4 Out of First Place in Chatbot Arena

by Amira Saadeh - March 28, 2024
Reading time: 2 min

Post Views: 311

According to the Chatbot Arena Leaderboard, Anthropic’s Claude 3 Opus has dethroned OpenAI’s GPT-4 as the top LLM.

Chatbot Arena, overseen by LMSYS ORG, offers an interactive platform for users to evaluate and compare LLMs through crowdsourced assessments.
Claude 3 Opus boasts advanced reasoning, math, coding abilities, and an extensive knowledge base, with a token context capacity of up to 1 million tokens.

Anthropic’s Claude 3 Opus has overtaken OpenAI’s GPT-4 as the top Large Language Model (LLM), according to the Chatbot Arena leaderboard.

Chatbot Arena, managed by the Large Model Systems Organization (LMSYS ORG), is an online platform designed to evaluate LLMs and introduce new ones. Users can compare and rate different AI chatbots based on their own preferences. The platform relies on a crowdsourced approach where users interact with two unlabeled chatbots at a time and choose the one they find better.

Basically, you have a conversation with two anonymous chatbots at the same time and then judge who’s answered your questions better without knowing which is which. Try it; it’s fun. The way the test is set up prevents model trainers from manipulating outcomes. Even though it is a qualitative assessment, it provides valuable insights for AI researchers seeking to understand user preference and model performance in real-world scenarios.

Ever since OpenAI came out with ChatGPT, it has dominated the Chatbot Arena rankings. Now, however, Anthropic’s Claude 3 Opus has dethroned it.

LMSYS ORG shared Peter Gostev’s analysis of the Top-15 Chatbot LLM ratings on their X (previously known as Twitter) account. According to the bar chart race, it was a close call.

[Community creation]
Top-15 Chatbot Arena LLM ratings (May '23 – Now)

Credit: Peter Gostev https://t.co/OgnLu3rj64 pic.twitter.com/Ueq7DZpu8N
— lmarena.ai (formerly lmsys.org) (@lmarena_ai) March 27, 2024

Claude 3 Opus has advanced reasoning, mathematical prowess, coding capabilities, and an expansive knowledge base. Unlike previous iterations, Claude 3 boasts an impressive token context capacity. It can handle up to 200,000 tokens in its public version and is reportedly capable of processing 1 million tokens with remarkable retrieval rates in a restricted version.

It’s really giving GPT-4 a run for its money.

Claude 3 Opus is not the only Anthropic AI model that made the cut. Sonnet, available for free, and Haiku, a smaller, faster model, have demonstrated competitive performance compared to their counterparts.

Interestingly, Meta is not on the list.

The results have multiple implications for the LLM race in its entirety. The Arena is based on user preferences over purely objective metrics. This perspective may nudge AI development more towards human values and priorities in conversations.

Beyond that, this win for Anthropic places it directly as a major competitor to OpenAI. All eyes are now on the two AI companies.

Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.

Tags: AI Anthropic Artificial Intelligence ChatGPT GPT-4 OpenAI

Mastercard and MyMonty Join Forces to Transform Digital Payments in Lebanon

Award-Winning. Future-Ready. Monty Mobile’s eSIM Platform Strikes Gold

G42 and UAE Team Emirates–XRG launch first global GEN AI-designed helmet and competition

Google Pay Goes Live in Lebanon, While Apple Pay Leaves iPhone Users Disappointed

Monty Mobile Wins GCCM 2025 Recognition Award for Innovative eSIM Solutions at the CC-Global Awards in Berlin

Wi-Fi 8 Taking Connectivity to New Levels Starting 2028

Meta’s Under Sea Internet Cables Will Keep Us Connected

Is Ericsson’s 5G Uplink Speed Worth the Cybersecurity Risk?

Starlink’s Direct-to-Cell Service Goes Beyond Consumer Use

China Telecom Industry Open to Foreign Investors

US-Gulf Tech Alliance Shifts Middle East Security to AI, Cybersecurity

Hallowed Ground or Tech Hub? Data Centers Colonize Civil War Battlefield

AI in Desperate Need of Smarter Humans

Diligent Robotics Expands to Reach Healthcare

Huawei Asserts AI Models' Domestic Following Whistleblower

MyMonty: The New Era of Banking

Entering the Monty Multiverse at Seamless 2023

Seamless Dubai 2023 - From Concept to Reality: Shaffra Technologies Opens Doors to Metaverse Mastery

Take A Look in the Mirror. The Greatest Technology of All Will Stare Back at You

Monty Mobile Enters Multibillion-Dollar MNO Equipment Industry

Are We Addicted to Social Media? IG, TikTok Trigger Physical and Emotional Withdrawal

Meta's AI on Instagram, Facebook Helps Save Lives

US DoT’s New Safety Plan Introduces Car Communication

Little Girl Receives First Prosthetic Eye from MRI, CT Scans

DeepL’s AI Translation Software to Get Traditional Chinese

Claude AI Knocks GPT 4 Out of First Place in Chatbot Arena