Claude-3 Beats Gemini Ultra & GPT-4

Anthropic’s latest models outperform Google’s Gemini 1.0 Ultra and OpenAI’s GPT-4 across benchmarks.

Anthropic’s Claude is a family of AI models from Anthropic, an AI research company with safety as one of its core tenets. The company has seen an influx of $100 million led by Amazon, Google, and a bunch of other investors. That money has created the next, upgraded set of Claude models. In the official announcement, the company teased the three models or weights of Claude-3: Haiku, Sonnet, and Opus. Opus, as the name suggests, is the biggest, baddest of the lot that outperforms GPT-4 and Gemini 1.0 Ultra (not 1.5 Pro) in literally every benchmark.

Benchmarks give you a good idea but companies can easily fine-tune certain parameters within their models to just outperform a competitor for marketing brownie points. At worst, they might also just cherry-pick the most favorable results.

But in the case of Claude-3 Opus, the performance seems to be truly significant. It’s fast and can reason much better than the best models from Google and OpenAI, which is a staggering feat. With this, Anthropic has (properly) cemented itself as the third major AI chatbot in the world.

This new advancement boasts a 200k token context length (the tool can take longer inputs and have a better memory within a chat instance). The company promises a 1 million token context soon. Claude-3 also introduces multimodality, meaning now it can process media as well. What’s more, you can have it analyze up to 20 pictures in one prompt.

Opus: $15 per million input tokens, $75 per million output tokens
Sonnet: $3 per million input tokens, $15 per million output tokens
Haiku: $0.25 per million input tokens, $1.25 per million output tokens

At this point, you can safely tune yourself out of the incremental advances being made by these AI companies. More hiring, more experience, more knowledge, higher profits, and more funding are only going to take us from 75% to 80%, then to 85%, then to 90%, and so on. GPT-5 from OpenAI will most likely beat Claude-3, and Gemini 2 Ultra will beat GPT-5, and so on. Something truly revolutionary such as a model scoring 100% on the GPQA, the official release of an AGI model, or rogue AI launching nukes into your backyard will tune you right back in.

By Abhimanyu