Falcon 40B is the New Open-Source LLM Champion

TII’s Falcon LLM is the new #1 in open-source language models on par with GPT-3, StabilityLM, and PaLM-62B.

Falcon 40B is an open-source large-language model released under the Apache License 2.0, meaning it can be used for commercial purposes. Abu Dhabi’s Technology Innovation Institute (TII) created this top-performing LLM on Amazon Web Services. Anyone can use SageMaker on AWS to use Falcon.

Official page: Falcon LLM.

Stability AI and Hugging Face are already using Falcon through AWS. This 40-billion parameter LLM is great for anyone eager to build AI applications without a lot of investment. The idea behind TII’s Falcon LLM is to give a taste of what generative AI can do. The main focus of the Falcon LLM is not chatbot conversations like ChatGPT but programming.

As of June 15, it ranks #1 on Hugging Face’s Open LLM Leaderboards, over Caldera’s Lazarus, for its excellent accuracy and reliability. It has already demonstrated that it can outperform many popular open-source LLMs such as Meta’s LLaMA, Stable Diffusion’s StableLM, RedPajama, and MPT. With a 2-month training time using 384 A100 GPUs, it surpasses OpenAI’s GPT-3 and offers reduced compute budget and inference time mainly thanks to FlashAttention. Further, it offers significantly lower compute compared to Chinchilla and PaLM-62B.

The training data includes high-quality sources including web crawls, research papers, and conversations (Reddit, Stack Exchange, and HackerNews).

The Falcon LLM can be tested with lower-end hardware using the Falcon 7B deployment, though it’s 20% slower. As a versatile LLM, Falcon 40B can be fine-tuned for specific language processing tasks.

TII trained the LLM on a whopping 1 trillion tokens.

Open-source LLMs are making a big entry on the world stage and TII’s Falcon is a state-of-the-art model that’s opened the world to new possibilities. It’s on par with DeepMind, Anthropic, and GPT-3 without being closed-source and that’s saying something.

By Abhimanyu

Unwrapping the fast-evolving AI popular culture.