Mistral AI’s 7B is a small LM that can achieve Llama 2 13B’s performance with a lower compute cost & a no-restrictions license.
Mistral AI, a French AI startup that has been working on its language model for a while now, released its first small language model that it’s making free for everyone, with no restrictions. The company claims that it’s better than other LMs of the same size.
Named 7B, the language model can be downloaded over torrent (link to 14.5GB .pth file) or GitHub (link), or used via Hugging Face (link). It has also been integrated into Perplexity Labs for anyone to test (the 7B-instruct version).
It outperforms Llama 2 (a 13B model) on all benchmarks and approaches CodeLlama 7B’s performance on code/programming tasks. The principle of Mistral AI is to provide “credible alternatives to the emerging AI oligopoly.” The model is provided under an Apache 2.0 license.
In a public tweet, the company provided the magnet link for the torrent download with no context:
On Reddit, users reported that the model wasn’t as bad as other small models (which the company also claims) and that the responses were actually good for smaller tasks. One user also pointed out that it lacks in censoring, and can give dangerous responses to users’ queries.
If you are willing to run it locally, you can repurpose, retrain, or use it any way you like without any restrictions. Notably, Mistral’s 7B has a “considerably smaller compute cost” than other small language models like Llama 2.
The Mistral 7B LM is optimal for low latency, text summarization, classification, text completion, and code completion. It has natural coding abilities and an 8k sequence length.
In the official blog post, the Mistral AI team wrote, “Our ambition is to become the leading supporter of the open generative AI community, and bring open models to state-of-the-art performance. We will make them the go-to solutions for most of the generative AI applications. Many of us played pivotal roles in important episodes in the development of LLMs; we’re thrilled to be working together on new frontier models with a community-oriented mindset.”
On the official product page, the company has said that “larger models, better reasoning, multiple languages” are coming soon.