Google Releases Open LLMs

Google releases two “Gemma” models – open-source models accessible through Colab and Hugging Face, among other platforms.

Google has been doing a lot of research in AI and ML since before the AI boom. A lot of it has been open-source. In a bid to make “AI helpful for everyone,” the company announced two weights of its open model called Gemma. Note that these are not open-source. The technology powering these is the same one used to train Gemini, but Gemma’s weights are naturally not as powerful, though quite useful for various use cases.

Gemma 2B and Gemma 7B are both available to developers on Colab, Kaggle notebooks, Hugging Face, MaxText, and Nvidia NeMo. These can also be run natively on your PC’s hardware like Falcon, Mixtral, and Llama, all open-source AI models.

According to the official announcement, these models are “inspired by Gemini” and “state-of-the-art,” though it’s unclear how they’re trained as there’s no research paper documenting these language models. The idea is to allow developers to develop more flexibly with access to better weights. Currently, open-source models offer a lot of capabilities for people, organizations, and entire companies to train their own models based on pre-trained ones, such as for fine-tuning Llama with a company’s propriety data after cleaning and tokenizing internal documents.

Google’s Gemma aims to achieve something similar.

Hugging Face’s leaderboards for open LLMs give it an average score of 64.29 (Gemma-7B) at best, lower than many instances of Mistral, Llama 2, Platypus2, and so on. These are superior in their dataset and can sometimes achieve better scores because they are fine-tuned on domain-specific datasets for certain tasks. For example, the top spot is taken by Abacus AI’s Smaug, a 72B LLM, with a score of 80.48.

Gemma’s weights are pre-trained to filter out personal and sensitive info. It also uses RLHF alignment. Google had to say this, though:

“Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.”

A technical report is quoted (PDF) where Gemma 7B is compared with Llama 2 7B, Llama 2 13B, and Mistral 7B across tasks like question answering, reasoning, math/science, and coding. In question answering, it’s toe-to-toe, but in all others, it surpasses every other model.

How suitable developers find Gemma models for fine-tuning and inferencing remains to be seen.

By Abhimanyu

Unwrapping the fast-evolving AI popular culture.