Groq’s LPUs Offer Instant Generative AI Responses

groq npu

Using new hardware, generative AI outputs have become nearly instantaneous, opening up new opportunities for AI companies.

Groq is a California-based semiconductor manufacturer (different from Elon Musk’s Grok LLM) that has created an LPU – Language Processing Unit (trademarked) for generative AI purposes. The LPU Inference Engine utilizes this new piece of hardware to generate output for prompts almost instantly.

Background: Other companies use GPUs with dedicated AI accelerators to train their models, do inferencing, and provide services (such as when you put a prompt into ChatGPT or Gemini). GPUs are graphics cards originally meant for gaming. These can do the job but are not built for AI inferencing, such as processing natural language. A lot of companies are trying to make more specialized chips. Nvidia, the leading chipmaker and the company with the biggest slice of the GPU market, is banking heavily on the explosion in generative AI by offering somewhat specialized GPUs. But at the same time, Big Tech is investing heavily in fully-specialized, AI-oriented chips that can do the job faster. After all, generative AI tools don’t need to run games at high FPS.

Groq’s LPU is the first useful AI-only chip that can forever transform generative AI. An LLM can be trained faster and a chatbot based on that LLM can answer faster using LPUs instead of GPUs. On the official website, you can test the LPU’s scary fast speeds on Llama, Mixtral, and Mistral LLMs.

Here’s a quick demo:

If all goes according to the plan, Groq’s LPUs will soon replace GPUs in data centers and the whole business of Nvidia unless others step forward with similar technology. 

Groq clarifies on its website that it took the name back in 2016 (to possibly avoid any confusion with Elon Musk’s AI LLM called Grok). They also wrote how they own the trademark, so Musk should “please choose another name, and fast.” The online demo is already hosting record users, and the company is about to give ChatGPT-on-Nvidia and Elon Musk a run for their money.

Tech companies such as Microsoft, Google, Apple, Amazon, etc. have a decision to make – ditch Nvidia to experiment with these flashy new LPUs or increase their funding and pressure for custom AI chips to compete with the Groq LPUs.

Groq can currently deliver 390 racks of its LPUs in 6-12 months and offers lead times of 6-12 weeks.

By Abhimanyu

Unwrapping the fast-evolving AI popular culture.