Google DeepMind’s Gemini is a multimodal LLM that’s been given early access to select companies.
As per data revealed by The Information (link—Paywalled), Google has given a handful of companies access to an early version of the conversational AI tool called Gemini. The people that The Information talked with gave an account of how outside developers were given access to the much-anticipated software. The software will be a cloud-based service for businesses and organizations, directly competing with the GPT-4 model from OpenAI that’s provided to normal ChatGPT users and ChatGPT Enterprise users (ChatGPT releases Enterprise version).
Organizations are increasingly looking to incorporate AI tools, especially those with generative capabilities, into their existing workflows as it allows them to do more faster and make their teams more productive. Though a lot of companies are keeping their silence on this, they are working under the hood to cut down on the active workforce they need where an AI tool can replace a human job.
As such, AI tools from companies like Google and OpenAI are the biggest attraction for big corporations that can save millions over the years by installing secure and well-built AI ecosystems to operate rote tasks.
In a nutshell, Gemini is a set of language models. You can use it to power chatbots or create organization-wide tools to summarize text. If you thought Bard was supposed to be the competitor to ChatGPT, it’s time to think again.
Though Google is developing Bard separately and religiously, Gemini is a DeepMind project. DeepMind was acquired by Google for $500 million in 2014 and has since acted as a Google subsidiary, but its projects rarely overlap with Google’s main products. In many ways, it’s a subsidiary that’s different from typical product teams at Google.
It’s quite a remarkable thing, for that reason, that DeepMind and Google are “partnering” in a way to give tough competition to OpenAI.
Details about Gemini’s actual performance are scarce. As a multimodal LLM, it will have both, text and image capabilities (with a possibility for additional data types). Ultimately, the aim is to make natural conversation better and more effective.
SEJ writer Kristi Hines unpacked a lot of what we know so far about Gemini, such as how it could integrate with APIs, do memory planning, or be the largest-LM with 175B+ parameters.