A Medium blog post has leaked several screenshots hinting at the possible features of Google’s upcoming LLM Gemini, that will replace PaLM 2.
Apparently, Google’s much-anticipated GPT-4 competitor Gemini has been leaked over a Medium blog post by one Bedros Pamboukian. It’s noteworthy that Gemini is the replacement for Google’s current LLM called PaLM 2, which powers Bard. After the shift, Bard will be powered by Gemini instead—Which is a multimodal LLM, meaning it can handle input and output in text as well as images.
Throughout the blog post, the writer has provided several screenshots and commented on what new features the Gemini LLM is going to pack inside it. By the looks of it, the leaks look pretty hefty and serious. This could be Google’s next breakthrough that finally puts it back on the map as the #1, not a follower behind OpenAI’s ChatGPT.
A new feature called “Stubbs” will apparently allow users to create actual apps with a single prompt. Rather than being full-fledged programs, these will likely be closer to prototypes or low-code applications that require a developer’s touch to fully customize or operate. These apps will be created and launched within the UI of Gemini, making it super convenient to do a bunch of small stuff, without needing to know how to code.
The multimodality is pretty standard but its true prowess wasn’t leaked. The instances provided are pretty straightforward and introduce all the usual functions such as recognizing text and objects, generating captions, or understanding images.
Quite surprisingly, the chat feature based on Gemini will not support multimodality (image input or output), even though PaLM 2-based Bard has that support.
Google MakerSuite is a tool that allows you to quickly prototype with generative language models such as PaLM 2. It’s good for creating and experimenting with model parameters and prompts. All the leaked Gemini UI has been designed by MakerSuite.