Multimodal – AI Commenter

Apple Quietly Drops Research on a Multimodal LLM

13 Apple researchers published a research paper outlining how they learned the best practices to build a multimodal large language model that can deliver quality results and perform better than others.

ChatGPT Plus Update Allows Users to Access All Tools in All Chats

New ChatGPT Plus update brings all tools together, including DALL-E, Code Interpreter, document upload, data analysis, and web search.

Google Gemini Leaked in a Medium Blog Post

A Medium blog post has leaked several screenshots hinting at the possible features of Google’s upcoming LLM Gemini, that will replace PaLM 2.

Free GPT-4 Competitor LLaVA Announced by Researchers

A new multimodal AI model can match GPT-4’s performance while being free to train and use.

ChatGPT Adds Voice & Image Capabilities

ChatGPT teased voice & image features that will allow users to speak to the chatbot, get voice responses, upload images, and receive images in responses.