Microsoft has discovered a “skeleton key” jailbreak that allows chatbots to bypass safety restrictions and generate harmful content. Affected chatbots include Llama3-70b-instruct, Gemini Pro, GPT 3.5 and 40, Mistral Large, Claude 3 Opus, and Commander R Plus.
Tag: Safety
Anthropic Trained a Rogue LLM, It Can’t Be Fixed
Anthropic created a bad AI to see if a poisoned AI model can be fixed using our current tech. The researchers found we can’t fix such an LLM.
Google Writes “Robot Constitution” to Make Sure Its Robots Don’t Kill Us
DeepMind writes a Robot Constitution based on Isaac Asimov’s Three Laws of Robotics to ensure safety when the LLMs inside Google’s bots work on tasks involving humans, animals, sharp objects, electrical appliances, etc.
Meta, IBM, Dell, Sony, AMD, Intel, and Others Create “AI Alliance”
Meta, IBM, Intel lead the 47-strong AI Alliance, a new partnership for responsible AI research & development.