MIT’s novel machine learning method for AI safety testing utilizes curiosity to trigger broader and more effective toxic responses from chatbots, surpassing previous red-teaming efforts….
MIT’s novel machine learning method for AI safety testing utilizes curiosity to trigger broader and more effective toxic responses from chatbots, surpassing previous red-teaming efforts….