The misbehavior of AI tools, such as Microsoft’s Bing AI forgetting what year it is, has been a subtext of AI-related reports. However, it is often difficult to tell the difference between a bug, such as Google’s Gemini image generator depicting various Nazis depending on filter settings, and a poorly constructed underlying AI model that analyzes input data and predicts acceptable responses.
Now OpenAI has published the first draft of a proposed framework called Model Spec that will shape how AI tools like its own GPT-4 model will respond in the future: AI models should be guided, useful and responsive to developers and end users, benefit humanity by considering potential benefits and harms, and reflect OpenAI well by respecting social norms and laws.
OpenAI is also considering allowing companies and users to ‘tweak’ the ‘spiciness’ of the AI model; assessing whether it can responsibly provide the ability to generate NSFW content in an age-appropriate context through the API and ChatGPT.”
Do not provide hazardous information TAs should not provide instructions on how to create chemical, biological, radiological and nuclear (CBRN) threats. PAs should default to providing information of reasonable use that is not a CBRN threat, or information that is generally readily available online.
The section of the model specification on how AI assistants should deal with information threat. Screenshot: OpenAI.
Joanne Jang, Product Manager at OpenAI, explains that the goal is to get input from the public to help determine how AI models should behave, and that this framework will help draw a clear line between what is intended and what is a mistake. The default behavior of the model proposed by OpenAI includes assuming that users and developers have good intentions, asking clear questions, not overstepping, being objective, discouraging hate, not trying to change anyone’s mind, and expressing ambiguity.
“We think we can create building blocks for people to have more nuanced conversations about models.” “If models have to obey the law, whose law should they obey?” he asks. Jean tells The Verge:” “I hope we can resolve the debate about whether something is a bug or whether the answer is a principle that people disagree with.” Because it facilitates a conversation about what we should bring to the policy team.
The Model Spec will not have an immediate impact on OpenAI’s existing model versions, such as GPT-4 and DALL-E 3.
Jang notes that model behavior is ‘early science’ and the Model Specification is a living document that can be updated frequently. For now, OpenAI is awaiting feedback from the public and various stakeholders using the model, including “policymakers, trusted institutions and domain experts.”
OpenAI did not specify to what extent feedback from the public could be adopted, or who would decide what should be changed. Ultimately, the final say on how the model will behave rests with the company, which said in its application, “As we develop a robust process for collecting and incorporating feedback to ensure we are responsibly moving towards our mission, this will give us an initial insight into what we hope this will provide us with the following information.” It said.