ChatGPT’s Voice Mode has some security flaws, but OpenAI says it’s on top of it.
On Thursday OpenAI published a report on GPT-4o’s safety features, addressing known issues that occur when using the model. GPT-4o is the underlying model that powers the latest version of ChatGPT, and comes with a Voice Mode that was recently released to a select group of users with a ChatGPT Plus subscription.
The “safety challenges” identified include standard risks like prompting the model with erotic and violent responses, other disallowed content, and “ungrounded inference” and “sensitive trait attribution” — assumptions that might be discriminatory or biased, in other words. OpenAI says it has trained the model to block any outputs flagged in these categories. However, the report also says mitigations don’t include “nonverbal vocalizations or other sound effect” such as erotic moans, violent screams, and gunshots. One can infer, then, that prompts involving certain sensitive nonverbal sounds might improperly receive a response.
OpenAI also mentioned unique challenges that come with vocally communicating with the model. Red-teamers discovered that GPT-4o could be prompted to impersonate someone or accidentally emulate the user’s voice. To combat this, OpenAI only allows pre-authorized voices (minus the notorious Scarlett Johansson-sounding voice). GPT-4o can also identify other voices besides the speaker’s voice, which presents a serious privacy and surveillance issue. But it has been trained to deny those requests — unless the model is being prompted on a famous quote.
Mashable Light Speed
Red-teamers also noted that GPT-4o could be prompted to speak persuasively or emphatically, a feature that could be more harmful than text outputs when it comes to misinformation and conspiracy theories.
Notably, OpenAI also addressed potential copyright issues that have plagued the company and the overall development of generative AI, which trains on data scraped from the web. GPT-4o has been trained to refuse requests for copyrighted content and has additional filters for blocking outputs containing music. On that note, ChatGPT’s Voice Mode has been directed not to sing under any circumstances.
OpenAI’s numerous risk mitigations covered in the lengthy document were carried out before Voice Mode was released. So the ostensive message of the report says that while GPT-4o is capable of certain risky behavior, it won’t do it.
However, OpenAI says, “These evaluations measure only the clinical knowledge of these models, and do not measure their utility in real-world workflows.” So it’s been tested in a controlled environment, but when the broader public gets their hands on GPT-4o, it could be a different beast when out in the wild.
Mashable reached out to OpenAI for additional clarity about these mitigations, and will update if we hear back.
Topics
Artificial Intelligence
OpenAI