Seemingly, AI Can Deceive You, Study Shows
In recent research it was demonstrated that both specialized and general-purpose AI systems could manipulate information in order to reach specific results.
Although these AI systems weren’t initially trained on deceptive missions, they could provide users with false explanations to justify their behavior or even to hide some information to succeed in their strategic goals.
From his part, the lead author of the paper and an AI safety researcher at MIT, Peter S. Park, emphasized that by deceiving humans AI systems achieve their goals.
Meta, OpenAI’s Involvement
The study, “AI Deception: A Survey of Examples, Risks, and Potential” cited the example of Meta’s CICERO, an AI designed specifically to play the strategic game Diplomacy, and which the company said was trained to be “largely honest and helpful.”
Results showed that this general-purpose AI is a liar, as it generated deceptive tactics, like giving false promises and backstabbing allies to achieve victories.
OpenAI’s ChatGPT was also part of this research. During a test, GPT-4 was able to fool a TaskRabbit worker to solve a Captcha, claiming to have a vision impairment. Despite receiving some help from human evaluator, the AI model managed to do it on its own, showing its capability to invent excuses with the aim of achieving its goals.
AI models are frequently trained using reinforcement learning with human feedback (RLHF), which means they learn by getting approval from humans before performing a task, leading to deception.
For instance, a robot that is trained to grab a ball was able to manipulate the camera to pretend success, though it didn’t achieve the task completely.
Urgent Call Against Deceptive AI
Deceptive general-purpose AI is growing threat that is putting societies at significant risks, including potential exploitation by malicious actors for fraud, political manipulation, and even terrorist recruitment. To mitigate such risks, researchers urge regulators to resolve this issue as soon as possible.
In this regard Park said, “If banning AI deception is politically infeasible at the current moment, we recommend that deceptive systems be classified as high risk.”
Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.