Cybersecurity

GPT-4 Successfully Breaks Zero-Day Vulnerabilities

by Inside Telecom Staff - June 10, 2024
Reading time: 2 min

Post Views: 152

zero-day vulnerabilities on website security with autonomous GPT-4 bots successfully exploit test websites using real-world exploits.

Researchers revealed that autonomous teams of GPT-4 bots succeeded in hacking over half of their test websites using real-world zero-day exploits.

These bots coordinated with each other to generated new bots as much as needed, aiming to penetrate these vulnerabilities.

In a previous paper, the same team who conducted this research showcased that GPT-4 has the ability to autonomously exploit known security flaws, specifically one-day vulnerabilities.

These are security issues that have been identified but do not yet have an official fixed release. In the study, the researchers gave GPT-4 a list of Common Vulnerabilities and Exposures (CVE) a database of disclosed security vulnerabilities. Through the use of such information GPT-4 was able to exploit 87% of the vulnerabilities classified as critical severity without requiring any assistance.

HPTSA Outperforms Single LLM

This week, the team released a follow up paper with more achieving outcomes, announcing that they were able to successfully hack zero-day vulnerabilities, security flaws that are not yet known. To do so, the researchers used a group of autonomous, self-replicating Large Language Model (LLM) agents. In turn, these agents used a method called Hierarchical Planning with Task-Specific Agents (HPTSA).

This method is different from the traditional one that requires LLMs to handle complex tasks. The HPTSA assigns a planning agent, responsible for overseeing the whole hacking process, by coordinating and deploying subagents, each to perform a specific task, making the process more efficient.

The HPTSA is similar to the method applied by Cognition Labs with their Devin AI software development team, which plans jobs, identifies necessary workers, and manages projects by generating specialist employees for specific tasks.

While assessing HPTSA alongside 15 real-world web-focused vulnerabilities, it demonstrated to be 550% efficient more than a single tasked LLM, successfully hacking 8 of 15 zero-day vulnerabilities. In contrast, the single LLM only hacked 3 of the 15 zero-day vulnerabilities.

Concerns Over Misuse

The potential for misuse of these models raises concerns. In this regard, Daniel Kang, one of the researchers and the author of the paper, emphasized that when the AI model is in chatbot mode, it is insufficient for understanding LLM capabilities”, therefore can’t hack anything independently.

Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Cybersecurity sections to stay informed and up-to-date with our daily articles.

G42 and UAE Team Emirates–XRG launch first global GEN AI-designed helmet and competition

Google Pay Goes Live in Lebanon, While Apple Pay Leaves iPhone Users Disappointed

Monty Mobile Wins GCCM 2025 Recognition Award for Innovative eSIM Solutions at the CC-Global Awards in Berlin

From Lost Revenue to Lasting Value: How MNOs Can Reclaim Control of A2P SMS

Switzerland’s rise in the global space economy

Wi-Fi 8 Taking Connectivity to New Levels Starting 2028

Meta’s Under Sea Internet Cables Will Keep Us Connected

Is Ericsson’s 5G Uplink Speed Worth the Cybersecurity Risk?

Starlink’s Direct-to-Cell Service Goes Beyond Consumer Use

China Telecom Industry Open to Foreign Investors

AI in Desperate Need of Smarter Humans

Diligent Robotics Expands to Reach Healthcare

Huawei Asserts AI Models' Domestic Following Whistleblower

EU-funded SOPHIA Project for Solar Panel Reuse with Next-gen Battery

ChatGPT Traffic Affecting News Discovery, Google Search Declines

MyMonty: The New Era of Banking

Entering the Monty Multiverse at Seamless 2023

Seamless Dubai 2023 - From Concept to Reality: Shaffra Technologies Opens Doors to Metaverse Mastery

Take A Look in the Mirror. The Greatest Technology of All Will Stare Back at You

Monty Mobile Enters Multibillion-Dollar MNO Equipment Industry

Are We Addicted to Social Media? IG, TikTok Trigger Physical and Emotional Withdrawal

Meta's AI on Instagram, Facebook Helps Save Lives

US DoT’s New Safety Plan Introduces Car Communication

Little Girl Receives First Prosthetic Eye from MRI, CT Scans

DeepL’s AI Translation Software to Get Traditional Chinese

GPT-4 Successfully Breaks Zero-Day Vulnerabilities

HPTSA Outperforms Single LLM

Concerns Over Misuse