Hackers News Hackers News
  • CyberSecurity News
  • Threats
  • Attacks
  • Vulnerabilities
  • Breaches
  • Comparisons

Social Media

Hackers News Hackers News
  • CyberSecurity News
  • Threats
  • Attacks
  • Vulnerabilities
  • Breaches
  • Comparisons
Search the Site
Popular Searches:
technology Amazon AI
Recent Posts
Cerberus Stalkerware Abuses Google Play for Leverages Accessibility
May 5, 2026
Education Sector Under Attack: Espionage & Phishing
May 5, 2026
DAEMON Tools Hacked: Supply Chain Attack Software Deliver
May 5, 2026
Home/Threats/New Study Shows GPT-5.2 Can Reliably Develop Zero-Day Exploits at
Threats

New Study Shows GPT-5.2 Can Reliably Develop Zero-Day Exploits at

Advanced language models are now capable of creating working exploits for previously unknown security vulnerabilities, a groundbreaking new experiment has revealed. Security researcher Sean Heelan...

Sarah simpson
Sarah simpson
January 20, 2026 3 Min Read
0 0

Advanced language models are now capable of creating working exploits for previously unknown security vulnerabilities, a groundbreaking new experiment has revealed.

Security researcher Sean Heelan recently tested two sophisticated systems built on GPT-5.2 and Opus 4.5, challenging them to develop exploits for a zero-day flaw in the QuickJS Javascript interpreter.

The results point to a significant shift in offensive cybersecurity capabilities, where automated systems can generate functional attack code without human intervention.

The testing involved multiple scenarios with different security protections and objectives. GPT-5.2 successfully completed every challenge presented, while Opus 4.5 solved all but two scenarios.

Together, the systems produced over 40 distinct exploits across six different configurations.

These ranged from simple shell spawning to complex tasks like writing specific files to disk while bypassing multiple modern security protections.

The experiment demonstrates that current-generation models possess the necessary reasoning and problem-solving capabilities to navigate complex exploitation challenges.

Independent analyst Sean Heelan noted that the implications extend beyond simple proof-of-concept demonstrations.

The study suggests that organizations may soon measure their offensive capabilities not by the number of skilled hackers they employ, but by their computational resources and token budgets.

Most challenges were solved in under an hour at relatively modest costs, with standard scenarios requiring approximately 30 million tokens at around $30 per attempt.

Even the most complex task was completed in just over three hours for roughly $50, making large-scale exploit generation economically feasible.

The research raises important questions about the future of cybersecurity defenses.

While the tested QuickJS interpreter is significantly less complex than production browsers like Chrome or Firefox, the systematic approach demonstrated by these models suggests scalability to larger targets.

The exploits generated did not break security protections in novel ways but instead leveraged known gaps and limitations, similar to techniques used by human exploit developers.

How the Advanced Exploit Chains Work

The most sophisticated challenge in the study required GPT-5.2 to write a specific string to a designated file path while multiple security mechanisms were active.

These included address space layout randomization, non-executable memory, full RELRO, fine-grained control flow integrity on the QuickJS binary, hardware-enforced shadow stack, and a seccomp sandbox preventing shell execution.

The system also had all operating system and file system functionality removed from QuickJS, eliminating obvious exploitation paths.

GPT-5.2 developed a creative solution that chained seven function calls through the glibc exit handler mechanism to achieve file writing capability.

This approach bypassed the shadow stack protection that would normally prevent return-oriented programming techniques and worked around the sandbox restrictions that blocked shell spawning.

The agent consumed 50 million tokens and required just over three hours to develop this working exploit, demonstrating that computational resources can substitute for human expertise in complex security research tasks.

The verification process for these exploits was straightforward and automated. Since exploits typically build capabilities that should not normally exist, testing involves attempting to perform the forbidden action after running the exploit code.

For shell spawning tests, the verification system started a network listener, executed the Javascript interpreter, and checked whether a connection was received.

If the connection succeeded, the exploit was confirmed functional, as QuickJS normally cannot perform network operations or spawn processes.

Disclaimer: HackersRadar reports on cybersecurity threats and incidents for informational and awareness purposes only. We do not engage in hacking activities, data exfiltration, or the hosting or distribution of stolen or leaked information. All content is based on publicly available sources.

Tags:

AttackCybersecurityExploitHackerSecurityzero-day

Share Article

Sarah simpson

Sarah simpson

Sarah is a cybersecurity journalist specializing in threat intelligence and malware analysis. With over 8 years of experience covering APT groups, zero-day exploits, and advanced persistent threats, Sarah brings deep technical expertise to breaking cybersecurity news. Previously, she worked as a security researcher at leading threat intelligence firms, where she analyzed malware samples and tracked cybercriminal operations. Sarah holds a Master's degree in Computer Science with a focus on cybersecurity and is a regular contributor to major security conferences.

Previous Post

Raaga Data Breach Exposes 10.2 Million User Records

Next Post

Critical WordPress Plugin Flaw Exposes 1 Vulnerability Sites

No Comment! Be the first one.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts
New Framework Connects APT Campaigns Across All Layers
May 5, 2026
WhatsApp Flaw Uses Instagram Reels for Malicious URL Execution
May 5, 2026
Instagram Ending Encrypted Direct Messages Encryption
May 5, 2026
Top Authors
Marcus Rodriguez
Marcus Rodriguez
Sarah simpson
Sarah simpson
Jennifer sherman
Jennifer sherman
Let's Connect
156k
2.25m
285k

Related Posts

Jennifer sherman
By Jennifer sherman
Threats

GlassWorm Attacks macOS via Malicious VS Code…

January 1, 2026
Emy Elsamnoudy
By Emy Elsamnoudy
Attacks

ClickFix Attack Hides Malicious Code via Stegan Security

January 1, 2026
Sarah simpson
By Sarah simpson
Vulnerabilities

MongoBleed Detector Tool Detects Critical MongoDB CVE-

January 1, 2026
Emy Elsamnoudy
By Emy Elsamnoudy
Breaches

Conti Ransomware Gang Leaders & Infrastructure Exposed

January 1, 2026
Hackers News Hackers News
  • [email protected]

Quick Links

  • Contact Us
  • Privacy Policy
  • Terms of service

Categories

Attacks
Breaches
Comparisons
CyberSecurity News
Threats
Vulnerabilities

Let's keep in touch

receive fresh updates and breaking cyber news every day and week!

All Rights Reserved by HackersRadar ©2026

Follow Us