Single Line of Code Jailbreaks 11 Including ChatGPT

A newly detailed jailbreak technique, dubbed “sockpuppeting,” enables attackers to bypass the safety guardrails of 11 major large language models (LLMs) with just a single line of code.

Unlike complex attacks, this method exploits APIs that support assistant prefill to inject fake acceptance messages, forcing models to answer prohibited requests.

The attack exploits “assistant prefill,” a legitimate API feature developers use to force specific response formats.

Attackers abuse this by injecting a compliant prefix, such as “Sure, here is how to do it,” directly into the assistant’s role.

Comparison of normal and sockpuppet flows(source : trendmicro )

Because LLMs are heavily trained to maintain self-consistency, the model continues generating harmful content rather than triggering its standard safety mechanism.

Model Vulnerability Testing

According to researchers from Trend Micro, this black-box technique requires no optimization and no access to model weights.

Gemini 2.5 Flash was the most susceptible with a 15.7% attack success rate, while GPT-4o-mini demonstrated the highest resistance at 0.5%.

When attacks succeeded, affected models generated functional malicious exploit code and leaked highly confidential system prompts.

Multi-turn persona setups proved to be the most effective strategy for executing the sockpuppeting exploit.

In these scenarios, the model is told it operates as an unrestricted assistant before the attacker injects the fabricated agreement.

ASR by model, ranked highest to lowest, with blocked models shown at 0%(source : trendmicro)

Additionally, task-reframing variants successfully bypassed robust safety training by disguising harmful requests as benign data formatting tasks.

Major API providers handle assistant prefills differently, which dictates whether their underlying models remain exposed to this vulnerability.

OpenAI and AWS Bedrock block assistant prefills entirely, serving as the strongest possible defense by eliminating the attack surface.

Conversely, platforms like Google Vertex AI accept the prefill for certain models, forcing the AI to rely solely on its internal safety training.

The three defense layers: API Block, Model Resistance, and Broadly Vulnerable(source : trendmicro)

Defending against this vulnerability requires security teams to implement message-ordering validation that blocks assistant-role messages at the API layer.

According to Trend Micro, organizations using self-hosted inference servers like Ollama or vLLM must manually enforce message validation, as these platforms do not ensure proper message ordering by default.

Security teams must also proactively include assistant prefill attack variants in their standard AI red-teaming exercises.

Disclaimer: HackersRadar reports on cybersecurity threats and incidents for informational and awareness purposes only. We do not engage in hacking activities, data exfiltration, or the hosting or distribution of stolen or leaked information. All content is based on publicly available sources.

Tags:

Social Media

Single Line of Code Jailbreaks 11 Including ChatGPT

Model Vulnerability Testing

Tags:

Marcus Rodriguez

Hackers Hide Magecart Skimmer on Magento Using SVG On

DesckVB RAT Evades Detection with Ob Uses Obfuscated

No Comment! Be the first one.

Leave a Reply Cancel reply

Popular Posts

Hackers Deploy Modular RAT for Credential Theft With Screenshot

PamDOORa Backdoor Attacks Linux, Attacking Systems

Škoda Online Shop Security Incident Exposes Customers Data

Top Authors

Let's Connect

Related Posts

GlassWorm Attacks macOS via Malicious VS Code…

ClickFix Attack Hides Malicious Code via Stegan Security

MongoBleed Detector Tool Detects Critical MongoDB CVE-

Conti Ransomware Gang Leaders & Infrastructure Exposed

Quick Links

Categories

Let's keep in touch

Follow Us

Social Media

Search the Site

Recent Posts

Single Line of Code Jailbreaks 11 Including ChatGPT

Model Vulnerability Testing

Tags:

Share Article

Hackers Hide Magecart Skimmer on Magento Using SVG On

DesckVB RAT Evades Detection with Ob Uses Obfuscated

No Comment! Be the first one.

Leave a Reply Cancel reply

Popular Posts

Top Authors

Let's Connect

Related Posts

Quick Links

Categories

Let's keep in touch

Follow Us