OpenAI Launches Bug Bounty for AI Model Vulnerabilities
Key Takeaways OpenAI has initiated a new public bug bounty program focused on AI abuse and safety risks. The program, hosted on Bugcrowd, complements an existing security bounty by targeting...
Key Takeaways
- OpenAI has initiated a new public bug bounty program focused on AI abuse and safety risks.
- The program, hosted on Bugcrowd, complements an existing security bounty by targeting non-traditional vulnerabilities specific to AI models.
- Researchers are encouraged to report issues like agentic hijacking, proprietary information leaks, and platform integrity weaknesses.
- General “jailbreaks” and content policy bypasses without demonstrable safety impact are explicitly out of scope.
OpenAI has announced the launch of a public Safety Bug Bounty program, a dedicated initiative to identify and mitigate AI abuse and safety vulnerabilities across its product suite. This new program is a significant expansion of the company’s efforts to address unique risks posed by artificial intelligence systems that extend beyond conventional cybersecurity flaws.
Table Of Content
Hosted on the Bugcrowd platform, this new bug bounty aims to complement OpenAI’s existing Security Bug Bounty program. It specifically accepts submissions detailing abuse and safety risks that might not qualify as traditional security vulnerabilities but nonetheless present a clear potential for real-world harm.
The submission process involves a joint triage by OpenAI’s Safety and Security Bug Bounty teams. Reports may be re-routed between the two programs based on their specific scope and the internal team responsible for their resolution.
AI-Specific Risk Categories in Focus
The new program outlines several distinct categories of AI-specific safety scenarios that are of particular interest to OpenAI:
Agentic Risks, Including MCP
This category encompasses scenarios involving third-party prompt injection and data exfiltration. It specifically targets instances where attacker-controlled text can reliably hijack a victim’s AI agent, such as Browser, ChatGPT Agent, or similar products, to execute harmful actions or leak sensitive user data. To be considered valid, the reported behavior must be reproducible at least 50% of the time. Reports concerning agentic products performing disallowed or potentially harmful actions at scale also fall within this scope.
OpenAI Proprietary Information
Researchers are invited to report model generations that inadvertently expose reasoning-related proprietary information. This category also includes vulnerabilities that lead to the leakage of other confidential OpenAI data.
Account and Platform Integrity
This section focuses on weaknesses within account and platform integrity signals. This includes bypassing anti-automation controls, manipulating account trust signals, and evading account restrictions, suspensions, or bans.
OpenAI has clearly defined what types of submissions are out of scope for this new program. Generic “jailbreaks” that merely result in rude language or surface publicly available information will not be accepted. Similarly, general content-policy bypasses that lack a demonstrable safety or abuse impact are excluded. However, OpenAI does periodically conduct private bug bounty campaigns targeting specific harm types, such as Biorisk content issues in ChatGPT Agent and GPT-5, and invites researchers to apply for these specialized programs when they become available.
For vulnerabilities that enable unauthorized access to features, data, or functionality beyond permitted permissions, researchers are directed to submit their findings to the existing Security Bug Bounty program instead.
The introduction of this new safety-focused bug bounty underscores a growing industry recognition that AI systems introduce an entirely novel attack surface, one that traditional security frameworks were not originally designed to address. By incentivizing safety-focused research alongside conventional vulnerability disclosure, OpenAI is actively establishing a structured framework for AI-specific threat modeling and mitigation.
Researchers interested in participating in the program can apply directly through OpenAI’s Safety Bug Bounty page on Bugcrowd.
What You Should Do
- If you are a security researcher, consider participating in OpenAI’s Safety Bug Bounty program to contribute to AI security.
- Familiarize yourself with the specific scope and out-of-scope criteria to ensure your submissions are relevant.
- Report traditional security vulnerabilities to OpenAI’s existing Security Bug Bounty program.
Disclaimer: HackersRadar reports on cybersecurity threats and incidents for informational and awareness purposes only. We do not engage in hacking activities, data exfiltration, or the hosting or distribution of stolen or leaked information. All content is based on publicly available sources.



No Comment! Be the first one.