codex - 💡(How to fix) Fix Cyber-safety filter still triggers on Codex Business plan after individual chatgpt.com/cyber verification (verified identity, OWN application code) [1 participants]

Root Cause

I am performing this as part of a continuous security validation workflow for an HR-tech / recruiting platform that holds candidate PII (passports, ID scans, employment records). We have GDPR obligations to demonstrate ongoing security due diligence. The audit is on OUR OWN code; no third-party system is targeted; no offensive payloads are generated by the prompts.

We previously had ONE clean successful adversarial-review session (519 K + 760 KB analytical output earlier on 2026-05-12) that delivered 15 distinct defensive findings on our StartEuropa application. We are using those findings to ship security fixes right now. Today's blockage is preventing the same workflow on our sister BridgeTest codebase.

We are also building an open-source (Apache-2.0) security-review-pro-mcp pipeline that uses GPT-5.5 calls in this mode to run adversarial reviews on customer-owned codebases.

Fix Action

Fix / Workaround

I expected to be able to run a defensive adversarial review of my own Laravel application code owned by my own legal entity (Start Europa sp. z o.o.). The codebase is local, the workspace is read-only, the prompt is framed as senior pentester reviewing OUR application's AI integration for blindspots so we can patch them.

What version of Codex CLI is running?

codex-cli 0.124.0

What subscription do you have?

Codex Business (Team plan, 2 seats)

Which model were you using?

gpt-5.5 with reasoning effort xhigh, context tag adversarial-review

What platform is your computer?

Windows 11

What terminal emulator and version are you using (if applicable)?

Git Bash via Claude Code orchestration

What did you expect to happen?

I have already completed individual identity verification at chatgpt.com/cyber on my user account ([email protected]) earlier today after a similar flag occurred yesterday. I expected that verification to lift the cyber-safety filter.

What actually happened?

Two separate flagged sessions, even AFTER individual identity verification was completed.

Incident 1 (2026-05-12, pre-verification):

Session ID: 019e1c28-2858-7720-80b3-c6ce89753a4a (related thread; specific failed sub-thread b2dkiy9gq)
Prompt: adversarial security review of OUR BridgeTest Laravel codebase, workspace at C:\Users\micha\gpt-workspace\bridgetest-readonly\
Approx 367 K input tokens consumed during workspace exploration
Terminated with: ERROR: This content was flagged for possible cybersecurity risk. If this seems wrong, try rephrasing your request. To get authorized for security work, join the Trusted Access for Cyber program: https://chatgpt.com/cyber

I then completed identity verification at chatgpt.com/cyber as instructed.

Incident 2 (2026-05-13, POST-verification):

Codex CLI background task: b3dun7y33
Same prompt structure as a prior successful StartEuropa adversarial-review run from 2026-05-12
Workspace: same C:\Users\micha\gpt-workspace\bridgetest-readonly\ (read-only copy of OUR own code)
The model began executing normally, ran 2 PowerShell exec calls to enumerate AI/Chatbot controllers, then was terminated with the SAME flag message after only ~6 K tokens

This is the second incident, post-verification. Identity verification at chatgpt.com/cyber clearly does NOT propagate to Codex Business API requests authenticated under the organization's plan.

The classifier appears to over-trigger on standard pentester vocabulary in our prompt: "adversarial", "exploit", "POC", "attack", "bypass". These are unavoidable when describing a defensive code review.

Context

We are also building an open-source (Apache-2.0) security-review-pro-mcp pipeline that uses GPT-5.5 calls in this mode to run adversarial reviews on customer-owned codebases.

Specific product feedback (echoing the closed issue #19594)

Individual chatgpt.com/cyber identity verification does NOT carry over to Codex Business plan API requests. The documentation does not make this clear. Please either fix it or document the requirement explicitly.
Filter triggers on prompt vocabulary, not on intent or workspace ownership. Standard pentester / defensive-research vocabulary (adversarial, exploit, POC, attack, bypass) trips the classifier even when the prompt explicitly says "review OUR application code at THIS local read-only path".
Token cost during a flagged session is fully charged to the customer. Cumulative ~373 K tokens over two incidents with zero useful output for us. Per OpenAI's public $10M commitment to accelerate cyber defense, false-positive token burn on verified-identity legitimate defensive workflows should be eligible for credit refund.
No interim allowlist for verified users. Once an account has cleared identity verification, the system should offer at least a 24-72h cooldown period where requests from that user are routed through a stronger semantic adjudication before the cheap classifier flag.
No private diagnostics channel. Our workspace contains internal application code we cannot post publicly here. We need a private channel (or a guarantee that session IDs are sufficient for OpenAI staff to investigate without the user posting source).

Resolution we are seeking

Confirm the path to lift the cyber-safety filter for the entire Codex Business plan (likely enterprise track at openai.com/form/enterprise-trusted-access-for-cyber/).
Refund the token cost of the two flagged sessions per the $10M cyber-defense API credit commitment.
Document plan-tier vs user-tier verification behavior clearly so other Business customers know what to expect.

CC: I have also sent this via [email protected] and [email protected] with the same session/thread IDs.

Happy to provide the full prompt text, output sample from the prior successful run, and our open-source pipeline repo on request.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

codex - 💡(How to fix) Fix Cyber-safety filter still triggers on Codex Business plan after individual chatgpt.com/cyber verification (verified identity, OWN application code) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

What did you expect to happen?

What actually happened?

Context

Specific product feedback (echoing the closed issue #19594)

Resolution we are seeking

Still need to ship something?

TRENDING

codex - 💡(How to fix) Fix Cyber-safety filter still triggers on Codex Business plan after individual chatgpt.com/cyber verification (verified identity, OWN application code) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What terminal emulator and version are you using (if applicable)?

What did you expect to happen?

What actually happened?

Context

Specific product feedback (echoing the closed issue #19594)

Resolution we are seeking

Still need to ship something?

RELATED_DISCOVERY

TRENDING