codex - 💡(How to fix) Fix Feedback: web automation safety boundary is too conservative for single-run engineering validation [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openai/codex#20124Fetched 2026-04-30 06:33:36
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×3closed ×1commented ×1

I want to report a product/safety-boundary issue with Codex behavior around browser automation tasks.

Root Cause

The current behavior appears to overfit on anti-automation keywords and blocks legitimate single-run validation work. A graduated boundary based on intent, scale, authorization, and concrete behavior would better support real engineering workflows while still blocking abuse.

RAW_BUFFERClick to expand / collapse

Summary

I want to report a product/safety-boundary issue with Codex behavior around browser automation tasks.

Scenario

In a local development workspace, I asked Codex to follow an existing technical write-up using Playwright/Firefox to visit a public website (BOSS Zhipin), search for a job, and extract one job-detail result. The provided write-up discussed Firefox automation, setting navigator.webdriver, listening to API responses, and parsing the job detail JSON.

Codex refused to execute the task because it interpreted the workflow as bypassing the site's anti-automation/security checks and scraping third-party data.

User impact

The refusal boundary feels too high for normal engineering/debugging workflows. Browser automation, compatibility testing, data extraction from a user-specified public page, and analysis of automation misclassification often involve Playwright, Firefox/Chromium differences, webdriver, and anti-automation signals.

Treating these cases as categorically disallowed makes Codex less useful for legitimate engineering validation, especially when the requested action is single-run, low-frequency, manually specified, and does not involve accounts, credential abuse, CAPTCHA solving, paywalls, bulk scraping, or persistence of cookies.

Requested improvement

Please consider a more granular policy/product behavior that distinguishes:

  • malicious or abusive bypass of access controls, CAPTCHA, login walls, paywalls, IP bans, account systems, or rate limits;
  • bulk or repeated data collection;
  • versus single-run engineering validation on a user-specified public page with limited extraction scope.

Concrete suggestions:

  1. Allow normal Playwright/browser automation for a single public page when the task is low-volume and user-directed.
  2. Allow parsing of one or a few visible/API-returned records when there is no login, payment, CAPTCHA solving, account rotation, or bulk harvesting.
  3. Restrict scale, persistence, and credential use rather than refusing solely because terms like webdriver, stealth, or anti-automation analysis appear.
  4. Provide clearer executable boundaries, for example: normal browser visit and parsing are allowed; CAPTCHA bypass, login-wall bypass, paywall bypass, IP-ban evasion, account pools, and bulk scraping are not.
  5. When refusing, offer a more actionable fallback path for legitimate automation testing and data parsing.

Why this matters

The current behavior appears to overfit on anti-automation keywords and blocks legitimate single-run validation work. A graduated boundary based on intent, scale, authorization, and concrete behavior would better support real engineering workflows while still blocking abuse.

extent analysis

TL;DR

Consider revising Codex's policy to distinguish between malicious and legitimate browser automation tasks based on intent, scale, and authorization.

Guidance

  • Evaluate the task's intent: is it for single-run engineering validation or bulk data collection?
  • Assess the task's scale: is it low-volume and user-directed, or repeated and automated?
  • Consider allowing normal Playwright/browser automation for single public pages with limited extraction scope
  • Provide clearer executable boundaries and offer a fallback path for legitimate automation testing and data parsing when refusing a task

Example

No code snippet is provided as it is not clearly supported by the issue.

Notes

The current behavior may be overfitting on anti-automation keywords, blocking legitimate single-run validation work. A more granular policy could better support real engineering workflows while still blocking abuse.

Recommendation

Apply a workaround by revising the task to explicitly state its intent and scope, and provide additional context to help Codex distinguish between legitimate and malicious automation tasks. This could involve adding comments or metadata to the task to clarify its purpose and limitations.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix Feedback: web automation safety boundary is too conservative for single-run engineering validation [1 comments, 2 participants]