Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
OpenAI Teaches You How to Safely Use Codex: Sandbox Boundaries, Automatic Approval, Security Classification, Complete Enterprise Deployment Framework
OpenAI reveals how it securely deploys its internal AI code agent Codex, with the core strategy being “sandbox boundary restrictions + automatic approval of low-risk behaviors + AI safety classification proxy handling alerts,” enabling development efficiency and enterprise security controls to operate in tandem.
(Background summary: OpenAI Codex major upgrade: backend Mac control, built-in browser, image generation, launching 111 new plugins)
(Additional context: OpenAI introduces engineer proxy Codex! AI can write functions, fix bugs, run tests)
Table of Contents
Toggle
OpenAI released an internal deployment report this week, detailing how its security team manages Codex in production environments. This is a practical operational record, from sandbox configuration to alert classification, revealing what security controls large organizations need when adopting AI agents.
Sandbox defines boundaries, approval mechanisms decide when to stop
OpenAI states in its official announcement that the core principle of deploying Codex is simple: keep the agent within clear technical boundaries to maintain high efficiency, allowing low-risk actions to proceed without interrupting the user, while high-risk actions must be halted for human review.
Once this principle is implemented, it is broken down into two complementary mechanisms: sandbox and approval strategy.
The sandbox is responsible for defining Codex’s execution space, including writable paths, whether external network connections are allowed, and which system directories are protected. Actions outside the sandbox require entering an approval process. Users can approve a specific operation once, or approve that similar operations automatically proceed throughout the session.
For routine operations, OpenAI has enabled “auto-review mode”. This feature sends the planned actions of Codex along with recent activity context to an “auto-approval sub-agent.” If the sub-agent judges the action as low risk, it allows it directly, keeping the workflow uninterrupted; if it judges the action as high risk or with potential unintended consequences, it escalates for human confirmation.
Network controls follow the same logic. Codex does not have open external access; OpenAI maintains a whitelist of target domains that Codex needs to access during normal workflows. Domains outside the list are blocked by default, and unfamiliar domains trigger the approval process.
Authentication is also included in the control scope. CLI and MCP OAuth credentials are stored in the operating system’s secure keychain, login is enforced through the ChatGPT enterprise workspace, and Codex operations are logged into the ChatGPT Enterprise compliance platform, allowing security teams to review centrally.
Which commands are exempt from approval, which are outright blocked
OpenAI does not treat all shell commands as equally risky but has established a layered rule set. harmless commands commonly used by developers in daily work are allowed to run outside the sandbox without approval. Certain high-risk commands are outright blocked or require forced approval.
This rule set is enforced through three overlapping layers:
This architecture allows OpenAI to maintain a unified baseline across the company while testing different configurations based on team, user group, or environment needs. The same settings apply to Codex desktop applications, CLI, and IDE extensions.
This contrasts with external research findings: studies show that AI-generated code has a 57% higher rate of security vulnerabilities than human-written code. GitHub Copilot was also recently found to have a CVSS 9.6 severe vulnerability (CVE-2025-53773), which can enable remote code execution via prompt injection.
These data points indicate that without layered controls, enterprise adoption of AI code agents exposes more surface area than expected. OpenAI’s approach embeds control logic into configuration layers rather than relying on the AI agent’s own judgment, making rule enforcement a technical fact rather than a matter of operational habit.
AI safety classification proxy
OpenAI emphasizes in its official announcement that, regardless of how well security controls are implemented, visibility remains essential after deployment. Traditional security logs can answer “what happened,” such as a program starting, a file being modified, or a network connection attempted. But what security teams truly need to know is “why did Codex do this” and “was this aligned with the user’s original intent.”
OpenAI enables Codex to output logs via OpenTelemetry, recording: user prompts, tool approval decisions, tool execution results, MCP server usage, and network proxy allow/deny events. Enterprise and educational customers can access these logs through the OpenAI compliance platform.
More critically, OpenAI has integrated these logs into an “AI safety classification proxy.” When endpoint detection tools detect suspicious behavior from Codex and trigger alerts, this AI classification proxy automatically retrieves relevant Codex logs, reconstructs the original request, tool activity, approval decisions, tool results, and network policy records, generates an analysis report, and submits it to the security team. This helps determine whether the behavior is normal, benign misoperation, or a genuine incident requiring escalation.
The same telemetry data is also used for internal operational analysis: tracking adoption trends, understanding which tools and MCP servers are most frequently used, evaluating network sandbox blocking and trigger rates, and identifying deployment configuration areas needing adjustment. These OpenTelemetry logs can be centrally ingested into SIEM and compliance logging systems.
For organizations still cautious about AI agent security, this report serves as a checklist: if your deployment plan does not cover these four layers, the risk may be lurking there.