OpenAI Introduces Harness Engineering: Codex Agents Power Large‑Scale Software Development

InfoQ Homepage
News
OpenAI Introduces Harness Engineering: Codex Agents Power Large‑Scale Software Development
Architecture & Design
OpenAI Introduces Harness Engineering: Codex Agents Power Large‑Scale Software Development
Feb 21, 2026
2
min read
by
Leela Kumili
Write for InfoQ
Feed your curiosity.
Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article -
0:00
Audio ready to play
Your browser does not support the audio element.
0:00
0:00
Normal
1.25x
1.5x
Like
Reading list
OpenAI has detailed a new
internal engineering methodology called Harness engineering
that leverages AI agents to drive key aspects of the software development lifecycle. The system uses Codex, a suite of AI agents, to perform tasks such as writing code, generating tests, and managing observability, based on declarative prompts defined by engineers. Harness standardizes workflows, reducing reliance on handcrafted scripts and custom tooling.
Ryan Lopopolo
, member of the technical staff at OpenAI, mentioned:
We built Harness to provide a consistent and reliable way to run large-scale AI workloads, so teams can focus on research and product development rather than infrastructure orchestration.
In a five-month internal experiment, OpenAI engineers built and shipped a beta product containing roughly a million lines of code without any manually written source code. A small team of engineers guided agents through pull requests and continuous integration workflows. The work included application logic, documentation, CI configuration, observability setup, and tooling. Engineers provided prompts and feedback, while
Codex
agents iterated autonomously on tasks including reproducing bugs, proposing fixes, and validating outcomes.
Codex Agent‑Driven Application Testing and Feedback ( Source:
OpenAI Blog Post
)
Harness engineering shifts human engineers focus from implementing code to designing environments, specifying intent, and providing structured feedback. Codex interacts directly with development tools, opening pull requests, evaluating changes, and iterating until task criteria are satisfied. Agents use telemetry, including logs, metrics, and spans, to monitor application performance and reproduce bugs across isolated development environments.
Observability and Telemetry Workflow for Codex Agents ( Source:
OpenAI Blog Post
)
Internal documentation is organized in a structured docs directory containing maps, execution plans, and design specifications. These documents serve as the single source of truth for agents. Cross-linked design and architecture documentation is mechanically enforced with linters and CI validation, ensuring consistency and reducing the need for manual oversight.
OpenAI enforces architectural boundaries and dependency layers across domains through mechanical rules and structural tests. Dependencies flow in a controlled sequence from Types → Config → Repo → Service → Runtime → UI, with agents restricted to operate within these layers. Structural tests validate compliance and prevent violations of modular layering.
Martin Fowler
, author and Thoughtworks technologist, mentioned in a LinkedIn
Post
Harness Engineering is a valuable framing of a key part of AI‑enabled software development. Harness includes context engineering, architectural constraints, and garbage collection.
OpenAI reports that Harness encodes scaffolding, feedback loops, documentation, and architectural constraints into machine-readable artifacts, which Codex agents use to execute tasks across development workflows, including code generation, testing, and observability.
About the Author
Leela Kumili
Show more
Show less
Rate this Article
Adoption
Style
Author Contacted
This content is in the
Agents
topic
Related Topics:
Development
Architecture & Design
AI, ML & Data Engineering
Continuous Improvement
OpenAI
Observability
Software Engineering
Continuous Integration
Product Development
SoftwareParadigm
Agents
Continuous Delivery
AI Development
Automation
Related Editorial
Related Sponsors
Popular across InfoQ
Vercel Releases React Best Practices Skill with 40+ Performance Rules for AI Agents
Kubernetes Introduces Node Readiness Controller to Improve Pod Scheduling Reliability
AWS Launches Agent Plugins to Automate Cloud Deployment
Google Publishes Scaling Principles for Agentic Architectures
ASP.NET Core in .NET 11 Preview 1 Brings New Blazor Components, Improved Navigation, and WebAssembly
[Video Podcast] Frictionless DevEx with Nicole Forsgren
The InfoQ
Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers.
View an example
Enter your e-mail address
Select your country
Select a country
I consent to InfoQ.com handling my data as explained in this
Privacy Notice
.
We protect your privacy.