DARPA ACE: The Day AI Beat a Human Fighter Pilot

The Setup: Why Air Combat AI Was Considered Impossible

For decades, within-visual-range (WVR) air combat — what pilots call a "dogfight" — was considered the domain of human intuition, spatial reasoning, and the kind of embodied split-second decision-making that machine learning systems could not replicate. The physics involved are extreme: aircraft pulling 9-G turns, closure rates exceeding Mach 1, weapons envelopes that exist for fractions of a second. Pilots spend years mastering the energy management equations, the geometry of missile shots, the timing of breaks and extensions.

Traditional autopilot and flight management systems operated in a completely different performance envelope. Even advanced fly-by-wire systems that prevent structural overstress were designed as augmentation tools, not combat decision-makers. The gap between "autopilot holding altitude" and "autonomous AI winning a knife fight at Mach 0.9" was considered as wide as the gap between a chess program and a world-class sprinter.

DARPA's Air Combat Evolution (ACE) program was launched in 2019 with exactly that assumption as its target. Its stated goal was to demonstrate that AI agents could perform autonomous air combat maneuvers (ACM) at human-competitive or human-exceeding levels — not in simulation, but in actual flight. The program would eventually accomplish that goal in a way that exceeded even its architects' expectations, and it would do so on a timeline that the broader defense community had not anticipated.

Program Background

ACE was part of DARPA's broader Mosaic Warfare concept — a vision in which large numbers of cheaper, more expendable platforms coordinate autonomously to overwhelm adversary defenses. The dogfighting question was both a technical milestone and a philosophical one: could AI be trusted to make lethal engagement decisions in the most demanding combat environment that exists?

AlphaDogfight Trials: The 2020 Watershed

The first major public milestone in the ACE program came not in the air but in simulation. In August 2020, DARPA hosted the AlphaDogfight Trials — a tournament in which eight AI teams competed against each other and, in the final round, against a human F-16 pilot in a simulated WVR engagement environment.

The human pilot, an active-duty U.S. Air Force F-16 weapons instructor with thousands of flight hours, lost every single engagement 0-5 to Heron Systems' AI agent. The AI flew nothing like a human pilot. It did not use the energy management doctrines that human pilots learn over years of training. Instead, it attacked in nearly straight lines, relying on its ability to track the exact position of the opposing aircraft with perfect precision and fire microsecond-accurate gun bursts at geometrically optimal moments. The human pilot called it "demotivating."

The defense community's reaction was divided. Skeptics noted that the simulation environment was highly simplified — no sensor noise, no communications latency, no real-world aerodynamic edge cases, no rules of engagement complexity. The AI had essentially solved a constrained optimization problem, not proven it could fight in the real world. Proponents argued the results were still significant: the AI had demonstrated that within a defined problem space, machine learning systems could achieve combat decision-making speed and geometric precision that exceeded human capability.

AlphaDogfight Team	Approach	Tournament Result	Notable Feature
Heron Systems	Reinforcement learning, neural net	CHAMPION — 5-0 vs Human	Straight-line attack geometry, gun-focused
Lockheed Martin Skunk Works	Hybrid rule-based + RL	2nd Place	Most conservative engagement style
Aurora Flight Sciences (Boeing)	Behavior trees + deep learning	3rd Place	Aggressive pursuit doctrine
EpiSys Science	Evolutionary algorithms	4th Place	Unconventional maneuvering patterns
Human Pilot (F-16 IP)	Trained human judgment	0-5 vs Heron AI	"Demotivating" post-match comment

From Simulation to the X-62A VISTA

The AlphaDogfight result raised an obvious question: would the same AI approaches work in actual flight, on an actual aircraft, with real sensors, real aerodynamics, and real safety considerations? DARPA's answer was to move the program into the air.

The platform chosen for ACE flight testing was the X-62A VISTA — Variable In-flight Simulator Test Aircraft — a heavily modified two-seat F-16D operated by the Air Force Test Pilot School at Edwards Air Force Base. The VISTA had been specifically designed to simulate the flight characteristics of other aircraft, making it a natural fit for an AI research program that needed to be able to test dangerous maneuvers without risking operational aircraft.

The X-62A's modification for ACE was extensive. The aircraft's flight control computers were interfaced with a new open-architecture mission systems platform called VAULT, developed by Calspan under contract to DARPA. VAULT allowed researchers to upload, test, and modify AI agent software in flight — essentially making the X-62A a flying AI laboratory. Crucially, a safety pilot sat in the rear cockpit at all times with the ability to override AI commands instantly. The safety override was hardwired, not software-dependent, ensuring that no AI failure mode could prevent human intervention.

Safety Architecture

DARPA's ACE safety framework was deliberately conservative. The AI agents were given control authority over flight surfaces and weapons systems within defined envelopes, but structural limits, collision avoidance, and emergency procedures remained hardwired to conventional flight computers. The AI could not exceed the aircraft's structural limits, could not disable the safety pilot's override capability, and could not engage outside defined test parameters. This layered safety architecture was as important to the program as the AI development itself.

The Test Campaigns: 2022-2023

DARPA and the Air Force Test Pilot School conducted extensive AI vs. AI flight testing through 2022 and into 2023. Multiple ACE contractor teams — eventually including Calspan, Lockheed Martin, Northrop Grumman, Heron Systems (acquired by Shield AI), and others — flew their AI agents in the VISTA against both simulated opponents and against each other in coordinated flight test campaigns.

The early flight tests revealed a significant gap between simulation performance and real-world performance. Sensor noise, aerodynamic modeling inaccuracies in the simulation training environment, and latency in the actual flight control loop all degraded AI performance relative to what had been demonstrated in AlphaDogfight. Teams spent much of 2022 working on the "sim-to-real" transfer problem — how to train AI agents in simulation that perform equally well in actual flight.

The approaches to this problem varied significantly across teams. Some used domain randomization — training in many slightly different simulated environments so the AI learned robust policies rather than simulation-specific ones. Others used real flight data to calibrate simulation aerodynamic models more precisely. Shield AI (the acquirer of Heron Systems) focused on ensuring their Nova AI pilot performed well across a wide range of sensor degradation conditions, preparing for real-world electromagnetic environments.

The September 2023 ACM Tests

By September 2023, DARPA and the Air Force TPS were ready for the landmark test: AI agents in the VISTA flying against a crewed F-16 flown by a human test pilot in within-visual-range combat. The test took place over the desert ranges of Edwards AFB under carefully controlled conditions. Both aircraft were equipped with instrumentation pods recording every flight parameter. Simulated weapons systems tracked kill probability at each moment of the engagement.

The results were unambiguous. Across multiple engagements, the AI agent defeated the human pilot. The final publicized score — which DARPA announced in a press release in May 2024, referencing the September 2023 flights — was effectively 5-0 or comparable results in ACM engagements. The AI consistently demonstrated superior geometric awareness, faster reaction to change in the adversary's energy state, and more aggressive exploitation of weapons envelopes.

"The work here shows that AI can handle the physical task of flying and fighting in a dogfight. What remains is the question of how we integrate these systems into the broader mission and how we ensure they operate within intended rules of engagement."
— Col. James Valpiani, Air Force Test Pilot School Commandant, May 2024

The human pilot who flew the crewed F-16 in the September 2023 engagements later described the AI's tactics as "alien." The AI did not fight like any human pilot ever trained. It did not use the standard Western air combat maneuver canon — no lead turns, no extension-and-reversal, no conventional lag pursuit. Instead, it appeared to have developed its own maneuver grammar, optimized for the specific aerodynamic and weapons employment geometry of the engagement, derived entirely from reinforcement learning without any human doctrinal input.

How the AI Fights: Reinforcement Learning in Combat

Understanding why the AI won requires understanding how it was trained. The core methodology across most ACE teams was reinforcement learning (RL) — specifically, a variant of deep RL in which a neural network learns to take actions by receiving reward signals for favorable outcomes and penalty signals for unfavorable ones.

In the ACE context, the AI agents were trained in high-fidelity flight simulators running thousands of simulated hours of air combat. The reward function was designed to give positive rewards for achieving favorable weapons employment position (shooting the adversary), maintaining energy advantage, surviving the engagement, and staying within safe flight envelopes. Negative rewards accrued for being shot, losing energy to unrecoverable states, and exceeding structural limits.

Over millions of simulated engagements — a volume of experience no human pilot could ever accumulate — the neural network converged on policies that maximized expected reward. The resulting behavior was highly effective but deeply non-human. The AI had essentially discovered air combat tactics from first principles, without any human doctrinal input, and the tactics it discovered were in several respects superior to human-codified doctrine.

Training Scale

Shield AI's Nova AI pilot, one of the ACE contractors, reportedly trained on the equivalent of thousands of years of simulated flight time before its first real flight. No human pilot can train for more than a few thousand hours in a career. The RL training advantage in raw experience is effectively infinite.

Key AI Tactical Advantages Observed

Post-flight analysis from DARPA and Air Force TPS observers identified several specific areas where the AI consistently outperformed the human pilot:

Reaction time: The AI responded to changes in adversary energy state and geometry in approximately 50-100 milliseconds. Human reaction time in high-G combat environments is typically 300-500 milliseconds, with cognitive processing adding additional latency.
G-tolerance management: The AI could fly continuously at or near the aircraft's aerodynamic limit without the fatigue, G-LOC risk, or physical discomfort that constrains human pilots. It did not need to ease off the stick to maintain visual acuity.
Perfect geometric precision: The AI maintained awareness of the precise 3D geometry of the engagement with no cognitive degradation. Human spatial reasoning under high G and high stress degrades significantly; AI spatial processing does not.
No hesitation under pressure: The AI did not freeze, second-guess, or become conservative under adverse conditions. It executed its policy regardless of tactical situation.
Non-doctrinal unpredictability: Because the AI had not learned human doctrinal tactics, its maneuvers were genuinely unpredictable to human adversaries trained on conventional ACM doctrine.

Timeline: From ACE to CCA

2019

DARPA Launches ACE Program

DARPA announces the Air Combat Evolution program with the goal of demonstrating AI agents performing autonomous ACM at human-competitive levels. Initial contracts awarded to multiple teams for simulation development. X-62A VISTA designated as the primary flight test platform.

August 2020

AlphaDogfight Trials

Eight AI teams compete in simulation. Heron Systems AI defeats human F-16 pilot 5-0 in the final round. Defense community divided on significance — simulation vs. real-world gap widely cited. Program receives significant media and congressional attention. DARPA budget increased for ACE flight phase.

2021–2022

VAULT Integration and Early Flights

Calspan completes VAULT open-architecture integration on X-62A VISTA. First AI-controlled flights begin at Edwards AFB. Sim-to-real gap identified as primary technical challenge. Teams begin domain randomization and physics calibration work. Multiple AI vs. AI engagements flown to gather real-world training data.

2022–2023

Expanded Flight Test Campaign

Over 100 AI-controlled flights completed. Multiple ACE contractor teams test their agents in real ACM scenarios. Performance gap between simulation and real flight narrowing as domain adaptation techniques mature. USAF begins parallel investigation of AI combat employment within Advanced Fighter program context.

September 2023

AI vs. Human Engagement Tests

DARPA and USAF TPS conduct landmark tests at Edwards AFB: AI agent in VISTA vs. crewed F-16 flown by test pilot. AI wins multiple WVR engagements. Results classified at time of completion. Both aircraft fly together in tight formation for photography, marking the first documented AI victory over a human in live flight combat.

May 2024

Public Announcement

DARPA and Air Force Secretary Frank Kendall publicly announce the AI vs. human dogfight results. Secretary Kendall reveals he personally flew in the back seat of the X-62A VISTA during one AI-controlled engagement. The announcement triggers global reassessment of autonomous air combat timelines.

2024–2026

CCA Program Acceleration

ACE results directly accelerate USAF's Collaborative Combat Aircraft program. FY2026 budget requests $5.8 billion for CCA development and procurement. Anduril Industries and General Atomics selected as initial CCA vendors. The program envisions AI-piloted wingmen flying with crewed F-35s and F-22s in contested airspace.

Air Force Secretary Kendall's Flight

The most remarkable disclosure in DARPA's May 2024 announcement was that Air Force Secretary Frank Kendall had personally flown in the back seat of the X-62A VISTA during one of the AI-controlled combat engagements. This was not a passive observation flight. Kendall sat in the rear cockpit while the AI pilot controlled the aircraft through high-G combat maneuvers against a human adversary in the F-16.

Kendall's decision to fly was deliberate and symbolic. He stated publicly that he wanted to experience AI air combat firsthand before making decisions about the CCA program — decisions involving billions of dollars and the future architecture of the United States Air Force. His assessment after the flight was unambiguous: the AI worked, the safety architecture was adequate, and the capabilities observed warranted accelerated investment in autonomous combat aircraft.

The political significance of a sitting Service Secretary flying in an AI-controlled combat aircraft cannot be overstated. It transformed the ACE results from a DARPA research program into a senior leadership endorsement of autonomous air combat capability — and it accelerated the CCA program's timeline in ways that subsequent budget requests make measurable.

The Collaborative Combat Aircraft Program

The ACE program's most significant legacy is not the dogfight itself but what it enabled: the Collaborative Combat Aircraft program, which represents the first large-scale procurement of autonomous combat aircraft by the United States Air Force.

The CCA concept envisions AI-piloted uncrewed aircraft flying as wingmen alongside crewed F-35s and F-22s. The uncrewed aircraft — designed to be significantly cheaper than crewed platforms, with a target unit cost in the $20-30 million range versus $80M+ for an F-35 — would perform a range of missions: sensor node operation, electronic attack, missile truck (carrying additional weapons for the crewed wingman), and potentially offensive strike. In high-threat environments where crewed aircraft risk being shot down, CCAs could operate in the most dangerous airspace, accepting attrition that would be unacceptable for crewed platforms.

CCA Budget Scale

The FY2026 budget request includes $5.8 billion for the CCA program — a figure that represents one of the largest single-year investments in autonomous weapons capability in U.S. defense history. This funding covers two CCA designs (Anduril Fury and General Atomics Gambit), autonomy software development, ground control systems, and the begin of a multi-year production contract intended to field hundreds of CCAs by 2030.

Anduril and General Atomics: The CCA Contractors

In 2024, the Air Force selected Anduril Industries and General Atomics as CCA Increment 1 vendors, awarding contracts for competitive prototyping. Both companies are building distinct approaches to the CCA problem.

Anduril's Fury is a twin-engine, supersonically capable platform designed to fly tight tactical formations with crewed fighters. Anduril's Lattice AI platform — already deployed in counter-drone applications — forms the autonomy backbone. Lattice was specifically designed to operate in GPS-denied, communications-degraded environments, a critical capability for combat operations against peer adversaries who will actively jam communications and navigation systems.

General Atomics' Gambit represents a family of designs rather than a single platform, with different variants optimized for different mission sets. GA brings decades of experience operating the Reaper and Predator drone programs, giving it the largest operational dataset of any CCA competitor. Its autonomy stack benefits from the enormous volume of real-world flight data accumulated across thousands of Reaper sorties.

The ACE program's AI combat results are directly incorporated into both CCA development programs. The reinforcement learning techniques validated on the X-62A — along with the safety architectures, sim-to-real transfer methodologies, and tactical policy development approaches — form the technical foundation of CCA autonomy software.

Beyond Dogfighting: The Broader Implications of ACE

The dogfight result, while dramatic, was only one element of what ACE demonstrated. The program's broader significance lies in its proof of concept for autonomous decision-making in high-speed, high-stakes tactical environments.

Before ACE, the conventional wisdom in autonomous systems held that AI could handle carefully bounded, predictable domains — landing, navigation, obstacle avoidance — but not the complex adversarial dynamics of tactical combat, where an intelligent opponent is actively trying to defeat your decisions. ACE falsified that assumption. The AI agents did not just perform scripted maneuvers; they adapted in real time to an adversary making intelligent countermoves.

The implications extend well beyond air combat:

Naval surface warfare: autonomous ships or submarines using similar RL approaches to conduct tactical engagements without human direction
Counter-drone systems: AI pilots for interceptor drones that pursue and defeat adversary UAS with the same geometric precision demonstrated in ACE
Missile defense: AI-controlled interceptors making mid-course correction decisions faster than human operators can authorize
Electronic warfare: autonomous aircraft optimizing jamming and spoofing patterns in real time against an adversary also adapting in real time

In each of these domains, the fundamental ACE insight applies: an AI agent trained on millions of simulated engagements, operating faster than human reaction time, with perfect geometric awareness and no physiological limits, will outperform human operators in the execution phase of tactical engagements. The question is no longer whether AI can fight — ACE answered that. The question is what the rules of engagement look like when AI is pulling the trigger.

International Reactions and Competition

The ACE announcement and CCA program have triggered visible acceleration in autonomous combat aircraft development by major military powers:

China

The PLA Air Force has been conducting its own autonomous air combat research in parallel with the ACE program, though with substantially less public transparency. Chinese military publications from 2022-2024 increasingly referenced AI air combat, and China's AI pilot competition — in which multiple AI agents competed in a simulation tournament — was completed in 2023. The PLA's Wing Loong and TB-001 MALE drones are already in service; CCA-equivalent platforms — likely derivatives of the AVIC AV500 or new designs — are believed to be in advanced development.

Russia

Russia has been slower to invest in high-end autonomous air combat capability, constrained by sanctions, production limitations, and the resource demands of the Ukraine war. However, Russian aerospace companies including Sukhoi and MiG have published conceptual work on loyal wingman programs, and ROSTEC has announced an "unmanned wingman for the Su-57." Whether these programs have achieved anything comparable to ACE flight test results is unknown.

United Kingdom and Australia

The UK's LANCA (Lightweight Affordable Novel Combat Aircraft) program and Australia's MQ-28A Ghost Bat — now renamed the Boeing Airpower Teaming System — represent the closest Western equivalents to the CCA program outside the United States. The Ghost Bat has been in flight testing since 2021 and is designed specifically for the loyal wingman role. AUKUS cooperation has deepened the US-UK-Australia alignment on autonomous combat aircraft development.

The Rules of Engagement Problem

The ACE result confronts military lawyers, ethicists, and policymakers with a problem that doctrine has not yet resolved: when an AI agent is making lethal engagement decisions at machine speed, what does meaningful human oversight look like?

The current U.S. policy framework — DoD Directive 3000.09, first issued in 2012 and updated in 2023 — requires that autonomous and semi-autonomous weapon systems be designed to allow commanders to exercise appropriate levels of human judgment over the use of force. The deliberate ambiguity of "appropriate levels" was intentional, providing flexibility for different weapons systems with different engagement timescales.

In the CCA context, an AI agent making dogfight decisions in 50-millisecond cycles is effectively beyond any form of per-engagement human control. The human commander authorizes the mission, sets the rules of engagement parameters (engage aircraft of type X in zone Y if criteria Z are met), and the AI executes. This is not meaningfully different from a human pilot executing authorized rules of engagement — but the speed and autonomy involved raise questions that will require new doctrinal frameworks to answer.

DARPA and the Air Force have been explicit that ACE was a research program, not a weapons development program — and that deploying lethal autonomous air combat systems requires the resolution of these policy questions before operational use. But the CCA program's aggressive timeline suggests that those resolutions are expected to arrive on schedule with the hardware.

Lessons Learned

Simulation training at scale transfers to real combat

The sim-to-real gap, once considered a fundamental barrier, was solved through domain randomization and careful physics calibration. AI trained on millions of simulated hours outperformed human pilots trained on thousands of real hours.

AI tactics are genuinely alien to human adversaries

Because AI learns from first principles rather than human doctrine, its maneuver grammar is unpredictable. Human pilots trained to counter conventional ACM doctrine have no framework for countering non-doctrinal AI tactics.

Safety architecture is as important as capability

DARPA's layered safety system — hardwired overrides, structural limits, human safety pilots — was what made the program politically and operationally viable. Capability without safety architecture is not deployable.

Cost asymmetry drives the CCA rationale

A $25M autonomous CCA that can absorb losses in high-threat airspace changes the math of contested air superiority. Adversaries cannot afford to expend $100M+ SAMs on every $25M drone wingman.

Rules of engagement must evolve ahead of capability

Doctrine has not kept pace with ACE's results. Deploying combat-capable AI aircraft requires resolved legal and ethical frameworks before the first operational engagement, not after.

Physiological limits will decide future dogfights

An AI pilot has no G-LOC threshold, no fatigue, no fear. In sustained high-G combat, it can maintain peak performance indefinitely. Human pilots cannot. The future of WVR combat may be determined by who deploys AI wingmen first.

What Comes Next: Increment 2 and Beyond

The CCA program is structured in increments. Increment 1 — the Anduril and General Atomics prototypes — focuses on the basic loyal wingman mission: flying in formation with crewed aircraft, operating autonomously in communications-degraded environments, and executing pre-authorized weapons employment decisions. First flight of Increment 1 aircraft is targeted for 2026-2027.

Increment 2 and beyond will expand CCA capability toward fully autonomous strike missions, multi-CCA coordination, and operations in highly contested airspace where crewed aircraft cannot safely operate. The ACE program's reinforcement learning approaches will be central to Increment 2 autonomy — particularly the multi-agent coordination capabilities tested in AI vs. AI scenarios at Edwards AFB.

The longer-term vision — which senior USAF leaders have discussed publicly but not formally committed to — involves CCAs capable of operating as the primary air superiority platform in the most dangerous airspace, with crewed aircraft operating as command nodes at safe standoff distances. This would represent a fundamental inversion of the current force architecture, in which crewed aircraft are the primary combat platform and unmanned systems are supporting assets.

Whether that vision is realized on the timeline the FY2026 budget implies depends on the resolution of several outstanding technical challenges: reliable autonomy in GPS-denied and communications-jammed environments, robust identification of friend or foe, and the policy frameworks that authorize AI engagement decisions. The ACE program has resolved the fundamental question of whether AI can fight. Everything that follows is a matter of engineering, policy, and political will.

"This is not science fiction anymore. We flew an AI-controlled aircraft in combat maneuvering against a human pilot, and the AI won. The question for the Air Force now is how fast we can operationalize that."
— Secretary of the Air Force Frank Kendall, May 2, 2024

DARPA ACE: The Day AI Beat a Human Fighter Pilot

The Setup: Why Air Combat AI Was Considered Impossible

AlphaDogfight Trials: The 2020 Watershed

From Simulation to the X-62A VISTA

The Test Campaigns: 2022-2023

The September 2023 ACM Tests

How the AI Fights: Reinforcement Learning in Combat

Key AI Tactical Advantages Observed

Timeline: From ACE to CCA

Air Force Secretary Kendall's Flight

The Collaborative Combat Aircraft Program

Anduril and General Atomics: The CCA Contractors

Beyond Dogfighting: The Broader Implications of ACE

International Reactions and Competition

China

Russia

United Kingdom and Australia

The Rules of Engagement Problem

Lessons Learned

What Comes Next: Increment 2 and Beyond

Key Statistics

Related

DARPA ACE: The Day AI Beat a Human Fighter Pilot

The Setup: Why Air Combat AI Was Considered Impossible

AlphaDogfight Trials: The 2020 Watershed

From Simulation to the X-62A VISTA

The Test Campaigns: 2022-2023

The September 2023 ACM Tests

How the AI Fights: Reinforcement Learning in Combat

Key AI Tactical Advantages Observed

Timeline: From ACE to CCA

Air Force Secretary Kendall's Flight

The Collaborative Combat Aircraft Program

Anduril and General Atomics: The CCA Contractors

Beyond Dogfighting: The Broader Implications of ACE

International Reactions and Competition

China

Russia

United Kingdom and Australia

The Rules of Engagement Problem

Lessons Learned

What Comes Next: Increment 2 and Beyond

Key Statistics

Related

INTELLIGENCE BRIEFING