← Back

The hidden macOS API behind Codex computer use

Apr 19, 2026

The hidden macOS API behind Codex computer use

Codex can run several computer-use tasks in parallel, which shouldn't really be possible. Most tools in this space take over your mouse and keyboard completely, so you can't use your machine while they work.

Codex doesn't do that. The team at Software Applications Inc. figured out how to avoid it. (That's Sky, by the way, former Shortcuts people Apple acquired a while back.)

I wanted to know how they pulled it off. So I did some digging. Here's what I found.

The "secret API" is AXUIElement, macOS's accessibility framework. Apple built it for VoiceOver. The Codex team just repurposed it for LLMs.

On the reading side: the AX Tree.

No screenshots. No OCR. Codex reads the accessibility hierarchy directly. Every button, text field, and menu item is already labeled and structured.

It's basically a DOM equivalent that Apple built decades ago for assistive tech. That's what the model actually sees.

On the writing side: events go through accessibility APIs, not CGEvent.

Clicks hit the target element directly, never through the shared system cursor. Those virtual cursors you see wiggling around? Purely cosmetic.

That's why focus doesn't get stolen. That's why N agents can run at the same time without stepping on each other.

It doesn't work on Windows, so there they hijack the cursor.

The Sky team didn't actually invent this approach. UI Browser was using the same APIs years ago.

I found hacks like this super-cool, and hope more people build cool shit with it.

I might share interesting things