Agentic Automation · AF Trailer Programs · 2026

APEX

How I built a browser agent to do the work nobody wanted to do, and what happened when it actually worked.

Vishal Gayakwar
Python + Playwright
Live

Every morning,
a queue arrives.

Inside Amazon's freight operations, a team called TFT manages trailer movement across dozens of facilities. Every day, a system called Paragon generates cases, each one representing a shipment that needs a trailer attached before it can move.

The cases land in a queue. Associates work through them one by one. Some of those cases are genuinely hard. A trailer stuck in the wrong yard. A site with no available options. A late shipment where someone needs to make a judgment call.

These are the cases that need a human. And then there's the other kind.

~0%

of incoming cases require no judgment. Open a tool. Find an empty trailer. Check it. Attach it. Resolve. The associate already knows the answer before they open the case.

The manual process, repeated hundreds of times a week
Open Paragon Find the case Open YMS Search the yard Find empty trailer Check it's clean Attach VRID Resolve Next case
~2 minutes per case  ·  no decision involved
Live
Since you opened this page
0
cases APEX would have resolved
One every 32 seconds  ·  live processing speed

The information was already there.
The rule was already written.

The only thing missing was something to click the buttons.

So I built the thing
that clicks the buttons.

APEX is a browser automation agent. It opens a real Chrome window, logs into Paragon, reads the case queue, opens YMS in a second tab, evaluates available trailers, makes the decision, attaches the VRID, and resolves the case.

It does exactly what an associate would do. It takes 32 seconds. Then it moves to the next case.

APEX · Case Lifecycle
01
Case Arrives
A Paragon case hits the TFT queue. Eligible for automation. No judgment needed.
02
Evaluates
APEX opens YMS, scans for available trailers, and runs 7 checks on each candidate.
03
Automates
Trailer passes all 7 checks. VRID attached. Case resolved in Paragon. Slack log sent. 32 seconds.
04
Escalates
No trailer clears the checks. APEX doesn't guess. It flags the case for a human to take over.
The time difference, animated
Manual
0s
APEX
0s Done
Manual: ~120s per case  ·  APEX: 32s per case  ·  75% reduction

It doesn't replace judgment. It replaces the absence of it: the cases where no judgment was needed in the first place.

It sounds simple.
It wasn't.

Browser automation against internal enterprise tools is a different problem. Three things made this harder than expected.

midway-auth.amazon.com
🔒
Access Denied
Headless browser detected.
Authentication requires a visible session.
headless=True  →  blocked
yms.amazon.com/inventory
Loading YMS inventory...
0s
timeout threshold: 90s
parallel execution
Partition A
Partition B
2× throughput, zero collisions
01 · Authentication
Internal tools block headless browsers
Amazon's authentication system (Midway) detects headless browsers and refuses to let them in. APEX had to run as a real, visible Chrome window, saving the session between runs so it doesn't need to re-authenticate each time. The browser is always open. It just runs quietly.
Fix → persistent browser context + saved session profile
02 · Page loading
Some pages take 90 seconds to load
YMS, the yard management tool, can take anywhere from 3 to 90 seconds to fully render. The agent needed to detect when a page was stuck, reload it, re-select the site, and retry without losing track of the case it was working on. No timeouts allowed.
Fix → loading mask detection + recovery + retry flow
03 · Throughput
2,600 cases across two tools is slow
Working sequentially would take hours. The solution: split the queue 50/50 and spawn two independent browser sessions in parallel. Each gets its own set of sites. A done flag signals completion. When both finish, the logs merge and post to Slack.
Fix → parallel partition architecture · 2× throughput

Before APEX touches a trailer,
it runs through seven checks.

Every trailer in the yard goes through this list. Pass all seven, or move on to the next one. Tap any check to see what happens when it fails.

Fail any check: skip the trailer, try the next one. If nothing clears all seven, APEX doesn't guess. It escalates. A human takes the case.
~0%
Automation rate
Cases APEX handles end-to-end without any human input
0%
Success rate
No incorrect trailer has ever been assigned
0s
Per case
vs ~2 minutes manually, a 75% reduction in processing time
Slack
Live run logs
Every run posts a success log to the TFT Slack channel automatically
131 hrs
Saved every week
Time the TFT team no longer spends on routine pre-assignment work

The team didn't get faster.
They got better.

When APEX handles the routine cases, TFT associates spend their time on the ones that actually need them. The complicated redirections. The yards with no good options. The sites where something's gone wrong and someone needs to make a call.

The best automation doesn't replace judgment.
It protects the time needed to use it.

That's the version of this story I find more interesting than the efficiency numbers. Not faster, but the right kind of work, available more of the time.

The cases APEX can't handle
are the interesting ones.

When APEX hits something it can't resolve: a site not in YMS, an unusual equipment restriction, a case modified mid-flight, it escalates. A human takes over.

That escalation path is intentional. APEX was never designed to handle everything. It was designed to handle the cases that don't need a decision.

Now
LiveAPEX is running on live cases across the TFT queue. Logs reviewed daily via Slack. No issues flagged since launch.
Soon
More case typesThe same architecture applies to other predictable case categories in the queue. The question is always the same: is there a decision here, or just a process?
Later
Claude API for the judgment callsThe escalated cases, the ones APEX currently can't touch, are the next frontier. Not more clicking, but actual reasoning: look at the notes, check the context, decide whether to try a fallback or pass it up.
Vishal Gayakwar
Product Manager  ·  Amazon  ·  2026
Back to Stories Blog