Detailed Analysis
AMD's reported use of 50+ simultaneous Claude Code sessions on a single project served as the catalyst for the development of Fleet, an open-source Python supervisor designed to orchestrate parallel coding agents at scale. The project originated when the author investigated an AMD bug report filed against the Claude Code repository, in which AMD described running a large fleet of Claude Code instances coordinated through a tool called beads — a git-backed issue tracker that functions as a shared task queue. Inspired by this industrial-scale usage pattern, the developer first built a minimal proof-of-concept using bash scripting before graduating to the more capable Python-based Fleet system, now published publicly on GitHub.
Fleet's architecture centers on a centralized task database stored in a user's home directory, eliminating the need to initialize project-specific configurations for each codebase. Tasks are created with metadata including title, description, priority, and dependencies, and the system automatically spawns coding agents within the correct working directory context. The tool supports three distinct coding agents — Claude, Antigravity (agy), and OpenAI's Codex — and is designed to make adding new agent backends a trivial extension. A configurable concurrency ceiling (defaulting to three parallel sessions to avoid API rate limits) governs how many agents run simultaneously, and the author reports having scaled their personal workflows from three to ten or more concurrent sessions.
The project surfaces a practical reality about frontier AI coding agents at scale: token consumption is the primary operational bottleneck, not compute or task coordination logic. The author describes rotating between multiple Claude subscriptions as each is exhausted and aggressively pruning CLAUDE.md configuration files and skill plugins to reduce context bloat — including discovering that certain plugins had been inadvertently loaded twice, doubling their token overhead. This kind of operational discipline mirrors patterns familiar from managing cloud infrastructure, where resource efficiency requires careful auditing of what is being consumed and why.
Fleet's emergence reflects a broader trend of developers building orchestration layers on top of AI coding agents, treating them less as interactive tools and more as distributed worker processes. The beads task queue abstraction — providing shared state, dependency resolution, and status tracking across many concurrent agent sessions — mirrors patterns from distributed computing frameworks like Celery or Sidekiq, adapted for the non-deterministic, context-sensitive nature of LLM-based code generation. AMD's apparent use of similar infrastructure for serious engineering work signals that multi-agent coding pipelines are transitioning from experimental curiosity to legitimate production methodology at major technology firms.
The development of Fleet also underscores an emerging design space around agent coordination primitives. The combination of a centralized task queue, per-task coder and model specification, and real-time monitoring commands (for context consumption, logs, and execution plans) constitutes a lightweight but functional agent operations platform. As Anthropic continues developing Claude Code and its associated APIs, third-party tooling of this kind is likely to proliferate, pushing the ecosystem toward standardized interfaces for task handoff, session lifecycle management, and multi-agent context sharing — challenges that grow significantly more complex as the number of concurrent agents increases.
Read original article →