Where I'm at with AI Assisted Building + Current and Future Workflow Overview

A developer has evolved from manually writing code to leveraging AI agents that autonomously handle most software development tasks, using a sophisticated system called Ferdinand composed of Claude Code, Codex, and various orchestrated agents and skills. Over two years, this workflow has transitioned from simple prompt engineering to a complex system that maintains deterministic oversight while allowing AI models to work autonomously, including overnight feature implementations at production quality. The current approach involves cloud agents for pre-work preparation, comprehensive architectural specification before task decomposition, and multiple parallel Claude instances managed through fixed gate sequences to ensure quality control.

Detailed Analysis

A software developer documents a two-year trajectory from rudimentary AI-assisted coding to a nearly fully autonomous agentic software development workflow, with Claude Code functioning as the central orchestration layer of a sophisticated multi-model harness. The author traces an evolution that began with copy-pasting between ChatGPT and an IDE — characterizing the early experience as "a slightly faster Stack Overflow search" — and progressively moved through phases including prompt engineering, a "human relay pattern" in Cursor, and eventually deep context engineering with sub-agents, skills, and custom agent personas. A decisive inflection point is identified in November 2025, coinciding with the release of Claude Opus 4.5 and Codex 5.3, which the author describes as the most significant step increase in code quality AI had delivered to that point, enabling large overnight feature implementations with minimal supervision.

The author's current technical setup is notably terminal-centric, built around Claude Code as the orchestration hub that directs both Anthropic's Claude models and OpenAI's Codex through a system of runbooks, skills, commands, and hooks. Adjacent tooling includes Ghostty as the terminal environment, Wispr Flow for voice-based steering, and a custom documentation maintenance suite anchored by an Obsidian vault and QMD — a system designed to persist architectural context, images, and notes so that agents do not require constant reorientation across long-running projects. Token efficiency is treated as a first-class engineering concern, with tools like grep AI, jcodemunch, and a Rust-based token reducer explicitly deployed to manage input and output costs at scale. The internal name for this harness at the author's workplace — "Ferdinand," after the Disney character — reflects both the degree of institutional investment in the system and its characterization as a gentle but capable autonomous entity.

The article's most analytically substantive passage concerns the author's central problem: how to maintain determinism and architectural integrity in an inherently probabilistic system while physically removing themselves from the steering loop. The author draws an explicit analogy to engineering management — noting that effective technical leaders remain close enough to architecture to prevent systemic errors without performing the implementation work themselves — and frames this as a model for their relationship to the agentic harness. The challenge of preventing hallucinations, enforcing instruction rigor, and sustaining continuous execution quality is acknowledged as an unsolved frontier even at the current capability level, with the author's ongoing work described as the effort to encode their judgment and steering capacity into the system itself rather than exercising it in real time.

This account reflects a broader pattern in advanced AI adoption among senior software practitioners, where the limiting factor has shifted from model capability to workflow architecture and context management. The emphasis on "context engineering" — as distinct from prompt engineering — signals a maturation in how sophisticated users conceptualize their role: less as instructors of individual model interactions and more as architects of persistent, stateful systems that can act on behalf of human intent over extended time horizons. The author's explicit concern with preserving understanding while delegating execution also surfaces a tension that is becoming increasingly central to professional AI use: as models absorb more cognitive labor, the human's value migrates toward judgment, taste, and constraint-setting rather than production, demanding new frameworks for thinking about what meaningful oversight actually looks like at scale.

Read original article →

Detailed Analysis

Don't Miss a Deploy