Spent an evening making a launch video with Claude + Blender MCP

A solo developer created a promotional video for a habit tracker app using Claude with a Blender MCP server, conversationally describing the desired aesthetic rather than manually coding the 3D effects. The process involved auto-cropping an iPhone screen recording and mapping it to a 3D phone in a Miyazaki-inspired atmosphere, with iterative refinement after the initial render proved too aggressive. Completed in roughly 90 minutes, the project generated approximately 800 lines of Python code handling complex effects including camera trajectories, volumetric fog, and particle animations.

Detailed Analysis

A solo developer building Spira, a habit-tracking app that visualizes habits as blooming flowers, used Claude paired with a Blender MCP (Model Context Protocol) server to produce a polished 10-second vertical launch video in roughly 90 minutes — a task that would conventionally require days of specialized 3D work. The developer described the desired aesthetic in natural language: a phone floating in a "Miyazaki-meets-Apple" atmosphere with drifting dust motes, a slow camera reveal, and a flower closeup. Claude translated those prompts into approximately 800 lines of Python executed directly inside Blender, handling camera trajectory, emissive materials, volumetric fog, and particle staggering entirely through conversational iteration across three full renders.

Several specific behaviors documented in the account illustrate how Claude approached the creative problem. Before designing the shot, the model assembled what the developer described as a "committee" of cinematographic references — Emmanuel Lubezki, Hokusai, and James Cameron — which initially seemed overengineered but informed a visually coherent output. Claude also autonomously handled a technical preprocessing step: upon receiving an iPhone screen recording, it auto-cropped the iOS recording indicator bar using ffmpeg before mapping the footage onto the 3D phone screen, bridging the gap between real-world asset capture and 3D compositing without explicit instruction. When the first render produced an overly dramatic result — a Fibonacci petal explosion with glowing roots — a single natural-language note ("make it gentler, like a Miyazaki dream") was sufficient to reorient the visual direction.

The workflow demonstrates a meaningful shift in how MCP integrations expand Claude's practical utility beyond text generation. By connecting directly to Blender's Python scripting environment, Claude gained the ability to execute, observe, and iterate on complex generative 3D tasks in a tight feedback loop. This is distinct from simply generating code snippets for a developer to paste — the model was operating as an active participant in a render pipeline, making sequential decisions across camera, lighting, material, and particle systems. The 90-minute timeline for a production-ready asset underscores how agentic tool use compresses creative workflows that once required specialist knowledge in both cinematography and 3D software.

This account fits within a broader pattern of developers using Claude's agentic capabilities to collapse the skill-gap barrier in creative production. The rise of MCP servers for tools like Blender, alongside similar integrations for code execution environments and design tools, reflects an industry-wide move toward AI systems that don't just advise but operate. For solo developers and indie studios in particular, the implications are significant: high-quality motion graphics, previously gated behind expensive software subscriptions and years of Blender expertise, become accessible through iterative natural-language direction. The Spira example is notable not because the output was generated automatically, but because the human remained the creative director while Claude handled the technical translation layer entirely.

The broader trend this represents is one of AI systems functioning as domain-fluent collaborators rather than general-purpose generators. Claude's invocation of specific cinematographic references before designing the shot suggests a model reasoning about aesthetic coherence at a level above simple instruction-following — synthesizing visual language from art history, film, and product design to serve a specific brief. As MCP ecosystems mature and more creative tools expose scriptable interfaces, the pattern demonstrated here — describe intent, iterate conversationally, receive production-ready output — is likely to become a standard workflow primitive for independent creators operating without dedicated production teams.

Read original article →

Detailed Analysis

Don't Miss a Deploy