Stack Implementation
This section outlines best-in-class, open-source technologies used for each layer of the Continuum Flow architecture. Each selection is optimized for performance, scalability, and compatibility with modern frameworks like Next.js 15 and React 19.
Best for complex, polyglot projects. Offers a rich plugin ecosystem (including Python), advanced dependency graphing, and robust caching.
Alternatives
The industry standard for building full-stack React applications. Providing optimized performance with SSR, SSG, and React Server Components.
Alternatives
Build for Web, iOS, and Android from a single TypeScript codebase. Features a powerful CLI and OTA updates.
Alternatives
Enables end-to-end typesafe APIs with zero code generation. Unbeatable DX in a full-stack TS monorepo.
Alternatives
Powerful, open-source relational database known for reliability and performance at scale.
Lightweight, performant, and type-safe SQL query builder with SQL-like syntax.
Alternatives
Comprehensive, framework-agnostic auth for TypeScript. Self-hostable and avoids vendor lock-in.
Alternatives
High-performance Python web framework ideal for building AI/ML APIs and leveraging Python's ML ecosystem.
Alternatives
Developer-first, open-source headless CMS built with TS and React. Deep Next.js integration.
De-facto standard for managing server state in React. Provides caching and background refetching.
Alternatives
Headless UI library for building powerful and fully customizable data tables and grids.
Alternatives
Modern, reliable E2E testing framework with true cross-browser support and auto-waits.
Alternatives
| Layer | Primary Recommendation | Rationale & Key Features | Strong Alternatives |
|---|---|---|---|
| Monorepo Tooling | NX | Best for complex, polyglot projects. Offers a rich plugin ecosystem (including Python), advanced dependency graphing, and robust caching. | |
| Web Framework | Next.js | The industry standard for building full-stack React applications. Providing optimized performance with SSR, SSG, and React Server Components. | |
| Universal Framework | Expo | Build for Web, iOS, and Android from a single TypeScript codebase. Features a powerful CLI and OTA updates. | |
| API Layer | tRPC | Enables end-to-end typesafe APIs with zero code generation. Unbeatable DX in a full-stack TS monorepo. | |
| Database | PostgreSQL | Powerful, open-source relational database known for reliability and performance at scale. | |
| Database ORM | Drizzle ORM | Lightweight, performant, and type-safe SQL query builder with SQL-like syntax. | |
| Authentication | better-auth | Comprehensive, framework-agnostic auth for TypeScript. Self-hostable and avoids vendor lock-in. | |
| AI/ML Services | FastAPI (Python) | High-performance Python web framework ideal for building AI/ML APIs and leveraging Python's ML ecosystem. | |
| Headless CMS | PayloadCMS | Developer-first, open-source headless CMS built with TS and React. Deep Next.js integration. | |
| Client Data Fetching | TanStack Query | De-facto standard for managing server state in React. Provides caching and background refetching. | |
| UI Data Grids | TanStack Table | Headless UI library for building powerful and fully customizable data tables and grids. | |
| E2E Testing | Playwright | Modern, reliable E2E testing framework with true cross-browser support and auto-waits. | |
| Component Testing | Storybook | Essential tool for developing UI components in isolation. Serves as a living documentation. |
The AI Model Zoo (Execution Layer)
We utilize a Best-in-Class Modular Approach rather than a single provider. This prevents vendor lock-in and allows upgrading specific components (e.g., swapping the Image Generator without breaking the Text Analyzer).
[!NOTE] Cost Analysis: A detailed breakdown of the costing layer is available in the Cost Estimator.
"Superior reasoning capabilities and larger context window (200k) for analyzing full chapters."
"Currently beats Midjourney in prompt adherence and text rendering."
"High temporal coherence. Relies on "Keyframe" feature for control."
"Low latency and highest emotional range."
"Decoupled lip-syncing ensures we can perfect audio performance before mapping to video."
| Component | Model Selected | Provider | Rationale |
|---|---|---|---|
| Logic / Text | Claude 3.5 Sonnet | Anthropic API | Superior reasoning capabilities and larger context window (200k) for analyzing full chapters. |
| Image Gen | Flux.1 [Dev] | Replicate / Fal.ai | Currently beats Midjourney in prompt adherence and text rendering. |
| Video Gen | Luma Dream Machine | Luma API | High temporal coherence. Relies on "Keyframe" feature for control. |
| Audio / TTS | ElevenLabs (Turbo v2) | ElevenLabs API | Low latency and highest emotional range. |
| Lip Sync | SyncLabs / SadTalker | API / Local | Decoupled lip-syncing ensures we can perfect audio performance before mapping to video. |
Advanced Document Management
We treat the screenplay not just as text, but as executable documentation.
The Quarto (QMD) Pipeline
- Source:
Chapter_01.md(Raw Text). - Processing: The Agent converts this into
Script_01.qmd(Quarto Markdown). - Metadata Injection: The Agent embeds JSON metadata (Camera angles, Lighting) inside YAML headers or hidden code blocks within the QMD.
- Render:
- For Humans: Quarto renders a clean PDF looking like a Hollywood script (Courier font, proper indentation).
- For Robots: The system parses the underlying JSON data blocks from the same file to drive the video generator.
[!TIP] Single Source of Truth: The readable PDF script reviewed by humans is the exact same code that generates the video.
Audio & Lip Sync Architecture
Professional production requires Decoupling. We generally avoid “all-in-one” generators to maintain granular control over performance.
- Step 1: Audio Production (The Radio Play)
- Generate full audio track using ElevenLabs.
- Forced Alignment: Use tools like Gentle or OpenAI Whisper to get exact timing of every word.
- Step 2: Video Generation (The Silent Film)
- Generate the 8-second video visuals based on the visual prompt.
- Step 3: The Sync Pass (Post-Process)
- Lip-Sync: Run Video + Audio through a dedicated Sync engine (Wav2Lip/SyncLabs).
5. Asset Management: “The Cloud-Local Mirror”
Team collaboration on 50GB+ video projects is challenging. We solve this with a “Split-Brain” storage strategy.
Storage Strategy
- Code & Scripts:
GitHub(.md, .qmd, .json) - Version controlled, lightweight. - Heavy Assets:
AWS S3 / Cloudflare R2(.mp4, .png, .wav) - Cheap object storage.
The Sync Mechanism (npm run asset:sync)
- Cloud Worker renders video -> Uploads to S3 -> Pushes Manifest to Database.
- Local CLI detects new manifest.
- Node.js
fsgenerates folder structure locally matching the Chapter/Scene hierarchy. - Pulls only the new video files to your local folder.
6. Execution Environment
Why: Requires massive GPU/TPU for LLM reasoning.
Why: Fast file system operations; zero latency UI updates.
Why: Requires A100 GPUs. Too slow/hot to run on local MacBook.
Why: FFmpeg WASM for quick previews; Cloud Lambda for 4K export.
| Task | Environment | Why? |
|---|---|---|
| Writing / Logic | Cloud (Anthropic) | Requires massive GPU/TPU for LLM reasoning. |
| Folder Gen / Management | Local (Node.js) | Fast file system operations; zero latency UI updates. |
| Image/Video Rendering | Cloud (Replicate) | Requires A100 GPUs. Too slow/hot to run on local MacBook. |
| Final Assembly | Hybrid | FFmpeg WASM for quick previews; Cloud Lambda for 4K export. |