Differential State Management

In software engineering, the "Diff Problem" is that LLMs are bad at precise edits. in Novel-to-Video, this is the root cause of Video Flicker and Character Hallucination.

When you ask an AI to generate Scene 2, it doesn't "edit" Scene 1; it re-imagines it. It accidentally "refactors" your main character's face, their clothes, or the room layout because it didn't know how to execute a precise "diff."

The Mapping: Code vs. Narrative

We implement the Cursor/Composer "Diff Architecture" directly into Continuum Flow.

The "Cursor" Model (Code)

The "Codebase"

file.ts (1,000 lines)

The Diff Problem

LLM creates Syntax Errors by deleting a closing bracket '}'.

The Goal

Apply change ONLY to the function.

The "Continuum Flow" Model

🎬

The "Visual State"

TOON File (Scene & Characters)

The Hallucination

LLM gives sword but changes shirt from Blue to Red.

The Goal

Add sword, FREEZE all other pixels.

The Solution: "Narrative Edit Trajectories"

You can’t retrain a model easily, but you can force your Orchestrator to use State Diffs instead of State Descriptions.

The "Anti-Regeneration" Rule

Standard prompting forces re-rendering of known assets.

"Generate Scene 2: Arjun is standing in the cave. He is wearing armor. He draws his sword."

Risk: Armor design changes, Cave lighting shifts.

The "Diff-Based" Protocol

Orchestrator generates a PATCH, not a new file.

Step A: Baseline (Locked State)

State_Frame_01:
Arjun: [Wear: Rusty Armor], [Face: Scarred], [Hand: Empty]
Bg: [Cave, Wet Walls]

Step B: The Diff Command

{
  "operation": "UPDATE",
  "target": "Arjun.Hand",
  "value": "Iron Sword",
  "constraint": "PRESERVE_ALL_OTHER_ATTRIBUTES"
}

Implementation: The "Visual Patching" Workflow

To leverage "Search and Replace" logic, we implement a specific tool in the Orchestrator that acts like git apply.

State Patching Algorithm

Input Patch

Receive DIFF Packet
(Target, New Value)

Verify State

Check Old Value matches
Current State

Safety Guard

Apply Patch

Update State Buffer
Freeze other pixels

Why This Fixes Video Generation

The biggest issue in AI video is Temporal Stability.

✕

Without Diffs:
Frame 1 and Frame 2 are treated as two different paintings. The AI "guesses" the continuity.
✓

With Diffs:
You instruct the Video AI (ControlNet): "Keep 90% exactly the same. Only use diffusion to change the pixels around the hand."