Align Shots to Audio
Re-computes each scene start_time and duration from whisper word_timestamps so shots stay in sync with whatever voiceover is actually playing. Fixes drift when the audio is regenerated at runtime.
Align Shots to Audio
Badge: Direct
Node type: utility:align_shots_to_audio
Category: process
Description
Re-computes each scene start_time and duration from whisper word_timestamps so shots stay in sync with whatever voiceover is actually playing. Fixes drift when the audio is regenerated at runtime.
Pricing
- Cost: ~1 credit per units
Canvas ports
These appear as port handles on the left side of the node.
| ID | Label | Details |
|---|---|---|
scenes |
Scenes (JSON) | JSON — Array of scene definitions with narration text. Can come from an upstream JSON source or from this node's Scenes JSON config field. |
wordTimestamps |
Word Timestamps | JSON (required) — Array of {word, start, end} timestamps from a Whisper STT node. Required — without this, alignment falls back to a linear estimate. |
Sidebar config
These render as form fields in the right-side config panel when the node is selected.
| ID | Label | Details |
|---|---|---|
hookDurationSec |
Hook Duration (sec) | NUMBER · 0…60 — Fixed offset for any pre-made hook scene with baked audio. Non-hook scenes are offset by this amount so the voiceover plays after the hook video ends. |
matchWindow |
Fuzzy Match Window | NUMBER · 1…100 — How far forward (in words) to search for each scene's first narration word. Higher = more forgiving when transcription differs from script. |
scenes_json |
Scenes JSON (fallback) | TEXT — Paste scene definitions JSON here if not wired via the scenes port. Parsed as the fallback scenes array. |
Outputs
| ID | Label | Type |
|---|---|---|
shots |
Aligned Shots | JSON |
Auto-generated from the Wireflow node registry.