Align Shots to Audio

Re-computes each scene start_time and duration from whisper word_timestamps so shots stay in sync with whatever voiceover is actually playing. Fixes drift when the audio is regenerated at runtime.

Align Shots to Audio

Badge: Direct
Node type: utility:align_shots_to_audio
Category: process

Description

Re-computes each scene start_time and duration from whisper word_timestamps so shots stay in sync with whatever voiceover is actually playing. Fixes drift when the audio is regenerated at runtime.

Pricing

  • Cost: ~1 credit per units

Canvas ports

These appear as port handles on the left side of the node.

ID Label Details
scenes Scenes (JSON) JSON — Array of scene definitions with narration text. Can come from an upstream JSON source or from this node's Scenes JSON config field.
wordTimestamps Word Timestamps JSON (required) — Array of {word, start, end} timestamps from a Whisper STT node. Required — without this, alignment falls back to a linear estimate.

These render as form fields in the right-side config panel when the node is selected.

ID Label Details
hookDurationSec Hook Duration (sec) NUMBER · 0…60 — Fixed offset for any pre-made hook scene with baked audio. Non-hook scenes are offset by this amount so the voiceover plays after the hook video ends.
matchWindow Fuzzy Match Window NUMBER · 1…100 — How far forward (in words) to search for each scene's first narration word. Higher = more forgiving when transcription differs from script.
scenes_json Scenes JSON (fallback) TEXT — Paste scene definitions JSON here if not wired via the scenes port. Parsed as the fallback scenes array.

Outputs

ID Label Type
shots Aligned Shots JSON

Auto-generated from the Wireflow node registry.

© 2026 Wireflow. All rights reserved.

Align Shots to Audio | Wireflow Docs