Speech to Text

Transcribe audio or video with precise word-level timestamps

Speech to Text

Node type: audio:whisper
Category: audio

Description

Transcribe audio or video with precise word-level timestamps

Pricing

  • Cost: ~1 credit per units

Canvas ports

These appear as port handles on the left side of the node.

ID Label Details
audio_url Audio / Video VIDEO (required)

These render as form fields in the right-side config panel when the node is selected.

ID Label Details
task Task TEXT · options: transcribe, translate
language Language TEXT · options: ``, en, es, fr, de, +8
chunk_level Chunk Level TEXT · options: none, segment, word

Outputs

ID Label Type
text Transcript TEXT
chunks Timed Chunks ARRAY
word_timestamps Word Timestamps JSON

Auto-generated from the Wireflow node registry.

© 2026 Wireflow. All rights reserved.

Speech to Text | Wireflow Docs