Speech to Text
Transcribe audio or video with precise word-level timestamps
Speech to Text
Node type: audio:whisper
Category: audio
Description
Transcribe audio or video with precise word-level timestamps
Pricing
- Cost: ~1 credit per units
Canvas ports
These appear as port handles on the left side of the node.
| ID | Label | Details |
|---|---|---|
audio_url |
Audio / Video | VIDEO (required) |
Sidebar config
These render as form fields in the right-side config panel when the node is selected.
| ID | Label | Details |
|---|---|---|
task |
Task | TEXT · options: transcribe, translate |
language |
Language | TEXT · options: ``, en, es, fr, de, +8 |
chunk_level |
Chunk Level | TEXT · options: none, segment, word |
Outputs
| ID | Label | Type |
|---|---|---|
text |
Transcript | TEXT |
chunks |
Timed Chunks | ARRAY |
word_timestamps |
Word Timestamps | JSON |
Auto-generated from the Wireflow node registry.