Skip to main content
Node ReferenceAudio Node

Audio Node

Upload, play, sync, and reuse audio files with custom playback controls.

Audio nodes are static media nodes for uploaded files and audio outputs that already exist on the canvas. They keep the playable source, sync-backed media reference, and file metadata with the node so downstream steps can reuse the same audio asset.

Audio display is separate from audio generation
Use the Audio node to store, preview, and pass through an audio file. Use audio-generation nodes, such as SFX Generation, when a workflow needs to create new audio.

What it stores

  • The local preview fields, including src and blobKey, while the browser can still play the staged upload.
  • The durable R2-backed mediaRef after media sync finishes.
  • File metadata such as fileName, mimeType, and optional duration.
  • Playback state is UI state, not persisted node data. The player reads the current runtime duration from the loaded audio element.

Add or replace audio

  1. Choose the Audio node

    Add an Audio node from the canvas toolbar, or drag compatible audio files into the canvas. Drag-and-drop creates Audio nodes with the standard audio node dimensions.

  2. Drop or select a file

    The hidden file input accepts the shared audio MIME allowlist. The upload validator rejects empty, unsupported, or oversized files before writing media data to the node.

  3. Let sync finish

    Builder Studio stages a local preview first, then syncs the asset to durable media storage. The node prefers the R2 display URL when the synced media reference becomes available.

  4. Use the synced asset

    Downstream nodes receive the stored audio through the audio-out port, including the stored MIME type when it is available.

Supported files

FormatMIME typeNotes
MP3audio/mpeg, audio/mp3Accepted for upload and the default fallback audio MIME type.
WAVaudio/wavAccepted for upload. audio/x-wav and audio/wave normalize to this MIME type.
OGGaudio/oggAccepted for upload. The .oga extension also maps to this MIME type.
WebM audioaudio/webmAccepted for upload and playback when the browser can decode it.
AACaudio/aacAccepted for upload. audio/x-aac normalizes to this MIME type.
FLACaudio/flacAccepted for upload and playback when the browser can decode it.
M4A / MP4 audioaudio/m4a, audio/x-m4aAccepted for upload. The file input also accepts audio/mp4, which normalizes to audio/m4a.
PCMaudio/pcmAccepted for upload.
AUaudio/basicAccepted for upload from .au files.
Upload limit
Audio uploads are capped at 100MB. Accepted MIME type only controls upload validation; playback still depends on the browser codec support available for the file.

Playback controls

A loaded Audio node renders a hidden audio element with metadata preloading, a compact waveform surface, and custom controls. The waveform is a deterministic visual treatment for the node, not decoded waveform analysis from the uploaded file.

  • The play button toggles between Play audio and Pause audio while keeping control events out of canvas drag and pan handling.
  • The progress control is a keyboard-accessible slider labeled Audio progress.
  • Keyboard scrubbing supports Arrow Right or Arrow Up for five seconds forward, Arrow Left or Arrow Down for five seconds back, Home for the beginning, and End for the end.
  • Mute and volume controls share the global media volume state, so a volume change applies consistently across media nodes.
  • If the audio element fails to load, editable sessions show Replace audio.
audio node media datajson
{  "type": "audio",  "data": {    "mediaRef": "r2://media/workspace/narration",    "mimeType": "audio/mpeg",    "fileName": "narration.mp3",    "duration": 38.2  }}

Sync and retry behavior

Before a source is available, pending media renders the shared media loading state. During sync, the node frame can show the current media state, a sync message, and a retry action when the upload pipeline can be retried.

  • The display source prefers the R2 display URL, then the stored media URL, then the local preview source.
  • Replacing audio clears the previous local source, blob key, media reference, MIME type, file name, and stored duration before writing the next preview.
  • Playback state resets when the source identity changes, preventing stale progress and load errors from following a replaced asset.

Agent and API notes

Agents should treat Audio nodes as media references, not generation requests. The passthrough executor emits the stored audio URL on audio-out with the stored MIME type when available. Keep mediaRef, src, and MIME metadata together when replacing the underlying asset so downstream audio-aware nodes receive the expected media.

Was this page helpful?