Legs
Last updated May 26, 2026
Voice call legs (SIP or WebRTC)
List all active legs
Responses
Originate an outbound leg
Originate a new outbound leg. The type field selects the transport: sip originates a SIP INVITE; whatsapp originates a WhatsApp call through Meta; websocket dials a remote WebSocket endpoint (audio is PCM in either binary or json_base64 framing, with bidirectional text and caller-supplied X-/P- headers).
Request Body
| Field | Type | Description | |
|---|---|---|---|
type | enum | required | Leg type Values: sip, whatsapp, websocket |
to | string | optional | Destination. For sip legs, a SIP URI (e.g. "sip:alice@example.com"). For whatsapp legs, an E.164 phone number (with or without '+'). |
uri | string | optional | Deprecated alias for `to` (sip legs only). Prefer `to`. |
from | string | optional | Caller ID — sets the user part of the SIP From header (e.g. "+15551234567", "alice") |
privacy | string | optional | SIP Privacy header value (e.g. "id", "none") |
ring_timeout | integer | optional | Seconds to wait for answer; 0 = no timeout 0 |
max_duration | integer | optional | Maximum call duration in seconds after connect. Automatically hung up when reached. 0 or omitted = no limit. 0 |
codecs | array[enum] | optional | Codec preference order (sip legs only) |
headers | object | optional | Custom headers to include in the outbound INVITE (sip/whatsapp) or the WebSocket upgrade request (websocket) |
room_id | string | optional | Room ID to auto-add the leg to once media is ready (early_media or connected). If the room does not exist, it is automatically created. |
auth | any | optional | Digest auth credentials. Required for whatsapp legs (Meta-issued password; username defaults to `from` with '+' stripped). Optional for sip legs (sipgo retries on 401/407 challenge). |
webhook_url | string(uri) | optional | Route all events for this leg exclusively to this URL instead of global webhooks. |
webhook_secret | string | optional | HMAC-SHA256 signing secret for the per-leg webhook. |
amd | any | optional | Enable Answering Machine Detection on outbound calls. Include the object (even empty) to enable with defaults; omit to disable. |
accept_dtmf | boolean | optional | If false, this leg will not receive DTMF digits broadcast from other legs in the same room. Defaults to true. true |
app_id | string | optional | Application identifier. Carried through to all events for this leg. Use to filter the WebSocket event stream by app. |
speech_detection | boolean | optional | If true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false). |
rtt | boolean | optional | For sip legs: offer Real-Time Text (ITU-T T.140 over RTP per RFC 4103) alongside audio. For websocket legs: enable the bidirectional text-message channel. Default: false. false |
url | string(uri) | optional | WebSocket target URL (ws:// or wss://) for outbound websocket legs. Required when type=websocket. |
sample_rate | enum | optional | PCM sample rate for websocket legs. The room's mixer automatically resamples between this and the room rate. 16000Values: 8000, 16000, 24000, 48000 |
wire_format | enum | optional | Audio framing for websocket legs. `binary` ships raw PCM as WebSocket binary frames; `json_base64` wraps PCM as `{"type":"audio","audio":"<base64>"}` text frames (browser-friendly). "binary"Values: binary, json_base64 |
sample_format | enum | optional | On-the-wire PCM sample encoding for websocket legs. v1 only supports `s16le`. "s16le"Values: s16le |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
id | string | required | Unique leg identifier (UUID) |
type | enum | required | Leg type Values: sip_inbound, sip_outbound, webrtc, whatsapp_in, whatsapp_out, websocket_in, websocket_out |
state | enum | required | Leg state Values: ringing, early_media, connected, held, hung_up |
room_id | string | optional | Room ID if the leg is in a room, empty otherwise |
muted | boolean | required | Whether the leg is muted (cannot be heard by others) |
deaf | boolean | required | Whether the leg is deaf (cannot hear others) |
accept_dtmf | boolean | required | Whether the leg receives DTMF digits broadcast from other legs in the same room. Defaults to true. |
held | boolean | required | Whether the call is on hold (SIP legs only) |
role | string | optional | Routing role used by the room's audio routing matrix (e.g. "customer", "agent", "supervisor"). Empty string means unroled (full mesh). |
app_id | string | optional | Application identifier for event stream filtering. |
sip_headers | object | optional | Deprecated: X-* headers from the inbound INVITE. Only present on sip_inbound legs. Use `headers` for new code; it carries the same map plus surfaces handshake headers for websocket legs. |
headers | object | optional | Custom protocol headers exposed by the leg's transport — X-/P- headers from a SIP INVITE, the WebSocket upgrade request, or supplied at outbound dial time. |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Connect a WebSocket as a leg (HTTP upgrade)
Upgrades the HTTP request to a WebSocket and creates a websocket_in leg. Query parameters: sample_rate (8000/16000/24000/48000; default 16000); wire_format (binary default, or json_base64); sample_format (s16le only in v1); room_id to auto-add the leg to a room; app_id for event filtering; rtt=true to enable the bidirectional text channel; webhook_url/webhook_secret for per-leg event routing. X-* and P-* request headers (plus Authorization) are captured into the leg's headers map and surfaced on leg.ringing and in LegView. The leg goes straight to connected (no ringing/answer flow). Audio frames carry PCM16-LE mono at sample_rate; with wire_format=binary each WebSocket binary frame is exactly one 20ms PCM frame, with json_base64 the same payload is wrapped as {"type":"audio","audio":"<base64>"}. Text and control messages always use JSON text frames: {"type":"text","text":...}, {"type":"ping","event_id":N}/{"type":"pong","event_id":N}, and {"type":"hangup"} to terminate from the peer side.
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Connect a MoQ (Media over QUIC) leg (WebTransport extended-CONNECT, experimental)
**Actual HTTP method: CONNECT** (HTTP/3 extended-CONNECT for WebTransport). OpenAPI 3.1 does not define connect as a path-item method, so this operation is documented under post with an x-actual-method: CONNECT extension. Standard HTTP clients (e.g. curl -X POST) will receive 405 Method Not Allowed — use a WebTransport-capable HTTP/3 client.
**Experimental / PoC.** Upgrades an HTTP/3 extended-CONNECT request to a WebTransport session and creates an inbound MoQ leg. Reachable only over HTTP/3 on the MoQ listener (not on the regular HTTP/1.1 chi listener). Requires MOQ_ENABLED=true plus MOQ_TLS_CERT_FILE and MOQ_TLS_KEY_FILE. Speaks IETF draft-11 of moq-transport (via mengelbart/moqtransport); browser interop with draft-16 clients (moqtail, moq.dev) is not expected to work. Audio is Opus framed one frame per MoQ Object (LOC-style), single MoQ session per leg. Query parameters: sample_rate (8000/16000/24000/48000; default 48000); room_id to auto-add the leg to a room; app_id for event filtering; webhook_url/webhook_secret for per-leg event routing. X-* and P-* request headers (plus Authorization) are captured into the leg's headers map and surfaced on LegView. The leg goes straight to connected (no ringing/answer flow); no DTMF, no RTT, and event parity is limited to leg.connected / leg.disconnected.
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Get a single leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
id | string | required | Unique leg identifier (UUID) |
type | enum | required | Leg type Values: sip_inbound, sip_outbound, webrtc, whatsapp_in, whatsapp_out, websocket_in, websocket_out |
state | enum | required | Leg state Values: ringing, early_media, connected, held, hung_up |
room_id | string | optional | Room ID if the leg is in a room, empty otherwise |
muted | boolean | required | Whether the leg is muted (cannot be heard by others) |
deaf | boolean | required | Whether the leg is deaf (cannot hear others) |
accept_dtmf | boolean | required | Whether the leg receives DTMF digits broadcast from other legs in the same room. Defaults to true. |
held | boolean | required | Whether the call is on hold (SIP legs only) |
role | string | optional | Routing role used by the room's audio routing matrix (e.g. "customer", "agent", "supervisor"). Empty string means unroled (full mesh). |
app_id | string | optional | Application identifier for event stream filtering. |
sip_headers | object | optional | Deprecated: X-* headers from the inbound INVITE. Only present on sip_inbound legs. Use `headers` for new code; it carries the same map plus surfaces handshake headers for websocket legs. |
headers | object | optional | Custom protocol headers exposed by the leg's transport — X-/P- headers from a SIP INVITE, the WebSocket upgrade request, or supplied at outbound dial time. |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Hang up a leg (asynchronous)
Validates the leg exists and queues a hangup. The HTTP call returns 202 as soon as the leg is found; the SIP work and cleanup run in the background, and the eventual disconnection is observed via the leg.disconnected event.
Without a request body the legacy behavior is preserved: SIP BYE on connected legs (cdr.reason: "api_hangup"), or dialog cancel on unanswered inbound legs (cdr.reason: "caller_cancel").
With {"reason": "<value>"} and an unanswered SIP inbound leg (state ringing or early_media), VoiceBlender sends a final non-2xx response instead of BYE/cancel: busy→486, declined/rejected→603, unavailable→480, not_found→404, forbidden→403, server_error→500. The reason value is passed through to leg.disconnected's cdr.reason.
For connected legs the request body is ignored.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
reason | enum | optional | Disconnect reason. Only honored for unanswered SIP inbound legs (state `ringing` or `early_media`); on connected legs the body is ignored and the leg is hung up with the legacy `api_hangup` reason. The value flows through to `leg.disconnected`'s `cdr.reason` and selects the SIP final response: `busy`→486, `declined`/`rejected`→603, `unavailable`→480, `not_found`→404, `forbidden`→403, `server_error`→500. Values: busy, declined, rejected, unavailable, not_found, forbidden, server_error |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Answer a ringing or early-media inbound SIP leg (asynchronous)
Signals the inbound-call goroutine to send 200 OK. The HTTP call returns 202 immediately; the actual SIP 200 OK is sent in the background, and the leg's transition is observed via leg.connected. Pre-condition failures (wrong state, unknown codec) still return 4xx synchronously.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
speech_detection | boolean | optional | If true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false). |
codec | enum | optional | Explicit codec for the answer SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order. Values: PCMU, PCMA, G722, opus |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Send 180 Ringing on a ringing inbound SIP leg (asynchronous)
Queues a SIP 180 Ringing provisional response with no SDP. Use when SIP_AUTO_RINGING=false (the default) and you want to indicate alerting before deciding to early-media or answer. Idempotent: each call emits another 180 — receivers tolerate re-sends. The HTTP call returns 202 as soon as the request is validated; SIP-level send failures surface as leg.command_failed with command="ring".
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Enable early media on a ringing inbound SIP leg (asynchronous)
Queues a SIP 183 Session Progress with SDP and the RTP/codec setup. The HTTP call returns 202 as soon as the request is validated; the leg transitions to early_media state asynchronously, observable via leg.early_media. Setup failures surface as leg.command_failed with command="early_media".
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
codec | enum | optional | Explicit codec for the 183 Session Progress SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order. Values: PCMU, PCMA, G722, opus |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Mute a leg
A muted leg's audio is excluded from the room mix and speaking events are suppressed. Taps (recording/STT) still receive the muted leg's own audio.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Unmute a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Put a SIP call on hold (asynchronous)
Queues a re-INVITE with sendonly SDP direction. The HTTP call returns 202 as soon as the leg is validated; the re-INVITE is sent in the background and success surfaces as leg.hold. Failures surface as leg.command_failed with command="hold". The RTP timeout is paused while held, and a 2-hour auto-hangup timer starts.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Resume a held SIP call (asynchronous)
Queues a re-INVITE with sendrecv SDP direction. The HTTP call returns 202; success surfaces as leg.unhold, failures as leg.command_failed with command="unhold".
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Transfer a SIP leg via REFER (asynchronous)
Asynchronously transfers a SIP leg. The HTTP call returns 202 as soon as the request is validated; the REFER is sent in the background and its outcome is surfaced through leg.transfer_initiated / leg.transfer_progress / leg.transfer_completed / leg.transfer_failed events. Blind transfer when replaces_leg_id is omitted; attended transfer when present (the named leg's dialog identity is embedded as a Replaces parameter per RFC 3891). On terminal 2xx the leg (and the replaces leg, if any) is hung up automatically.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
target | string | required | SIP URI to transfer the call to (e.g. "sip:bob@example.com"). |
replaces_leg_id | string | optional | ID of an existing connected SIP leg whose dialog should be replaced (attended transfer). Omit for blind transfer. |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Send DTMF digits on a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
digits | string | required | DTMF digits to send (0-9, *, #) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Enable DTMF reception on a leg
Allow this leg to receive DTMF digits broadcast from other legs in the same room. This is the default state for new legs.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Disable DTMF reception on a leg
Block this leg from receiving DTMF digits broadcast from other legs in the same room. DTMF received from this leg's own far end is still emitted as a leg.dtmf event.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Send Real-Time Text (T.140) on a SIP leg
Sends UTF-8 text on the leg's RTT (T.140 / RFC 4103) media stream. Requires that the SDP offer/answer agreed on an m=text section with the remote UA. Enable RTT on the server with RTT_ENABLED=true.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
text | string | required | UTF-8 text to send. May be one or more characters and may include T.140 control codes (e.g. backspace U+0008, CR/LF). |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Enable RTT reception on a leg
Allow this leg to receive RTT text broadcast from other legs in the same room and to emit rtt.received events for incoming text. Default state for new legs.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Disable RTT reception on a leg
Block this leg from receiving RTT text broadcast from other legs in the same room and suppress rtt.received events for this leg.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Start audio playback to a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
url | string(uri) | required | URL of the audio file (mutually exclusive with tone) |
tone | string | required | Built-in telephone tone name. Format: {country}_{type} or bare {type} (defaults to US). Types: ringback, busy, dial, congestion. Countries: us, gb, de, fr, au, jp, it, in, br, pl, ru. Examples: us_ringback, gb_busy, dial. |
mime_type | string | required | MIME type (e.g. audio/wav). Required when using url. |
repeat | integer | required | Number of times to repeat playback (url only) 0 |
volume | integer | required | Volume adjustment in dB (-8 to 8) 0 |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Change the volume of an active leg playback
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
playbackID | path | string | required | Playback ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
volume | integer | required | Volume adjustment (-8 to 8, ~3dB per step, 0 = unchanged) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Stop audio playback on a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
playbackID | path | string | required | Playback ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Synthesize speech and play it on a leg
Synthesizes the provided text using the configured TTS provider and plays the audio on the leg. When TTS_CACHE_ENABLED=true, identical requests (same text, voice, model, language, and prompt) are stored on disk in TTS_CACHE_DIR and persist across restarts, without calling the external provider.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
text | string | required | Text to synthesize |
voice | string | required | Provider-specific voice identifier. ElevenLabs: voice name or ID. AWS Polly: voice ID (e.g. Joanna, Matthew). Google Cloud: voice name — either full format (e.g. en-US-Neural2-F) or short name for Gemini models (e.g. Achernar, Kore). Deepgram: model name (e.g. aura-2-asteria-en). |
model_id | string | required | Provider-specific model/engine. ElevenLabs: model ID. AWS Polly: engine (standard, neural, long-form, generative; default neural). Google Cloud: model name (e.g. gemini-2.5-pro-tts, chirp3-hd). |
language | string | optional | Language code (e.g. "en-US", "pl-pl"). Required for Google Gemini TTS voices that use short names (e.g. Achernar). Auto-extracted from full voice names like en-US-Neural2-F. |
prompt | string | optional | Style/tone instruction for promptable voice models (Google Gemini TTS only). E.g. "Read aloud in a warm, welcoming tone." |
volume | integer | required | Volume adjustment in dB (-8 to 8) 0 |
provider | enum | optional | TTS provider: "elevenlabs" (default), "aws", "google", or "deepgram" Values: elevenlabs, aws, google, deepgram |
api_key | string | optional | ElevenLabs: API key override (falls back to ELEVENLABS_API_KEY env var). AWS: optional ACCESS_KEY:SECRET_KEY override (falls back to default AWS credential chain). Google Cloud: optional API key override (falls back to Application Default Credentials). Deepgram: API key override (falls back to DEEPGRAM_API_KEY env var). |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Start recording a leg to a WAV file
For SIP legs, recording is stereo (left=incoming, right=outgoing). For legs in a room, stereo at 16kHz (left=participant audio, right=mixed-minus-self).
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
storage | enum | required | "file" (default) — local disk, "s3" — upload to S3 after recording stops Values: file, s3 |
multi_channel | boolean | required | When true, record each participant to a separate mono WAV file in addition to the full mix. Only applies to room recordings. false |
s3_bucket | string | required | S3 bucket name. Overrides S3_BUCKET env var. Required if env var is not set. |
s3_region | string | required | AWS region. Overrides S3_REGION env var. Default us-east-1. |
s3_endpoint | string | required | Custom S3 endpoint (MinIO, etc.). Overrides S3_ENDPOINT env var. |
s3_prefix | string | required | Key prefix (e.g. recordings/). Overrides S3_PREFIX env var. |
s3_access_key | string | required | AWS access key ID. Overrides default credential chain. |
s3_secret_key | string | required | AWS secret access key. Must be set together with s3_access_key. |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Stop recording a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Pause a leg recording
Replaces incoming audio with silence on the active recording until /record/resume is called. The WAV's timeline is preserved (silent gap where audio was paused), so reviewers can see exactly when sensitive data was excluded. Idempotent: calling while already paused returns status: already_paused.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Resume a paused leg recording
Resumes writing real audio after a prior /record/pause. Idempotent: calling while not paused returns status: not_paused.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Start real-time speech-to-text on a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
language | string | required | Language code (e.g. "en", "es") |
partial | boolean | required | Emit partial (non-final) transcripts false |
provider | enum | optional | STT provider: "elevenlabs" (default) or "deepgram" Values: elevenlabs, deepgram |
api_key | string | optional | API key override (falls back to ELEVENLABS_API_KEY or DEEPGRAM_API_KEY env var depending on provider) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Stop speech-to-text on a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Attach an ElevenLabs ConvAI agent to a leg
Bridges audio bidirectionally with an ElevenLabs conversational AI agent. Standalone legs use direct audio; legs in a room use mixer taps.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
agent_id | string | required | ElevenLabs agent ID |
first_message | string | optional | Override the agent's first message |
language | string | optional | Language code (e.g. "en", "es") |
dynamic_variables | object | optional | Key-value pairs passed to the agent as dynamic variables |
api_key | string | optional | API key override (falls back to ELEVENLABS_API_KEY env var) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Attach a VAPI agent to a leg
Bridges audio bidirectionally with a VAPI conversational AI agent. Standalone legs use direct audio; legs in a room use mixer taps.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
assistant_id | string | required | VAPI assistant ID |
first_message | string | optional | Override the agent's first message |
variable_values | object | optional | Key-value pairs passed as VAPI variable values (assistantOverrides.variableValues) |
api_key | string | optional | API key override (falls back to VAPI_API_KEY env var) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Attach a Pipecat bot to a leg
Bridges audio bidirectionally with a self-hosted Pipecat bot via WebSocket. Standalone legs use direct audio; legs in a room use mixer taps.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
websocket_url | string(uri) | required | WebSocket URL of the Pipecat bot (e.g. ws://my-bot:8765) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Attach a Deepgram Voice Agent to a leg
Bridges audio bidirectionally with a Deepgram Voice Agent. Standalone legs use direct audio; legs in a room use mixer taps.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
settings | object | optional | Full Deepgram agent settings object (agent.listen, agent.think, agent.speak, etc.). When omitted, sensible defaults are used (nova-3 STT, gpt-4o-mini LLM, aura-2-asteria-en TTS). |
greeting | string | optional | Agent greeting message |
language | string | optional | Language code (e.g. "en", "es") |
api_key | string | optional | API key override (falls back to DEEPGRAM_API_KEY env var) |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Inject a message into a running agent session on a leg
Sends a context message or instruction to the running agent. Supported by Deepgram (InjectAgentMessage), Pipecat (TextFrame), and VAPI (control URL). Returns 501 for ElevenLabs.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
message | string | required | Context or instruction to inject into the running agent session |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Detach the agent from a leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Start answering machine detection on a connected leg
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
initial_silence_timeout | integer | optional | Max milliseconds of silence before declaring no_speech 2500 |
greeting_duration | integer | optional | Speech duration threshold (ms) above which answerer is classified as machine 1500 |
after_greeting_silence | integer | optional | Silence duration (ms) after initial speech to declare human 800 |
total_analysis_time | integer | optional | Max analysis window in milliseconds 5000 |
minimum_word_length | integer | optional | Minimum speech burst duration (ms) to count as a word 100 |
beep_timeout | integer | optional | Max time (ms) to wait for the voicemail beep after machine detection. 0 or omitted = disabled. 0 |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
status | string | required |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
Change a leg's routing role
Updates the leg's routing role and, if the leg is currently in a room, recomputes the room's matrix-derived allow-sets atomically (single mixer-mutex acquisition). The next mix tick (≤ 20 ms) reflects the change.
Parameters
| Name | In | Type | Description | |
|---|---|---|---|---|
id | path | string | required | Leg ID |
Request Body
| Field | Type | Description | |
|---|---|---|---|
role | string | required | New routing role for the leg. The room's routing matrix decides which other legs this leg hears and is heard by based on roles. Pass an empty string to clear the role (full mesh). |
Responses
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
id | string | required | Unique leg identifier (UUID) |
type | enum | required | Leg type Values: sip_inbound, sip_outbound, webrtc, whatsapp_in, whatsapp_out, websocket_in, websocket_out |
state | enum | required | Leg state Values: ringing, early_media, connected, held, hung_up |
room_id | string | optional | Room ID if the leg is in a room, empty otherwise |
muted | boolean | required | Whether the leg is muted (cannot be heard by others) |
deaf | boolean | required | Whether the leg is deaf (cannot hear others) |
accept_dtmf | boolean | required | Whether the leg receives DTMF digits broadcast from other legs in the same room. Defaults to true. |
held | boolean | required | Whether the call is on hold (SIP legs only) |
role | string | optional | Routing role used by the room's audio routing matrix (e.g. "customer", "agent", "supervisor"). Empty string means unroled (full mesh). |
app_id | string | optional | Application identifier for event stream filtering. |
sip_headers | object | optional | Deprecated: X-* headers from the inbound INVITE. Only present on sip_inbound legs. Use `headers` for new code; it carries the same map plus surfaces handshake headers for websocket legs. |
headers | object | optional | Custom protocol headers exposed by the leg's transport — X-/P- headers from a SIP INVITE, the WebSocket upgrade request, or supplied at outbound dial time. |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |
| Field | Type | Description | |
|---|---|---|---|
instance_id | string | optional | Instance identifier |
error | string | required | Error message |