Schemas

Last updated May 26, 2026

AMDParams

Field	Type		Description
`initial_silence_timeout`	integer	optional	Max milliseconds of silence before declaring no_speech `2500`
`greeting_duration`	integer	optional	Speech duration threshold (ms) above which answerer is classified as machine `1500`
`after_greeting_silence`	integer	optional	Silence duration (ms) after initial speech to declare human `800`
`total_analysis_time`	integer	optional	Max analysis window in milliseconds `5000`
`minimum_word_length`	integer	optional	Minimum speech burst duration (ms) to count as a word `100`
`beep_timeout`	integer	optional	Max time (ms) to wait for the voicemail beep after machine detection. 0 or omitted = disabled. `0`

AddLegRequest

Field	Type		Description
`leg_id`	string	required	ID of the leg to add
`mute`	boolean	optional	If set, apply this mute state to the leg atomically before it joins the mixer (no race where un-muted audio enters the mix). Omit to leave current state untouched (useful when moving between rooms).
`deaf`	boolean	optional	If set, apply this deaf state to the leg atomically before it joins the mixer. Omit to leave current state untouched.
`accept_dtmf`	boolean	optional	If set, control whether this leg receives DTMF digits broadcast from other legs in the same room. Omit to leave current state untouched (default for new legs is true).
`role`	string	optional	If set, apply this routing role to the leg atomically before it joins the mixer. The room's routing matrix (see PUT /v1/rooms/{id}/routing) decides which other legs this leg hears and is heard by based on roles. Pass "" to clear the role (full mesh). Omit to leave the current role untouched.

AgentMessageRequest

Field	Type		Description
`message`	string	required	Context or instruction to inject into the running agent session

AnswerLegRequest

Field	Type		Description
`speech_detection`	boolean	optional	If true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false).
`codec`	enum	optional	Explicit codec for the answer SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order. Values: `PCMU`, `PCMA`, `G722`, `opus`

BridgeView

Field	Type		Description
`id`	string	required	Bridge identifier
`room_id`	string	required	The peer room joined to the room in the path
`direction`	enum	required	Audio flow relative to the room in the path: bidirectional, send, receive, or none. Values: `bidirectional`, `send`, `receive`, `none`
`sample_rate`	integer	required	Shared mixer sample rate in Hz (both rooms must match).

CreateLegRequest

Field	Type		Description
`type`	enum	required	Leg type Values: `sip`, `whatsapp`, `websocket`
`to`	string	optional	Destination. For sip legs, a SIP URI (e.g. "sip:alice@example.com"). For whatsapp legs, an E.164 phone number (with or without '+').
`uri`	string	optional	Deprecated alias for `to` (sip legs only). Prefer `to`.
`from`	string	optional	Caller ID — sets the user part of the SIP From header (e.g. "+15551234567", "alice")
`privacy`	string	optional	SIP Privacy header value (e.g. "id", "none")
`ring_timeout`	integer	optional	Seconds to wait for answer; 0 = no timeout `0`
`max_duration`	integer	optional	Maximum call duration in seconds after connect. Automatically hung up when reached. 0 or omitted = no limit. `0`
`codecs`	array[enum]	optional	Codec preference order (sip legs only)
`headers`	object	optional	Custom headers to include in the outbound INVITE (sip/whatsapp) or the WebSocket upgrade request (websocket)
`room_id`	string	optional	Room ID to auto-add the leg to once media is ready (early_media or connected). If the room does not exist, it is automatically created.
`auth`	any	optional	Digest auth credentials. Required for whatsapp legs (Meta-issued password; username defaults to `from` with '+' stripped). Optional for sip legs (sipgo retries on 401/407 challenge).
`webhook_url`	string(uri)	optional	Route all events for this leg exclusively to this URL instead of global webhooks.
`webhook_secret`	string	optional	HMAC-SHA256 signing secret for the per-leg webhook.
`amd`	any	optional	Enable Answering Machine Detection on outbound calls. Include the object (even empty) to enable with defaults; omit to disable.
`accept_dtmf`	boolean	optional	If false, this leg will not receive DTMF digits broadcast from other legs in the same room. Defaults to true. `true`
`app_id`	string	optional	Application identifier. Carried through to all events for this leg. Use to filter the WebSocket event stream by app.
`speech_detection`	boolean	optional	If true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false).
`rtt`	boolean	optional	For sip legs: offer Real-Time Text (ITU-T T.140 over RTP per RFC 4103) alongside audio. For websocket legs: enable the bidirectional text-message channel. Default: false. `false`
`url`	string(uri)	optional	WebSocket target URL (ws:// or wss://) for outbound websocket legs. Required when type=websocket.
`sample_rate`	enum	optional	PCM sample rate for websocket legs. The room's mixer automatically resamples between this and the room rate. `16000` Values: `8000`, `16000`, `24000`, `48000`
`wire_format`	enum	optional	Audio framing for websocket legs. `binary` ships raw PCM as WebSocket binary frames; `json_base64` wraps PCM as `{"type":"audio","audio":"<base64>"}` text frames (browser-friendly). `"binary"` Values: `binary`, `json_base64`
`sample_format`	enum	optional	On-the-wire PCM sample encoding for websocket legs. v1 only supports `s16le`. `"s16le"` Values: `s16le`

CreateRoomBridgeRequest

Field	Type		Description
`id`	string	optional	Custom bridge ID (auto-generated UUID if omitted)
`room_id`	string	required	The other room to join. Must use the same sample rate as the room in the path.
`direction`	enum	optional	Audio flow relative to the room in the path: bidirectional (both hear each other), send (path room → other only), receive (other → path room only), none (allocated but silent). Default: bidirectional. `"bidirectional"` Values: `bidirectional`, `send`, `receive`, `none`

DTMFRequest

Field	Type		Description
`digits`	string	required	DTMF digits to send (0-9, *, #)

DeepgramAgentRequest

Field	Type		Description
`settings`	object	optional	Full Deepgram agent settings object (agent.listen, agent.think, agent.speak, etc.). When omitted, sensible defaults are used (nova-3 STT, gpt-4o-mini LLM, aura-2-asteria-en TTS).
`greeting`	string	optional	Agent greeting message
`language`	string	optional	Language code (e.g. "en", "es")
`api_key`	string	optional	API key override (falls back to DEEPGRAM_API_KEY env var)

DeleteLegRequest

Field	Type		Description
`reason`	enum	optional	Disconnect reason. Only honored for unanswered SIP inbound legs (state `ringing` or `early_media`); on connected legs the body is ignored and the leg is hung up with the legacy `api_hangup` reason. The value flows through to `leg.disconnected`'s `cdr.reason` and selects the SIP final response: `busy`→486, `declined`/`rejected`→603, `unavailable`→480, `not_found`→404, `forbidden`→403, `server_error`→500. Values: `busy`, `declined`, `rejected`, `unavailable`, `not_found`, `forbidden`, `server_error`

EarlyMediaLegRequest

Field	Type		Description
`codec`	enum	optional	Explicit codec for the 183 Session Progress SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order. Values: `PCMU`, `PCMA`, `G722`, `opus`

ElevenLabsAgentRequest

Field	Type		Description
`agent_id`	string	required	ElevenLabs agent ID
`first_message`	string	optional	Override the agent's first message
`language`	string	optional	Language code (e.g. "en", "es")
`dynamic_variables`	object	optional	Key-value pairs passed to the agent as dynamic variables
`api_key`	string	optional	API key override (falls back to ELEVENLABS_API_KEY env var)

Error

Field	Type		Description
`instance_id`	string	optional	Instance identifier
`error`	string	required	Error message

ICECandidateInit

Field	Type		Description
`candidate`	string	required	ICE candidate string
`sdpMid`	string	optional	Media stream identification tag
`sdpMLineIndex`	integer	optional	Index of the media description

Leg

Field	Type		Description
`instance_id`	string	optional	Instance identifier
`id`	string	required	Unique leg identifier (UUID)
`type`	enum	required	Leg type Values: `sip_inbound`, `sip_outbound`, `webrtc`, `whatsapp_in`, `whatsapp_out`, `websocket_in`, `websocket_out`
`state`	enum	required	Leg state Values: `ringing`, `early_media`, `connected`, `held`, `hung_up`
`room_id`	string	optional	Room ID if the leg is in a room, empty otherwise
`muted`	boolean	required	Whether the leg is muted (cannot be heard by others)
`deaf`	boolean	required	Whether the leg is deaf (cannot hear others)
`accept_dtmf`	boolean	required	Whether the leg receives DTMF digits broadcast from other legs in the same room. Defaults to true.
`held`	boolean	required	Whether the call is on hold (SIP legs only)
`role`	string	optional	Routing role used by the room's audio routing matrix (e.g. "customer", "agent", "supervisor"). Empty string means unroled (full mesh).
`app_id`	string	optional	Application identifier for event stream filtering.
`sip_headers`	object	optional	Deprecated: X-* headers from the inbound INVITE. Only present on sip_inbound legs. Use `headers` for new code; it carries the same map plus surfaces handshake headers for websocket legs.
`headers`	object	optional	Custom protocol headers exposed by the leg's transport — X-/P- headers from a SIP INVITE, the WebSocket upgrade request, or supplied at outbound dial time.

PipecatAgentRequest

Field	Type		Description
`websocket_url`	string(uri)	required	WebSocket URL of the Pipecat bot (e.g. ws://my-bot:8765)

PlaybackRequest

Field	Type		Description
`url`	string(uri)	required	URL of the audio file (mutually exclusive with tone)
`tone`	string	required	Built-in telephone tone name. Format: {country}_{type} or bare {type} (defaults to US). Types: ringback, busy, dial, congestion. Countries: us, gb, de, fr, au, jp, it, in, br, pl, ru. Examples: us_ringback, gb_busy, dial.
`mime_type`	string	required	MIME type (e.g. audio/wav). Required when using url.
`repeat`	integer	required	Number of times to repeat playback (url only) `0`
`volume`	integer	required	Volume adjustment in dB (-8 to 8) `0`

RTTRequest

Field	Type		Description
`text`	string	required	UTF-8 text to send. May be one or more characters and may include T.140 control codes (e.g. backspace U+0008, CR/LF).

RecordingRequest

Field	Type		Description
`storage`	enum	required	"file" (default) — local disk, "s3" — upload to S3 after recording stops Values: `file`, `s3`
`multi_channel`	boolean	required	When true, record each participant to a separate mono WAV file in addition to the full mix. Only applies to room recordings. `false`
`s3_bucket`	string	required	S3 bucket name. Overrides S3_BUCKET env var. Required if env var is not set.
`s3_region`	string	required	AWS region. Overrides S3_REGION env var. Default us-east-1.
`s3_endpoint`	string	required	Custom S3 endpoint (MinIO, etc.). Overrides S3_ENDPOINT env var.
`s3_prefix`	string	required	Key prefix (e.g. recordings/). Overrides S3_PREFIX env var.
`s3_access_key`	string	required	AWS access key ID. Overrides default credential chain.
`s3_secret_key`	string	required	AWS secret access key. Must be set together with s3_access_key.

Room

Field	Type		Description
`instance_id`	string	optional	Instance identifier
`id`	string	required	Room identifier
`app_id`	string	optional	Application identifier for event stream filtering.
`sample_rate`	integer	required	Mixer sample rate in Hz (8000, 16000, or 48000).
`participants`	array[object]	required	Legs currently in this room

RoomCreateRequest

Field	Type		Description
`id`	string	required	Custom room ID (auto-generated UUID if omitted)
`webhook_url`	string(uri)	optional	Route all events for this room exclusively to this URL instead of global webhooks.
`webhook_secret`	string	optional	HMAC-SHA256 signing secret for the per-room webhook.
`app_id`	string	optional	Application identifier. Carried through to all events for this room. Use to filter the WebSocket event stream by app.
`sample_rate`	enum	optional	Mixer sample rate in Hz. Allowed values: 8000, 16000, 48000. Default: 16000. `16000` Values: `8000`, `16000`, `48000`

RoomRoutingRequest

Field	Type		Description
`matrix`	object	required	Listener-role → list of allowed source roles. Omitted listener roles default to full mesh. Empty list = hears nothing.

RoomRoutingUpdateRequest

Field	Type		Description
`updates`	array[object]	required	Per-listener-role row replacements applied as a single atomic update.

RoomRoutingView

Field	Type		Description
`matrix`	object	required	Listener-role → list of allowed source roles. Roles absent from the matrix default to full mesh.

RoutingRowUpdate

Field	Type		Description
`listener_role`	string	required	The role whose row is being replaced.
`sources`	array[string]	required	New list of allowed source roles for this listener role. Pass null to clear the row (full mesh).

SIPAuth

Field	Type		Description
`username`	string	optional	Digest auth username. Optional for whatsapp legs (defaults to `from` with '+' stripped, per Meta's spec).
`password`	string	required	Digest auth password.

STTRequest

Field	Type		Description
`language`	string	required	Language code (e.g. "en", "es")
`partial`	boolean	required	Emit partial (non-final) transcripts `false`
`provider`	enum	optional	STT provider: "elevenlabs" (default) or "deepgram" Values: `elevenlabs`, `deepgram`
`api_key`	string	optional	API key override (falls back to ELEVENLABS_API_KEY or DEEPGRAM_API_KEY env var depending on provider)

SetLegRoleRequest

Field	Type		Description
`role`	string	required	New routing role for the leg. The room's routing matrix decides which other legs this leg hears and is heard by based on roles. Pass an empty string to clear the role (full mesh).

StatusResponse

Field	Type		Description
`instance_id`	string	optional	Instance identifier
`status`	string	required

TTSRequest

Field	Type		Description
`text`	string	required	Text to synthesize
`voice`	string	required	Provider-specific voice identifier. ElevenLabs: voice name or ID. AWS Polly: voice ID (e.g. Joanna, Matthew). Google Cloud: voice name — either full format (e.g. en-US-Neural2-F) or short name for Gemini models (e.g. Achernar, Kore). Deepgram: model name (e.g. aura-2-asteria-en).
`model_id`	string	required	Provider-specific model/engine. ElevenLabs: model ID. AWS Polly: engine (standard, neural, long-form, generative; default neural). Google Cloud: model name (e.g. gemini-2.5-pro-tts, chirp3-hd).
`language`	string	optional	Language code (e.g. "en-US", "pl-pl"). Required for Google Gemini TTS voices that use short names (e.g. Achernar). Auto-extracted from full voice names like en-US-Neural2-F.
`prompt`	string	optional	Style/tone instruction for promptable voice models (Google Gemini TTS only). E.g. "Read aloud in a warm, welcoming tone."
`volume`	integer	required	Volume adjustment in dB (-8 to 8) `0`
`provider`	enum	optional	TTS provider: "elevenlabs" (default), "aws", "google", or "deepgram" Values: `elevenlabs`, `aws`, `google`, `deepgram`
`api_key`	string	optional	ElevenLabs: API key override (falls back to ELEVENLABS_API_KEY env var). AWS: optional ACCESS_KEY:SECRET_KEY override (falls back to default AWS credential chain). Google Cloud: optional API key override (falls back to Application Default Credentials). Deepgram: API key override (falls back to DEEPGRAM_API_KEY env var).

TransferRequest

Field	Type		Description
`target`	string	required	SIP URI to transfer the call to (e.g. "sip:bob@example.com").
`replaces_leg_id`	string	optional	ID of an existing connected SIP leg whose dialog should be replaced (attended transfer). Omit for blind transfer.

UpdateRoomBridgeRequest

Field	Type		Description
`direction`	enum	required	New audio flow relative to the room in the path: bidirectional, send, receive, or none. Values: `bidirectional`, `send`, `receive`, `none`

VAPIAgentRequest

Field	Type		Description
`assistant_id`	string	required	VAPI assistant ID
`first_message`	string	optional	Override the agent's first message
`variable_values`	object	optional	Key-value pairs passed as VAPI variable values (assistantOverrides.variableValues)
`api_key`	string	optional	API key override (falls back to VAPI_API_KEY env var)

VolumeRequest

Field	Type		Description
`volume`	integer	required	Volume adjustment (-8 to 8, ~3dB per step, 0 = unchanged)

WebRTCCandidatesResult

Field	Type		Description
`candidates`	array[object]	required
`done`	boolean	required

WebRTCOfferRequest

Field	Type		Description
`sdp`	string	required	SDP offer from the browser

WebRTCOfferResult

Field	Type		Description
`leg_id`	string	required
`sdp`	string	required

Event envelope delivered via HTTP POST to registered webhook URLs. Event-specific fields are flattened into the top-level object (no "data" wrapper). Includes X-Signature-256 header when a secret is configured.

Field	Type		Description
`type`	enum	required	Values: `leg.ringing`, `leg.early_media`, `leg.connected`, `leg.disconnected`, `leg.joined_room`, `leg.left_room`, `leg.muted`, `leg.unmuted`, `leg.deaf`, `leg.undeaf`, `leg.hold`, `leg.unhold`, `leg.command_failed`, `dtmf.received`, `rtt.received`, `speaking.started`, `speaking.stopped`, `playback.started`, `playback.finished`, `playback.error`, `tts.started`, `tts.finished`, `tts.error`, `recording.started`, `recording.finished`, `recording.paused`, `recording.resumed`, `leg.transfer_initiated`, `leg.transfer_requested`, `leg.transfer_progress`, `leg.transfer_completed`, `leg.transfer_failed`, `room.created`, `room.deleted`, `room.bridged`, `room.bridge_updated`, `room.unbridged`, `room.routing_changed`, `leg.role_changed`, `stt.text`, `agent.connected`, `agent.disconnected`, `agent.user_transcript`, `agent.agent_response`, `amd.result`, `amd.beep`
`timestamp`	string(date-time)	required
`instance_id`	string	optional	Instance identifier