Schemas

Last updated May 26, 2026

AMDParams

FieldTypeDescription
initial_silence_timeoutintegeroptionalMax milliseconds of silence before declaring no_speech 2500
greeting_durationintegeroptionalSpeech duration threshold (ms) above which answerer is classified as machine 1500
after_greeting_silenceintegeroptionalSilence duration (ms) after initial speech to declare human 800
total_analysis_timeintegeroptionalMax analysis window in milliseconds 5000
minimum_word_lengthintegeroptionalMinimum speech burst duration (ms) to count as a word 100
beep_timeoutintegeroptionalMax time (ms) to wait for the voicemail beep after machine detection. 0 or omitted = disabled. 0

AddLegRequest

FieldTypeDescription
leg_idstringrequiredID of the leg to add
mutebooleanoptionalIf set, apply this mute state to the leg atomically before it joins the mixer (no race where un-muted audio enters the mix). Omit to leave current state untouched (useful when moving between rooms).
deafbooleanoptionalIf set, apply this deaf state to the leg atomically before it joins the mixer. Omit to leave current state untouched.
accept_dtmfbooleanoptionalIf set, control whether this leg receives DTMF digits broadcast from other legs in the same room. Omit to leave current state untouched (default for new legs is true).
rolestringoptionalIf set, apply this routing role to the leg atomically before it joins the mixer. The room's routing matrix (see PUT /v1/rooms/{id}/routing) decides which other legs this leg hears and is heard by based on roles. Pass "" to clear the role (full mesh). Omit to leave the current role untouched.

AgentMessageRequest

FieldTypeDescription
messagestringrequiredContext or instruction to inject into the running agent session

AnswerLegRequest

FieldTypeDescription
speech_detectionbooleanoptionalIf true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false).
codecenumoptionalExplicit codec for the answer SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order.
Values: PCMU, PCMA, G722, opus

BridgeView

FieldTypeDescription
idstringrequiredBridge identifier
room_idstringrequiredThe peer room joined to the room in the path
directionenumrequiredAudio flow relative to the room in the path: bidirectional, send, receive, or none.
Values: bidirectional, send, receive, none
sample_rateintegerrequiredShared mixer sample rate in Hz (both rooms must match).

CreateLegRequest

FieldTypeDescription
typeenumrequiredLeg type
Values: sip, whatsapp, websocket
tostringoptionalDestination. For sip legs, a SIP URI (e.g. "sip:alice@example.com"). For whatsapp legs, an E.164 phone number (with or without '+').
uristringoptionalDeprecated alias for `to` (sip legs only). Prefer `to`.
fromstringoptionalCaller ID — sets the user part of the SIP From header (e.g. "+15551234567", "alice")
privacystringoptionalSIP Privacy header value (e.g. "id", "none")
ring_timeoutintegeroptionalSeconds to wait for answer; 0 = no timeout 0
max_durationintegeroptionalMaximum call duration in seconds after connect. Automatically hung up when reached. 0 or omitted = no limit. 0
codecsarray[enum]optionalCodec preference order (sip legs only)
headersobjectoptionalCustom headers to include in the outbound INVITE (sip/whatsapp) or the WebSocket upgrade request (websocket)
room_idstringoptionalRoom ID to auto-add the leg to once media is ready (early_media or connected). If the room does not exist, it is automatically created.
authanyoptionalDigest auth credentials. Required for whatsapp legs (Meta-issued password; username defaults to `from` with '+' stripped). Optional for sip legs (sipgo retries on 401/407 challenge).
webhook_urlstring(uri)optionalRoute all events for this leg exclusively to this URL instead of global webhooks.
webhook_secretstringoptionalHMAC-SHA256 signing secret for the per-leg webhook.
amdanyoptionalEnable Answering Machine Detection on outbound calls. Include the object (even empty) to enable with defaults; omit to disable.
accept_dtmfbooleanoptionalIf false, this leg will not receive DTMF digits broadcast from other legs in the same room. Defaults to true. true
app_idstringoptionalApplication identifier. Carried through to all events for this leg. Use to filter the WebSocket event stream by app.
speech_detectionbooleanoptionalIf true, emit speaking.started and speaking.stopped events for this leg. If false, suppress them. Omit to use the server default (SPEECH_DETECTION_ENABLED env var, default false).
rttbooleanoptionalFor sip legs: offer Real-Time Text (ITU-T T.140 over RTP per RFC 4103) alongside audio. For websocket legs: enable the bidirectional text-message channel. Default: false. false
urlstring(uri)optionalWebSocket target URL (ws:// or wss://) for outbound websocket legs. Required when type=websocket.
sample_rateenumoptionalPCM sample rate for websocket legs. The room's mixer automatically resamples between this and the room rate. 16000
Values: 8000, 16000, 24000, 48000
wire_formatenumoptionalAudio framing for websocket legs. `binary` ships raw PCM as WebSocket binary frames; `json_base64` wraps PCM as `{"type":"audio","audio":"<base64>"}` text frames (browser-friendly). "binary"
Values: binary, json_base64
sample_formatenumoptionalOn-the-wire PCM sample encoding for websocket legs. v1 only supports `s16le`. "s16le"
Values: s16le

CreateRoomBridgeRequest

FieldTypeDescription
idstringoptionalCustom bridge ID (auto-generated UUID if omitted)
room_idstringrequiredThe other room to join. Must use the same sample rate as the room in the path.
directionenumoptionalAudio flow relative to the room in the path: bidirectional (both hear each other), send (path room → other only), receive (other → path room only), none (allocated but silent). Default: bidirectional. "bidirectional"
Values: bidirectional, send, receive, none

DTMFRequest

FieldTypeDescription
digitsstringrequiredDTMF digits to send (0-9, *, #)

DeepgramAgentRequest

FieldTypeDescription
settingsobjectoptionalFull Deepgram agent settings object (agent.listen, agent.think, agent.speak, etc.). When omitted, sensible defaults are used (nova-3 STT, gpt-4o-mini LLM, aura-2-asteria-en TTS).
greetingstringoptionalAgent greeting message
languagestringoptionalLanguage code (e.g. "en", "es")
api_keystringoptionalAPI key override (falls back to DEEPGRAM_API_KEY env var)

DeleteLegRequest

FieldTypeDescription
reasonenumoptionalDisconnect reason. Only honored for unanswered SIP inbound legs (state `ringing` or `early_media`); on connected legs the body is ignored and the leg is hung up with the legacy `api_hangup` reason. The value flows through to `leg.disconnected`'s `cdr.reason` and selects the SIP final response: `busy`→486, `declined`/`rejected`→603, `unavailable`→480, `not_found`→404, `forbidden`→403, `server_error`→500.
Values: busy, declined, rejected, unavailable, not_found, forbidden, server_error

EarlyMediaLegRequest

FieldTypeDescription
codecenumoptionalExplicit codec for the 183 Session Progress SDP. Must appear in the remote offer's offered_codecs list. Omit to use the server's default preference order.
Values: PCMU, PCMA, G722, opus

ElevenLabsAgentRequest

FieldTypeDescription
agent_idstringrequiredElevenLabs agent ID
first_messagestringoptionalOverride the agent's first message
languagestringoptionalLanguage code (e.g. "en", "es")
dynamic_variablesobjectoptionalKey-value pairs passed to the agent as dynamic variables
api_keystringoptionalAPI key override (falls back to ELEVENLABS_API_KEY env var)

Error

FieldTypeDescription
instance_idstringoptionalInstance identifier
errorstringrequiredError message

ICECandidateInit

FieldTypeDescription
candidatestringrequiredICE candidate string
sdpMidstringoptionalMedia stream identification tag
sdpMLineIndexintegeroptionalIndex of the media description

Leg

FieldTypeDescription
instance_idstringoptionalInstance identifier
idstringrequiredUnique leg identifier (UUID)
typeenumrequiredLeg type
Values: sip_inbound, sip_outbound, webrtc, whatsapp_in, whatsapp_out, websocket_in, websocket_out
stateenumrequiredLeg state
Values: ringing, early_media, connected, held, hung_up
room_idstringoptionalRoom ID if the leg is in a room, empty otherwise
mutedbooleanrequiredWhether the leg is muted (cannot be heard by others)
deafbooleanrequiredWhether the leg is deaf (cannot hear others)
accept_dtmfbooleanrequiredWhether the leg receives DTMF digits broadcast from other legs in the same room. Defaults to true.
heldbooleanrequiredWhether the call is on hold (SIP legs only)
rolestringoptionalRouting role used by the room's audio routing matrix (e.g. "customer", "agent", "supervisor"). Empty string means unroled (full mesh).
app_idstringoptionalApplication identifier for event stream filtering.
sip_headersobjectoptionalDeprecated: X-* headers from the inbound INVITE. Only present on sip_inbound legs. Use `headers` for new code; it carries the same map plus surfaces handshake headers for websocket legs.
headersobjectoptionalCustom protocol headers exposed by the leg's transport — X-/P- headers from a SIP INVITE, the WebSocket upgrade request, or supplied at outbound dial time.

PipecatAgentRequest

FieldTypeDescription
websocket_urlstring(uri)requiredWebSocket URL of the Pipecat bot (e.g. ws://my-bot:8765)

PlaybackRequest

FieldTypeDescription
urlstring(uri)requiredURL of the audio file (mutually exclusive with tone)
tonestringrequiredBuilt-in telephone tone name. Format: {country}_{type} or bare {type} (defaults to US). Types: ringback, busy, dial, congestion. Countries: us, gb, de, fr, au, jp, it, in, br, pl, ru. Examples: us_ringback, gb_busy, dial.
mime_typestringrequiredMIME type (e.g. audio/wav). Required when using url.
repeatintegerrequiredNumber of times to repeat playback (url only) 0
volumeintegerrequiredVolume adjustment in dB (-8 to 8) 0

RTTRequest

FieldTypeDescription
textstringrequiredUTF-8 text to send. May be one or more characters and may include T.140 control codes (e.g. backspace U+0008, CR/LF).

RecordingRequest

FieldTypeDescription
storageenumrequired"file" (default) — local disk, "s3" — upload to S3 after recording stops
Values: file, s3
multi_channelbooleanrequiredWhen true, record each participant to a separate mono WAV file in addition to the full mix. Only applies to room recordings. false
s3_bucketstringrequiredS3 bucket name. Overrides S3_BUCKET env var. Required if env var is not set.
s3_regionstringrequiredAWS region. Overrides S3_REGION env var. Default us-east-1.
s3_endpointstringrequiredCustom S3 endpoint (MinIO, etc.). Overrides S3_ENDPOINT env var.
s3_prefixstringrequiredKey prefix (e.g. recordings/). Overrides S3_PREFIX env var.
s3_access_keystringrequiredAWS access key ID. Overrides default credential chain.
s3_secret_keystringrequiredAWS secret access key. Must be set together with s3_access_key.

Room

FieldTypeDescription
instance_idstringoptionalInstance identifier
idstringrequiredRoom identifier
app_idstringoptionalApplication identifier for event stream filtering.
sample_rateintegerrequiredMixer sample rate in Hz (8000, 16000, or 48000).
participantsarray[object]requiredLegs currently in this room

RoomCreateRequest

FieldTypeDescription
idstringrequiredCustom room ID (auto-generated UUID if omitted)
webhook_urlstring(uri)optionalRoute all events for this room exclusively to this URL instead of global webhooks.
webhook_secretstringoptionalHMAC-SHA256 signing secret for the per-room webhook.
app_idstringoptionalApplication identifier. Carried through to all events for this room. Use to filter the WebSocket event stream by app.
sample_rateenumoptionalMixer sample rate in Hz. Allowed values: 8000, 16000, 48000. Default: 16000. 16000
Values: 8000, 16000, 48000

RoomRoutingRequest

FieldTypeDescription
matrixobjectrequiredListener-role → list of allowed source roles. Omitted listener roles default to full mesh. Empty list = hears nothing.

RoomRoutingUpdateRequest

FieldTypeDescription
updatesarray[object]requiredPer-listener-role row replacements applied as a single atomic update.

RoomRoutingView

FieldTypeDescription
matrixobjectrequiredListener-role → list of allowed source roles. Roles absent from the matrix default to full mesh.

RoutingRowUpdate

FieldTypeDescription
listener_rolestringrequiredThe role whose row is being replaced.
sourcesarray[string]requiredNew list of allowed source roles for this listener role. Pass null to clear the row (full mesh).

SIPAuth

FieldTypeDescription
usernamestringoptionalDigest auth username. Optional for whatsapp legs (defaults to `from` with '+' stripped, per Meta's spec).
passwordstringrequiredDigest auth password.

STTRequest

FieldTypeDescription
languagestringrequiredLanguage code (e.g. "en", "es")
partialbooleanrequiredEmit partial (non-final) transcripts false
providerenumoptionalSTT provider: "elevenlabs" (default) or "deepgram"
Values: elevenlabs, deepgram
api_keystringoptionalAPI key override (falls back to ELEVENLABS_API_KEY or DEEPGRAM_API_KEY env var depending on provider)

SetLegRoleRequest

FieldTypeDescription
rolestringrequiredNew routing role for the leg. The room's routing matrix decides which other legs this leg hears and is heard by based on roles. Pass an empty string to clear the role (full mesh).

StatusResponse

FieldTypeDescription
instance_idstringoptionalInstance identifier
statusstringrequired

TTSRequest

FieldTypeDescription
textstringrequiredText to synthesize
voicestringrequiredProvider-specific voice identifier. ElevenLabs: voice name or ID. AWS Polly: voice ID (e.g. Joanna, Matthew). Google Cloud: voice name — either full format (e.g. en-US-Neural2-F) or short name for Gemini models (e.g. Achernar, Kore). Deepgram: model name (e.g. aura-2-asteria-en).
model_idstringrequiredProvider-specific model/engine. ElevenLabs: model ID. AWS Polly: engine (standard, neural, long-form, generative; default neural). Google Cloud: model name (e.g. gemini-2.5-pro-tts, chirp3-hd).
languagestringoptionalLanguage code (e.g. "en-US", "pl-pl"). Required for Google Gemini TTS voices that use short names (e.g. Achernar). Auto-extracted from full voice names like en-US-Neural2-F.
promptstringoptionalStyle/tone instruction for promptable voice models (Google Gemini TTS only). E.g. "Read aloud in a warm, welcoming tone."
volumeintegerrequiredVolume adjustment in dB (-8 to 8) 0
providerenumoptionalTTS provider: "elevenlabs" (default), "aws", "google", or "deepgram"
Values: elevenlabs, aws, google, deepgram
api_keystringoptionalElevenLabs: API key override (falls back to ELEVENLABS_API_KEY env var). AWS: optional ACCESS_KEY:SECRET_KEY override (falls back to default AWS credential chain). Google Cloud: optional API key override (falls back to Application Default Credentials). Deepgram: API key override (falls back to DEEPGRAM_API_KEY env var).

TransferRequest

FieldTypeDescription
targetstringrequiredSIP URI to transfer the call to (e.g. "sip:bob@example.com").
replaces_leg_idstringoptionalID of an existing connected SIP leg whose dialog should be replaced (attended transfer). Omit for blind transfer.

UpdateRoomBridgeRequest

FieldTypeDescription
directionenumrequiredNew audio flow relative to the room in the path: bidirectional, send, receive, or none.
Values: bidirectional, send, receive, none

VAPIAgentRequest

FieldTypeDescription
assistant_idstringrequiredVAPI assistant ID
first_messagestringoptionalOverride the agent's first message
variable_valuesobjectoptionalKey-value pairs passed as VAPI variable values (assistantOverrides.variableValues)
api_keystringoptionalAPI key override (falls back to VAPI_API_KEY env var)

VolumeRequest

FieldTypeDescription
volumeintegerrequiredVolume adjustment (-8 to 8, ~3dB per step, 0 = unchanged)

WebRTCCandidatesResult

FieldTypeDescription
candidatesarray[object]required
donebooleanrequired

WebRTCOfferRequest

FieldTypeDescription
sdpstringrequiredSDP offer from the browser

WebRTCOfferResult

FieldTypeDescription
leg_idstringrequired
sdpstringrequired

WebhookEvent

Event envelope delivered via HTTP POST to registered webhook URLs. Event-specific fields are flattened into the top-level object (no "data" wrapper). Includes X-Signature-256 header when a secret is configured.

FieldTypeDescription
typeenumrequired
Values: leg.ringing, leg.early_media, leg.connected, leg.disconnected, leg.joined_room, leg.left_room, leg.muted, leg.unmuted, leg.deaf, leg.undeaf, leg.hold, leg.unhold, leg.command_failed, dtmf.received, rtt.received, speaking.started, speaking.stopped, playback.started, playback.finished, playback.error, tts.started, tts.finished, tts.error, recording.started, recording.finished, recording.paused, recording.resumed, leg.transfer_initiated, leg.transfer_requested, leg.transfer_progress, leg.transfer_completed, leg.transfer_failed, room.created, room.deleted, room.bridged, room.bridge_updated, room.unbridged, room.routing_changed, leg.role_changed, stt.text, agent.connected, agent.disconnected, agent.user_transcript, agent.agent_response, amd.result, amd.beep
timestampstring(date-time)required
instance_idstringoptionalInstance identifier