What's New in VoiceBlender — WebSocket Legs, MoQ, Room Bridging and More

May 26, 2026 by csiwek

websocketmoqwebtransportsipnatroomsrttvsi

VoiceBlender has picked up a stack of new capabilities over the last few releases. Here is the short tour.

🔌 WebSocket as a leg

Any WebSocket client can now be treated as a first-class leg in a call, with bi-directional audio streaming. No SIP stack, no WebRTC negotiation — open a socket, push frames in, read frames out. It is the simplest way to plug a browser app, an AI agent or a lightweight service into a VoiceBlender room or bridged call.

⚡ MoQ — Media over QUIC (experimental)

Experimental support for Media over QUIC Transport, speaking IETF draft-11 of moq-transport over WebTransport / HTTP/3. MoQ is shaping up to be the next generation of low-latency, scalable media delivery — and VoiceBlender now lets you experiment with it end-to-end against real calls.

🌐 Symmetric SIP / NAT traversal

Smarter handling of NATed scenarios using source sockets — VoiceBlender sends responses and subsequent requests from the same socket the peer reached us on. This is particularly useful when running behind NAT in Docker, cloud or Kubernetes environments, where the classic “reply to the Contact” approach falls apart.

🎙️ Role-based audio routing

You can now decide who hears whom inside a room. Assign roles to legs and control the audio matrix between them — enough to build moderated conferences, whisper channels, supervisor-agent flows or “barge-in only” participants without juggling multiple rooms.

🌉 Bridge rooms

Rooms can be connected to other rooms to form larger, federated audio topologies. Useful for scaling beyond a single mixer, segmenting participants by region, or composing independent rooms into a single logical conference.

⌨️ Real-Time Text (RTT / T.140)

VoiceBlender now speaks T.140 Real-Time Text over RTP (with RED redundancy per RFC 2198), bridging SIP-based RTT to WebRTC clients via the Streaming Interface. The full story — including the Kamailio World Dangerous Demo it came from — is in Bringing T.140 Real-Time Text to VoiceBlender.

🎛️ Full control via VSI

The VoiceBlender Streaming Interface (VSI) now exposes every aspect of calls, rooms, legs and media programmatically over WebSocket. Subscribe to events, drive call flows, mutate room membership and stream media — all from a single bi-directional control channel. See the VSI documentation for the full surface.

These features are available in the latest VoiceBlender releases on GitHub. The API documentation is regenerated from the OpenAPI and AsyncAPI specs and covers everything above in detail.