Visions for WebRTC Next Steps: Fine-grained Media Control in the browser

By Varun Singh on November 22, 2017

At the beginning of the month, W3C announced the release of WebRTC v1.0 for candidate recommendation, which means that the community thinks that the standard is design complete and is soliciting feedback on the implementation of the API.

The v1.0 CR is expected to undergo revisions over the next few months, based on developers and browser vendors’ feedback and API implementation experiences. This also means that, apart from bug-fixing, no new major functionality will be added to the current Candidate Recommendation (CR) proposal.

At the recently concluded W3C Technical Plenary / Advisory Committee (TPAC), we had an open discussion about the next steps in the evolution of WebRTC v1.0. One of these topics centered around adding new transports to WebRTC, concretely, the adoption of QUIC for WebRTC.

QUIC transport

A little bit of the background: QUIC is currently being standardized in the IETF as an alternative transport for HTTP/2. Since it being defined over UDP, it is an opportunity for WebRTC to use QUIC as an alternative transport for data channels that currently use SCTP/DTLS.

Peter Thatcher from Google has a working prototype of QUIC in Chrome and made a proposal for adding QuicTransport and QuicStream as an extension transport to WebRTC. The QuicTransport does the heavy lifting and is a combination of the DtlsTransport (i.e., handles the crypto) and SctpTransport (i.e., handles the local and remote streams). QuicStream maps closely to the QUIC stream being defined in the IETF, which means that an application can:

  • write to a local stream,
  • read from a remote stream,
  • finish when the local stream is done, and
  • reset a remote stream.

A QUIC stream is reliable by default. Hence, a local stream may want to reset via a timeout for example because a particular QuicStream has been attempting to deliver contents for a reasonable amount of time. Or the remote side, feels it does not need the data from a particular stream because it has moved on and thus can remotely reset it. The final result is that one can build data channels atop these QuicStream, i.e., easy to build big-ordered-reliable messages as well as small-unordered-reliable messages. See slides 90-100 from Peter’s presentation at the TPAC for details.

Media over QUIC

This is where the next idea steps in: Over the summer, some RTP folks (Jörg Ott, Colin Perkins, Roni Even and I) got together and put out a strawman proposal discussing what is needed for sending media over QUIC. Taking over Peter’s proposal of sending data over QUIC, it would be pretty awesome to send media over QUIC.

Over a single media session’s lifetime, the session is expected to create millions/billions of streams, thus, QUIC is perfectly suited to send media frames.

Let me explain why I am super excited about this new development and want to encourage the community to seriously consider this.

Like with UDP for media, the application is in full control over what it puts into a QUIC stream, i.e., it could put an MTU sized packet, or a frame, or a group of pictures into a QUIC stream. On top, QUIC streams are reliable and can be controlled from the remote end, we have an inbuilt way of skipping frames or a group of pictures, if the next frame or GoP is already being decoded. Furthermore, these strategies can be adapted based on the networking environment.

For example, networks with more losses or higher delay, structure the QUIC streams differently than when the application is in a more stable network. Further down the road, we may see unreliable streams in QUIC, which would allow an even finer-grained differentiation. I have only scratched the surface though and I am sure that if people give it more thought, they can come up with more complex hierarchies that may provide more consistent quality of experience.

Can we do this all in JS land?

Currently, the QUIC implementation for WebRTC is available for experiments in Chrome and technically, it is possible to send media over QUIC. The simplest missing piece is a way to connect the media tracks to the QuicTransport. We could do this by letting the browser wire up the connection as they do currently in the PeerConnection API (WebRTC v1.0).

However, I feel that giving the JavaScript control over this would be the smart thing to do. In WebRTC v1.0, the PeerConnection API, we let the browser manage the messy parts of the media pipeline: packetization, scheduling, adapting the frame-rate, frame-size, media coding, etc. In a future extension to the WebRTC v1.0, the browser could give every encrypted frame to the javascript to packetize and send to the remote endpoint. The remote endpoint could decide to playback (if the frame was received in time) or reset the stream (if the frame would not be rendered in time). The sending application on receiving the reset for a particular stream or a group of streams can make the choice of sending an Intra frame or not stop sending the other frames that are part of the temporal set. The application could also modify the dependency structures in the temporal set based on the observed loss patterns. The choices here are endless and giving more control to JS means adaptive real-time applications will become a reality.

In the below video, Dan Burnett and I discuss the WebRTC Next Steps/TPAC 2017.

Tags: Real-time Communications, WebRTC