Check out our Amazon Connect solution! Learn More

Product Update: Circuit Breaker on the Conference Event timeline

By callstats on September 16, 2020
read

When your network is terrible it causes annoying artefacts in the audio and video rendering. Typically, measured by packet loss, jitter, variations in latency, and fluctuations in bitrates. Circuit breakers are triggered when things are terrible, much like the electrical counterparts -- these must be triggered in extreme conditions and before the enduser can react.

Circuit breakers (CB) for network transport protocols such as, User Datagram Protocol (UDP) and Real-time Transport Protocol (RTP) were developed in the IETF as part a boundary conditions for non-congestion controlled traffic. In electrical systems a CB is triggered when there is a power surge due to faulty wiring or malfunctioning grid or appliance, and triggering the CB protects the appliance, part of a building, or a small community. Similarly, the aim of the circuit breaker in RTP is to prevent the media flow to overwhelm the endpoint or network and starve out other internet traffic.

In RTP or WebRTC there are three Circuit Breaker timeouts 

  1. RTCP and Media Timeout 
  2. Congestion 
  3. Media Usability 

Typically, the media sender and receiver will resynchronise state by asking the sender to scale back the quality or in adverse conditions switch of video in attempt to reduce the overall traffic bitrate used by the application. These corrections can occur in relatively short timescales, in the order of couple of seconds. However, in adverse situations where the network conditions are fluctuating widely it might be impossible and the system or enduser may eventually end the call, redial-in hoping for a better connection. We hope that by detecting these adverse conditions earlier, the application may automatically set up a parallel connection, check if it is better and seamlessly move the traffic to this new connection, without the enduser interacting with the product (or trying to self-diagnose -- toggle audio/video or ask other participants to toggle their audio/video, leave and rejoin, etc).

 

Timeouts

Before we talk about the timeouts, a quick refresher: Media is carried over RTP and the corresponding feedback or statistics for that media is carried over RTP Control Protocol (RTCP). You can read more about RTP and RTCP from our earlier blog.

Media (RTP) and RTCP timeout occurs when an endpoint is sending and receiving media, but it stops receiving the corresponding RTCP feedback from the other party.  The emphasis here is that the endpoints were receiving both the RTP and RTCP but stops receiving it. Typically an endpoint expects an RTCP feedback every 5-7 seconds, and if the endpoint does not receive any media or feedback over three consecutive RTCP CB intervals, the endpoint will trigger the Media or RTCP timeout circuit breaker.

 

Congestion

In the case of congestion, the assumption is that the endpoint is sending 10 times the traffic to what it should be (overuse), this implies that the media is inflicting congestion on other traffic (this can be the enduser's email, Netflix, Youtube, etc, or other people's traffic). The CB is triggered if the overuse has been occurring for three consecutive RTCP CB intervals.

 

Media Usability

Multimedia can tolerate some amount of packet loss, it might conceal, pixelate, but continue playing out. if the endpoint is sending a certain quality video (or some subset) and expecting the receiver to see that certain quality of video (or some subset) then the media is usable. Conversely, if it sending higher bitrate media and the receiver is only able to receive 10% of it, this leads to choppy media. If this occurs for a longer period of time, the media is completely unusable. 

Typically, the endpoints will automatically detect these situations and downscale the audio and video. Further, it may switch of the video. For the media usability circuit breaker to trigger, the endpoint may not be making these decisions and thus rendering unusable media for three consecutive RTCP CB intervals. 

Read more about the circuit breakers: https://tools.ietf.org/html/rfc8083 and https://tools.ietf.org/html/rfc8084 or the corresponding research articles. 

Event Timeline

Currently the browsers do not implement the circuit breaker, however, there is sufficient information in webrtc-stats including timestamps of reports to help calculate these pieces of information. Furthermore, because we are caclculating the circuit breakers, we can also calculate when a particular media traffic is underperforming due to other traffic on the network. While we cannot point out what that traffic might be, but informing the user or IT admin that something else might be causing an issue is insightful for diagnosis.

 

Tags: WebRTC, getStats API, QoE