Check out our Amazon Connect solution! Learn More

Secrets to Monitoring WebRTC Endpoints in a Cloud Contact Center

By Pallab Gain on October 24, 2019
read

When you move your contact center to a cloud provider, you improve agility and simplify operations, but you also lose visibility and control over your infrastructure. You’ll need new tools to monitor network performance and gauge the customer experience when using contact center as a service platforms like Amazon Connect, Twilio Flex and NewVoiceMedia. In this blog, I look at the critical role the agent endpoint plays in providing visibility into user experiences and some of the clever techniques available to monitor the endpoint. If you are evaluating a WebRTC monitoring solution, it is important to think about the visibility it provides into the agent and customer experiences.

Why You Need a WebRTC Monitoring Solution 

WebRTC is an open standard project for delivering real-time communications capabilities natively in a browser. Many CCaaS platforms use WebRTC for the agent leg of a customer call, as shown below.

 

Secrets to Endpoint Monitoring Blog

WebRTC audio quality can suffer for a variety of reasons, including:

  • Network Performance issues with the underlying transport network, i.e. the agent’s end-to-end connection through the public internet to the CCaaS platform.
  • Software issues with the WebRTC protocol stack running in the agent’s browser, or issues related to signaling, media transmission or NAT traversal. 
  • Endpoint platform issues like configuration errors, compatibility issues or environmental issues like CPU or memory constraints.

If you use a CCaaS platform, WebRTC monitoring tools can help you analyze network performance, troubleshoot hardware and software problems, and resolve potential service quality issues. 

Why Monitor the Contact Center Agent Endpoint?

The contact center agent provides the best vantage point to gauge the user experience. At the risk of stating the obvious, because WebRTC is implemented in the browser, the best place to get WebRTC performance data is directly from the browser running on a WebRTC endpoint. The WebRTC getStats API supports an extensive collection of real-time communications statistics that can be queried directly from an agent’s browser.

Generally speaking, the getStats API supports three types of endpoint statistics that are vital to analyzing performance and troubleshooting problems with WebRTC. They correspond to stages in the media pipeline, as described in Marcin Nagy’s blog How Problems in the WebRTC Media Pipeline Affect Quality of Experience:

Network connectivity - packet transport metrics, including throughput, loss, delay and jitter.

Media - media stream metrics, including throughput per channel (audio, video and data)

Signaling - metrics for session negotiation (SDP or XMPP) and address resolution (STUN/TURN/ICE).

Typical challenges associated with using the getStats API to monitor WebRTC endpoints include keeping track of the WebRTC statistics each browser vendor supports, and ensuring endpoint statistics are collected and transported efficiently and securely. 

Tracking Browser Releases

The W3C/IETF getStats API specifications outline hundreds of potential WebRTC statistics. The vast majority of these statistics are optional, so support varies from browser to browser, and from browser release to browser release. (Our getStats API matrix tracks the individual statistics supported in specific browser releases.)

To further complicate matters, browser vendors introduce new releases quite frequently.  Google, for example, introduces new major releases of Chrome every six weeks, sometimes adding new stats. In any given contact center, agents may be running a variety of browsers (Chrome, Firefox, Safari, etc.) at different release levels, so it’s important for a monitoring software provider to follow browser releases closely and perform regular regression tests to avoid incompatibilities and keep pace with change. (At callstats.io, we use an open-source automated test engine called KITE to regularly validate interoperability with various browser releases.)

It is also worth noting that there is some help on the way. Chromium-based browsers like Microsoft Edge and Opera leverage common technology, which helps unify statistics support and reduce regression testing.

Gathering and Transporting WebRTC Statistics

You’ll need to transport WebRTC statistics from endpoints to a central repository for consolidation, analysis and reporting. You may need to make tradeoffs between the breadth and depth of the data you collect and WAN bandwidth consumption. On the one hand, you want to capture and forward as much statistical data as possible, as frequently as possible for ultimate granularity. On the other hand, you don’t want to overwhelm the network with statistical data. (In extreme cases, WebRTC performance monitoring and troubleshooting tools can actually impair service quality and exacerbate problems by seizing bandwidth, a phenomenon known as the observer effect in physics.)  Unfortunately, the less frequently you capture statistics, the more likely you are to miss a short-lived event like burst packet loss.

Augmenting getStats API Data

To develop a complete view of the user experience, you need to collect and examine statistical data from both endpoints in a WebRTC session. Most cloud contact centers establish simple point-to-point WebRTC connections between agents and a PSTN gateway in the CCaaS infrastructure. The PSTN gateway in the CCaaS infrastructure typically does not support the getStats API. You can use the Real-time Transport Control Protocol (RTCP) as an alternative to the getStats API to gain visibility into the PSTN gateway side of the session.

RTCP allows one endpoint to exchange performance statistics with other endpoints in a WebRTC session. RTCP statistics are robust and include delay, loss, jitter and throughput measurements (plus many more statistics if RFC 3611 RCTC Extended Reports [XR] are enabled). By combining getStats API data from an agent endpoint with RTCP data you can obtain a full, end-to-end view of the session.

You can also use RTCP to gather statistics from intermediary network elements like TURN servers (or MCUs and SFUs in WebRTC conferencing and collaboration applications). By monitoring both endpoints of the session, along with intermediary network elements you can increase the depth and breadth of the statistics you gather, which can help you identify and resolve issues more quickly and efficiently. (In a peer-to-peer WebRTC application like a video chat you can get a complete view of a session by monitoring both endpoints using the getStats API, or by monitoring one endpoint via the getStats API and the other via RCTP stats.)

callstats.io Provides WebRTC Health and Performance Insights 

callstats.io is built from the ground up to help contact center managers optimize WebRTC audio quality and improve user experiences. The callstats.io solution embeds advanced monitoring functionality into WebRTC endpoints, giving operations teams real-time visibility into key network performance indicators and service quality metrics. The solution gathers all supported WebRTC statistics from each endpoint, transforming raw data into actionable insights. 

We use an adaptive querying/reporting algorithm to balance statistical granularity with bandwidth consumption. At the beginning of a WebRTC session the callstats.io client queries the browser for statistics every second, and sends the results to an upstream callstats.io data collector. Once we determine there is no anomalous activity, we back off, and reduce the querying/reporting frequency to conserve bandwidth. Our adaptive stats algorithm provides full visibility into key performance metrics, without overburdening the network or impairing the user experience.

Of course, callstats.io employs strong security measures to protect the privacy of WebRTC metadata. We authenticate callstats.io clients to prevent masquerading, encrypt data in transit to prevent eavesdropping and man-in-the-middle attacks, encrypt data at rest to protect data confidentiality, and implement strong access control mechanisms to prevent unauthorized data disclosure.  (For more details, read Shaohong Li’s blog How callstats.io Monitors WebRTC Sessions Without Compromising Data Privacy.)

Since WebRTC is our sole focus, we closely follow browser releases, and continuously regression test our software against the latest releases to keep pace with change. We also maintain a close working relationship with Google so we can closely track the latest Chrome developments.

Conclusion:  Improve CCaaS User Experiences with WebRTC Endpoint Monitoring 

When you deploy a cloud contact center solution, you’ll need new tools to monitor WebRTC performance and service quality. Whether you develop your own monitoring tools or use a commercial solution like callstats.io it is important to consider the performance implications of collecting and analyzing massive WebRTC datasets. Endpoint-based monitoring is ideal for gauging user experiences and gathering detailed statistics at scale. However, you need to take bandwidth utilization into account and be sure to track browser changes closely.

Read: How Problems in the WebRTC Media Pipeline Affect Quality of Experience

Tags: WebRTC, WebRTC Monitoring