One of the most common requests we hear from customers is to measure the quality of experience for a WebRTC session using a Mean Opinion Score (MOS). Our Objective Quality metric has provided a measure of QoE for several years. However, its unfamiliar decibel scale confused some customers and its reliance on a single numerical value to represent complex user perceptions limited its utility. In response, we are happy to introduce the callstats.io Extended Mean Opinion Score (eMOS), which renders call quality ratings on a familiar 1-5 scale and incorporates several innovations not available in other quality metrics.
Extended MOS for deeper insights
As noted in a previous post, measuring Quality of Experience (QoE) in WebRTC is challenging because the codecs and protocols it uses are new and very sophisticated. In addition, WebRTC services can range from simple 2-party audio calls found in contact centers, to complex multiparty video conferences found in collaboration services. For all these reasons, WebRTC QoE is difficult to model.
Nonetheless, our team has made significant strides. Leveraging three years of experience monitoring QoE for WebRTC sessions, eMOS is a next generation metric that enables rapid quality analysis at many levels of aggregation. We call it Extended MOS because it provides deeper insights than typical MOS estimators. Key features include:
- Measures quality for WebRTC voice and video - a single, familiar scale enables, at a glance, analysis for any call type
- Accurately aggregates values for multi-party calls - the model accounts for variations in the number of parties participating in a call, their duration and media types
- Continuously measures quality at regular intervals throughout a call - captures spontaneous and intermittent quality variations
- Histogram visualizes quality distributions - the Dashboard displays a histogram that enables operators to quickly analyze the distribution of quality throughout the measurement interval
- Values are aggregated at multiple levels - enables monitoring at the service/application level and rapidly drill-down to an individual call, a single user or even a specific media stream
How we calculate eMOS
Several quality models (commonly called "QoS to QoE" models in the QoE literature) are combined to calculate eMOS values. These models map a range of network performance (loss, delay, etc) and media encoding (codec, bit rate, etc) parameters into an estimation of how a typical subject would rate the media on a 5-point Absolute Category Rating (ACR) scale commonly found in many ITU-T standards. This is done for each modality in a WebRTC call (audio, video, screen sharing). It produces values ranging from 1 to 5 corresponding to Bad, Poor, Fair, Good and Excellent.
The models are continuously applied over a sliding time window, taking recency effects into account. This is important because when degradations happen during a call, their effect on the users' perception of quality varies with time. The models pool the quality observed in the window, and aggregate the overall results at the end of the call. This helps to account for interactions between quality and time.
The models consider a range of call-related events, including media stream mute conditions. When a stream is muted or paused, its contribution to the eMOS calculation is ignored. The model can be enriched with more call-related events as we develop them (e.g., considering the dominant speaker, other sources of quality disruption, and more).
eMOS values for each media stream are aggregated in a way that mirrors the structure of the call. Consider a multiparty video call, for example. eMOS is calculated for each media stream exchanged by each pair of users (inbound and outbound) and combined to understand the quality perceived by the user. In this multiparty example, the inbound and outbound user streams are typically exchanged with a bridge. These values are aggregated for all users connected to the bridge to produce an overall value for the entire call. The dashboard displays a single eMOS estimate and a more detailed distribution of estimates over the duration of the call. This enables operators to analyze call quality for each user, between pairs of users, for users connected to a bridge, and for the overall call (bridge, plus all users).
By aggregating granular eMOS data, callstats.io enables dashboard operators to rapidly drill-down and troubleshoot quality problems. For example, if we observe a low eMOS value for a given user, the problem could be isolated between the bridge and the user (in which case all inbound qualities would be low), or between one or more remote participants (which reduces the respective values and pulls down the overall eMOS score).
eMOS calculations occur roughly once a minute for each call. The results are aggregated at the end of the call, where we calculate the distribution of eMOS values at each level of the call (conference, user, user-to-user, etc.), as well as other summary statistics. The results presented in the dashboard correspond to the distribution itself (or its mode, in the collapsed view), and the median eMOS value.
Quickly assess QoE with scores distribution
The Call Detail screen presents eMOS values using a stacked bar histogram, enabling dashboard operators to evaluate quality at a glance. The scores visualization can be collapsed to the mode of the distribution (i.e., the predominant value), accompanied by the median score, and further summary statistics.
To facilitate troubleshooting down to the user level, the Call Detail screen presents point-to-point eMOS values between users. This enables operators to determine whether a problem reported by a user is isolated a particular subset of participants, or affected all participants.
Migrating from Objective Quality to eMOS
The new eMOS metric incorporates improvements in our perceptual quality model for voice. In this first release, we apply the improved model to the Opus codec (the codec used for the overwhelming majority of WebRTC sessions). Subsequent releases will add upgrades for video and other codecs.
In comparison to OQ, dashboard operators will enjoy these new features with eMOS:
- Histogram enables at-a-glance visualization of quality throughout the duration of a call
- Granular scoring of individual media streams, enabling more rapid troubleshooting
- More accurate perceptual quality estimates for Opus-based voice media streams
Note, customers familiar with OQ can continue to monitor calls with this metric. A switch enables dashboard operators to choose OQ or eMOS as their preferred QoE metric.
eMOS: an actionable metric
The overarching goal of the new eMOS metric for QoE is to provide a more comprehensive view of quality through better, more actionable and familiar information. This means customers get a better understanding of how their service is performing, with lower cognitive load. When it comes to troubleshooting, you will be able to more rapidly isolate and diagnose problems.