Third-party Authentication Proposal

By Lasse Lumiaho on January 21, 2016

We are enhancing the security mechanism for sending data to We will continue to use origin URLs as a whitelisting mechanism, this limits data gathering to a particular domain (or subdomains). For WebRTC services (our customers) that desire more robust security, we are proposing an enhanced solution. The main effort in implementing the third party authentication proposal is that it requires the WebRTC service to develop a part of the security protocol, which mitigates the threats described in the basic authentication.

The proposed new authentication occurs between three parties.

WebRTC application service

This is an WebRTC application or service manages the signaling for its service, integrates with, and contains information about its end-users


Clients run the WebRTC stack (e.g., send and receive media that they generated or forward media generated by other endpoints) and submit measurement data to

Client Library

The client library is a piece of code that resides in the client and sends measurement data to a measurement service, in this case. In the WebRTC case, an endpoint is any device along the IP route that handles the mediastream. Typically, these are clients used by the end-users, either a web browser or a native application that captures and renders the mediastream. However, it can also be a middlebox that either forwards media from one participant to another. The middlebox may for example transcode (Opus <-> G.711, or VP8 <-> H.264) or transrate it (switch between different media qualities or re-encode the media at a different rate).

In addition to sending measurement data, the client library sends conference state events, for example, a user joined, a user left, someone muted audio or paused video, conference created, conference stopped, etc.

Entities in 3-party auth

Figure 2: Entities in the third-party authentication (all interactions over HTTPS).

Why not OAuth 2.0

In OAuth 2.0, there is an authentication flow named the “three-legged authentication”. It contains the following

  • Consumer (entity that wants to access the user information)
  • Resource owner (something that owns the user information)
  • Service provider (someone that provides the requested user information)

Normally, the resource owner (e.g. a Twitter user) would access the consumer (e.g. an application that allows logging in using the Twitter credentials). The consumer redirects the resource owner to the service provider (e.g. Twitter). The resource owner authorizes the request (i.e., gives the consumer access to the resource owner’s data on the service provider’s service) and the service provider redirects the resource owner back to the consumer with the authorization information.

In our case, the resource is the ability to store measurement data for a particular userIDs at and the resource owner is the WebRTC application service. To do this in OAuth2, would require a large number of round-trips and this will result in longer authorization latency (time to get to the final token that allows the user-agent to submit measurement data). Consequently, the long authorization latency is not advisable in situations where the conference failure time, i.e., conferenceID fails to set up (e.g., user fails to provide access to camera) may be shorter than the time to get a valid token (authorization latency) from, and the user closes the tab, thus the failure event may go unreported.

To keep the number of RTTs short, we have chosen to not utilize the standard OAuth2, however, for the sake of simplicity the protocol utilizes OAuth2 terminology to explain the protocol interaction.

Protocol Details

The WebRTC client (web or native application) receives an __application_token__ associated to the userID from the WebRTC application Service. The application_token is sent to during the authentication step. sends the application_token along with the userID to the application service, which validates the application_token request and sends a __permission_token__ associated with the userID back to Based on the PERMISSIONS in the permission_token, generates a __data_token__ for the particular userID and sends it back to the WebRTC client. The WebRTC client can then use the data_token to submit conference events and measurement data.

The motivation for letting the WebRTC application service generate the application_token is it to offer the flexibility of associating and validating a token for a particular userID (or sessionID, conferenceID, etc) and not constrain it to a particular tuple of information.

Protocol interaction in 3rd-party auth

Figure 3: Protocol interaction between different entities in the third-party authentication process.

1. fetchAppToken message (optional)

The application_token is received by the WebRTC client from the WebRTC application service, this can be done on the first page load or requested explicitly by an REST API. Providing a REST API (e.g., fetchAppToken) is recommended, because it enables the WebRTC client to request a new application_token if the authentication fails.

In the WebRTC application service, the application_token is generally associated to the userID. An WebRTC application service may specifically constrain the use of the application token to an HTTP session, for example. In case where the userID is authorized to use the WebRTC service from several endpoints (web-browser, android/iOS application, etc), the WebRTC application service, depending on its design, may use the same or different token across endpoints. The association of the token to the userID or an HTTP session is up to the discretion of the WebRTC application developer.

To summarize, the WebRTC client has an application_token identifying a particular userID.

2. Authenticate Transaction

The client sends the application_token to in the Authenticate message. Further, the response to the Authenticate message remains pending until the getPermissionToken transaction is complete.

3. getPermissionToken Transaction sends the getPermissionToken message, which contains the application_token to the WebRTC application service. Further signs the getPermissionToken message with its private key before sending it.

On receiving the getPermissionToken message, the WebRTC application service verifies the message signature and validates if the application_token is valid.

If the application_token is valid, the WebRTC application service sends back a signed permission_token that contains a list of valid PERMISSIONS for that userID.

If the application_token is invalid, receives an error code, which it passes on to the WebRTC client.

4. Authenticate response

On receiving the response to the getPermissionToken message, verifies the signature of the Response message. If the response has a valid signature, stores the PERMISSIONS indicated in the permission_token for the associated userID, and later uses the set of permissions to verify if the userID is authorized to perform the requested action.

Meanwhile, creates a data_token for the associated userID based on the permission_token, and sends the data_token to the WebRTC client. The data_token, typically has a limited validity (fixed amount of time, or until the end of the current conferenceID).

If the application_token or the permission_token is invalid, sends an error to the client, which restarts the process by getting a new application_token from the WebRTC application.

Data submission

On receiving the data_token, the WebRTC client can use it to send new conference events (conference started, conference ended, etc) or measurement statistics (conference stats, bridge overall transport metrics, bridge overall health metrics).

On receiving the data submission messages, verifies if the data_token associated with the userID is permitted to perform the action based on the PERMISSIONS indicated in the permission_token. If the data_token is invalid, sends an error message, which restarts the authentication process. Alternatively, if the permission is invalid, i.e., if the userID is attempting to perform an action that it is not permitted to perform, the application developer needs to
then generate permission_token with the correct PERMISSIONS during the getPermissionToken transaction.

List of Permissions

A permission_token contains a list of permissions:

  • Wildcard (*): all permissions are granted.
  • CreateConference: can create a new conferenceID.
  • TerminateConference: can end a conferenceID.
  • SubmitBridgeEvent: can submit events as a middlebox.
  • SubmitBridgeStats: can submit measurement data as a middlebox.
  • SubmitConferenceEvent: can submit events as an userID at an endpoint.
  • SubmitConferenceStats: can submit measurement data as an userID at an endpoint.

In some deployment scenarios, typically in MESH-based conferencing topologies, application developers may prefer granting all the conference call participants a wildcard permissions, i.e., anyone can create or end a conference, and send statistics. However, in topologies that have a centralized conference server, the application developer may prefer to limit the permission to create and terminate a conference to the centralized bridge (indicated by a unique bridgeID), instead of every participant in the conference (each indicated by a userID).


We wrote this page to gather feedback, please send your comments to sec-arch[at]

Tags:, WebRTC Monitoring, Security