Technical overview

Media streaming capabilities

The media streaming capabilities of WebRTC are similar to those used for traditional VoIP and RTC but they differ slightly.

WebRTC mandates support for two audio codecs, Opus and G.711. In practice, browsers are unlikely to implement proprietary codecs such as G.729 that are implemented in many desk phones. Opus is superior to many of the legacy codecs but it does mean there is a possibility of transcoding with a slight degradation in quality when there is interoperability with legacy technology.

Since November 2014, two video codecs, VP8 and H.264 are mandatory for the browser. VP8 is a royalty-free codec that is likely to be widely adopted by newer technologies. H.264 exists in many legacy video/webcam/teleconferencing solutions and its presence in the browser enables end-to-end video without transcoding, reducing complexity and CPU requirements and avoiding degradation of the picture.

Despite the support for legacy codecs, many other features of the browser's media stack differ and do not offer direct connectivity to most legacy technology.

A major feature of WebRTC is the use of Interactive Connectivity Establishment (ICE) for effective NAT discovery and traversal. Many legacy technologies, including a lot of softphones and desk phones, do not support ICE or have support for its predecessor, STUN.

The next major feature of WebRTC is encryption. Many legacy phones support basic SDES encryption, with key exchange in the Session Description Protocol crypto attributes. WebRTC insists on the use of DTLS-SRTP which offers more security but with a complete loss of backwards compatibility.

The final point is the RTP packet itself. Many traditional devices and softphones support RTP/AVP. WebRTC requires RTP/AVPF.

The combined effect of these small differences is that WebRTC media streams have a higher quality than traditional RTC but they can't interoperate directly with the majority of desk phones and softphones already deployed. Given the extremely wide deployment of web browsers (hundreds of millions of users have already upgraded to browser versions that include WebRTC support) it is envisaged that vendors of related technologies will aim to interoperate with the browser in future versions of their products.

In the meantime, it is necessary for media streams to be passed through some intermediate network component that can transform the streams between the standards. This component is referred to by different names, including Session Border Controller (SBC), Back-to-Back User Agent (B2BUA) and Media Breaker. In practice, it is relatively easy to configure an Asterisk or FreeSWITCH server to perform this role.

Signalling protocols

As already mentioned, WebRTC does not provide a signalling protocol for the purpose of locating other users and establishing the media streams to start a call.

JavaScript does provide the WebSockets API, a mechanism for asychronously passing messages back and forth between JavaScript and a WebSocket server. Many WebRTC implementors have chosen to use WebSockets as a transport for their signalling protocols.

SIP and XMPP, the most common signalling protocols from traditional RTC, have both between adapted to support WebRTC. RFC 7118 specifies the SIP over WebSocket transport and many leading SIP implementations have implemented it. XMPP supports both a HTTP binding and more recently a WebSocket binding specified in RFC 7395. Using one of these protocols is highly recommended for the vast majority of WebRTC projects.

User privacy and security

While media streaming allows for more powerful applications to be deployed through the web, it also creates a greater risk to user security and privacy.

When the JavaScript on a site attempts to activate the webcam or microphone, the browser shows a prompt asking the user to authorize the streaming session. The prompt usually allows the user to choose which devices will be used if they have more than one webcam or microphone/audio source.

Authentication

Authentication between a browser and a server can take place in various ways. If a WebSocket connection is used for signalling and if a user has a client certificate in their browser then it should theoretically be possible to use the certificate to authenticate the WebSocket connection. In practice, this is supported by the repro SIP proxy but it is not yet supported by the browsers.

Another possibility is the use of cookies or WebSocket URL parameters. Browser security mechanisms (to protect users from cross-site-scripting) only allow cookies to be used if the web server and WebSocket server have the same domain name. If the servers don't use the same domain, URL parameters must be used instead of the cookie. The web server serving the HTML and JavaScript can send an authentication token to the browser as a cookie and the browser can then present this to the WebSocket signalling server. If using the URL parameter method, the script running on the server-side usually constructs a WebSocket URL and embeds it in the HTML sent to the browser. When the JavaScript activates the WebSocket connection, the request-URL, including the parameters, are sent in the WebSocket upgrade request. Both of these mechanisms are supported in the repro SIP proxy.

Standard SIP DIGEST authentication can also be used over the WebSocket transport. One benefit of DIGEST authentication is that challenges can be sent from SIP proxy servers or other network components behind the WebSocket server.

Cookie and URL parameter authentication in repro

To use either of these mechanisms with the repro SIP proxy, simply make sure that there is a value for the WSCookieAuthSharedSecret parameter in repro.config. If desired, the actual cookie or URL parameter names can be customized, otherwise they use default values.

Example 13.1. repro.config settings for cookie and URL parameter authentication

WSCookieAuthSharedSecret = some-random-string

# Names of the cookies to use for the cookie authentication protocol
# These are the default values:
#WSCookieNameInfo = WSSessionInfo
#WSCookieNameExtra = WSSessionExtra
#WSCookieNameMAC = WSSessionMAC

# Name of the extension header that must match the content of
# the authenticated WSSessionExtra cookie
#WSCookieExtraHeaderName = X-WS-Session-Extra

To send the authentication values as URL parameters, the WebSocket URL passed to JSCommunicator or JsSIP may resemble wss://ws.example.org:8443/WSSessionInfo=1%3A1429975989%3A142997 6889%3A%2A%40example.org%3A%2A%40%2A;WSSessionExtra=;WSSessionM AC=7bf9ed44bfe7e10153762639419c52fd712de58e.

Notice that the parameters are sent with the semicolon as a separator, do not send them as query parameters with the ampersand (&) separator. The value of each parameter has to be URL encoded (for example, using the urlencode() function in PHP). This is demonstrated in the DruCall source code.

Further details and examples are present in the page on the reSIProcate project wiki.