TELECOMMUNICATION SYSTEM AND METHOD FOR TRAVERSING AN APPLICATION LAYER GATEWAY FIREWALL DURING THE ESTABLISHMENT OF AN RTC COMMUNICATION CONNECTION BETWEEN AN RTC CLIENT AND AN RTC SERVER
A telecommunications assembly and a method for traversing an application layer gateway firewall during the establishment of an RTC communication connection between an RTC client and an RTC server using a proprietary RTC signalling protocol, wherein the firewall has no specific knowledge of the proprietary RTC signalling protocol. The RTC client and the RTC server can negotiate during the establishment of the RTC communication connection which of the ports of the firewall are required for the data packets to be exchanged via the RTC communication connection, wherein they use at least one standardised message element as a component of the proprietary RTC signalling protocol, with which information relating to the ports to be used can be found. by the firewall.
Latest RINGCENTRAL, INC. Patents:
This application is the United States national phase under 35 U.S.C. Section 371, of PCT International Patent Application No. PCT/EP2015/002040, which was filed on Oct. 15, 2015, and claims priority to German application no DE 10 2014 015 443.2, filed on Oct. 21, 2014.
BACKGROUND OF THE INVENTION Field of the InventionEmbodiments provide systems and methods for traversing an application layer gateway firewall during the establishment of an RTC communication connection between an RTC client and an RTC server. Computer programs and machine-readable data carriers are also provided.
Background of the Related ArtEmbodiments reported herein generally concern traversing of an application layer gateway firewall (hereinafter usually referred to in brief as “firewall”), which refers to data packets passing through such a firewall, for example during communication by means of Voice over IP (VoIP) or Video over IP. These types of communication fall under the category of Real-Time Transport Protocol communication (RTP communication). The following description refers but is not limited to a particular application of this RTC communication (RTC=real-time communication), which is WebRTC communication, carried out via a Web browser.
Firewalls are always an obstacle to the transmission of communications via VoIP or Video over IP. This is due to the UDP (User Datagram Protocol) port numbers negotiated dynamically in VoIP standards (H.323, SIP[RFC3261], etc.) for the RTP voice or video packets (RTP=Real-Time Transport Protocol, see [RFC3550]).
With precisely specified standard signaling protocols (H.323/H.245 (H.323 uses H.245 to handle media data), SIP/SDP (Session Initiation Protocol/Session Description Protocol), XMPP/Jingle (Extensible Messaging and Presence Protocol), MGCP (Media Gateway Control Protocol) [RFC3435], etc.), firewall manufacturers can dynamically track signals by implementing certain protocol portions (the signaling portions that are relevant to handling the UDP port numbers). This allows the firewall to open and close the dynamically negotiated UDP ports for the voice-video RTP packets to be transmitted. This known principle is also known as a firewall application layer gateway (=ALG firewall or simply firewall).
Because the signaling protocol for WebRTC is not standardized, any manufacturer can use his own proprietary protocol or alternatively can build on known protocols. Ultimately, however, ALG firewall manufacturers have the problem that they cannot build on a fixed signaling protocol, as would be the case with SIP/SDP, for example, and also cannot inspect it to get the port information in the signaling messages.
For a better understanding,
The WebSockets protocol optionally includes a field that identifies the signaling protocol used (SIP in this example). This is shown, for example, in an info box 14 under “Browser Request” and “Web Server Response.”
The problem with WebRTC in this interchange is that the signaling protocol for WebRTC is not standardized. This means that every WebRTC server must determine how it will handle signaling communication with its WebRTC client. With this proprietary WebRTC signaling approach, it is not possible for firewall manufacturers to produce general ALG firewall solutions for traversing or crossing firewalls, known as WebRTC Traversal. This can lead to problems with generating WebRTC solutions.
WebRTC is relatively new to commercial applications. However, WebRTC is on the way to becoming a dominant technology for Web-based real-time communication.
There are multiple known firewall techniques for WebRTC that are considered for firewall traversal for WebRTC:
b. As for SIP or H.323, certain UDP port ranges in the firewall can also be opened permanently for WebRTC. However, for companies with restrictive security requirements, this is often not desirable.
b. HTTP (Hypertext Transfer Protocol) tunneling: Most firewalls have one port always open. This is the TCP port 80 (TCP=Transmission Control Protocol), through which the HTIP data traffic [see RFC2616] also runs (TCP/http port). The idea is to form a TCP tunnel between the WebRTC client and a TURN server (Traversal Using Relays around NAT, NAT=Network Address Translation, see RFC5766) on the other side of the firewall (“TURN access via TCP”) and use it to channel UDP/RTP voice/video packets and data packets through the firewall. Some firewalls/companies are so restrictive that they will not accept HTTP traffic from any client, but instead only that coming from a specific internal server (HTIP proxy). In this case, the WebRTC browser must order the HTTP proxy, using the known HTTP-CONNECT method [RFC2817], to generate the aforementioned TCP tunnel through the firewall, to be used later for the TURN protocol. In another version of this discussion, in IETF, for example, a “TURN over WebSockets” tunnel through the firewall can be used [draft-chenxin-behave-turn-WebSocket].
This HTTP tunnel solution is basically possible, but requires that several conditions be met for uninterrupted use. It must be established,
-
- That the WebRTC client (browser) has implemented the described features (e.g., HTTP CONNECT). This depends upon the browser manufacturer (Google, Microsoft, Mozilla, etc.). For mobile WebRTC clients like smartphones and tablets (native WebRTC app),
- the method itself must be implemented. the company has and supports the required infrastructure (HTTP proxy),
- the WebRTC solution provider has installed a TURN server behind the firewall as part of its solution.
e. Firewall/Port Control Protocol [RFC6887] (e.g., Cisco). The idea is that the WebRTC client, before it sends a voice or video packet, gives the firewall a command via its own protocol to open a certain UDP port. Firewall control protocols have been known since around 2003. In practice, however, this approach has not yet succeeded, due among other things to security, authentication, and authorization issues. Most companies (CIOs, IT departments) do not want their firewalls to be “controlled” by multiple clients or servers.
f. Port multiplexing: With this approach, some or all RTP streams for a WebRTC call (e.g., all audio and video streams for a call), or even all RTP streams for multiple or all calls on the same system, can be transmitted through a single UDP port. This approach alleviates the firewall port problem in that fewer port resources are needed, but it does not solve the basic problem of first having to overcome the restrictive firewall. To date, no manufacturer of WebRTC clients or servers supports port multiplexing in conjunction with SIP/XMPP/H.323-based systems (optional). Port multiplexing is particularly an option for WebRTC solution manufacturers with large to very large scaling requirements (e.g., public, residential services, e.g., Google, etc.).
BRIEF SUMMARY OF THE INVENTIONThe invention is intended to overcome the aforementioned disadvantages and propose a method for traversing a firewall that both satisfies all security requirements and is easy to manage. The invention is further intended to propose a corresponding telecommunication system with which the method can be implemented.
According to embodiments of the invention, when an RTC communication connection needs to be established, as occurs when a website is opened via an HTTP request, for example, using a proprietary (i.e., not standardized) RTC signaling protocol, the RTC client and the RTC server negotiate which ports of the ALG firewall are needed in order to transmit the data packets required for the RTC communication connection, during which they use at least one standardized message element in the context, i.e., as a component of the proprietary RTC signaling protocol, with which the information concerning the ports to be used can be detected. The firewall has no specific knowledge of the proprietary RTC signaling protocol, and when the RTC communication connection is established using the standardized message element, it learns which of the firewall ports were negotiated by the RTC client and the RTC server, i.e., were found to be necessary in order to transmit the data packets to be exchanged via the RTC communication connection. In other words, the firewall can “overhear” which ports are needed, and that allows the firewall to dynamically open and close the necessary ports depending on the result of the negotiation between RTC client and RTC server.
Additional advantages, features, and characteristics of the present invention are presented in the following description of advantageous embodiments with reference to the drawing. The figures show schematically:
As noted above, According to embodiments of the invention, when an RTC communication connection needs to be established, as occurs when a website is opened via an HTTP request, for example, using a proprietary (i.e., not standardized) RTC signaling protocol, the RTC client and the RTC server negotiate which ports of the ALG firewall are needed in order to transmit the data packets required for the RTC communication connection, during which they use at least one standardized message element in the context, i.e., as a component of the proprietary RTC signaling protocol, with which the information concerning the ports to be used can be detected. The firewall has no specific knowledge of the proprietary RTC signaling protocol, and when the RTC communication connection is established using the standardized message element, it learns which of the firewall ports were negotiated by the RTC client and the RTC server, i.e., were found to be necessary in order to transmit the data packets to be exchanged via the RTC communication connection. In other words, the firewall can “overhear” which ports are needed, and that allows the firewall to dynamically open and close the necessary ports depending on the result of the negotiation between RTC client and RTC server.
A message element in a communication protocol is a syntactic segment of one or more signaling messages in which a piece of information is coded for later interpretation in network components and/or communication network terminals as part of a switching process. Message elements can be standardized elements or manufacturer-specific (proprietary) elements; the latter are not essential for basic functions of the communication network and are usually ignored by other manufacturers' network components and/or terminals. The standardized message element according to the invention contains identifying information about the connections established in order to transmit media data from and to a terminal and therefore must pass through the firewall, e.g., through open ports, in both sending and receiving directions.
Additional explanations of such message elements can be found in EP 1 317 150 A2.
In other words, the invented method solves the basic problem by using an add-on as a component of the RTC signaling channel that allows the firewall to overhear, during establishment of the RTC connection, which ports or UDP ports are dynamically negotiated for the exchange of voice and/or video packets, and therefore to dynamically open and close the corresponding UDP ports for the RTP traffic. The aforementioned context can be generated during the creation of the RTC signaling channel, during RTC signaling, or at the end of RTC signaling in the form of an additional field that contains information used for later detection of the RTP ports in the signaling messages. The establishment or standardization of an add-on that defines the context for the RTC signaling portion, which when read by a firewall is adequate to allow UDP/RTP port control, i.e., opening and closing, is also designated in the following as WebRTC signaling or briefly as WebRTCSig.
Embodiments as reported herein may offer multiple advantages:
-
- Firewall control protocols, which would represent significant obstacles with respect to security requirements, do not have to be implemented;
- no ports or ranges of ports have to be kept permanently open in the firewall, which could be risky for security reasons. It should be noted here that the use of port multiplexing techniques, with which multiple or all UDP streams are sent through a single UDP port, will presumably be supported in the future primarily by manufacturers of large-scale solutions.
- in scenarios where a solution based on HTTP tunneling cannot be applied, the invented method is relatively simple and yet more secure than other alternatives that can require significant expansion of WebRTC; by using this invention, for example, firewall solutions can be implemented that provide a continuous solution in particular for certain WebRTC applications.
- the invented solution can also be standardized easily, for example with IETF, so that generic implementation is possible and open to all manufacturers of WebRTC-based solutions and WebRTC firewalls.
For the use of the Secure WebSockets Protocol (WSS)—i.e., a WebSockets connection with TLS (Transport Layer Security)—the firewall cannot easily read the higher WebRTC signaling portion contained in the WSS connection, and for this problem, for example, a TLS hop-by-hop context can be used as the solution, as is done for session border controllers (SBCs). The ALG firewall terminates TLS, i.e., encryption takes place only up to or beginning at the firewall. TLS is only hop-by-hop. The ALG firewall therefore has one TLS connection first to the WebRTC client (or proxy) on one side of the ALG firewall and another TLS connection with the WebRTC server (or access node) on the other side of the ALG firewall.
According to one preferred embodiment of the invention, for negotiating the required ports, i.e., for exchanging the RTC signaling information and parameters between the RTC client and the RTC server, a previously defined (randomly numbered) signaling type is used, that is exchanged after the initial establishment of an HTTP connection between the RTC client and the RTC server by means of a so-called “WebRTCSig handshake.” This presents the advantageous development that the WebRTCSig handshake is executed as part of a procedure to upgrade an HTTP connection to a WebSockets connection and generates a context for RTC signaling. Expansions to the WebSockets protocol are sometimes necessary for this, for which a special or defined—and usually additional—field is inserted in a header, for example. Alternatively, the WebRTCSig handshake can take place only after the HTTP connection is converted (or upgraded) to a WebSockets connection, which is done by a proprietary protocol that preferably comprises only a few additional bytes and is also known as a “thin layer protocol” or “WebRTCSig over WebSockets.” With respect to the second WebRTCSig handshake alternative occurring only after the upgrade procedure, the first WebRTCSig handshake alternative offers the advantage, as part of the upgrade procedure, of saving the time needed for a round trip. Regarding the precise scheduling or timing, the WebRTCSig handshake takes place, for example, after the RTC client has downloaded the Java script (JS client) from the RTC server.
Depending on the RTC signaling protocol used, the actual WebRTCSig information can include the following signaling protocol variations:
3) WebRTCSig type 1=SIP and SDP over WebSockets
4) WebRTCSig type 2=XMPP and Jingle over WebSockets
3) WebRTCSig type 3=“Proprietary WebSockets Signaling with SDP Embedded (Offset)”
i.e., WS signaling messages (WS=WebSockets) with SDP protocol messages (e.g., WS Setup with SDP Offer; WS Connect with SDP Answer). This allows the firewall to find the beginning of the SDP Offer/Answer message, and an offset value can/must be provided here, that addresses the beginning of the SDP Offer message.
It should be noted that SDP is used here as session signaling for two reasons:
a) The WebRTC browser API (standardized in W3C=World Wide Web Consortium) is SDP-based in version 1.
b) SDP is also the session description protocol in SIP.
The offer-answer model is described in RFC 3264 as an example of a standardized message element, with the line “m=video 53000 RTP/AVP 32”, which means that video should be transmitted via port 53000.
SDP thereby facilitates cooperation with the SIP environment and also client-side cooperation between session signaling and WebRTC-API.
If a manufacturer uses a proprietary signaling protocol, it most probably uses SDP with the proprietary messages nonetheless, because WebRTC-API also uses SDP.
With the invented WebRTCSig type 3, for example, the signal would also indicate, as additional information, that the ALG firewall should start by byte 77 and should be interpreted as SDP protocol (again because that is standardized). Everything before that, i.e., up to and including byte 76, is part of the “proprietary setup message.” Alternatively, the browser could also map the SDP of the WebRTC-API to something else—e.g., H.245, Jingle, or a proprietary format—and use RTC signaling. It would then be flagged by another WebRTCSig type. This variation corresponds to a preferred embodiment of the invented method, according to which a signaling protocol with a signaling message is used, in which a session description protocol offer message with embedded offset is used, wherein the offset addresses the beginning of that message.
9) WebRTCSig type 4=specific SDP protocol
The SDP protocol could be standardized specifically for WebRTC.
10) WebRTCSig type 5=negotiated ports with pre-defined and communication syntax according to the invention
11) WebRTCSig type 6=negotiated ports in RESTful style (REST=Representational State Transfer): known URI (Uniform Resource Identifier) with defined (sub-)structure, which contains the ports.
12) WebRTCSig type 7=negotiated ports in RESTful style: known URI with a pointer or indicator that indicates a resource (server) that is supposed to contain the ports.
These last two variations also correspond to a preferred embodiment of the invented method, according to which the negotiated ports are defined in the RESTful style in RTC signaling messages.
13) WebRTCSig type 8=a text string is entered as the parameter that designates the start of SDP in the signaling messages. The text string as such is optional; it should not recur anywhere in the rest of the message.
Further embodiments may provide a telecommunication system that includes at least one RTC client, at least one RTC server, and at least one firewall with multiple ports. According to an embodiment of the invention, the firewall has a control unit that is configured such that the previously described method can be implemented.
In addition, a computer program product for executing the previously described method, and a machine-readable data carrier on which such a computer program product is stored, are possible embodiments.
As it is currently understood, IETF will not standardize the entire WebRTC signaling protocol, as was done for SIP or H.323, for example. An ALG firewall must therefore implement the WebRTC signaling protocols of all WebRTC application manufacturers, if the signaling protocol needs to be understood dynamically in all environments in order to find the negotiated UDP ports to which the proprietary RTP packets are sent. This can be avoided by grouping the chosen signaling protocols into categories (randomly numbered, for example). If the ALG firewall determines or learns that WebRTC signaling type 1 is involved, then it knows that it must parse according to SIP/SDP. On the other hand, if the ALG firewall learns that WebRTC signaling type 3 with offset 77 is being used, then the ALG firewall knows that it must parse the message from byte 77 as SDP protocol, etc. WebRTC signaling type 4 would then be an SDP protocol from byte 1. WebRTC signaling type 5 plus specific source and destination UDP port instructions would inform the ALG firewall of the exact UDP ports, so in this case no SDP protocol is used.
The telecommunication system 10 according to this invention shown in
The remaining designations shown in
After successful completion of the signaling, media data can be transmitted through the firewall 40, for which other protocols, such as RTP (real-time protocol), STUN (Session Traversal Utilities for NAT, NAT=Network Address Translation), ICE (Interactive Connectivity Establishment), are used.
As previously explained, according to the invention the type of WebRTC signaling is transmitted (this example uses the—randomly selected—type No. 3), and at the position of the SDP_offset designation there is a text string that marks an indicator or pointer for the SDP in the signaling message. The text string as such is optional; it should not recur anywhere in the rest of the message. Instead of the “SDP offset” designation given in the example, any string of adequate length, for example, could satisfy this requirement.
A second embodiment of the invented method, shown in
A third embodiment of the invented method, shown in
It should be noted that the features of the invention described by referencing the presented embodiments, for example the type and configuration of the clients, server, connections, and protocols used, can also be present in other embodiments, unless stated otherwise or prohibited for technical reasons. Not all features of individual embodiments described in combination must necessarily always be implemented in any one particular embodiment.
Claims
1-8. (canceled)
9. A computer-implemented method, comprising:
- negotiating a port for media data packet exchange using a standardized message element of a proprietary Real Time Communication (RTC) protocol that a firewall has no specific knowledge about, wherein the standardized message element includes port information; and
- upon establishing an RTC connection, enabling the firewall to dynamically open and close the port for media data packet exchange using the standardized message element.
10. The computer-implemented method of claim 9, wherein enabling the firewall comprises enabling the firewall to dynamically open and close the port with no specific knowledge of the proprietary RTC protocol.
11. The computer-implemented method of claim 9, further comprising:
- establishing a Hypertext Transfer Protocol (HTTP) connection prior to negotiating the port for media data packet exchange; and
- upgrading the HTTP connection to a WebSocket connection.
12. The computer-implemented method of claim 11, wherein negotiating the port for media data packet exchange is responsive to upgrading the HTTP connection to the WebSocket connection.
13. The computer-implemented method of claim 9, wherein negotiating the port for media data packet exchange comprises exchanging a signal protocol variation.
14. The computer-implemented method of claim 13, wherein exchanging the signal protocol variation occurs through a defined field in a header of an add-on.
15. The computer-implemented method of claim 9, wherein the media data packets are for audio data or video data.
16. A non-transitory, computer-readable medium storing instructions that, when executed by a processor, cause:
- negotiating a port for media data packet exchange using a standardized message element of a proprietary Real Time Communication (RTC) protocol that a firewall has no specific knowledge about, wherein the standardized message element includes port information; and
- upon establishing an RTC connection, enabling a firewall to dynamically open and close the port for media data packet exchange using the standardized message element.
17. The non-transitory, computer-readable medium of claim 16, wherein enabling the firewall comprises enabling the firewall to dynamically open and close the port with no specific knowledge of the proprietary RTC protocol.
18. The non-transitory, computer-readable medium of claim 16, storing further instructions that, when executed by the processor, cause:
- establishing a Hypertext Transfer Protocol (HTTP) connection prior to negotiating the port for media data packet exchange; and
- upgrading the HTTP connection to a WebSocket connection.
19. The non-transitory, computer-readable medium of claim 18, wherein negotiating the port for media data packet exchange is responsive to upgrading the HTTP connection to the WebSocket connection.
20. The non-transitory, computer-readable medium of claim 16, wherein negotiating the port for media data packet exchange comprises exchanging a signal protocol variation.
21. The non-transitory, computer-readable medium of claim 20, wherein exchanging the signal protocol variation occurs through a defined field in a header of an add-on.
22. The non-transitory, computer-readable medium of claim 16, wherein the media data packets are for audio data or video data.
23. A WebRTC server, comprising:
- a processor;
- a memory storing instructions that, when executed by the processor, cause: negotiating a port for media data packet exchange using a standardized message element of a proprietary Real Time Communication (RTC) protocol that a firewall has no specific knowledge about, wherein the standardized message element includes port information; and upon establishing an RTC connection, enabling a firewall to dynamically open and close the port for media data packet exchange using the standardized message element.
24. The WebRTC server of claim 23, wherein enabling the firewall comprises enabling the firewall to dynamically open and close the port with no specific knowledge of the proprietary RTC protocol.
25. The WebRTC server of claim 23, wherein the memory stores further instructions that, when executed by the processor, cause:
- establishing a Hypertext Transfer Protocol (HTTP) connection prior to negotiating the port for media data packet exchange; and
- upgrading the HTTP connection to a WebSocket connection.
26. The WebRTC server of claim 25, wherein negotiating the port for media data packet exchange is responsive to upgrading the HTTP connection to the WebSocket connection.
27. The WebRTC server of claim 23, wherein negotiating the port for media data packet exchange comprises exchanging a signal protocol variation.
28. The WebRTC server of claim 23, wherein exchanging the signal protocol variation occurs through a defined field in a header of an add-on.
Type: Application
Filed: Apr 21, 2021
Publication Date: Aug 12, 2021
Applicant: RINGCENTRAL, INC. (Belmont, CA)
Inventors: Karl KLAGHOFER (Munchen), Thomas STACH (Wien), Jurgen TOTZKE (Poing)
Application Number: 17/236,298