SYSTEM AND METHOD FOR MANAGING LATENCY IN A DISTRIBUTED TELEPHONY NETWORK
A system and method of preferred embodiments include at a signaling gateway of a first region, receiving a communication invitation of a first endpoint from a communication provider; signaling the communication invitation to a communication-processing server in a second region; in response to communication processing of the communication-processing server, dynamically directing signaling and media of the communication according to processing instructions and resources available in at least the first and two regions; wherein dynamically directing signaling and media communication of the communication comprises selectively routing media communication exclusively through communication resources of the first region if resources are available in the first region or selectively routing media communication between the first endpoint, the gateway, and at least the communication-processing server if media resources are not in the first region.
This application is a continuation of co-pending U.S. patent application Ser. No. 13/891,111, which claims the benefit of U.S. Provisional Application Ser. No. 61/644,886, filed on 9 May 2012, both of which are incorporated in their entirety by this reference.
TECHNICAL FIELDThis invention relates generally to the telephony field, and more specifically to a new and useful system and method for managing latency in a distributed telephony network.
BACKGROUNDIn recent years, innovations in the web application and Voice over Internet Protocol (VOIP) have brought about considerable changes to the capabilities offered through traditional phone services. In some distributed or cloud-based telephony systems, the routing of audio, video, or other media files can be determined or limited by the location and/or availability of the appropriate computing resources. In some instances, some or all of the callers reside in the same region, country, or continent as the bulk of the computing resources, thereby promoting increased call quality. However, if one or more of the parties to the call is located in a different region, country, or continent, then it is not readily apparent which computing resources should be utilized. Similarly, if the platform infrastructure is based in one region, communication outside of that region will be poor quality. For example, if the two callers reside in different countries, it might be unclear which of many computing resources should be allocated to the particular session. Furthermore, as more communication platforms are supported by cloud computing services located in distinct areas, core-computing infrastructure may be limited to particular locations. Accordingly, there is a need in the art for determining the shortest, highest quality, and/or optimized route for session traffic in a globally distributed telephony system. This invention provides such a new and useful system and method, described in detail below with reference to the appended figures.
The following description of preferred embodiments of the invention is not intended to limit the invention to these preferred embodiments, but rather to enable any person skilled in the art to make and use this invention.
Preferred SystemAs shown in
As shown in
The provider services (P1, P2, P3) preferably receive or initiate communication to an endpoint such as a caller, a mobile or browser client. The provider service is preferably an interface between the communication platform of the system 10 and communication providers. Communication providers preferably include telephony carrier networks, client applications using IP based communication protocols, or any suitable outside network. The system 10 may include a plurality of regions in addition to the first and second regions 12, 14. The provider services are preferably specific to each region as they are determined by the communication service providers, networks, and established contracts with various communication entities.
Incoming communications to a destination endpoint are preferably routed to the provider services in response to the destination endpoint being registered with the system 100. For example, a user dialing a PSTN number belonging to the system 10 will preferably have the communication directed to a provider service (P1, P2, or P3). Another example, a user dialing a SIP based endpoint that specifies a domain registered in DNS to the system 100 will preferably have the communication directed to a provider service (P1, P2, or P3). The provider additionally creates invite requests and responses that are preferably sent to a regional address (e.g., europe.twilio.com) and resolved to a communication gateway. In some variations, communication may be directly connected to a communication gateway to achieve a lower latency audio/video. This may be particularly advantageous to mobile and browser clients. The Domain Name System (DNS), anycast, or any suitable addressing and routing methodology may be used to forward to the closest communication gateway of a particular zone. The provider services preferably use SIP protocol for communication within the system, but the outside connected communication devices may use any suitable communication protocol. Similarly, the medium of the communication can preferably include any suitable combination of possible media mediums such as audio, video, screen-sharing, or other suitable synchronous media mediums.
The communication gateways (X1, X2) are preferably configured for both media and signaling. A communication gateway preferably mediates Session Initiation Protocol (SIP) signaling between at least one endpoint of a communication, from call establishment to termination. SIP is a signaling protocol widely used for controlling communication sessions such as voice and/or video calls over Internet Protocol. Any suitable communication protocol such as RTP or combination of protocols may alternatively be used. As a SIP mediator, the communication gateway preferably creates SIP invites, issues other SIP signaling messages, and facilitates transfer of media (e.g., audio, video) between various end-points. The communication gateways (X1, X2, XN) are preferably logical network elements of a SIP application, and more preferably configured as back-to-back user agents (b2bua) for one or both of media and signaling control. A b2bua, as would be readily understood by a person of ordinary skill in the art, preferably operates between endpoints involved in a communication session (e.g., a phone call, video chat session, or screen-sharing session). The b2bua also divides a communication channel into at least two communication legs and mediates signaling between the involved endpoints from call establishment to termination. As such, the communication gateway can facilitate switching the communication flow from flowing through a remote region (to use remote resources) to flowing just within the local region (e.g., when establishing a call with another endpoint in the local region). The communication gateway may additionally include media processing components/resources such as Dual-tone Multi-frequency (DTMF) detector, media recorder, text-to-speech (TTS), and/or any suitable processor or service. The media processing and signaling components of a communication gateway may alternatively be divided into any suitable number of components or services in cooperative communication. In one variation, the communication gateway is implemented by two distinct components—a signaling gateway that handles the signaling and a media gateway that handles media processing and media communication. In an alternative embodiment, the communication gateways may be configured as a control channel that functions to allow devices to directly communicate peer-to-peer. Browser clients, mobile clients, or any suitable combination of clients may have direct media communication in this variation. This alternative embodiment is preferably used with low-latency media. As an additional security precaution, communication gateways may be configured to allow traffic from only a distinct set of providers. Other providers are preferably firewalled off to protect infrastructure from the public Internet. The communication gateways will preferably respond to communications and/or propagate the communication messages to a communication-processing server. The communication-processing server may be in a different remote region. Load balancers may additionally facilitate a communication propagating from a communication gateway to an optimal communication-processing server. For example, there may be multiple remote regions with available communication-processing servers that can service a communication. A load balancer or alternatively a routing policy engine may direct the communication to an appropriate the region and/or communication-processing server.
The communication-processing servers (H1, H2, H3) function to process communication from a communication gateway. A communication-processing server preferably provides value-added features or services to a communication. A preferred communication-processing server is preferably a call router or telephony application processing component as described in patent application Ser. No. 12/417,630 referenced and incorporated above. A communication-processing server (or more specifically a call router) will preferably retrieve an addressable application resource (e.g., HTTP URI address document) associated with the phone number or communication indicator. In a preferred embodiment, the resource is a telephony application that indicates sequential telephony commands for the communication session of the client(s). The telephony commands may include instructions to call another communication endpoint, to start a conference call, to play audio, to record audio or video, to convert text to speech, to transcribe audio, to perform answering machine detection, to send text or media messages (e.g., SMS or MMS messages), to collect DTMF key entry, to end a call, or perform any suitable action. The telephony instructions are preferably communicated in a telephony instruction markup language such as TwiML. The addressable resource is preferably hosted at the HTTP Server 16. The servers (H1, H2, H3) and HTTP server 16 communications are preferably RESTful in nature in both/all directions. RESTful is understood in this document to describe a Representational State Transfer architecture as is known in the art. The RESTful HTTP requests are preferably stateless, thus each message communicated from any component in the system 10 preferably contains all necessary information for operation and/or performance of the specified function. Signaling will preferably be transferred through the server, but media may not be transferred through the server.
The communication-processing server is preferably part of a telephony application platform and may cooperatively use several other resources in operation. The communication-processing server may be a central component to the service provided by a platform and as such may be associated with considerable stateful data generated in use of the server. The stateful data may be used in internal logic and operation of the platform and/or for providing API accessible data and information. The system 10 is preferably implemented in a multi-tenant environment where multiple accounts share/operate with the same resources. As such, there may be benefits in keeping the communication-processing servers centrally located in a limited number of regions. Since the communication-processing server may not be located in each local region, a local region may call out, bridge or otherwise communicate with a remote region that does hold a communication-processing server. As mentioned above, the communication-processing server may provide any suitable processing services in addition to or as an alternative to the call router variation described above.
As shown in
As shown in
As shown in
As shown in
As shown in
As shown in
The system preferably can be configured to perform one or more of the foregoing functions in a computer-readable medium storing computer-readable instructions. The instructions are preferably executed by computer-executable components preferably integrated with the one or more communication gateways (X1, X2, XN) in the first region 12, the one or more communication-processing servers (H1, H2, H3, HN) in the second region 14, the HTTP server 16, the SIP API 30, and/or the routing policy server 50. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component is preferably a processor but the instructions can alternatively or additionally be executed by any suitable dedicated hardware device.
Preferred MethodAs shown in
Block S210, which includes receiving a communication invitation of a first endpoint from a communication provider, functions to initiate a communication session. Preferably a “call” will be directed to the system through a provider service of the first region. The called destination is preferably registered with the system. For example, the telephony endpoint (the phone number, phone number prefix, SIP address, the domain of the SIP address, and the like) is used to route communication to the system in any suitable manner. The provider services preferably ports or provides a network access interface through which outside communication networks connect to the system (and/or conversely, how the system connects to the outside communication networks). A communication will preferably include a call from an endpoint being directed through outside networking to a provider service interface. The provider service will preferably use SIP signaling or any suitable protocol to direct a communication stream to a communication gateway of the first region. A SIP communication invite is preferably received at the communication gateway or more specifically a SIP signaling gateway acting as a b2bua. Herein, “calls” may refer to PSTN phone calls, IP based video calls, screen-sharing sessions, multimedia sessions, and/or any suitable synchronous media communication. Calls can additionally be mixed medium/protocols. For example, a call (i.e., communication session) may have one leg connect to a PSTN telephony device while a second leg connects to a Sip based client application. Calls may alternatively be initiated from within the system such as in response to an API request or any suitable event.
Block S220, which includes signaling the communication invitation to a communication-processing server in a second region, functions to direct the communication to a communication-processing server in another region. The other region (the second region) is preferably spatially separate and remotely located from the first region. The distance of separation is preferably a globally significant distance. Within the US, the distance may be greater than 2000 miles (across country). Across the globe, the distance may be greater than 5000 miles. The communication gateway preferably directs the communication signaling. As shown in
The communication-processing server can provide any suitable communication processing service. Preferably, the communication-processing server acts as a call router that manages execution of a communication session application. Processing a communication application can include operations such as connecting to endpoints, recording media, processing media (converting text-to-speech, transcribing audio to text, transcoding between media codecs), retrieving device inputs (e.g., DTMF capture), sending messages or emails, ending calls, providing answering machine detection, or providing any suitable service. In one preferred variation, the method may additionally include, within the second region, a communication-processing server retrieving application instructions from an internet accessible server at a URI that is associated with a destination endpoint of the communication invitation. In this variation, the communication-processing server is preferably a call router as described in the incorporated patent application Ser. No. 12/417,630. The application instructions are preferably formatted as markup instructions within a document retrieved over HTTP using a web request-response model.
Block S230, which includes dynamically directing signaling and media of the communication according to communication processing instructions and the resources available in at least the first and second regions functions to redirect communication to appropriate regions. The directing of signaling and media is preferably dynamically responsive to the active state of the communication. Preferably, the signal and media direction is responsive to application state of a communication. Application state may include streaming media between two outside endpoints, playing media from the system to an endpoint, processing or recording media of a communication, or any suitable application state. The communication routing is preferably changed to increase the communication performance of the current state of a communication. For example, if a first endpoint is connected to a second endpoint, and the first and second endpoints are in the same region, the communication media stream is preferably kept within the first region. This can preferably reduce the amount of communication latency that might be involved in routing through a second region. In a contrasting situation, if the communication of a first endpoint necessitates particular media processing not available in the first region, a communication flow may be established with a second region. Additionally, an application can be configured with any suitable logic. For example, a call may be responsive to a new connection to an endpoint, to one of two endpoints hanging up, to initiating media processing (e.g., audio recording, transcription, or DTMF detection), or to sending an out of stream communication (e.g., SMS or MMS) and the like.
Block S232, which includes selectively routing media communication exclusively through communication resources of the first region if media resources to execute the processing instructions are available in the first region, functions to route communication within a region. The resources of the region are preferably sufficient to support the current state of the communication session. In a preferred variation, the media communication is exclusively routed through the communication resource of the first region for calls to other endpoints in the region. Block S132 preferably includes a communication-processing server inviting a second gateway, the second communication gateway inviting a second endpoint accessible through a provider service of the first region, and the communication-processing server re-inviting the first and second communication gateways to establish media communication flow between the first and second endpoints. The communication is also directed away from the communication-processing server of the second region. As a slight variation, the media communication flow may even be established to flow directly between the first and second endpoints without passing through a gateway of the first region. The first and second endpoints can be PSTN-based endpoints, SIP based endpoints, RTP based endpoints or any suitable endpoint. An endpoint is preferably any addressable communication destination, which may be a phone, a client application (e.g., desktop or mobile application), an IP based device or any suitable communication device. The endpoints can use any suitable protocol and the first and second endpoints may additionally use different communication protocols or mediums.
Additionally or alternatively, routing media communication exclusively through communication resources of the first region may include selecting a media resource of the first region to facilitate the media communication flow. In some cases, select media resources may be deployed/implemented in the first region. When the current communication media stream transitions to a state where it requires only the media resources of the first region, the media communication flow will preferably utilize the media resources of the first region, rather than those of the remotely located resources in the second region. For example, an application may initiate a media recording instruction. If a recording resource is in the first region, the communication gateway may direct communication flow to go to the local recording server as opposed to a recording server in a different region. In another example, a media transcoding server may be accessed to transcode media for two endpoints. Two endpoints may use different media codecs that are not compatible. The transcoding service will preferably be added as an intermediary in the communication flow so that the media can be transcoded with low latency.
The method may include querying a routing policy service for a selected communication route, which functions to dynamically select a communication route. The routing policy server can use the current state of the system, individual regions, individual resources/services/components of a region, application state, or any suitable parameter as an input. In one variation, the routing policy service is substantially statically defined. A set of rules and/or architecture configuration may be used to select the routes. In another variation, the routing policy service performs an analysis and selects a route that has statistical indications to be an optimal route based on the analysis. The routing policy server is preferably queried by the communication-processing server to select communication gateways. The routing policy server may additionally or alternatively be used by the communication gateway to select a communication-processing server in block S220. There may be one canonical routing policy server or multiple routing policy server instances may be established in multiple regions.
Block S234, which includes selectively routing media communication through at least the communication-processing server if media resources are not in the first region, functions to route communication between the first and second regions. This selective option is preferably taken when the resource needed or preferred for handling the communication session is not within the local region (i.e., the first region). As with the initiation of a call, the communication gateway preferably initially connects to a communication-processing server. As was mentioned above, this default behavior may not be taken if the next state of the communication is known without accessing the communication-processing server. Additional resources within the second region may additionally or alternatively be used with the communication-processing server. For example, media resources such as recording service, text-to-speech servers, transcoding servers, transcription/speech recognition servers, and/or any suitable media resource may be implemented in the second region and may act on the media communication flow.
As mentioned above, the directing of the communication can dynamically change. The method may additionally include re-establishing communication with the communication-processing server upon a second endpoint terminating the media communication flow S236 as shown in
As shown in
As shown in
In block S302, the first communication gateway preferably performs any necessary authentications, security checks, verifications, and/or credential checks for one or both of the caller and the recipient. Block S302 can additionally include looking up and/or identifying a target uniform resource identifier (URI) for the invitation, which designates the next destination for the transmission, i.e., the suitable regional communication-processing server H1 for the request. As shown in
In block S306, the server H1 downloads and/or retrieves the TwiML based on the URI associated with the dialed number (which corresponds to an address in one variation of the preferred system and method). Preferably, block S306 can further include determining if there is any media associated with the session. Preferably, the existence or requirement of a particular media can be determined with reference to the TwiML, which can contain predefined actions or verbs. Suitable actions or verbs can include dialing a number, saying text to the caller, sending an SMS message, playing an audio or video file, getting input from the keypad, recording audio or video, connecting the call to another browser client or device, or any other suitable type or media of communication. In the example implementation, the TwiML would contain the “dial” verb, which requires media. Following a series of mutual acknowledgements, the transmission of media is opened up between the POP and the server H1 in block S306.
As shown in
As shown in
Preferably, the server H1 is not involved in the media flow of block S314. Accordingly, another example implementation can include detecting, at each of the first and second communication gateways X1 and X2, whether each respective side of the session has timed out for some reason. In response to a timeout at the first communication gateway X1, the first communication gateway X1 will alert the server H1, which in turn will hang up both the caller side and the callee side of the session. Alternatively, if it is the second communication gateway X2 that times out, then the server H1 can be configured to only terminate or hang up on the callee side in the event that there are more actions or verbs to execute on the caller side.
The foregoing example implementation illustrates one aspect of the preferred system and method using a single dial verb between two PSTN users in a telephony system. However, the preferred system and method can be readily configured for any suitable combination of verbs, user types, and media types found in a cloud-based communication network system. Some example alternative implementations can include usage of the say verb, the hang up verb, the gather verb, either alone or in combination with the dial verb described above.
As shown in
As shown in
As shown in
As shown in
A sixth exemplary implementation of the system and/or method of the preferred embodiment can accommodate timeout scenarios on the caller or callee side. Each communication gateway is preferably responsible for detecting timeouts for their respective leg of the communication. As shown in
One or more aspects of the example embodiment can be configured partially or entirely in a computer-readable medium storing computer-readable instructions. The instructions are preferably executed by computer-executable components preferably integrated with one or more APIs, servers, routing policy servers, POP servers, and/or communication gateways. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component is preferably a processor but the instructions can alternatively or additionally be executed by any suitable dedicated hardware device.
As a person skilled in the art will recognize from the previous detailed description and from the figures and claims, modifications and changes can be made to the preferred embodiments of the invention without departing from the scope of this invention defined in the following claims.
Claims
1. A method for architecting a geographically distributed communication computing platform comprising:
- at a SIP signaling gateway of a first region, receiving a SIP communication invitation, the communication originating from a first communication endpoint directed through a provider service;
- signaling a communication invitation to a communication-processing server in a second region, wherein the first region and second region are at least two thousand miles apart;
- at the communication-processing server, retrieving application instructions from an internet accessible server at a URI that is associated with a destination endpoint of the communication invitation;
- at the communication-processing server, processing the retrieved application instructions;
- upon encountering an instruction requiring media flow, establishing a media communication flow with the first endpoint, querying a routing policy service of the second region for a selected communication route, and establishing a media communication flow with a resource of the selected communication route;
- wherein establishing a media communication flow with a resource of the selected communication route comprises if the instruction includes communicating with a second endpoint of the first region, establishing a communication flow with a second gateway, the second gateway establishing communication flow with the second endpoint through a provider service of the first region, and signaling for media communication to flow between the first and second endpoints through the first and second gateways of the first region.
2. The method of claim 1, further comprising if the second endpoint terminates media communication through the two gateways of the first region, re-inviting the communication-processing server, and the communication-processing server processing subsequent application instructions.
3. The method of claim 1, wherein establishing a media communication flow with a resource of the selected communication route comprises if the instruction includes a media resource not in the first region, the communication-processing server establishing media communication flow with the media resource of the second region, the communication gateway, and the first endpoint through the gateway of the first region.
4. The method of claim 1, wherein the first and second endpoints can be communicating with an audio medium, a video medium, or screen-sharing medium.
5. The method of claim 4, wherein the first and second endpoints communicate to the provider service with a PSTN protocol or a SIP protocol.
6. A method for architecting a geographically distributed communication computing platform with a subset of resources in a first region and a subset of resources in a second region comprising:
- at a signaling gateway of a first region, receiving a communication invitation of a first endpoint from a communication provider;
- signaling the communication invitation to a communication-processing server in a second region;
- in response to communication processing of the communication-processing server, dynamically directing signaling and media of the communication according to processing instructions and resources available in at least the first and two regions;
- wherein dynamically directing signaling and media communication of the communication comprises selectively routing media communication exclusively through communication resources of the first region if resources are available in the first region or selectively routing media communication between the first endpoint, the gateway, and at least the communication-processing server if media resources are not in the first region.
7. The method of claim 6, wherein routing media communication exclusively through communication resources of the first region comprises inviting a second gateway, the second gateway inviting a second endpoint accessible through a provider service of the first region, and the communication-processing server re-inviting the first and second communication gateway to establish media communication flow between the first and second communication gateway and away from the communication-processing server.
8. The method of claim 7, wherein upon the second endpoint terminating communication flow, re-establishing communication with the communication-processing server.
9. The method of claim 7, wherein the first and second endpoints are PSTN endpoints.
10. The method of claim 7, wherein the first and second endpoints are client application endpoints.
11. The method of claim 6, wherein the signaling uses a SIP signaling protocol.
12. The method of claim 6, wherein communication flow support the first endpoint and second endpoint connecting to the provider service through different communication mediums, the possible communication mediums comprising audio, video, and screen-sharing.
13. The method of claim 6, wherein the communication-processing server is a call router; and further comprising the call router retrieving telephony application instructions from an internet accessible server at a URI that is associated with a destination endpoint of the communication invitation, and processing the retrieved telephony application instructions.
14. The method of claim 6, wherein the resources available in the first region are media services.
15. The method of claim 14, wherein the media services comprise a text-to-speech service, a recording service, and a transcoding service.
16. A system for managing synchronous communication across regions comprising:
- a communication-processing server in a first region configured to retrieve application instructions from an addressable resource associated with a destination endpoint of a communication invitation and process the retrieved application instructions;
- service provider interfaces in a second region with communication to outside communication endpoints in the second region;
- a communication gateway of the second region configured to dynamically redirect signaling and media communication flow between at least one endpoint of the second region and the communication-processing server of the first region during the duration of a synchronous communication session;
- wherein the first region and the second region are within geographically distinct regions wherein latency of media communication flow between the first region and the second region is greater than latency of media communication flow between endpoints of the second region.
17. The system of claim 16, further comprising a routing policy server communicatively coupled to the communication-processing server and configured to select a communication gateway for communication between at least the first and second region during a communication session.
18. The system of claim 17, wherein the signaling is SIP signaling; and wherein the first and second communication endpoints can be a PSTN endpoint or a client application endpoint.
19. The system of claim 18, wherein the communication medium supported by the service provider interfaces in the second region comprise audio communication, video communication, and screen-sharing communication.
20. The system of claim 17, further comprising media resources of the second region that are configured for media communication flow within the second region independent of the communication-processing server of the first region.
21. The system of claim 20, wherein the media resources comprise a text-to-speech resource, a recording resource, and a media transcoding resource.
22. The system of claim 16, wherein the distance between the first region and the second region is greater than 2000 miles.
Type: Application
Filed: Jun 6, 2013
Publication Date: Nov 14, 2013
Inventors: Christer Fahlgren (San Francisco, CA), Jonas Boerjesson (San Francisco, CA), John Wolthuis (San Francisco, CA), Peter Shafton (San Francisco, CA)
Application Number: 13/911,896
International Classification: H04L 29/06 (20060101);