EXTENDED HTTP OBJECT CACHE SYSTEM AND METHOD

Info

Publication number: 20160241667
Type: Application
Filed: Feb 18, 2016
Publication Date: Aug 18, 2016
Applicant:
Inventors: Pankaj G. Kulkarni (Koramangala Bangalore), Phaniraj M. Raghavendra (Bangalore), Rakesh Padinjeredil (Mysore), Andrew Foss (San Jose, CA), John Coronella (Winchester, MA)
Application Number: 15/047,594

Abstract

An Extended Object Cache system for reducing network traffic and methods for making and using the same. The Extended Object Cache system allows caching of HTTP responses where the response headers do not include Etag headers to indicate whether contents of the response have changed, but the contents have not changed. The system allows caching of dynamically generated content as well as static content that does not include an ETag in the response headers.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. provisional patent application, Ser. No. 62/117,879, filed Feb. 18, 2015. Priority to the provisional patent application is expressly claimed, and the disclosure of the provisional application is hereby incorporated herein by reference in its entirety and for all purposes.

The following United States nonprovisional patent applications are fully owned by the assignee of the present application and are filed on the same date herewith. The disclosure of the nonprovisional patent applications are hereby incorporated herein by reference in their entireties and for all purposes:

“MULTI-STAGE ACCELERATION SYSTEM AND METHOD,” Attorney Matter No. 29955.4001, filed Feb. 18, 2016; and

“SYSTEM AND METHOD TO ELIMINATE DUPLICATE BYTE PATTERNS IN NETWORK STREAMS,” Attorney Matter No. 29955.4002, filed Feb. 18, 2016.

COPYRIGHT NOTICE

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

FIELD

The disclosed embodiments relate generally to object caching and more particularly, but not exclusively, to methods and systems for an Extended Hypertext Transfer Protocol (HTTP) Object Cache.

BACKGROUND

HTTP caching is described in RFC 2616 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html). An HTTP client, such as a web browser, makes an HTTP Request “R” to a web server. When the web server returns a response “RR” to the HTTP client, the HTTP client can store the response RR in its cache. When the client subsequently makes the same request R, the HTTP client informs the web server that the HTTP client already has a cached response along with some metadata about the cached response, as part of the HTTP request headers of R. Based on the HTTP request headers (for example, If-Modified-Since, If-Match), the web server uses the rules described in RFC 2616 to decide if the cached response at the client is fresh or if it has gone stale. If the cached response at the client is fresh, the web server returns a ‘304 Not Modified’ response to inform the client to continue to use the cached response. If the cached response at the client is stale, the web server sends the new response “RR_UPDATED”. RFC 2616 describes the rules and protocol support for HTTP caching, including headers, response codes, expiration mechanisms, etc.

One of these HTTP caching mechanisms is an Etag header. When a web server sends a HTTP response RR for a HTTP request R, the web server can include an Etag header in the response header of the RR, for example “Etag: Etag-String”, to indicate that Etag-String is the value of the Etag header. The Etag-String is a small opaque string, and can represent a signature for the Contents of the RR (Contents of RR referred to as CRR henceforth). If the CRR does not change, the Etag in the RR does not change. If the CRR changes, then the web server creates a new Etag ETAG_UPDATED and includes that in RR_UPDATED.

The HTTP client can cache the RR locally. When the client makes a subsequent request R for the same resource, the client mentions the Etag in one of the request headers (for example, using a “if-none-match: Etag” header field). The web server checks if the current Etag for the response RR matches what the client has provided. If yes, the server knows that the client has the correct version of the object and sends a ‘304 not modified’ response, which is just a few bytes in length. If the Etag does not match, the server sends the complete response RR_UPDATED along with the new Etag ETAG_UPDATED so the client can update its cache.

SUMMARY

Modern web browsers support the Etag mechanism. Many web sites also provide Etags for their content, but this is typically for static files and not for dynamically generated content. Several web sites do not support Etag even for static content, since supporting Etags is optional and requires additional work to be done by the web administrator.

In practice, dynamically generated content is also cacheable. For example, if someone were to look up the timetable of trains from Boston to New York, the web server might fetch this information from a database dynamically, but the response will be same every time (unless there is a change in schedule which causes the response to change).

An Extended HTTP Object Cache system disclosed herein allows caching of HTTP responses RR where the response headers do not include the Etag header to indicate whether CRR has changed, but the CRR has not changed. The disclosed Extended HTTP Object Cache system allows caching of dynamically generated content as well as static content that does not include an ETag in the response headers.

Though the Extended HTTP Object Cache is applicable in all situations, it is more relevant where the ‘last mile’ connectivity does not have high bandwidth. Examples of low bandwidth ‘last mile’ include cellular data, public shared wireless fidelity (Wi-Fi) hot spots such as airports etc. The Extended HTTP Object Cache drastically cuts down the data transferred in the last mile.

The disclosed system is available in two modes—double-ended and single-ended.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a top-level block diagram illustrating one embodiment of an Extended HTTP Object Cache system operating in double-ended mode.

FIG. 2 is a block diagram illustrating an alternative embodiment of the Extended HTTP Object Cache system of FIG. 1, wherein the system is run in a single ended mode.

FIG. 3 is a flow chart illustrating one embodiment for processing an HTTP Request from the Application by the Client Proxy of FIG. 1.

FIG. 4 is a flow chart illustrating one embodiment for processing an HTTP Request from the Client Proxy on the Server Proxy of FIG. 1.

FIG. 5 is a flow chart illustrating one embodiment for processing the Server Proxy Response RR and sending the response to the Application of FIG. 1.

FIG. 6 is a flow chart illustrating one embodiment for processing an HTTP Request from the Application and sending the Response to the Application of FIG. 2.

It should be noted that the figures are not drawn to scale and that elements of similar structures or functions are generally represented by like reference numerals for illustrative purposes throughout the figures. It also should be noted that the figures are only intended to facilitate the description of the preferred embodiments. The figures do not illustrate every aspect of the described embodiments and do not limit the scope of the present disclosure.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Turning to FIG. 1, a Client 10 is any computer or mobile device that can connect to the Internet and request HTTP objects using an Application 11 running on the Client 10. The Application 11 can also include Web Browsers and applications that make network requests. A Destination Server 14 is a web server serving the HTTP requests. The Client Extended HTTP Proxy 12 is the client side of the disclosed system and runs on the Client 10. For simplicity, the Client Extended HTTP Proxy 12 can be referred to as ‘Client Proxy’ 12. The Application 11 on the Client machine 10 use the ‘Client Proxy’ 12 as a standard HTTP Proxy as defined in RFC 2068 (https://www.ietf.org/rfc/rfc2068.txt). A Server Extended Object Caching Proxy 13 is an enhanced HTTP Proxy that adds the server side of the disclosed system. For simplicity, the Server Extended Object Caching Proxy 13 can be referred to as ‘Server Proxy’ 13.

As an example with reference to FIG. 1, consider the Client 10 to be a mobile device using a shared Wi-Fi in a public location trying to access a hotel-booking site (e.g., the Destination Server 14) to make a reservation. The Client 10 has a slow network connection to the Internet (the last mile). The Server Proxy 13 is a machine on a network (e.g., cloud computing), which has a very fast network connection to the Internet. Thus, data exchange rate between the Server Proxy 13 and the Destination Server 14 is much less than the data exchange rate between the Client 10 and the Destination Server 14. When the Application 11 makes an HTTP request R, the request R is intercepted by the Client Proxy 12. The Client Proxy 12 then forwards the request R to the Server Proxy 13. The Server Proxy 13 fetches the response RR from the Destination Server 14 and sends the response RR back to the Client Proxy 12, which in turn responds to the Application 11.

FIG. 3 is a flow chart illustrating one embodiment for handling the HTTP Request R on the client side, in double ended mode, using the system of FIG. 1. In step 15, the Application 11 sends the HTTP Request R to the Client Proxy 12. In step 16, The Client Proxy 12 determines whether it has a cached response RR for the Request R in its local cache.

If yes, in step 17, if the response RR is found to be fresh according to the RFC specifications, then in step 18, the Client Proxy 12 immediately sends the response RR to the Application 11. In step 17, if the cached response RR is found to be stale, then in step 19, the Client Proxy 12 fetches the response header X-ActEtag from the cached RR and adds it to the HTTP Request R. Then in step 20, the Client Proxy 12 forwards this request to the Server Proxy 13 and fetches the response RR.

However, returning to step 16, if the Client Proxy 12 did not find a cached response RR for the request R, the Client Proxy 12 forwards the request R to the Server Proxy 13 and fetches a new response RR. In step 21, the response RR from the Server Proxy 13 is processed as described in FIG. 5.

One embodiment of processing the request R on the Server Proxy 13 in double ended mode is illustrated in FIG. 4. Turning to FIG. 4, in step 22, the HTTP Request R is received from the Client Proxy 12 by the Server Proxy 13. In step 23, the Server Proxy 13 checks if the request R contains a header field “X-ActEtag: value”. If the header field “X-ActEtag: value” is not found in R, in step 29, the Server Proxy 13 fetches the response RR from the Destination Server 14 and then in step 28, the Server Proxy 13 sends the response RR to the Client Proxy 12. On the other hand, if the Request R contains a header “X-ActEtag: value”, in step 24, the Server Proxy 13 first fetches the response RR from the destination server 14. Then in step 25, the Server Proxy 13 computes a hash digest (for example, an MD5 digest) “hash” of the Contents of RR (CRR).

In step 26, the Server Proxy 13 checks if the computed “hash” is same as “value” that was present in the X-ActEtag header value. If the “hash” is same as “value”, the Server Proxy 13 knows that the CRR at the Client Proxy 12 is same as what the Destination Server 14 sent and so in step 27, the Server Proxy 13 sends a “304 Act Not Modified” response to the Client Proxy 12. Advantageously, instead of sending the whole CRR, which can be a large file, only a few bytes of “304 Act Not Modified” are sent over the last mile to the Client Proxy 12, thus making it possible to cache RFC-non-cacheable content. A large data transfer thus can be avoided in the last mile leading to increased speed of fetching the content and data savings on the last mile.

FIG. 5 is a flow chart of handling of the HTTP Response RR from the Server Proxy 13 on the Client Proxy 12, in double ended mode of FIG. 1 (step 21 of FIG. 3). In step 30, the Client Proxy 12 received the HTTP response RR from the Server Proxy 13. In step 31, the Client Proxy 12 checks if the response code RC is 200 OK. If yes, the Client Proxy 12 computes a hash digest (for example, an MD5 digest) “hash” of the Contents of RR (CRR). The Client Proxy 12 and the Server Proxy 13 use the mechanism of step 25 to compute the “hash” value. In step 32, the Client Proxy 12 then adds a response header “X-ActEtag: hash” to the response header of RR. Then, in step 33, the Client Proxy 12 updates its local cache with this response RR, and sends RR to the Application 11.

In step 31, if the response code RC is not 200 OK, then in step 34, the Client Proxy 12 checks if the response code RC is 304. In step 35, the Client Proxy 12 then checks if response code RC is “304 Act Not Modified”. If yes, the Client Proxy 12 takes this as an indication from the Server Proxy 13 that the response RR in its cache is same as what the Destination Server 14 sent, and so in step 36, the Client Proxy 12 fetches the response RR from its local cache and sends RR to the Application 11. In step 35, if the response code RC is not “304 Act Modified”, then the Client Proxy 12 understands that the Destination Server itself has sent a “304 Not Modified” and so in step 37, sends “304 Not Modified” to the Application 11. In step 34, if the response code RC does not contain 304, the Client Proxy 12 forwards the response RR from the Server Proxy 13 to the Application 11.

In an alternative embodiment, the computation of the “hash” and addition of X-ActEtag header can be done on the Server Proxy 13 instead of the Client Proxy 12 based on other considerations such as client and server compute power.

FIG. 6 is a flow chart of one embodiment for handling the HTTP Request R in single ended mode. A difference between the double-ended mode in FIG. 1 and single-ended mode in FIG. 2 is that in single ended mode, there is no ‘Client Proxy’ 12 running on the Client 10. The system of FIG. 2 uses the Client Application cache and the standard Etag header that is part of the HTTP Object Caching RFC mentioned earlier. In single ended mode, the Application 11 sends the HTTP Request R to the Server Proxy 13 in step 38. In step 39, the Server Proxy 13 checks if there is a request header “if-none-match: act-hashvalue” in the request R. If the answer in step 39 is yes, the Server Proxy 13 then fetches the response RR from the Destination Server 14 in step 40.

Then in step 44, the Server Proxy computes a hash digest “hash” of the contents of the response, in same manner as step 25 of the double ended system. In step 45, the Server Proxy 13 checks if “hash” matches “hashvalue” from the request Etag header. If the answer in step 45 is yes, the Server Proxy 13 realizes that the Application 11 has the same CRR as sent by the Destination Server 14 and so in step 46, the Server Proxy sends a “304 Not Modified” response to the client. In step 45, if the “hash” does not match “hashvalue” from the request Etag header, then the Server Proxy 13 realizes that the response from the Destination Server 14 is newer than what the Application 11 has in its cache. So, in step 42, the Server Proxy 13 adds an “Etag:act-hash” header to the response RR from the Destination Server 14 and sends the response to the Application 11 in step 43.

In the single ended mode, the Server Proxy 13 is avoiding transmission of the complete response from the Destination Server 14 when the Application 11 has the same content in its local cache as the content from the Destination Server 14. Instead the Server Proxy 13 sends only a few bytes containing “304 Not Modified” so the Application 11 can use the response from its cache. The size of the data transfer thus can be reduced in the last mile leading to increased speed of fetching the content and data savings on the last mile.

It should be noted here that for the sake of clarity, the diagrams do not show handling of other HTTP response codes such as 404, 503 etc. These response codes are handled by the Server Proxy 13 and Client Proxy 12 as described in RFC 2068 (https://www.ietforg/rfc/rfc2068.txt).

The described embodiments are susceptible to various modifications and alternative forms, and specific examples thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the described embodiments are not to be limited to the particular forms or methods disclosed, but to the contrary, the present disclosure is to cover all modifications, equivalents, and alternatives.

Claims

1. A method for reducing network traffic between an application and a destination server, comprising:

intercepting a network request from the application via a client proxy;

forwarding the network request to a server proxy;

receiving a network response from the destination server at the server proxy;

determining at the server proxy whether the network response is the same as a cached response at the client proxy; and

transmitting a ‘304 Act Not Modified’ response to the client proxy if the cached response at the client proxy is the same as the network response, otherwise transmitting the network response to the client proxy to transmit the network response to the application.

2. The method of claim 1, wherein said intercepting the network request comprises intercepting a Hypertext Transfer Protocol request.

3. The method of claim 1, further comprising computing a hash digest of the network responses to identify if a selected network response has changed.

4. The method of claim 1, further comprising maintaining a cache of network responses at the client proxy.

5. The method of claim 4, further comprising adding a Hypertext Transfer Protocol header “X-ActEtag: value” to a selected network response before storing the selected network response in the cache at client proxy.

6. The method of claim 5, further comprising updating the X-ActEtag header via at least one of the client proxy and the server proxy.

7. The method of claim 4, wherein said forwarding the network request to the server proxy comprises adding a Hypertext Transfer Protocol header “X-ActEtag: value” to the network request when there is a stale cached response in the cache at the client proxy.

8. The method of claim 7, further comprising computing a hash digest of the a selected network response and comparing the hash digest with the value from the header “X-ActEtag: value” of the network Request to determine if the cached response at the client is same as the selected network response from the destination server.

9. The method of claim 1, further comprising fetching the cached response from the cache based on said transmitting the ‘304 Act Not Modified’ response and sending the cached response to the application.

10. A method for reducing Hypertext Transfer Protocol traffic between an application and a destination server, comprising:

intercepting a Hypertext Transfer Protocol request from the application via a server proxy;

receiving a Hypertext Transfer Protocol response from the destination server at the server proxy;

determining at the server proxy whether the Hypertext Transfer Protocol response is same as a cached response at the application; and

transmitting a ‘304 Not Modified’ response to the application if the cached response at is the same as the Hypertext Transfer Protocol response, otherwise transmitting the Hypertext Transfer Protocol response to the application.

11. The method of claim 10, further comprising computing a hash digest of the Hypertext Transfer Protocol response to identify if a selected Hypertext Transfer Protocol response has changed.

12. The method of claim 11, further comprising adding a Hypertext Transfer Protocol header “Etag: act-hashvalue” to a selected Hypertext Transfer Protocol response before sending the Hypertext Transfer Protocol response to the application.

13. The method of claim 12, further comprising comparing a hashvalue from the “Etag: act-hashvalue” header with the computed hash digest of the Hypertext Transfer Protocol response from the destination server to determine if the cached response at the application is the same as the Hypertext Transfer Protocol response from the destination server and sending a ‘304 Not Modified’ Hypertext Transfer Protocol response if the hashvalue and the computed hash value of the Hypertext Transfer Protocol response are the same.