RETURN-LINK OPTIMIZATION FOR FILE-SHARING TRAFFIC

Info

Publication number: 20100179984
Type: Application
Filed: Jan 4, 2010
Publication Date: Jul 15, 2010
Applicant: ViaSat, Inc. (Carlsbad, CA)
Inventor: William B. Sebastian (Quincy, MA)
Application Number: 12/651,928

Abstract

Methods, apparatuses, and systems for return-link optimization are provided. Embodiments identify upload-after-download content (e.g., file sharing content) upon download, and generate one or more identifiers characterizing the content (e.g., a digest). The identifiers are stored in a client-side server dictionary model reflecting a presumption that the content is stored in a server-side dictionary. When content is later uploaded, the server dictionary model is used to identify when the upload content matches previously downloaded content. When a match is detected, the stored identifiers are used to generate a highly compressed version of the upload content, which is then uploaded to the server instead of uploading the full content data. In some embodiments, similar techniques are used to optimize return link bandwidth usage for upload-after-upload transactions.

Description

Description

CROSS-REFERENCES

This application claims the benefit of and is a non-provisional of co-pending U.S. Provisional Application Ser. No. 61/144,363, filed on Jan. 13, 2009, titled “SATELLITE MULTICASTING”; and co-pending U.S. Provisional Application Ser. No. 61/170,359, filed on Apr. 17, 2009, titled “DISTRIBUTED BASE STATION SATELLITE TOPOLOGY,” both of which are hereby expressly incorporated by reference in their entirety for all purposes.

BACKGROUND

This disclosure relates in general to communications and, but not by way of limitation, to optimization of return links of a communications system.

In some satellite communications systems, a single user plays a dual role of client and server (e.g., in a peer-to-peer environment). For example, a user may desire to share previously downloaded content with another user. Certain types of local networking and/or shared caching techniques may be used to limit redundancies and/or other inefficiencies associated with these types of transactions. However, the techniques may rely at times on users sharing a subnet, relatively symmetric client-server storage capabilities, relatively symmetric upload-download capabilities of the network links, or other types of network characteristics.

As such, it may be desirable to further mitigate inefficiencies associated with these types of communications while avoiding limitations of current approaches.

SUMMARY

Among other things, methods, systems, devices, and software are provided for improving utilization of a communications system (e.g., a satellite communications system) through techniques referred to herein as return-link optimization. Embodiments operate in a client-server context (or a more generalized sender-receiver context). When content is downloaded by a client from a server, a client optimizer intercepts the download and generates one or more identifiers characterizing the content (e.g., a digest). The identifiers are stored in a client-side server dictionary model reflecting a presumption that the content is stored in a server-side dictionary. In some embodiments, the actual data blocks (e.g., byte sequences) making up the content are not stored at the client side; only digests or other identifiers are stored.

In some embodiments, when content is uploaded by the client at some later time, the server dictionary model is used to identify when the upload content matches previously downloaded (e.g., or, in some embodiments, previously uploaded) content. When a match is detected, the identifiers stored in the server dictionary model are used to generate a highly compressed version of the upload content, which is then uploaded to the server instead of the full content data. In this way, return-link bandwidth usage can be reduced for these types of transactions.

In one set of embodiments, a system is provided for managing return-link resource usage in a communications system. The system includes a local dictionary model configured to store identifiers associated with data blocks stored on a remote dictionary, where the remote dictionary is located at a remote node of the communications system. For example, the remote dictionary may be a server dictionary in communication with a server optimizer. The system further includes a download processor module, configured to: receive a first content data block from a remote device associated with the remote dictionary; store the first content data block in a local store (e.g., a buffer); calculate a first identifier (e.g., a digest) from the first content data block; store the first identifier in the local dictionary model; and remove the first content data block from the local store. The system further includes an upload processor module, configured to: receive a second content data block for upload to the remote device; calculate a second identifier from the second content data block; determine whether the second identifier matches the first identifier stored in the local dictionary model; and when the second identifier matches the first identifier, use the first identifier or the second identifier to compress the second content data block into compressed content.

Further areas of applicability of the present disclosure will become apparent from the detailed description provided hereinafter. It should be understood that the detailed description and specific examples, while indicating various embodiments, are intended for purposes of illustration only and are not intended to necessarily limit the scope of the disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is described in conjunction with the appended figures:

FIG. 1 shows a simplified block diagram of one embodiment of a communications system for use with various embodiments;

FIG. 2A shows a simplified block diagram of one embodiment of a client-server communications system for use with various embodiments;

FIG. 2B shows a simplified block diagram of an embodiment of a communications system having multiple user systems for use with various embodiments;

FIG. 3 shows a block diagram of an embodiment of a satellite communications system having a server system in communication with multiple user systems via a satellite over multiple spot beams, according to various embodiments;

FIG. 4 shows a block diagram of an embodiment of a communications system, illustrating client-server interactivity through a client optimizer and a server optimizer, according to various embodiments;

FIG. 5 shows a block diagram of an embodiment of a client optimizer having additional storage capacity and mode selection, according to various embodiments;

FIG. 6 shows an illustrative method for performing return-link optimization, according to various embodiments; and

FIG. 7 shows an illustrative method for performing return-link optimization for an upload-after-upload transaction, according to various embodiments.

In the appended figures, similar components and/or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.

DETAILED DESCRIPTION

The ensuing description provides preferred exemplary embodiment(s) only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the preferred exemplary embodiment(s) will provide those skilled in the art with an enabling description for implementing a preferred exemplary embodiment. It is understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope as set forth in the appended claims.

Referring first to FIG. 1, a simplified block diagram is shown of one embodiment of a communications system 100 for use with various embodiments. The communications system 100 facilitates communications between a sender optimizer 120 on a sender side 110 and a receiver optimizer 140 on a receiver side 130. The sender optimizer 120 and the receiver optimizer 140 are configured to effectively provide an optimizer tunnel 105 between the sender side 110 and the receiver side 130 of the communications system 100, including providing certain communications functionality.

Embodiments of the optimizers (e.g., the sender optimizer 120 and/or the receiver optimizer 140) can be implemented in a number of ways without departing from the scope of the invention. In some embodiments, the optimizers are implemented as proxy components (e.g., a two-part proxy client/server topology), such that the optimizer tunnel 105 is a proxy tunnel. For example, a transparent intercept proxy can be used to intercept traffic in a way that is substantially transparent to users at each side of the proxy tunnel. In other embodiments, the optimizers are implemented as in-line optimizers. For example, the optimizers are implemented within respective user or provider terminals. Other configurations are possible in other embodiments. For example, embodiments of the receiver optimizer 140 are implemented in the Internet cloud (e.g., on commercial network leased server space), and embodiments of the sender optimizer 120 are implemented within a user system (e.g., in user's personal computer, within a user's modem, in a physically separate component at the customer premises, etc.).

Various embodiments of optimizers may include and/or have access to different amounts of storage. Some embodiments are configured to cache data, store dictionaries of byte sequences, etc. For example, in the communications system 100, the receiver optimizer 140 has access to enough storage to maintain a receiver dictionary 144. Embodiments of the receiver dictionary 144 include chunks of content data (e.g., implemented as delta dictionaries, wide dictionaries, byte caches, and/or other types of dictionary structures). For example, when content data is stored in the dictionary, some or all of the blocks of data defining the content are stored in the dictionary in an unordered, but indexed way. As such, content may not be directly accessible from the dictionary; rather, the set of indexes may be needed to recreate the content from the set of unordered blocks.

Other embodiments of optimizers have substantially limited storage. For example, in the communications system 100, the sender optimizer 120 has access only to a small amount of storage. The storage capacity may be too limited to store a full dictionary, but sufficient to store a model of the receiver dictionary 144, illustrated as the receiver dictionary model 124. Embodiments of the receiver dictionary model 124 store digests representing data stored at the receiver dictionary 144. For example, as described more fully below, embodiments of the sender optimizer 120 intercept traffic, and use one or more techniques to generate digests of byte sequences of the traffic. The digests are then stored in the receiver dictionary model 124, and can be used to identify matching byte sequences in the receiver dictionary 144.

As used herein, “digests” may generally include any type of fingerprint, digest, signature, hash function, and/or other functional coding of byte sequences generated so as to provide a strong enough identifier to reliably represent substantially identical matching blocks stored in a dictionary. For example, a user on the sender side 110 of the communications system 100 downloads content from the receiver side 130 of the communications system 100. In one embodiment, the content is intercepted by the sender optimizer 120 and a digest is created and stored in the receiver dictionary model 124. Storage of the digest in the receiver dictionary model 124 indicates that a full copy of the downloaded content is stored in the receiver dictionary 144 on the receiver side 130 of the communications system 100 without storing a copy of the data on the sender side 110 of the communications system 100.

If the user at the sender side 110 later uploads the content to the receiver side 130, embodiments of the sender optimizer 120 intercept the upload to see if the content was previously downloaded from the receiver side 130 (i.e., the content is presumed to be stored in the receiver dictionary 144 according to the receiver dictionary model 124). If the content is determined to be previously downloaded content, a highly compressed version of the content may be uploaded to the receiver side 130. Notably, this technique may allow significant reductions in return-link resource usage for file sharing traffic and/or other upload-after-download traffic, even where there is a very small amount of storage capacity accessible by the sender optimizer 120 (e.g., enough to store only a receiver dictionary model 124).

It will be appreciated that the limited storage capacity at the sender optimizer 120 may be considered differently in different embodiments. In one embodiment, the sender optimizer 120 is implemented within a network device (e.g., a user modem) having minimal storage capacity. In another embodiment, the sender optimizer 120 is configured to operate in different operating modes, where one or more operating modes is configured to use minimal storage capacity. For example, the sender optimizer 120 may operate either in a normal mode that stores dictionary entries for certain types of traffic or in a file sharing mode (when file sharing traffic is detected) that only stores digests without storing the actual file sharing content.

Embodiments of the sender optimizer 120 implement certain functionality described herein when file sharing or similar types of content are detected (e.g., resulting in switching into a file sharing mode, as described above). In some embodiments, the detection involves determining that traffic intercepted during a download is likely to be uploaded at some later time. The determination may account for certain tags or protocols in the metadata, which application is downloading the data, which ports are carrying the traffic, etc. For example, file sharing data may be assumed to have a high probability of upload after download, while Internet-protocol television (IPTV) or voice-over-Internet-protocol (VoIP) content may carry a low probability of upload after download. As used herein, “file sharing” connotes traffic and associated environments in which a downloader of content becomes a provider (e.g., a server) of the content. For example, peer-to-peer and other types of file sharing applications may allow a downloader to become a server in the context of particular traffic.

It is worth noting that many file sharing applications fragment files for communication. For example, some programs allow clients to download a content file in parallel from multiple sources (e.g., other peers on the network) by receiving fragments of the file from each source. As discussed more fully below, embodiments generate identifiers (e.g., digests) at the data block level, rather than at the full-file level. In this way, optimization opportunities may be identified even from file fragments, and even when fragments are received asynchronously, out of order, etc.

It is worth noting that the storage capacity of the sender optimizer 120, as discussed above, may be distinct from other storage capacity at the sender side 110 of the communications system 100. For example, there may be a user machine 114 at one or both sides of the communications system 100. The user machine 114 may broadly include any type of machine through which a user may interact with content over the communications system 100. For example, the user machine 114 may include consumer premises equipment (CPE), such as computers, televisions, etc. Further, as illustrated, the user machines 114 may have access to their own respective machine storage 118. The machine storage 118 may include hard-disk space, application storage, cache capacity, etc.

Notably, the optimizers at each side of the communications system 100 may or may not have access to the respective machine storage 118. For example, embodiments of the sender optimizer 120 may typically have little or no access to the machine storage 118. In some embodiments, the sender optimizer 120 is an independent (e.g., transparent) network component that does not have access to the machine storage 118. In other embodiments, it is inefficient or impractical for the sender optimizer 120 to access machine storage 118 for various optimization processes. For example, access to the machine storage 118 may be too slow to provide desirable optimization benefits. As such, embodiments of the sender optimizer 120 are described as having limited storage capacity (e.g., or operating in a mode with limited storage capacity) even where other storage capacity is available at the sender side 110 of the communications system 100.

While the communications system 100 of FIG. 1 is illustrated generically as a sender side 110 and receiver side 130, some typical embodiments operate in a client-server context. FIG. 2A shows a simplified block diagram of one embodiment of a client-server communications system 200a for use with various embodiments. The communications system 200a facilitates communications between a user system 210 and a server system 320 via a client optimizer 220 and a server optimizer 230. The client optimizer 220 and the server optimizer 230 are configured to effectively provide an optimizer tunnel 205 between the user system 210 and the server system 320, including providing certain communications functionality. Notably, client and server are used herein to clarify particular sides of the communications system, and are not intended to limit the respective roles, functions, direction of communications, etc. For example, in a peer-to-peer context, users may act as both clients and servers in file sharing transactions.

In an illustrative file sharing transaction, the client optimizer 220 and the server optimizer 230 implement functionality of the sender optimizer 120 and the receiver optimizer 140 of FIG. 1, respectively. For example, a user downloads content from a content server 250 over a network 240 through the user system 210. Embodiments of the user system 210 may include any component or components for providing a user with network interactivity. For example, the user system 210 may include any type of computational device, network interface device, communications device, or other device for communicating data to and from the user. Typically, the communications system 200a facilitates communications between multiple user systems 210 and a variety of content servers 250 over one or more networks 240 (only one of each is shown in FIG. 2A for the sake of clarity). The content servers 250 are in communication with the server optimizer 230 via one or more networks 240. The network 240 may be any type of network 240 and can include, for example, the Internet, an Internet protocol (“IP”) network, an intranet, a wide-area network (“WAN”), a local-area network (“LAN”), a virtual private network (“VPN”), the Public Switched Telephone Network (“PSTN”), and/or any other type of network 240 supporting data communication between devices described herein, in different embodiments. The network 240 may also include both wired and wireless connections, including optical links.

As used herein, “content servers” is intended broadly to include any source of content in which the users may be interested. For example, a content server 250 may provide website content, television content, file sharing, multimedia serving, voice-over-Internet-protocol (VoIP) handling, and/or any other useful content. It is worth noting that, in some embodiments, the content servers 250 are in direct communication with the server optimizer 230 (e.g., not through the network 240). For example, the server optimizer 230 may be located in a gateway that includes a content or application server. As such, discussions of embodiments herein with respect to communications with content servers 250 over the network 240 are intended only to be illustrative, and should not be construed as limiting.

As described below, the server optimizer 230 may be part of a server system 320 that includes components for server-side communications (e.g., base stations, gateways, satellite modem termination systems (SMTSs), digital subscriber line access multiplexers (DSLAMs), etc., as described below with reference to FIG. 3). The server optimizer 230 may act as a transparent and/or intercepting proxy. For example, the client optimizer 220 is in communication with the server optimizer 230 over a client-server communication link 225, and the server optimizer 230 is in communication with the content server 250 over a content network link 235. The server optimizer 230 may act as a transparent man-in-the-middle to intercept the data as it passes between the client-server communication link 225 and the content network link 235. Further, embodiments of the server optimizer 230 maintain a server dictionary 234 (e.g., like the receiver dictionary 144 of FIG. 1) including byte sequences of some or all of the traffic previously seen by the server optimizer 230.

For example, when the user system 210 downloads content from the content server 250, the server optimizer 230 may intercept the content and store blocks of content data in the server dictionary 234. The content may then be sent (e.g., over the client-server communication link 225) to the user terminal 210 in response to the user's request for the content. The client optimizer 220 intercepts the traffic at the client side of the optimizer tunnel 205 and generates a digest of the content, as described above. The digest is stored in a server dictionary model 224. In some embodiments, additional data (e.g., fingerprints) are generated to facilitate efficient searches for the digests in the server dictionary model 224. For example, the digest may be a strong identifier that can reliably represent an identical data block stored at the server dictionary 234, and a weak identifier (e.g., a hash) may be generated for quickly finding matching candidates among a large set of digests.

In the event that the content is later uploaded to the communications system 200a, the client optimizer 220 may intercept the upload (e.g., the request may be directed or redirected to the client optimizer 220) and look for a match in the server dictionary model 224, indicating presumptive existence of the upload content on the server dictionary 234. If a match is found, a highly compressed version of the content may be communicated to the server system 320 over the client-server communication link 225. For example, the highly compressed version may use the matching digests or other identifiers (e.g., block IDs) from the server dictionary model 224 as indexes to recreate the content at the server side from byte sequences stored in the server dictionary 234.

It is worth noting that the upload may not be ultimately destined for the server system 320. For example, in a peer-to-peer context, the upload may actually be from one user system 210 to another user system 210. While the communications system 200a illustrated in FIG. 2A shows only one optimizer tunnel 205 between one server system 320 and one user system 210, embodiments typically operate in the context of, and take advantage of, optimization among multiple user systems 210. FIG. 2B shows a simplified block diagram of an embodiment of a communications system 200b having multiple user systems 210 for use with various embodiments. The communications system 200b facilitates communications between a server system 320 and multiple user systems 210, via a respective server optimizer 230 and at least one client optimizer 220.

As described above with reference to FIG. 2A, a first user system 210a may desire to upload content after a previous download of the content from the server system 320. Using the client optimizer 220, the server optimizer 230, the server dictionary 234, and the server dictionary model 224, return-link bandwidth may be optimized for this scenario. Notably, the optimized return-link bandwidth may refer to the return link between the first user system 210a and the server system 320, regardless of the ultimate destination of the upload content. For example, the return link may be optimized even where the ultimate destination of the content is the second user system 210n, such that the content is further communicated from the server system 320 to other nodes of the communications system 200b.

Further, it is worth noting that embodiments may optimize the return-link bandwidth, regardless of whether the ultimate destination terminal includes optimization functionality. For example, some embodiments of the second user system 210n include a second client optimizer 220n that is in communication with the server optimizer 230 and maintains its own server dictionary model 224n. In other embodiments, however, the second client optimizer 220n may be any receiving node anywhere on the network, even one having no client optimizer 220n and/or no server dictionary model 224n. For example, the return-link optimization may be effectuated between the first user system 210a and the server system 320 via their respective client optimizer 220 and server optimizer 230, even where the destination for the traffic is some node of the network other than the server system 320.

FIGS. 1, 2A, and 2B illustrate various types of communications systems for use with embodiments of the invention using generic component designations. It will be appreciated that these components may be implemented in various nodes of various types and topologies of communications systems. For example, the communications systems may include cable communications systems, satellite communications systems, digital subscriber line (DSL) communications systems, local are networks (LANs), wide area networks (WANs), etc. Further, the links of the communications systems may include wired and/or wireless links, Ethernet links, coaxial cable links, fiber-optic links, etc. Some embodiments include shared portions of the forward and/or reverse links between nodes (e.g., a shared spot beam in a satellite communications system), while other embodiments include unshared links between nodes (e.g., in an Ethernet network).

In one illustrative example, FIG. 3 shows a block diagram of an embodiment of a satellite communications system 300 having a server system 320 in communication with multiple user systems 210 via a satellite 305 over multiple spot beams 335, according to various embodiments. The server system 320 may include any server components, including base stations 315, gateways 317, etc. A base station 315 is sometimes referred to as a hub or ground station. In certain embodiments, the base station 315 has functionality that is the same or different from a gateway 317. For example, as illustrated, a gateway 317 provides an interface between the network 240 and the satellite 305 via a number of base stations 315. Various embodiments provide different types of interfaces between the gateways 317 and base stations 315. For example, the gateways 317 and base stations 315 may be in communication over leased high-bandwidth lines (e.g., raw Ethernet), a virtual private large-area network service (VPLS), an Internet protocol virtual private network (IP VPN), or any other public or private, wired or wireless network. Embodiments of the server system 320 are in communication with one or more content servers 250 via one or more networks 240.

As traffic traverses the satellite communications system 300 in multiple directions, the gateway 317 may be configured to implement multi-directional communications functionality. For example, the gateway 317 may send data to and receive data from the base stations 315.

Similarly, the gateway 317 may be configured to receive data and information directed to one or more user systems 210, and format the data and information for delivery to the respective destination device via the satellite 305; or receive signals from the satellite 305 (e.g., from one or more user systems 210) directed to a destination in the network 240, and process the received signals for transmission through the network 240.

In various embodiments, one or more of the satellite links are capable of communicating using one or more communication schemes. In various embodiments, the communication schemes may be the same or different for different links. The communication schemes may include different types of coding and modulation combinations. For example, various satellite links may communicate using physical layer transmission modulation and coding techniques using adaptive coding and modulation schemes, etc. The communication schemes may also use one or more different types of multiplexing schemes, including Multi-Frequency Time-Division Multiple Access (“MF-TDMA”), Time-Division Multiple Access (“TDMA”), Frequency Division Multiple Access (“FDMA”), Orthogonal Frequency Division Multiple Access (“OFDMA”), Code Division Multiple Access (“CDMA”), or any number of other schemes.

The satellite 305 may operate in a multi-beam mode, transmitting a number of spot beams 335, each directed at a different region of the earth. Each spot beam 335 may be associated with one of the user links, and used to communicate between the satellite 305 and a large group (e.g., thousands) of user systems 210 (e.g., user terminals 330 within the user systems 210). The signals transmitted from the satellite 305 may be received by one or more user systems 210, via a respective user antenna 325. In some embodiments, some or all of the user systems 210 include one or more user terminals 330 and one or more CPE devices 360. User terminals 330 may include modems, satellite modems, routers, or any other useful components for handling the user-side communications. Reference to “users” should be construed generally to include any user (e.g., subscriber, consumer, customer, etc.) of services provided over the satellite communications system 300 (e.g., by or through the server system 320).

In a given spot beam 335, some or all of the users (e.g., user systems 210) serviced by the spot beam 335 may be capable of receiving all the content traversing the spot beam 335 by virtue of the fact that the satellite communications system 300 employs wireless communications via various antennae (e.g., 310 and 325). However, some of the content may not be intended for receipt by certain customers. As such, the satellite communications system 300 may use various techniques to “direct” content to a user or group of users. For example, the content may be tagged (e.g., using packet header information according to a transmission protocol) with a certain destination identifier (e.g., an IP address), use different modcode points that can be reliably received only by certain user terminals 330, send control information to user systems 210 to direct the user systems 210 to ignore or accept certain communications, etc. Each user system 210 may then be adapted to handle the received data accordingly. For example, content destined for a particular user system 210 may be passed on to its respective CPE 360, while content not destined for the user system 210 may be ignored. In some cases, the user system 210 caches information not destined for the associated CPE 360 for use if the information is later found to be useful in avoiding traffic over the satellite link, as described in more detail below.

Embodiments of the server system 320 and/or the user system 210 include an accelerator module and/or other processing components. In one embodiment, real-time types of data (e.g., User Datagram Protocol (“UDP”) data traffic, like Internet-protocol television (“IPTV”) programming) bypass the accelerator module, while non-real-time types of data (e.g., Transmission Control Protocol (“TCP”) data traffic, like web video) are routed through the accelerator module for processing. Embodiments of the accelerator module provide various types of applications, WAN/LAN, and/or other acceleration functionality.

In some embodiments, the accelerator module is adapted to provide high payload compression. This allows faster transfer of the data and enhances the effective capacity of the network. The accelerator module can also implement protocol-specific methods to reduce the number of round trips needed to complete a transaction, such as by prefetching objects embedded in HTTP pages. In other embodiments, functionality of the accelerator module is closely integrated with the satellite link through other modules, including the client optimizer 220 and/or the server optimizer 230.

As discussed above, the satellite communications system 300 may be configured to implement various optimization functions through client-server interactions, implemented by the client optimizer 220 and the server optimizer 230. The server optimizer 230 may be configured to maintain a server dictionary and the client optimizer 220 may be configured to maintain a model of the server dictionary. Embodiments of the client optimizers 220 and server optimizer 230 may act to create a virtual tunnel between the user systems 210 and the content servers 250 or the server system 320, as described with reference to FIGS. 2A and 2B. In a topology, like the satellite communications system 300 shown in FIG. 3, vast amounts of traffic may traverse various portions of the satellite communications system 300 at any given time. The optimizer functionality may help relieve the satellite communications system 300 from traffic burdens relating to file sharing and similar transactions (e.g., by optimizing return-link resources). This and other functionality of the client optimizer 220 and the server optimizer 230 are described more fully with reference to FIG. 4.

FIG. 4 shows a block diagram of an embodiment of a communications system 400, illustrating client-server interactivity through a client optimizer 220 and a server optimizer 230, according to various embodiments. In some embodiments, the communications system 400 is an embodiment of the communications system 200a of FIG. 2A or the satellite communications system 300 of FIG. 3. As shown, the communications system 400 facilitates communications between a user system 210 and one or more content servers 250 via at least one client-server communication link 225. For example, interactions between the client optimizer 220 and the server optimizer 230 effectively create an optimizer tunnel 205 between the user system 210 and the content servers 250. In some embodiments, the server system 320 is in communication with the content servers 250 via one or more networks 240, like the Internet.

In some embodiments, the user system 210 includes a client graphical user interface (GUI) 410, a web browser 406, and a redirector 408. The client GUI 410 may allow a user to configure performance aspects of the user system 210 (e.g., or even aspects of the greater communications system 400 in some cases). For example, the user may adjust compression parameters and/or algorithms, alter content filters (e.g., for blocking illicit websites), or enable or disable various features used by the communications system 400. In one embodiment, some of the features may include network diagnostics, error reporting, as well as controlling, for example, components of the client optimizer 220 and/or the server optimizer 230.

In one embodiment, the user selects a universal recourse locator (URL) address through the client GUI 410 which directs the web browser 406 (e.g., Internet Explorer®, Firefox®, Netscape Navigator®, etc.) to a website (e.g., cnn.com, google.com, yahoo.com, etc.). The web browser 406 may then issue a request for the website and associated objects to the Internet. It is worth noting that the web browser 406 is shown for illustrative purposes only. While embodiments of the user system 210 may typically include at least one web browser 406, user systems 210 may interact with content servers 250 in a number of different ways without departing from the scope of the invention (e.g., through downloader applications, file sharing applications, applets, etc.).

The content request from the user system 210 (e.g., download request from the web browser 406) may be intercepted by the redirector 408. It is worth noting that embodiments of the redirector 408 are implemented in various ways. For example, embodiments of the redirector 408 are implemented within a user modem as part of the modem's internal routing functionality. The redirector 408 may send the request to the client optimizer 220. It is worth noting that the client optimizer 220 is shown as separate from the user machine 214 (e.g., in communication over a local bus, on a separate computer system connected to the user system 210 via a high speed/low latency link, like a branch office LAN subnet, etc.). However, embodiments of the client optimizer 220 are implemented as part of any component of the user system 210 in any useful client-side location, including as part of a user terminal, as part of a user modem, as part of a hub, as a separate hardware component, as a software application on the user machine 214, etc.

In some embodiments, the client optimizer 220 includes a request manager 416. The request manager 416 may be configured to perform a number of different processing functions, including Java parsing and protocol processing. Embodiments of the request manager 416 may process hypertext transfer protocol (HTTP), file transfer protocol (FTP), various media protocols, metadata, header information, and/or other relevant information from the request data (e.g., packets) to allow the client optimizer 220 to perform its optimizer functions. For example, the request may be processed by the request manager 416 as part of identifying opportunities for optimizing return-link resources for previously downloaded content.

The request manager 416 may forward the request to a request encoder 418. Embodiments of the request encoder 418 encode the request using one of many possible data compression or similar types of algorithms. For example, strong identifiers and/or weak identifiers may be generated using dictionary coding techniques, including hashes, checksums, fingerprints, signatures, etc. As described below, these identifiers may be used to identify digests in a server dictionary model 224 indicating matching data blocks in a server dictionary 234 in, or in communication with, the server optimizer 230.

In some embodiments, the request manager 416 and/or the request encoder 418 process the request content differently, depending on the type of data included in the request. For example, the content portion (e.g., byte-level data) of the data may be processed according to metadata. Some types of schema-specific coding are described in U.S. Provisional Patent Application No. 61/231,265, entitled “METHODS AND SYSTEMS FOR INTEGRATING DELTA CODING WITH SCHEMA SPECIFIC CODING” (026841-002300US), filed on Aug. 4, 2009, which is incorporated herein by reference in its entirety for all purposes.

In some embodiments, the request may be forwarded to a transport manager 428a. In one embodiment, the transport manager 428a implements Intelligent Compression Technology's° (“ICT”) transport protocol (“ITP”). Nonetheless, other protocols may be used, such as the standard transmission control protocol (“TCP”). In one embodiment, ITP maintains a persistent connection with the server system 320 via its server optimizer 230. The persistent connection between the client optimizer 220 and the server optimizer 230 may enable the communications system 400 to eliminate or reduce inefficiencies and overhead costs associated with creating a new connection for each request.

In one embodiment, the encoded request is forwarded from the transport manager 428a in the client optimizer 220 to a transport manager 428b in the server optimizer 230 to a request decoder 436. The request decoder 436 may use a decoder which is appropriate for the encoding performed by the request encoder 418. The request decoder 436 may then transmit the decoded request to a content processor 442 configured to communicate the request to an appropriate content source. For example, the content processor 442 may communicate with a content server 250 over a network 240. Of course, other types of content sources are possible. For example, some or all of the data blocks that make up the requested content may be available in the server dictionary 234. As discussed above, embodiments of the server dictionary 234 include indexed blocks of content data (e.g., byte sequences).

In response to the request, response data may be received by the content processor 442. For example, the response data may be retrieved from an appropriate content server 250, from the server dictionary 234, etc. The response data may include various types of information, such as one or more attachments (e.g., media files, text files, etc.), references to “in-line” objects needed to render a web page, etc. Embodiments of the content processor 442 may be configured to interpret the response data, which may, for example, be received as HTML, XML, CSS, Java Scripts, or other types of data. In some embodiments, when response data is received, the content processor 442 checks the server dictionary 234 to determine whether the content is already stored by the server system 230. If not, the content may be stored to the server dictionary 234.

In some embodiments, the response received at the content processor 442 is parsed by a response parser 444 and/or encoded by a response encoder 440. The response data may then be communicated back to the user system 210 via the protocol managers 428 and the client-server communication link 225. After the response data is received at the client optimizer 220 by its transport manager 428a, the response data is forwarded to a response manager 424 for client-side processing.

Embodiments of the response manager 424 generate a strong identifier (e.g., a digest) of the response data for storage in the server dictionary model 224. For example, certain embodiments assume that response data is stored in the server dictionary 234 (i.e., the response data was stored either prior to or upon receipt by the content processor 442 in the server optimizer 230. As such, it may be assumed by embodiments of the client optimizer 220 that the server dictionary model 224 is, in fact, a model of the server dictionary 234 without requiring any explicit messages from the server optimizer 230 to that effect. It is worth noting that, in some embodiments, synchronization techniques are used to ensure that the server dictionary model 224 remains an accurate model of the server dictionary 234. For example, the server optimizer 230 may desire to remove a data block from its server dictionary 234. The server optimizer 230 may notify the client optimizer 220 that is it ready to remove the data block, wait for a notification back from the client optimizer 220 confirming deletion of the data block from the server dictionary model 224, and then remove the data block from the server dictionary 234.

In certain embodiments, the response manager 424 may further generate a weak identifier (e.g., a checksum, a hash, etc.). The weak identifier may be used to quickly find strong identifier entries in the server dictionary model 224, as described more fully below. Once the server dictionary model 224 is updated, the response manager 424 may forward the response data to the user machine 214 (e.g., via its redirector 408).

At some later time, a user desires to upload the same content that was previously downloaded (referred to as “upload-after-download”). For example, the upload-after-download may occur as part of a file sharing transaction. It is worth noting that the upload-after-download content may be either identical to or different from the originally downloaded content; and where the upload-after-download content is different, it may differ in varying degrees. For example, a user may download a document, modify the document, and re-upload the modified document. Depending on the amount of modification, the upload-after-download content (i.e., the re-uploaded, modified document) may be slightly or significantly different (e.g., at the byte level) from the downloaded version of the document.

When the user uploads the content in the upload-after-download context, the upload request may be intercepted by the redirector 408 and sent to the client optimizer 220. The request manager 416 in the client optimizer 220 parses the upload request to find any object or other content data that should be evaluated for optimization. The parsed data may then be encoded by the request encoder 418 to generate one or more identifiers associated with the content requested for upload.

In some embodiments, the request encoder 418 generates a weak identifier (e.g., by applying a hashing function). The weak identifier is then used to quickly find candidate matches for the content among the digests stored in the server dictionary model 224. As noted above, when matches are found, embodiments of the client optimizer 220 assume that the content (e.g., or data blocks needed to decompress a compressed version of the content) are presently stored in the server dictionary 234). If matches are found, the matching digests may be used to generate a highly compressed version of the upload content. The highly compressed version of the upload content may then be uploaded to the server system 320 for decompression and/or further processing.

It is worth noting that strong and weak identifiers, as used herein, may be generated in different ways according to different functions. In some embodiments, received data block are of variable size. The boundaries of the blocks are established, for example, by a function that operates on N bytes. Each time the output of this function has a particular value, a boundary is established. If the value of the function does not match the particular value, the block position may be advanced by one byte and a new function output may be calculated. For computational efficiency, certain embodiments of the function include a rolling checksum or other algorithm that allows the function value to be adjusted as a new byte is added and an old byte is removed from the set of N bytes used to compute the function. This approach may allow the same block boundaries to be established even when starting at different points in a stream (e.g., a session stream).

The boundary points may delimit blocks of variable sizes, and a strong identifier can then be calculated on each block delimited in this way (e.g., using a Message-Digest algorithm 5 (MD5) technique, or other technique). When a boundary point is reached, the strong identifier of the completed block can be compared against the identifiers in the server dictionary model 224 to see if the new block matches data in the server dictionary 234. Other techniques for delimiting blocks and identifying matching with previous blocks are possible.

In one illustrative embodiment, a byte sequence is received as a stream of data. For each N bytes, a rolling checksum is calculated, for example, according to the equation:

$(\sum_{i = 0}^{N - 1} f (x, i)) \mod M .$

According to this equation, “i” is the position of a byte in the sequence, so that i=0 for the first byte in the block and i=N−1 for the last byte in the block. Also according to the equation, “x” is the value of the byte at position i, which may, for example, be in the range 0-255. Further according to the equation, “f(x,i)” is a function applied to each entry. For example, the function may use x as an index into an array of prime values “P,” which may be multiplied by the local offset i, so that f(x,i)=P[x]*i. And further, according to the equation, modulo arithmetic can be applied to the total, so that the number of possible output values is the modulo value. Adjusting the modulus may then adjust the average size of the output blocks, as it sets the probability that a match with the special value S (e.g., the “particular value” discussed above) will occur at any point, where S is any value between 0 and M−1. Each time the sum equals the special value S, a boundary point is established in the incoming stream. Each pair of boundary points may define a dictionary block, and a strong identifier is calculated on each such block. The rolling checksum function is applied to every block of N bytes in the incoming stream.

In one example, a user engages in file sharing by downloading a one-Megabyte content file and then becoming a source (e.g., a server) for that content file, uploading the file multiple times. Without return-link optimization, the entire one-Megabyte of file data may be re-uploaded with each upload request. Using the client optimizer 220, however, the return-link bandwidth usage may be minimized. For example, the digests in the server dictionary model 224 may provide 10,000-to-1 compression, such that the one-Megabyte file can be compressed into only one-hundred bytes of digest data. As such, even multiple upload requests may be compressed into only hundreds or thousands of bytes of total bandwidth usage on the return link.

Notably, as the optimization occurs on the return link from the client optimizer 220 to the server optimizer 230, the optimization may be unaffected by destinations for the upload content beyond the server system 320. For example, the upload from the user system 210 via the client optimizer 220 may be destined for another user system 210 in communication with the server system 320. As discussed above, the optimization may be unaffected by a presence or absence of a client optimizer 220 at the destination user system 210.

Further, it is worth noting that the digest-based optimization may provide optimization benefits (e.g., compression), even where portions of a content file have changed. For example, in a collaborative media editing environment, revisions of large media files may be sent back and forth among a number of users. When each revision upload is intercepted by a client optimizer 220, the server dictionary model 224 may include digests for the unchanged data blocks. As such, those unchanged blocks may still be sent in highly compressed form, while the changes are sent in uncompressed form (e.g., or, at least, not compressed according to the server dictionary model 224). Upon receipt at the server optimizer 230, the server dictionary 234 may then be used to decompress the compressed blocks of the upload and/or updated with the uncompressed revision data.

The communications system 400 illustrated in FIG. 4 shows a client optimizer 220 having storage only for a server dictionary model 224. Embodiments of the client optimizer 220 shown in FIG. 4 may have no access (e.g., or no practical or efficient access) to other storage capacity. For example, the client optimizer 220 may not be authorized to access the machine storage 218 and/or may not have additional capacity of its own (e.g., for storage of its own dictionary or for a cache). However, in other embodiments, the client optimizer 220 has additional capacity, which it may manage according to whether return-link optimization is desired.

FIG. 5 shows a block diagram of an embodiment of a client optimizer 220 having additional storage capacity and mode selection, according to various embodiments. As in the client optimizer 220 of FIG. 4, the client optimizer 220a of FIG. 5 includes a request manager 416, a request encoder 418, a response manager 424, a transport manager 428, and a server dictionary model 224. However, the client optimizer 220a of FIG. 5 also includes a client dictionary 524, a mode selector 520, and file sharing detectors 510. Embodiments of the client optimizer 220a operate in a “file sharing” operating mode when file sharing content (e.g., or any content deemed a likely upload-after-download candidate) is detected and in a “normal” mode for other types of traffic, as described below.

When the user downloads content from the server system 320, the content is received via the transport manager 428 by the client optimizer 220. The received content is evaluated by the file sharing detector 510 to determine whether the content includes file sharing content. As discussed above, “file sharing” content is used herein to describe any traffic having a probability of being uploaded after download. This determination can be made in a number of ways. For example, metadata may be evaluated to look for certain file sharing protocols, certain types of content (e.g., file types) may be deemed more likely to be re-uploaded, patterns of use may be evaluated to find upload-after-download candidates, etc.

The determination of the file sharing detector 510 may be used to set the operating mode of the client optimizer 220a for handling that content, and the response manager 424 may process the content according to that operating mode. In some embodiments, if the file sharing detector 510 determines that the content includes file sharing content, the mode selector 520 may be set such that the client optimizer 220a processes the content in “file sharing” operating mode. For example, the file sharing content may be processed as described above with reference to FIG. 4. The response manager 424 may generate one or more identifiers (e.g., digests) for storage in the server dictionary model 224, and may pass the content to the user machine 214.

If the file sharing detector 510 determines that there is no file sharing content, the mode selector 520 may be set such that the client optimizer 220a processes the content in “normal” operating mode. According to the normal operating mode, the content may be processed in a number of ways, including using the client dictionary 524 for various types of optimization. In one embodiment, the normal operating mode exploits deltacasting opportunities, as described in U.S. patent application Ser. No. 12/651,909, entitled “DELTACASTING” (017018-019510US), filed on Jan. 4, 2010, which is incorporated herein by reference in its entirety for all purposes. In other embodiments, the normal operating mode configures the client optimizer 220a to implement functionality of delta coders, caches, and/or other types of network components known in the art.

When the user uploads content from the user machine 214, the upload request may be sent to the client optimizer 220. The request manager 416 in the client optimizer 220 processes (e.g., parses) the upload request to find any object or other content data that should be evaluated for optimization. In some embodiments, the parsed data is then encoded by the request encoder 418 to generate one or more identifiers associated with the content requested for upload. The identifiers may then be evaluated against one or both of the server dictionary model 224 and the client dictionary 524 to find and/or exploit matches.

In other embodiments, information obtained from processing the upload request is used by the file sharing detector 510 to determine whether the upload request includes file sharing traffic. The operating mode may then be selected by the mode selector 520 and the upload request may be encoded by the request encoder 418 according to the determination of the file sharing detector 510. For example, if the file sharing detector 510 detects file sharing content, the mode selector 520 may select the “file sharing” operating mode. In this mode, the request encoder 418 may generate a weak identifier for use in finding matching digests in the server dictionary model 224 without any reference to the client dictionary 524. As discussed above, any matches found in the server dictionary model 224 may then be used to compress the upload request, for example, for return-link optimization.

It will be appreciated that, while the above descriptions of content transactions focus on requests and responses, these terms are intended to be broadly construed, and embodiments of the invention function within many other contexts. For example, embodiments of the communication system 400 are used to provide interactive Internet services (e.g., access to the world-wide web, email communications, file serving and sharing, etc.), television services (e.g., satellite broadcast television, Internet protocol television (IPTV), on-demand programming, etc.), voice communications (e.g., telephone services, voice-over-Internet-protocol (VoIP) telephony, etc.), networking services (e.g., mesh networking, VPN, VLAN, MPLS, VPLS, etc.), and other communication services. As such, the “response” data discussed above is intended only as an illustrative type of data that may be received by the server optimizer 230 from a content source (e.g., a content server 250). For example, the “response” data may actually be pushed, multicast, or otherwise communicated to the user without an explicit request from the user.

It will be further appreciated that embodiments of systems and components described above include merely some exemplary embodiments, and various methods of the invention can be performed by those and other system embodiments. FIG. 6 shows an illustrative method 600 for performing return-link optimization, according to various embodiments. Particularly, the method 600 illustrates an “upload-after-download” scenario (e.g., a re-upload to the Internet, P2P file sharing of previously downloaded content, etc.). In some embodiments, the method 600 is performed by one or more components of a client optimizer 220, as described above with reference to FIGS. 1-5.

For the sake of added clarity, the method 600 is shown with reference to client-side activities 602 and server-side activities 604, and with reference to illustrative timing on a timeline 605. Of course, certain client-side functions may be performed by server-side components, certain server-side functions may be performed by client-side components, and specific timing of process blocks may be changed without affecting the method 600. Further, it will be appreciated that the timeline 605 is not intended to show any time scale (relative or absolute), and certain process blocks may occur in series, in parallel, or otherwise, according to various embodiments. For at least these reasons, it will be appreciated that these elements of FIG. 6 are intended only for clarity and are not intended to limit the scope of the method 600 in any way.

Some embodiments of the method 600 begin at a first time 610a (shown on timeline 605), when the client side 602 (e.g., a user of a user machine) requests content for download in block 620. At a second time 610b, (e.g., after some delay due to latency of a satellite communication link, etc.), the server side 604 receives and processes the request at block 624. At block 628, the server side 604 transmits the requested content to the requesting client side 602 in response to the request. In some embodiments, the server side 604 also determines whether the content represents an optimization candidate at block 636c. For example, the server side 604 may evaluate the response data to determine whether it includes file sharing content. If the traffic is deemed an optimization candidate (e.g., or in all cases, for example, where a determination is not made at block 636c), the server side 604 stores the response data in a local dictionary at block 630.

At a third time 610c, the client side 602 receives the content at block 632. In some embodiments, the client side 602 determines whether the content represents an optimization candidate at block 636a. Embodiments of the determination may be similar to those made at the server side 604 in block 636c. For example, the client side 602 may evaluate the response data to determine whether it includes file sharing content. If the traffic is deemed an optimization candidate, identifiers (e.g., digests) may be generated at block 640 and used to update a server dictionary model at the client side 602.

Sometime later, at a fourth time 610d, the client side 602 makes a request at block 644 that involves upload of the content received in block 632. For example, the client side 602 desires to re-upload the content to the another location on the Internet, share the content with another user via the communication system, etc. In some embodiments, at a fifth time 610e, the upload request is intercepted at block 636b and a determination is made (e.g., as in block 636a) as to whether the upload request includes content relating to an optimization candidate (e.g., file sharing content).

If the upload request includes optimizable content, according to the determination of block 636b, an identifier may be generated at block 648. For example, a weak identifier may be generated by applying a hashing function to the upload content data. At block 652, the identifier is used to find any candidate matches among the digests stored in the server dictionary model. If matches are not found, the content data may be uploaded at block 656a. If matches are found, the matching digests may be used to generate and upload a highly compressed version of the upload content at block 656b.

At a sixth time 610f, (e.g., again after some delay due to latency), the server side 604 may receive and process the uploaded content at block 660. At block 664, the server dictionary may be updated with any blocks not already in the dictionary. For example, if the content is uploaded at block 656a without digest-based compression, or if some of the blocks of the content were uploaded without digest-based compression due to changes in the file, the server dictionary may be updated at block 664.

In some embodiments, the upload-after-download scenario is part of a peer-to-peer (P2P) file sharing process, or some other process in which the upload is destined for a node of the communications system other than the server side 604. In embodiments of these transactions, the uploaded content may then be communicated to a destination node at block 668. For example, the content may be pushed to a user at the same or another client side 602.

It will be appreciated that various embodiments have been described herein with reference to upload-after-download transactions. However, similar functionality may be used to optimize return-link bandwidth usage in the context of multiple uploads of the same content. FIG. 7 shows an illustrative method 700 for performing return-link optimization for an upload-after-upload transaction, according to various embodiments. As with the method 600 of FIG. 6, the method 700 is shown with reference to client-side activities 702 and server-side activities 704, and with reference to illustrative timing on a timeline 705.

Embodiments of the method 700 begin at a first time 710a (shown on timeline 705), when the client side 702 (e.g., a user of a user machine) requests upload of content in block 720. At a second time 710b, the upload request is intercepted at block 724a and a determination is made as to whether the upload request includes content relating to an optimization candidate (e.g., file sharing content). If so, an identifier (e.g., a digest) may be generated at block 728 and added to the server dictionary model. For example, it may be assumed that the data will be stored in the server dictionary after it is received by the server side 704 as part of the present upload request.

The content may then be uploaded at block 732. It is assumed in the illustrative method 700 that this is the first time the content is being uploaded to the server side 704. At a third time 710c, the uploaded content is received and processed by the server side 704 at block 736. In some embodiments, at block 740, the server dictionary is updated to reflect the uploaded content.

Sometime later, at a fourth time 710d, the client side 702 makes a request at block 744 that involves a second upload of the content previously uploaded in block 732. For example, the client side 702 desires to re-upload the content to another location on the Internet, share the content with another user via the communication system, etc. In some embodiments, at a fifth time 710e, the upload request is intercepted at block 724b and a determination is made (e.g., as in block 724a) as to whether the upload request includes content relating to an optimization candidate (e.g., file sharing content).

If the upload request includes optimizable content, according to the determination of block 724b, an identifier may be generated at block 748. For example, a weak identifier may be generated by applying a hashing function to the upload content data. At block 752, the identifier is used to find any candidate matches among the digests stored in the server dictionary model. If matches are not found, the content data may be uploaded at block 756a. If matches are found, the matching digests may be used to generate and upload a highly compressed version of the upload content at block 756b.

At a sixth time 710f, the server side 704 may receive and processes the uploaded content at block 760. At block 764, the server dictionary may be updated with any blocks not already in the dictionary. For example, if the content is uploaded at block 756a without digest-based compression, or if some of the blocks of the content were uploaded without digest-based compression due to changes in the file, the server dictionary may be updated at block 764. In some embodiments, the uploaded content may then be communicated to a destination node other than the server side 704 at block 768.

It is worth noting, that blocks 744, 748, 752, 756a, 756b, 760, 764, and 768 of the method 700 of FIG. 7 may be implemented substantially identically to blocks 644, 648, 652, 656a, 656b, 660, 664, and 668 of the method 600 of FIG. 6, respectively. For example, once content is uploaded once to the server (e.g., either after a download, as in FIG. 6, or not, as in FIG. 7) the data may be used to reduce return-link bandwidth on future uploads of the same content. As such, embodiments of systems and methods described herein handle both upload-after-download and upload-after-upload transactions.

The above description is intended to provide various embodiments of the invention, but does not represent an exhaustive list of all embodiments. For example, those of skill in the art will appreciate that various modifications are available within the scope of the invention. Further, while the disclosure includes various sections and headings, the sections and headings are not intended to limit the scope of any embodiment of the invention. Rather, disclosure presented under one heading may inform disclosure presented under a different heading. For example, descriptions of embodiments of method steps for handling overlapping content requests may be used to inform embodiments of methods for handling anticipatory requests.

Specific details are given in the above description to provide a thorough understanding of the embodiments. However, it is understood that the embodiments may be practiced without these specific details. For example, well-known processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Implementation of the techniques, blocks, steps, and means described above may be done in various ways. For example, these techniques, blocks, steps, and means may be implemented in hardware, software, or a combination thereof. For a hardware implementation, the processing units may be implemented within one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), soft core processors, hard core processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described above, and/or a combination thereof. Software can be used instead of or in addition to hardware to perform the techniques, blocks, steps, and means.

Also, it is noted that the embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed, but could have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.

Furthermore, embodiments may be implemented by hardware, software, scripting languages, firmware, middleware, microcode, hardware description languages, and/or any combination thereof. When implemented in software, firmware, middleware, scripting language, and/or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine readable medium such as a storage medium. A code segment or machine-executable instruction may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a script, a class, or any combination of instructions, data structures, and/or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, and/or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.

For a firmware and/or software implementation, the methodologies may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. Any machine-readable medium tangibly embodying instructions may be used in implementing the methodologies described herein. For example, software codes may be stored in a memory. Memory may be implemented within the processor or external to the processor. As used herein the term “memory” refers to any type of long term, short term, volatile, nonvolatile, or other storage medium and is not to be limited to any particular type of memory or number of memories, or type of media upon which memory is stored.

Moreover, as disclosed herein, the term “storage medium” may represent one or more memories for storing data, including read only memory (ROM), random access memory (RAM), magnetic RAM, core memory, magnetic disk storage mediums, optical storage mediums, flash memory devices and/or other machine readable mediums for storing information. Similarly, terms like “cache” are intended to broadly include any type of storage, including temporary or persistent storage, queues (e.g., FIFO, LIFO, etc.), buffers (e.g., circular, etc.), etc. The term “machine-readable medium” includes, but is not limited to, portable or fixed storage devices, optical storage devices, wireless channels, and/or various other storage mediums capable of storing that contain or carry instruction(s) and/or data.

Further, certain portions of embodiments (e.g., method steps) are described as being implemented “as a function of” other portions of embodiments. This and similar phraseologies, as used herein, intend broadly to include any technique for determining one element partially or completely according to another element. In various embodiments, determinations “as a function of” a factor may be made in any way, so long as the outcome of the determination is at least partially dependant on the factor.

While the principles of the disclosure have been described above in connection with specific apparatuses and methods, it is to be clearly understood that this description is made only by way of example and not as limitation on the scope of the disclosure.

Claims

1. A method for managing return-link resource usage in a communications system, the method comprising:

receiving a first content data block at a client-side device from a server-side device, the server-side device being communicatively coupled with a server dictionary and the client-side device being communicatively coupled with a server dictionary model, the server dictionary model configured to store identifiers associated with data blocks stored on the server dictionary, each identifier having a substantially smaller file size than its associated data block;

calculating a first identifier from the first content data block;

storing the first identifier in the server dictionary model at the client-side device;

removing the first content data block from the client-side device;

subsequent to storing the first identifier, receiving a second content data block at the client-side device for upload to the server-side device;

calculating a second identifier from the second content data block;

determining whether the second identifier matches the first identifier stored in the server dictionary model; and

when the second identifier matches the first identifier, using the first identifier or the second identifier to compress the second content data block into compressed content.

2. The method of claim 1, further comprising:

communicating the compressed content to the server-side device.

3. The method of claim 2, wherein the compressed content is the second identifier.

4. The method of claim 1, wherein the first content data block is removed from the client-side device substantially upon calculating the first identifier.

5. The method of claim 1, further comprising:

determining whether the first content data block comprises file sharing content,

wherein the first content data block is removed from the client-side device only when the content data block comprises file sharing content.

6. The method of claim 5, wherein determining whether the first content data block comprises file sharing content comprises:

determining whether the first content data block is configured according to a file sharing protocol.

7. The method of claim 5, wherein determining whether the first content data block comprises file sharing content comprises:

determining a probability that a substantially identical content data block will be received by the client-side device for upload to the server-side device at a subsequent time.

8. The method of claim 1, further comprising:

storing the first content data block in a client data store configured such that the client-side device has substantially no access to the content data block when stored in the client data store.

9. The method of claim 1, wherein the first identifier and the second identifier are calculated such that a probability of the first identifier and the second identifier matching when the first content data block and the second content data block are not identical is effectively zero.

10. A method for managing return-link resource usage in a communications system, the method comprising:

receiving a first content data block at a client-side device from a server-side device, the server-side device being communicatively coupled with a server dictionary and the client-side device being communicatively coupled with a server dictionary model;

calculating a first identifier from the first content data block, such that the first identifier has a substantially smaller file size than the first content data block;

storing the first identifier in the server dictionary model at the client-side device;

determining whether the first content data block comprises file sharing content;

removing the first content data block from the client-side device when the first content data block comprises file sharing content;

subsequent to storing the first identifier, receiving a second content data block at the client-side device for upload to the server-side device;

calculating a second identifier from the second content data block;

determining whether the second identifier matches the first identifier stored in the server dictionary model; and

when the second identifier matches the first identifier, using the first identifier or the second identifier to compress the second content data block into compressed content.

11. The method of claim 10, further comprising:

communicating the compressed content to the server-side device.

12. The method of claim 10, wherein the first content data block is removed from the client-side device prior to receiving the second content data block at the client-side device.

13. The method of claim 10, wherein determining whether the first content data block comprises file sharing content comprises determining whether the first content data block is configured according to a file sharing protocol.

14. The method of claim 10, wherein determining whether the first content data block comprises file sharing content comprises determining a probability that a substantially identical content data block will be received by the client-side device for upload to the server-side device at a subsequent time.

15. The method of claim 10, wherein the first identifier and the second identifier are calculated such that a probability of the first identifier and the second identifier matching when the first content data block and the second content data block are not identical is effectively zero.

16. A system for managing return-link resource usage in a communications system, the system comprising:

a local dictionary model configured to store identifiers associated with data blocks stored on a remote dictionary, the remote dictionary located at a remote node of the communications system;

a download processor module, configured to: receive a first content data block from a remote device associated with the remote dictionary; store the first content data block in a local store; calculate a first identifier from the first content data block; store the first identifier in the local dictionary model; and remove the first content data block from the local store; and

an upload processor module, configured to: receive a second content data block for upload to the remote device; calculate a second identifier from the second content data block; determine whether the second identifier matches the first identifier stored in the local dictionary model; and when the second identifier matches the first identifier, use the first identifier or the second identifier to compress the second content data block into compressed content.

17. The system of claim 16, further comprising:

a communications module configured to communicate the compressed content to the remote device.

18. The system of claim 16, wherein the download processor module is configured to remove the first content data block from the local store substantially upon calculating the first identifier.

19. The system of claim 16, further comprising:

a file sharing detector, communicatively coupled with the download processor module, and configured to: determine whether the first content data block comprises file sharing content, wherein the download processor module is configured to remove the first content data block from the local store only when the content data block comprises file sharing content.

20. The system of claim 19, wherein the file sharing detector is configured to determine whether the first content data block comprises file sharing content by determining whether the first content data block is configured according to a file sharing protocol.

21. The system of claim 19, wherein the file sharing detector is configured to determine whether the first content data block comprises file sharing content by determining a probability that a substantially identical content data block will be received for upload to the remote device at a subsequent time.

22. The system of claim 16, further comprising:

a client storage module, configured to store the first content data block such that the download processor module has substantially no access to the first content data block when stored in the client storage module.

23. The system of claim 16, wherein the local store is a buffer.

24. The system of claim 16, wherein:

the remote device is a server and the remote dictionary is a server dictionary.