Security techniques for cooperative file distribution
Security techniques are provided for cooperative file distribution. An encryption key or a nonce (or both) are generated for a package containing one or more files that are to be sent in a cooperative file distribution system. Random access encryption techniques can be employed to encrypt a package containing one or more files to be sent in a cooperative file distribution system. One or more storage proxies are allocated to a package to be transmitted in a cooperative file distribution system, based on load. Access to trackers in the cooperative file distribution system is controlled using security tokens. Content can automatically expire using a defined expiration period when the content is uploaded into the system. Variable announce intervals allow the tracker to control how often the tracker will receive a message, such as an announcement or a heartbeat message, from peers in the system.
The present invention relates generally to communication methods and systems, and more particularly, to cooperative and secure methods and systems for sharing one or more files among users.
BACKGROUND OF THE INVENTIONThe providers of popular, large digital files, such as software, music or video files, must keep pace with the ever increasing bandwidth demands for delivering such files. As the popularity of a file increases, a larger number of users are requesting the file and more bandwidth is required to deliver the file. With conventional Hypertext Transfer Protocol (HTTP) file delivery techniques, for example, the bandwidth requirements increase linearly with the number of requesting users, and quickly becomes prohibitively expensive.
A number of techniques have been proposed or suggested for reducing the bandwidth demands of file delivery on the server, using peer-to-peer content sharing. In a peer-to-peer content sharing model, often referred to as “cooperative distribution,” one or more users that have downloaded a file from the server can share the file with other users. A cooperative distribution model allows a server to deliver large files in a reliable manner that scales with the number of requesting users. Among other benefits, cooperative distribution models exploit the underutilized upstream bandwidth of existing users.
The BitTorrent™ file distribution system, described, for example, in http://www.bittorrent.com/documentation.html, or Bram Cohen, “Incentives Build Robustness in BitTorrent,” http://www.bittorrent.com/bittorrentecon.pdf (May 22, 2003) is an example of a cooperative distribution technique. When multiple users are downloading the same file at the same time using the BitTorrent file distribution system, the various users upload pieces of the file to each other. In other words, a BitTorrent user trades pieces of a file that the user has with other required pieces that other users have until the complete file is obtained. In this manner, the cost of uploading a file is redistributed to the users of the file and the cost of hosting a popular file is more affordable.
While the BitTorrent file distribution system provides an effective mechanism for distributing large files in a cost effective manner, it suffers from a number of limitations, which if overcome, could further improve the utility and efficiency of cooperative file distribution. In particular, if a BitTorrent receiver is offline, then the receiver is unable to obtain files from other users. U.S. patent application Ser. No. 11/096,193, filed Mar. 31, 2005, entitled “Method and Apparatus for Offline Cooperative File Distribution Using Cache Nodes,” discloses a cooperative file distribution system that uses one or more storage proxies to store the files that are being transferred among users.
A need still exists for improved security techniques for a cooperative file distribution system.
SUMMARY OF THE INVENTIONGenerally, security techniques are provided for cooperative file distribution. According to one aspect of the invention, a method and system are provided for generating an encryption key or a nonce (or both) for a package containing one or more files that are to be sent in a cooperative file distribution system. Initially, samples are obtained of at least a portion of each of the files. Thereafter, a hash is applied to the samples and the encryption key or nonce (or both) are generated from a result of the hash.
According to another aspect of the invention, random access encryption techniques are employed to encrypt a package containing one or more files to be sent in a cooperative file distribution system. The package is first separated into pieces of a predefined size, and then a random access encryption technique is applied to each of the pieces. The encrypted package is comprised of the encrypted pieces.
According to yet another aspect of the invention, one or more storage proxies are allocated to a package to be transmitted in a cooperative file distribution system. The load of each of storage proxies is evaluated, and a weight is assigned to each storage proxy based on the evaluated load. Thereafter, a storage proxy is selected for the package using one or more predefined criteria to balance a load among the storage proxies.
Another aspect of the invention controls access to a tracker in a cooperative file distribution system. The tracker allows peers associated with related content to discover each other. The tracker receives a request to upload or download content. Thereafter, the tracker determines if the sender of the request is authorized. The tracker will provide a security token to the sender of the request, whereby the security token can then be used to establish an authorization between the sender of the request and the tracker.
According to further aspects of the invention, content, such as a package of one or more files, can automatically expire using a defined expiration period when the content is uploaded into the system. In addition, a variable announce interval mechanism is disclosed that allows the tracker to control how often the tracker will receive a message, such as an announcement or a heartbeat message, from peers in the system.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
The present invention provides improved security techniques for a cooperative file distribution system.
BitTorrent FrameworkGenerally, to publish one or more files 105 using the BitTorrent file distribution system 100, a corresponding static file 114 with extension .torrent is put on a web server 160. In particular, as shown in
Trackers 130 communicate using a protocol that may be layered on top of HTTP in which a downloader 110, 120 sends information regarding the one or more files that the user is downloading, the port that the user is listening on, and similar information, and the tracker 130 responds with a list of contact information for peers that are downloading the same file. Downloaders 110, 120 then use this information to connect with one another.
To make one or more files 105 available, a downloader 110 having the complete file(s) 105 initiates a seed server 112, using the BitTorrent client116. The bandwidth requirements of the tracker 130 and web server 160 are low, while the seed must send out at least one complete copy of the original file.
The responsibilities of the tracker 130 are generally limited to helping peers (i.e., users) find each other. Typically, the tracker 130 returns a random list of peers to each user. In order to keep track of the files and file pieces held by each user 110, 120, the BitTorrent file distribution system 100 divides files into pieces of fixed size, typically a quarter megabyte. Each downloader 110, 120 reports to all of its peers via the tracker 130, the pieces held by the respective downloader 110, 120. Generally, each peer sends bit torrent tracker messages 165 to the tracker 130. To verify data integrity, a hash of each piece can be included in the .torrent file 114, and a given peer does not report that it has a given piece until the corresponding hash has been validated.
On the receiver side 120, the receiver 120 reads the web page on the tracker web site 160 with .torrent file 114 attached and uses the browser 126 to click on the .torrent file. As a result, the BitTorrent client 128 is launched on the receiver 120 and the .torrent file 124 is provided to the client process 128. In addition, the BitTorrent client 128 initiates a “Leech” server 122 that allows the receiver 120 to connect to the public tracker 130. In this manner, the file 105 is sent from the “seed” 112 to the “leech” 122 via connection 150, such as an offline peer-to-peer connection or swarm delivery, in a known manner. The file copy 105 can then be opened by the receiver 120, for example, using an operating system function.
Cooperative File Distribution Using Storage ProxiesStorage node 260 can cache communications between two nodes 210, 220. The sender 210 deposits blocks of data into the proxy node 260 for subsequent retrieval by one or more receivers 220. A receiver 220 can thereafter retrieve that data from the storage proxy 260.
The cooperative file distribution system 200 may be implemented, for example, using the BitTorrent file distribution system 100 of
In addition, as discussed further below, the cooperative file distribution 200 employs a proxy service 250 to identify potential nodes that are available to serve as storage proxy 260. The proxy service 250 may be integrated with the tracker 230, as shown in
The exemplary profile information maintained in the storage proxy database 255 may be obtained, for example, by a profile service that can be integrated with, or independent of, the proxy service 250. For example, the profile service may obtain information directly from each potential storage proxy 260 regarding the state of the node (e.g., whether the node is behind a firewall) and its resources. In addition, in a further variation, after a given receiver 220 has received a file or a portion thereof from a given storage proxy 260, the receiver 220 can provide a confirmation report to the profile service. In this manner, the validating information from the receivers 220 reduces the likelihood of abuse by the storage proxy 260.
Encryption in a Cooperative File Distribution SystemAccording to one aspect of the invention files 205 that are transmitted in the cooperative file distribution system are encrypted in transit. In this manner, the files 205 are not compromised by eavesdropping. In one exemplary implementation, an Advanced Encryption Standard (AES) 256 in Counter (CTR) mode is employed.
In this manner, the encryption key 350 depends on the content of the file(s) 320. In the exemplary implementation shown in
The process 300 produces the same key 350 and nonce 360 for the same package 310 of ordered files 320. In this manner, two users can package the same content (e.g., the same video) and share a torrent. The duplicate content only needs to be stored once. In addition, users who independently publish the same data can take advantage of sharing a P2P torrent without being aware of each other.
If a given file 320 is less than 20K, the whole file is used. The use of the blocks 330 allows the key 350 and nonce 360 to be generated without reading the entire file(s), which can be long, in a similar manner to a thumbprint. Otherwise, each file would have to be scanned twice, once to generate the key and nonce, and once to hash it for torrent packaging, which would take too long.
In one exemplary embodiment, the encryption process 400 uses an AES 256/CTR technique based on the AES encryption scheme using 256-bit keys 350, 128-bit blocks, and a 128-bit nonce 360. As shown in
According to one aspect of the invention, the data is delivered through the cooperative file distribution system as encrypted data. In other words, the clear data is handed off to the Bit Torrent system as encrypted data. The clear data 310 is encrypted into encrypted data 450 using the exemplary encryption process 400 shown in
In this manner, the encrypted data 450 is delivered without the ability to decrypt the data midstream. The encrypted data 450 is thus delivered with the benefits of Bit Torrent (including piece by piece integrity checks) without being able to access the original data. The data is stored by the storage proxy 515 but the storage proxy 515 has no ability to access the underlying clear data 310.
Uploading Content
After the sender 510 is validated by the message exchange 650, the sender 510 attempts to start a session using message exchange 655. Generally, the sender 510 sends a “start” message to the services processor 630, which executes a storage proxy allocation process 700, discussed further below in conjunction with
After the sender 610 is notified of the tracker 525 assigned to the bit torrent, the sender 610 announces his or herself to the tracker 525, during a message exchange 660. As shown in
After implementing the announce interval computation process 800, the tracker 525 will send an announce response to the sender 510. The announce response includes a listing of the peers associated with the bit torrent, discussed further below in a section entitled “Tracker Peer Listing,” as well as the assigned announce interval (two seconds in this example). If a storage proxy is required for the communication, message exchange 670 occurs between the tracker 525 and the assigned storage proxy 515. The message exchange 670 includes a request for the storage proxy 515 to join the bit torrent. The storage proxy 515 will respond to the tracker 525 with an announce message, which will trigger the tracker 525 to execute the announce interval computation process 800.
After the defined announce interval, the sender 510 will send another announce message during message exchange 675. During message exchange 680, the sender 510 publishes the file on the assigned storage proxy 515. The sender 510 will continue to announce periodically to the tracker 525 in accordance with the assigned announce interval. Thereafter, during message exchange 685, the sender 510 notifies the services processor 630 that the uploading is complete. Finally, the session is terminated during a message exchange 690 between the sender 510 and the 630.
Storage Proxy Allocation Process
The storage proxy allocation process 700 includes a section 730 for selecting a storage proxy. As shown in
In one exemplary embodiment, shown in
The weight function 750 of the storage proxy allocation process 700 is shown in
Finally, the weight is computed in statement 780. Since all of the factors are multiplied in the weight computation 780, any one factor being zero (e.g., available disk space) can prevent a storage proxy from being allocated any more torrents. Taking the weight to a fractional power (e.g., ̂0.25), for example, smooths the distribution of weights, reducing the tendency of the equation to over-allocate for the most underutilized storage proxy. This factor can be manipulated to make the allocation scheme sufficiently responsive without being over-responsive.
Announce Interval Computation Process
As shown in
As shown in
Tracker Peer Listing
As indicated above, the tracker announce response message 660 (
In one exemplary embodiment, the listing of a peer in the tracker announce response message 660 is controlled by the following announce arguments:
-
- NAT/external_ip—the IP address the announce message arrives from;
- IP—the internal IP address reported in the announce URL;
- port—the listening port reported in the announce URL;
- show_seeds=1|0, 0 default (indicates who has the same content (whole file));
- fw=0 not firewalled|1 firewalled|−1 don't know yet (default)
- left=0 seed|#leech|−1 don't know yet
- type=sp|peer (default)
The response logic for the exemplary embodiment can be expressed as follows:
-
- An SP peer (type=sp) always gets an empty list (storage proxies do not make outgoing connections).
- A seed peer (left=0) only gets addresses of leeches, unless show_seeds=1 (seeds cannot communicate with other seeds).
- FW=1 is not shown to other peers (peers with firewalls are not shown to other peers), unless both are behind the same NAT.
- Peers behind different NATs don't see each other, unless peer is fw=0 (not firewalled)
- An SP is not listed if a there is a certain count of seeds or a certain count non-firewalled seeds (offload delivery from storage proxies to peers to reduce costs).
In a further variation, if the list is longer than a specified length (such as 40-50 peers), the response can be randomized in the following manner:
-
- The SP is always the first in the response.
- X peers behind the same NAT as the requested peer are listed next.
- The other peers are uniformly selected from the complete list.
Downloading Content
As shown in
After the receiver 520 is validated by the message exchange 950, the receiver 520 attempts to start a session using message exchange 955. Generally, the receiver 520 sends a “start” message to the services processor 630, which executes the storage proxy allocation process 700, discussed above in conjunction with
After the receiver 520 is notified of the tracker 525 assigned to the bit torrent, the receiver 520 announces his or herself to the tracker 525, during a message exchange 960. As shown in
After implementing the announce interval computation process 800, the tracker 525 will send an announce response to the receiver 520. The announce response includes a listing of the storage proxy 515 and sender 510 associated with the file(s), as well as the assigned announce interval.
During message exchange 970, the receiver 520 downloads the file from the assigned storage proxy 515 or sender 510 (or both). Thereafter, during message exchange 975, the receiver 520 notifies the services processor 630 that the downloading is complete. Finally, the session is terminated during a message exchange 980 between the receiver 520 and the 630.
Maintenance Operations
As shown in
During a storage proxy registration process 1020, each storage proxy 515 reports its state, such as its current load information, to the services processor 630, and the services processor 630 records the information in the database 635.
As shown in
In a second scenario, the services processor 630 recognizes that a given bit torrent has expired. In one exemplary implementation, bit torrents can be deleted after a defined expiration period. For example, each time a file is uploaded, the expiration period can be extended by two weeks. Therefore, a bit torrent available for two weeks from the last time the BT was published. (pstart received plus 14 days). The services processor 630 can expire the bit torrent and deallocate the associated storage proxy 515 after the bit torrent expires.
In a third scenario, the storage proxy 515 self terminates by notifying the services processor 630, if the storage proxy believes that the torrent has expired, based on the expiration interval that was indicated in the join torrent message 670 (
As shown in
Tracker Tokens
As previously indicated, tracker tokens are used to control access to and use of the tracker 525 and reduce the number of accesses to the database(s) 635 for authentication purposes. The tracker tracks all peers who are participating in a torrent and help these peers to discover each other. Peers announce their presence to the tracker 525 on regular (announce) intervals, as discussed above, and are responded to with a listing of the addresses of other peers.
When peers upload or download content (package containing one or more files), as discussed above in conjunction with
In one exemplary implementation, the assigned tokens are valid for a limited time period. Thus, an announce response message may include a “token-expired” error. To obtain a new token, a peer may issue a request for a token from the tracker 525.
In one preferred embodiment, the token is an encrypted binary data structure. The tracker 525 and 630 can share a secret key. In one implementation, 128 bits AES encryption is used.
System and Article of Manufacture Details
As is known in the art, the methods and apparatus discussed herein may be distributed as an article of manufacture that itself comprises a computer readable medium having computer readable code means embodied thereon. The computer readable program code means is operable, in conjunction with a computer system, to carry out all or some of the steps to perform the methods or create the apparatuses discussed herein. The computer readable medium may be a recordable medium (e.g., floppy disks, hard drives, compact disks, or memory cards) or may be a transmission medium (e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel). Any medium known or developed that can store information suitable for use with a computer system may be used. The computer-readable code means is any mechanism for allowing a computer to read instructions and data, such as magnetic variations on a magnetic media or height variations on the surface of a compact disk.
The computer systems and servers described herein each contain a memory that will configure associated processors to implement the methods, steps, and functions disclosed herein. The memories could be distributed or local and the processors could be distributed or singular. The memories could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. Moreover, the term “memory” should be construed broadly enough to encompass any information able to be read from or written to an address in the addressable space accessed by an associated processor. With this definition, information on a network is still within a memory because the associated processor can retrieve the information from the network.
It is to be understood that the embodiments and variations shown and described herein are merely illustrative of the principles of this invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention.
Claims
1. A method for generating one or more of an encryption key and a nonce for a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- obtaining samples of at least a portion of each of said one or more files;
- applying a hash to said samples; and
- generating one or more of said encryption key and said nonce from a result of said hash.
2. The method of claim 1, further comprising the step of using one or more of said encryption key and said nonce to encrypt said package.
3. The method of claim 1, wherein said samples of at least a portion of each of said one or more files maintain an order of said files in said package.
4. The method of claim 1, wherein said hash is a Secure Hash Algorithm.
5. The method of claim 1, wherein said generating step further comprises the step of generating said encryption key from a predefined portion of said result of said hash.
6. The method of claim 1, wherein said generating step further comprises the step of generating said nonce from a predefined portion of said result of said hash.
7. The method of claim 1, wherein said encryption key is based on a content of said files in said package.
8. The method of claim 1, wherein said samples are a predefined number of evenly spaced samples from each file in said package.
9. A method for encrypting a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- separating said package into pieces of a predefined size;
- applying a random access encryption technique to each of said pieces; and
- generating said encrypted package comprised of said encrypted pieces.
10. The method of claim 9, wherein said random access encryption technique employs an encryption key and a nonce.
11. The method of claim 10, wherein said encryption key is based on a content of said package.
12. The method of claim 9, wherein said random access encryption technique adheres to an Advanced Encryption Standard.
13. The method of claim 9, wherein said random access encryption technique allows any received encrypted piece to be decrypted by an authorized recipient.
14. The method of claim 9, wherein said random access encryption technique encrypts a given piece N of said pieces using an encryption key for said package and applying a result to an exclusive or (XOR) gate with said given piece to generate said encrypted piece.
15. The method of claim 9, further comprising the step of delivering said encrypted pieces to a cooperative file distribution system for transport.
16. The method of claim 15, wherein said cooperative file distribution system is a Bit Torrent system.
17. A method for assigning one of a plurality of storage proxies to a package to be transmitted in a cooperative file distribution system, comprising:
- evaluating a load of each of said plurality of storage proxies;
- assigning a weight to each of said plurality of storage proxies based on said evaluated load; and
- selecting one of said plurality of storage proxies for said package using one or more predefined criteria to balance a load among said plurality of storage proxies.
18. The method of claim 17, wherein said load is based on one or more of used storage space, a number of connections, and a number of active and inactive torrents associated with each of said storage proxies.
19. The method of claim 18, wherein said weight is based on a multiple of two or more of said used storage space, a number of connections, and a number of active and inactive torrents associated with each of said storage proxies and wherein setting any one of said values to zero results in said storage proxy becoming unavailable for being assigned to said package.
20. The method of claim 17, wherein subsets of said plurality of storage proxies are grouped together.
21. The method of claim 17, further comprising the step of smoothing said weights in order to reduce a tendency to assign said package to the most underutilized storage proxy.
22. A method for controlling access to a tracker in a cooperative file distribution system, wherein said tracker allows peers associated with related content to discover each other, comprising:
- receiving a request to upload or download said content;
- evaluating an authorization for said request; and
- providing a security token to a sender of said request, whereby said security token can be used to establish an authorization between said sender of said request and said tracker.
23. The method of claim 22, wherein said security token is provided by said sender of said request to said tracker in an announce message.
24. The method of claim 23, wherein said tracker can validate said sender of said request using said security token.
25. The method of claim 23, wherein said tracker provides a listing of one or more of said peers to said sender of said request in response to said announce message.
26. The method of claim 22, wherein said peers associated with related content are one or more senders and one or more recipients of said content.
27. The method of claim 22, wherein said security token has a defined expiration.
28. The method of claim 22, wherein said security token is an encrypted binary data string.
29. The method of claim 22, wherein said security token contains a last torrent update time and wherein said tracker obtains torrent information if a predefined torrent update time has been exceeded.
30. The method of claim 25, wherein said listing is empty if said peer is a storage proxy.
31. The method of claim 25, wherein said listing comprises addresses of leeches if said peer is a seed peer.
32. The method of claim 25, wherein said listing does not identify peers behind a firewall.
33. The method of claim 25, wherein said listing does not identify peers behind a different number address translator (NAT) unless said peers are not behind a firewall.
34. The method of claim 25, wherein said listing does not identify a storage proxy peer if the number of peers satisfies a predefined criteria.
35. The method of claim 25, wherein said listing has a predefined maximum length, X, and said listing includes a storage proxy, if present, up to X-1 peers behind the same number address translator as the requested peer and then other peers.
36. A method for processing a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- receiving a request to upload said package; and
- responsive to said request, setting an expiration period for said package.
37. The method of claim 36, further comprising the step of deleting said package after said expiration period has expired.
38. The method of claim 36, further comprising the step of deallocating a storage proxy associated with said package after said expiration period has expired.
39. A method performed by a tracker for communicating with one or more peers in a cooperative file distribution system, comprising:
- receiving a request from one of said peers to start a session;
- determining an announce interval within which said one of said peers should provide an announcement; and
- providing a message to said one of said peers comprising said announce interval.
40. The method of claim 39, wherein announce interval is determined based on whether said peer is a storage proxy.
41. The method of claim 39, wherein announce interval is determined based on a recency of activity for a torrent associated with said request.
42. The method of claim 39, wherein announce interval is determined based on a number of peers in a torrent associated with said request.
43. The method of claim 39, wherein announce interval is determined based on whether one or more peers in a torrent associated with said request are behind a firewall.
44. A system for generating one or more of an encryption key and a nonce for a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- obtain samples of at least a portion of each of said one or more files;
- apply a hash to said samples; and
- generate one or more of said encryption key and said nonce from a result of said hash.
45. A system for encrypting a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- separate said package into pieces of a predefined size;
- apply a random access encryption technique to each of said pieces; and
- generate said encrypted package comprised of said encrypted pieces.
46. A system for assigning one of a plurality of storage proxies to a package to be transmitted in a cooperative file distribution system, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- evaluate a load of each of said plurality of storage proxies;
- assign a weight to each of said plurality of storage proxies based on said evaluated load; and
- select one of said plurality of storage proxies for said package using one or more predefined criteria to balance a load among said plurality of storage proxies.
47. A system for controlling access to a tracker in a cooperative file distribution system, wherein said tracker allows peers associated with related content to discover each other, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- receive a request to upload or download said content;
- evaluate an authorization for said request; and
- provide a security token to a sender of said request, whereby said security token can be used to establish an authorization between said sender of said request and said tracker.
48. A system for processing a package containing one or more files to be sent in a cooperative file distribution system, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- receive a request to upload said package; and
- responsive to said request, set an expiration period for said package.
49. A system performed by a tracker for communicating with one or more peers in a cooperative file distribution system, comprising:
- a memory; and
- at least one processor, coupled to the memory, operative to:
- receive a request from one of said peers to start a session;
- determine an announce interval within which said one of said peers should provide an announcement; and
- provide a message to said one of said peers comprising said announce interval.
Type: Application
Filed: Sep 12, 2006
Publication Date: Mar 13, 2008
Inventors: Andrew Hickmott (New York, NY), Laird A. Popkin (West Orange, NJ), Yaar Schnitman (New York, NY)
Application Number: 11/519,990
International Classification: H04L 9/32 (20060101);