Distributed frequency data collection via DNS
Domain Name Service (DNS) requests are used as the reporting vehicle for ensuring that security-related information can be transferred from a network. As one possibility, a central facility for a security provider may maintain a data collection capability that is based upon receiving the DNS requests containing the information being reported. In an email application, if a data block is embedded within or attached to an email message, an algorithm is applied to the data block to generate an indicator that is specifically related to the contents of the data block. As one possibility, the algorithm may generate a hash that provides a “digital fingerprint” having a reasonable likelihood that the hash is unique to the data block. By embedding the hash within a DNS request, the request becomes a report that the data block has been accessed.
The present invention relates generally to computer network security and more particularly to enabling detection of widespread “events” which are indicative of network security concerns, such as a distribution of spam or malware (for example, a virus, worm or spyware).
BACKGROUND ARTAlong with the many benefits of data and communication exchanges as a result of the Internet, there are significant risks. Providing security for a computer network so as to prevent disruption of network operations is an increasing concern for network administrators. A security measure that has become a standard practice is to use a firewall as a chokepoint for the network. The firewall enforces one or more sets of rules which determine access to and from nodes of the network.
Firewalls utilize various techniques to provide security for a network. One such technique is packet filtering. The firewall may examine packets to determine origins, destinations and content. All packets that violate a rule are discarded. As another technique, security measures may be applied at lower levels. Thus, there may be rules that are specific to determining whether to enable establishment of a Transmission Control Protocol (TCP) connection or a Universal Datagram Protocol (UDP) connection. In addition to providing rules that are specific to packet filtering or specific to types of connections, there are approaches that are specific to a particular application. As examples, there may be rules directed to a File Transfer Protocol (FTP) application, a Telnet application, HyperText Transfer Protocol (HTTP), or Simple Mail Transfer Protocol (SMTP).
Network security for a particular network may be provided by using a number of separate components. It is typical for the application-level firewall directed to electronic mail (email) to be a separate component. Thus, there may be a general firewall at the chokepoint of the network and an internal “spam filter” that applies security measures to email messages of the network. For network email messages having destinations or origins outside of the network, there may be different sets of rules applied at the spam filter and at the general firewall.
As used herein, the term “spam” is defined as unsolicited messages intended for bulk distribution. With respect to email, spam is a form of abuse of the SMTP. A spam email may be a mere inconvenience or annoyance, as is the case if the email includes advertisement. However, a spam email may also include a virus or a “worm” which is intended to affect operation or performance of a device or the entire network. At times, spam is designed to induce a person to disclose confidential personal or business-related information. Additionally, even unharmful spam is a financial drain to large corporations.
A commercial supplier of spam filters will often provide regular updates for the application of security rules. The supplier may operate a central location that identifies the need for updated rules and that has Internet access to spam filters located at different networks. A spam filter of a particular network may collect information regarding activity within the network. This activity may be useful to the centralized supplier for the purpose of identifying “events” which indicate the need for rule or definition updates. A concern is that if the reporting information must pass through one or more “chokepoint” firewalls to exit the network for transmission to the central facility, the transmission may be blocked. Because the different security devices are separately controlled, the “innocent” transmission may be interpreted as being a distribution of confidential data, for example. If the centralized facility is to have the ability to quickly identify and respond to an intrusive event, such as a widespread distribution of a virus or worm, reporting information must be allowed to pass from the network. This concern also applies to other network security devices that benefit from the ability of transmitting reporting data.
SUMMARY OF THE INVENTIONThe concern that the reporting of information useful to providing network security will be blocked is addressed by utilizing Domain Name Service (DNS) requests as the reporting vehicle. The potential blocking of useful report information occurs because a wide range of different security rules are applied by different independent networks and even by different security devices within a single network. However, nearly all networks allow DNS requests to be forwarded from a system that is not identified as being “suspect.” Thus, reporting information by use of DNS requests allows the information to reach its intended target. For example, a central facility of a security provider may maintain a data collection capability that is based upon receiving requests containing “phantom” domain names which specify the information being reported.
The method of monitoring data traffic at a particular network includes detecting each occurrence of a transfer of a “block of data,” which may be a file or other data assembly, such as the relevant IP addresses of a transmission within the data traffic being monitored. In an email application, the block of data may be one that is embedded within or attached to an email message. As two examples, the block of data may be an image file or a data file. However, the block of data in the email application may merely be the IP address of the sender system, receiver system, or other system (or the corresponding URL) that is referenced within the body of an email message. This reporting of IP addresses may be the focus in other security applications as well.
For each transfer of a block of data, an indicator is generated. The indicator is specifically related to contents of the block of data. In the preferred embodiment, the indicator is generated by applying a particular algorithm to the block of data to provide a “digital fingerprint” as a function of the algorithm. The digital fingerprint may not be unique to the contents, but has a reasonable likelihood of being unique to the particular block of data. A standard cryptographic hash function may be used as the algorithm. MD5 (Message-Digest Algorithm 5) is a known algorithm that is used to verify data integrity, but may be used in the present invention to define the digital fingerprint. An MD5 hash is typically a 32-character hexadecimal number. Following the generation of an MD5 hash for a transferred block of data, the number may be used in forming a “phantom” domain name that is embedded in a DNS request. The original transfer of the block of data is then reported by a transmission of the DNS request. While an algorithm which generates an MD5 hash is one possible approach, alternative algorithms which provide indicators which are reasonably likely to be unique to particular blocks of data may be substituted.
As noted, an application of the invention is one in which the DNS requests are directed to a central facility of a security provider which maintains data collection. The DNS requests may be transmitted to the central facility via the Internet, thereby enabling the remote site to determine a count of occurrences of transfers of each block of data. As an example, when an email filter of a particular network detects the transfer of an email attachment, the algorithm is applied to the attachment and the resulting hash is used as the first portion of a DNS query, such as 2978546CDFADBE.barracuda.com. The second portion of the DNS request ensures that the request is properly directed to the central facility. If the same algorithm is applied at different reporting networks, spikes of “events” that are indicative of security breaches may be identified at an early stage. Such spikes in the frequency of transfers occur with widespread dissemination of a virus, a worm or spam, for example. The security provider may then provide updates of filtering rules so as to combat the potential security breach.
As an alternative to transmission of a DNS request for each occurrence of a transfer of a block of data, the individual networks may independently accumulate counts of different blocks of data and report the counts to a central facility when a threshold number or a threshold time is reached. For example, if the reporting of certain data is not considered time-critical, the data may be accumulated for a selectable period of time (such as one hour) so that the transmission of a “reporting DNS request” at the end of the time period will include the relevant count. Thus, a single transmission will indicate the number of times that a particular digital fingerprint (such as an MD5 hash) has been generated during the time period. In this embodiment, a reporting DNS request is specific to a single MD5 hash. Alternatively, a single reporting DNS request may include an aggregate of MD5 hashes or other indicators that are related to particular blocks of data. For this application, the receiving site for the reporting DNS requests must be configured to dissect a DNS request in order to identify each MD5 hash within an aggregated DNS request. By way of example, each MD5 hash may be separated by a symbol, such as a “punctuation dot.”
As is known in the art, transmission of a DNS request is accompanied by the expectation of a response. In one embodiment of the invention, the responses to reporting DNS requests may be used to provide security enforcement. If a central facility receives reporting DNS requests from a number of different networks, responses to the different networks may be used to initiate blocking actions by the networks. The data collection capability at the central facility may detect evidence of distribution of spam or malware (e.g., virus, worm or spyware), with the responses then being used to initiate a blocking action. Other possible actions are to quarantine certain transmissions or to defer a decision as to how to proceed.
A potential problem is that while a first DNS request (e.g., 2978546CDFADBE.barracuda.com) having a particular hash will be transmitted from a network and will reach the intended central facility, it is common for DNS information to be locally cached in order to increase efficiency in satisfying subsequent DNS requests. If DNS information relevant to a particular DNS request is locally cached, the information may be used to service subsequent DNS requests, effectively blocking the requests from reaching the central facility. To overcome this problem, each report of information that uses a DNS request as its transmission vehicle may be made unique. One possible solution is to provide a date-and-time stamp for each DNS request intended for the central facility. The format of the stamp is not critical, but must be known at the central facility, so that the stamp may be stripped at the central facility. The stamp may precede or follow the hash. As a second solution, each hash-containing DNS request may include a value segment that is incremented for each transmission of a particular DNS request, such as the prefix N- (where N is the current count 1, 2, 3 . . . ). Thus, the fifth transmission of a particular hash may be a DNS request of 5-2978546DJSKDJM.barracuda.com.
Preferably, there is a means for verifying the source of a reporting DNS request. The concern is that data collection at a central facility may be rendered unreliable if it is possible for unauthorized sites to send reporting DNS requests. As one possibility, a digital signature may be required for each reporting DNS request. The use of public key encryption is well known in the art.
As viewed from the central security provider for a number of different networks, the invention includes receiving the DNS requests from the various networks, determining the frequencies of transfers of different data blocks based on the reception of DNS requests that include hashes indicative of the different data blocks, and determining adaptive security measures at least partially on the basis of determinations of the frequencies of transfers. The adaptive security measures may be implemented as a step of forwarding security rules or definitions to the different networks in order to block subsequent occurrences of transfers of specific blocks of data into or out of the networks. In an email security application, the rules and definitions may be forwarded to spam filters of the various networks. However, the invention may be applied to other security applications and may be applied within a single network (e.g., a wide area network (WAN) that does not require connection to the Internet.
A network that is adapted to use the invention may include all of the conventional components of a network, but will include an “algorithm component” specific to generating the digital fingerprints. This algorithm component may be implemented in software, dedicated hardware, or a combination of software and hardware. Again referring to the email security application, a network email security device, such as a spam filter, may include the algorithm component and a reporting component that forms and transmits DNS requests that include the digital fingerprints. Because the reporting information is contained within a DNS request, the information will exit even a network having highly protective rules applied at a chokepoint firewall.
While the invention has been described primarily with reference to collecting data for purposes of network security, the use of DNS requests as reporting vehicles for data collection may be used in other applications. At each of a number of different nodes, information that is determined to be relevant to the data collection is embedded within a DNS request in a format consistent with the protocol for transmissions. Unlike other DNS requests transmitted from the nodes, the information-reporting DNS requests are forwarded for purposes of enabling the data collection. An advantage of the use of DNS requests as the reporting vehicles is that the transfer of information is less likely to be blocked by rules applied at network security devices, such as chokepoint firewalls. As previously described, there may be a need to incorporate a unique feature into the DNS requests from a particular node, so that the DNS requests are not satisfied by operation of a local DNS server. Merely as two possibilities, the unique feature may be a time (date) stamp or may be an incremented count for duplicate DNS requests from the particular node.
With reference to
The central security provider 10 is connected to the different networks 12,14 and 16 via the global communications network referred to as the Internet 18, so as to allow updates in response to detecting new security breaches. The invention will be described with reference to its application via the Internet, but may be used within connectivity environments (e.g., WANs) that include exchanges of DNS requests which do not require the Internet. Each network includes a firewall 20, 22 and 24 which functions as a chokepoint on the network. The firewall utilizes a set of rules to determine if access to or from the network should be allowed or denied.
A typical network 12 is shown in
When a person at one of the user devices 26, 28 and 30 sends or receives an email message or accesses a website, the Domain Name Service (DNS) is implemented. The location of a website or node on the Internet is identified by its IP address. The person attempting to reach the website may initiate contact by sending a request using the IP address of the website. The IP address is a long and awkward numerical address that is difficult to remember (twelve digits segmented by three decimals). However, a domain name may be used in place of the IP address. A Universal Resource Locator (URL) is based on a domain name with the protocol specified (e.g., http://www.mywebsite.com). The URL is translated into the appropriate IP address by the DNS. Therefore, a user request for access to a website is normally a combination of two requests. The first request is the request to perform the translation of the domain name portion of the URL to the appropriate IP address. The second DNS request is sent to the actual IP address of the requested URL.
In
As previously described, a concern with a networking environment such as that shown in
A spam blocker 46 is a conventional component. In the preferred embodiment, the spam blocker is responsive to the central security provider 10 for updating rules and definitions. Spam blocking techniques include word filtering, rule-based scoring, the use of allowable IP addresses (white lists), the use of restricted IP addresses (black lists), and Bayesian filtering. With regard to the invention, the spam blocker additionally (or alternatively) applies techniques for combating malware.
Data blocks which are not in violation with one of the rules may be parallel processed. The email security device represented in
While the input 44, the spam blocker 46 and the output 48 may be conventional components, the hash generator 50 is an algorithm component that is unconventional to security devices such as email spam filters. Nevertheless, features of this component are closely related to teachings within U.S. Pat. No. 6,330,590 to Cotten. The hash generator is used to provide an indicator that is at least partially based on contents of a data block received at the input 44. The data block may be an attachment to an email message or an embedded image, but other applications are contemplated. As one possibility, the hash generator 50 may apply an MD5. As is known in the art, an MD5 is an algorithm that may be used to verify data integrity. However, as used in the present invention, the algorithm is used for data identification. Other algorithms which provide a “digital fingerprint” may also be used. The digital fingerprint is not necessarily unique in the manner that a human fingerprint is unique to a particular person, but the digital fingerprint has a very high likelihood of uniquely identifying the data block.
A DNS request component 52 is at the output of the hash generator 50. The DNS request component forms a DNS request having a conventional format. However, the DNS request includes the indicator of the data block. By way of example, if the hash generator executes a checksum (other than MD5) which provides an indicator of 2978546CDFADBE, a DNS query may be 2978546CDFADBE.barracuda.com. The second portion of this DNS query ensures that the request is properly directed. The DNS request is then forwarded from the network in the usual manner.
In some applications, a third portion of DNS requests in accordance with the invention may be beneficial or even required. In
The method steps will be described briefly with reference to
The invention may also include the step 61 of receiving an “enforcement response” to the DNS request that was transmitted at step 60. Rather than the standard response to the DNS request, the enforcement response may include instructions. Thus, a central security provider may provide immediate instructions to the network, rather than being limited to providing updates to a set of rules applied at the network. Possible enforcement responses include instructions to block transmissions to or from a particular IP address, instructions to block transmission of a particular file, instructions to defer a determination, or instructions to enable transmissions.
The method of
In
In addition to the DNS data collector 62 and the data analysis component 64,
The invention has been described as being one in which a DNS request is formulated and transmitted each time that the algorithm is applied to a data block. As one alternative, each network may be configured to accumulate a preliminary count which is systematically transferred to the central security provider 10. In order to ensure that the systematic transfer of the subtotal is not blocked by the firewall of the network, the information is reported via a DNS request. The difference is that the DNS request is representative of both the block of data and the subtotal. The systematic transfer may be based on time (e.g., a transfer each hour) or on reaching a threshold number.
Rather than including a single hash (or other type of digital fingerprint representative of a specific data block), the DNS requests that are used to report data may include a string of hashes representative of different data blocks. Thus, if a number of data blocks are received, the algorithm may be applied to the individual data blocks, but the accumulation of hashes may be incorporated into a single DNS request. In one application, the “aggregated” DNS request may be a string of the different hashes generated for a number of independent email attachments for a single email message. However, aggregated DNS requests may include hashes that report blocks of data that are not related in any manner.
As another alternative in the use of the invention, the data collection may be unrelated to providing network security. That is, the ability of a node to transmit DNS requests as reporting vehicles may be used in other applications that require or benefit from data collection from distributed sources. Because the information is reported in the format of DNS requests, the information is less likely to be blocked as a consequence of the application of security rules at networks that include one or more of the sources.
Claims
1. A method of monitoring data traffic comprising:
- detecting occurrence of a transfer of a block of data with respect to a network node;
- generating an indicator that is specifically related to contents of said block of data; and
- reporting said transfer, including utilizing said indicator in a Domain Name Service (DNS) request.
2. The method of claim 1 wherein generating said indicator includes applying a particular algorithm to said block of data to define said indicator as a digital fingerprint that is a function of said algorithm.
3. The method of claim 2 wherein generating said indicator includes outputting a hash as a consequence of applying said algorithm, said reporting including forming said DNS request to include said hash.
4. The method of claim 1 wherein said reporting includes transmitting said DNS request to a remote site via a global communications network, thereby enabling said remote site to determine a count of occurrences of transfers of said block of data.
5. The method of claim 4 further comprising receiving instructions from said remote site as a response to said DNS request, said instructions being relevant to processing of said data traffic being monitored.
6. The method of claim 4 wherein said remote site is maintained by a central security provider enabled to select and implement corrective action on a basis of said count of occurrences of transfers.
7. The method of claim 6 wherein generating said indication is executed at one of a plurality of independent networks that are enabled to exchange data with said central security provider, said networks using a same algorithm to generate hashes upon said occurrences of transfers of blocks of data.
8. The method of claim 7 further comprising enabling said central security provider to receive DNS requests from each said network, wherein at least some of said DNS requests include said hashes.
9. The method of claim 1 further comprising combining a plurality of different said indications to form an aggregated said DNS request.
10. The method of claim 1 wherein detecting said transfer is specific to monitoring email transmissions.
11. The method of claim 10 wherein said block of data is an attachment of an email message.
12. The method of claim 10 wherein said block of data is an image which is a component of an email message.
13. The method of claim 10 wherein said indicator is generated to identify Uniform Resource Locators (URLs) detected within said email transmissions.
14. The method of claim 10 wherein said indicator is generated to identify IP addresses relevant to said email transmissions.
15. The method of claim 1 wherein said indicator is generated to identify an IP address of a source of said block of data.
16. A method of providing security for a plurality of networks comprising:
- receiving Domain Name Service (DNS) requests originating from said networks, including DNS requests that include hashes determined at said networks by applications of an algorithm to transferred data blocks;
- determining frequencies of transfers of different data blocks based on receiving said DNS requests that include different said hashes; and
- forwarding security updates to said networks at least partially on a basis of determinations of said frequencies.
17. The method of claim 16 wherein forwarding said security updates relates to updating email security rules for application by spam filters of said networks.
18. The method of claim 17 wherein said hashes are formed upon applying said algorithm to components of emails exchanged via the Internet, said components including attachments and embedded images.
19. The method of claim 16 wherein receiving said DNS requests includes identifications of domain names containing said hashes.
20. The method of claim 16 wherein at least some of said DNS requests include indications of counts of transfers of said data blocks at individual said networks.
21. The method of claim 16 wherein each said DNS request includes a digital signature verifying the source of said DNS request, thereby enabling authentification of authorization to affect determinations of said frequencies.
22. A network comprising:
- a plurality of user devices;
- a network email server configured to enable email exchanges to and from said user devices;
- a network email security device configured to filter said email exchanges, said network email security device including an algorithm component specific to generating digital signatures for components of email messages, said network email security device having a reporting component specific to forming and transmitting domain names that include said digital signatures; and
- a network firewall connected along a path from the Internet and each of said user devices and said network email security devices.
23. The network of claim 22 wherein said reporting component is configured to transmit said domain names as DNS requests.
24. The network of claim 22 wherein said algorithm component is configured to generate a hash for data blocks that are transferred in said email exchanges, said data blocks including images and file attachments.
25. The network of claim 22 wherein said reporting component is configured to transmit said domain names to a central security provider, said network email security device being responsive to security updates received from said central security provider.
26. A method of collecting data from a plurality of nodes comprising:
- at each of said nodes, determining information that is to be reported in order to enable data collection;
- utilizing DNS requests as reporting vehicles for transmitting said information via the Internet, including embedding said information within said DNS requests in a format consistent with a protocol for transmissions via said Internet and further including forwarding said DNS requests for purposes of enabling said data collection; and
- collecting said information as a consequence of said DNS requests.
27. The method of claim 26 wherein formatting said DNS requests includes incorporating a unique feature into said DNS requests from a particular one of said nodes, such that said DNS requests are not satisfied by operation of a local DNS server.
28. The method of claim 27 wherein said formatting utilizes time stamping to provide said unique feature.
29. The method of claim 27 wherein said formatting utilizes incorporating an incremented count for duplicate said DNS requests from said particular node.
Type: Application
Filed: Sep 24, 2007
Publication Date: Mar 26, 2009
Inventors: Zachary S. Levow (Mountain View, CA), Joseph Wilson Evans (Santa Clara, CA)
Application Number: 11/903,605
International Classification: G06F 15/177 (20060101); G06F 21/00 (20060101); H04L 9/00 (20060101);