METHOD AND APPARATUS FOR NETWORK PACKET CAPTURE DISTRIBUTED STORAGE SYSTEM
This is invention comprises a method and apparatus for Infinite Network Packet Capture System (INPCS). The INPCS is a high performance data capture recorder capable of capturing and archiving all network traffic present on a single network or multiple networks. This device can be attached to Ethernet networks via copper or SX fiber via either a SPAN port (101) router configuration or via an optical splitter (102). By this method, multiple sources or network traffic including gigabit Ethernet switches (102) may provide parallelized data feeds to the capture appliance (104), effectively increasing collective data capture capacity. Multiple captured streams are merged into a consolidated time indexed capture stream to support asymmetrically routed network traffic as well as other merged streams for external consumption.
Latest SOLERA NETWORKS. INC. Patents:
- Apparatus and Method for Utilizing Fourier Transforms to Characterize Network Traffic
- Apparatus and Method for Characterizing the Risk of a User Contracting Malicious Software
- Hardware accelerated application-based pattern matching for real time classification and recording of network traffic
- Method and apparatus of network artifact indentification and extraction
- Apparatus and Method for Random Database Sampling with Repeatable Results
This is an accelerated examination of application Ser. No. 11/632,249 titled METHOD AND APPARATUS FOR NETWORK PACKET CAPTURE DISTRIBUTED STORAGE SYSTEM, filed Dec. 16, 2005, which claims the benefit of U.S. Provisional Application No. 60/638,707, filed on Dec. 23, 2004. These applications are incorporated herein by reference.
BACKGROUNDThe present invention relates to capturing and archiving computer network traffic. Networks allowing computer users to communicate and share information with one another are ubiquitous in business, government, educational institutions, and homes. Computers communicate with one another through small and large local area networks (LANs) that may be wireless or based on hard-wired technology such as Ethernet or fiber optics. Most local networks have the ability to communicate with other networks through wide area networks (WANs). The interconnectivity of these various networks ultimately enables the sharing of information throughout the world via the Internet. In addition to traditional computers, other information sharing devices may interact with these networks, including cellular telephones, personal digital assistants (PDAs) and other devices whose functionality may be enhanced by communication with other persons, devices, or systems.
The constant increase in the volume of information exchanged through networks has made network management both more important and more difficult. Enforcement of security, audit, policy compliance, network performance and use analysis policies, as well as data forensics investigations and general management of a network may require access to prior network traffic. Traditional storage systems, generally based on magnetic hard disk drive technology, have not been able to keep pace with expanding network traffic loads due to speed and storage capacity limitations. Use of arrays of multiple hard disks, increases speed and capacity but even the largest arrays based on traditional operating system and network protocol technologies lack the ability to monolithically capture and archive all traffic over a large network. Capture and archive systems based on current technologies also become part of the network in which they function, rendering them vulnerable to covert attacks or “hacking” and thus limiting their security and usefulness as forensic and analytical tools.
To overcome these limitations, a robust network packet capture and archiving system must utilize the maximum capabilities of the latest hardware technologies and must also avoid the bottlenecks inherent in current technologies. Using multiple gigabit Ethernet connections, arrays of large hard disk drives, and software that by-passes traditional bottlenecks by more direct communication with the various devices, it is possible to achieve packet capture and archiving on a scale capable of handling the traffic of the largest networks.
SUMMARYThe present invention describes an Infinite Network Packet Capture System (INPCS). The INPCS is a high performance data capture recorder capable of capturing and archiving all network traffic present on a single network or multiple networks. The captured data is archived onto a scalable, infinite, disk based LRU (least recently used) caching system at multiple gigabit (Gb) line speeds. The INPCS has the ability to capture and stream to disk all network traffic on a gigabit Ethernet network and allows this stored data to be presented as a Virtual File System (VFS) to end users. The file system facilitates security, forensics, compliance, analytics and network management applications. The INPCS also supports this capability via T1/T3 and other network topologies that utilize packet based encapsulation methods.
The INPCS does not require the configuration of a protocol stack, such as TCIP/IP, on the network capture device. As a result, the INPCS remains “invisible” or passive and thus not detectable or addressable from network devices being captured. Being undetectable and unaddressable, INPCS enhances security and forensic reliability as it cannot be modified or “hacked” from external network devices or directly targeted for attack from other devices on the network.
INPCS also provides a suite of tools and exposes the captured data in time sequenced playback, as a virtual network interface or virtual Ethernet device, a regenerated packet stream to external network segments and as a VFS file system that dynamically generates industry standard LIBPCAP (TCPDUMP) file formats. These formats allow the capture data to be imported into any currently available or custom applications that that support LIBPCAP formats. Analysis of captured data can be performed on a live network via INPCS while the device is actively capturing and archiving data.
In its basic hardware configuration, the INPCS platform is rack mountable device capable of supporting large arrays of RAID 0/RAID 5 disk storage with high performance Input/Output (I/O) system architectures. Storage of high-density network traffic is achieved by using copy-less Direct Memory Access (DMA). The INPCS device can sustain capture and storage rates of over 350 MB/s (megabytes per second). The device can be attached to Ethernet networks via, copper or fiber via either a SPAN port router configuration or via an optical splitter. The INPCS also supports the ability to merge multiple captured streams of data into a consolidated time indexed capture stream to support asymmetrically routed network traffic as well as other merged streams for external access, facilitating efficient network management, analysis, and forensic uses.
The INPCS software may be independently used as a standalone software package compatible with existing Linux network interface drivers. This offering of the INPCS technology provides a lower performance metric than that available in the integrated hardware/software appliance but has the advantage of being portable across the large base of existing Linux supported network drivers. The standalone software package for INPCS provides all the same features and application support as available with the appliance offering above described, but does not provide the high performance disk I/O and copy-less Direct Memory Access (DMA) switch technology of the integrated appliance.
Captured network traffic can be exposed to external appliances and devices or appropriate applications running on the INPCS appliance utilizing three primary methods: a VFS file system exposing PCAP formatted files, a virtual network interface (Ethernet) device and through a regenerated stream of packets to external network segments feeding external appliances. The INPCS file system acts as an on-disk LRU (least recently used) cache and recycles the oldest captured data when the store fills and allows continuous capture to occur with the oldest data either being recycled and overwritten or transferred to external storage captured network traffic. This architecture allows for an infinite capture system. Captured packets at any given time in the on-disk store represents a view in time of all packets captured from the oldest packets to the newest. By increasing the capacity of the disk array, a system may be configured to allow a predetermined time window on all network traffic from a network of a predetermined traffic capacity. For example a business, government entity, or university can configure an appliance with sufficient disk array storage to allow examination and analysis of all traffic during the prior 24 hours, 48 hours, or any other predetermined time frame.
Other features and advantages of the present invention will be apparent from reference to a specific embodiment of the invention as presented in the following Detailed Description taken in conjunction with the accompanying Drawings, in which:
The INPCS is a high performance data capture recorder capable of capturing all network traffic present on a network or on multiple networks and archiving the captured data on a scalable, infinite, disk based LRU (least recently used) caching system, as is known in the art, at multiple gigabit (Gb) line speeds. INPCS has the ability to capture and stream to disk all network traffic on a gigabit Ethernet network and to present the data as a Virtual File System (VFS). End users may access information by retrieving it from the VFS to facilitate network security, forensics, compliance, analytics and network management applications as well as media applications utilizing video or audio formats. INPCS also supports this capability via T1/T3 and other topologies known in the art that utilize packet based encapsulation methods.
The INPCS does not require the configuration of a protocol stack, such as TCP/IP, on the capture network device. This makes the INPCS “invisible” or passive and not addressable from the capture network segment. In this way, the device can't be targeted for attack since it can't be addressed on the network. The INPCS also provides a suite of tools to retrieve the captured data in time sequenced playback, as a virtual network interface or virtual Ethernet device, a regenerated packet stream to external network segments, or as a VFS that dynamically generates LIBPCAP (Packet Capture file format) and TCPDUMP (TCP protocol dump file format), CAP, CAZ, and industry standard formats that can be imported into any appropriate application that supports these formats. LIBPCAP is a system-independent interface for user-level packet capture that provides a portable framework for low-level network monitoring. Applications include network statistics collection, security monitoring, network debugging. The INPCS allows analysis of captured data while the device is actively capturing and archiving data.
The merged data stream is archived to an FC-AL SAN (Fiber Channel Arbitrated Loop Storage Area Network) as is known in the art. The FC-AL switch 105 shown in
The INPCS platform is a UL/TUV and EC certified platform and is rated as a Class A FCC device. The INPCS unit also meets TUV-1002, 1003, 1004, and 1007 electrostatic discharge immunity requirements and EMI immunity specifications. The INPCS platform allows console administration via SSH (Secure Shell access) as well as by attached atty and tty serial console support through the primary serial port ensuring a secure connection to the device. The unit supports hot swapping of disk drives and dynamic fail over of IDE devices via RAID 5 fault tolerant configuration. The unit also supports a high performance RAID 0 array configuration for supporting dual 1000 Base T (1 Gb) stream to disk capture.
Captured network traffic stored on the SAN can be exposed to external appliances and devices or appropriate applications running on the INPCS appliance utilizing three primary methods: a VFS file system exposing PCAP formatted files, a virtual network interface (Ethernet) device and through a regenerated stream of packets to external network segments feeding external appliances. The INPCS file system acts as an on-disk LRU (least recently used) cache and recycles the oldest captured data when the store fills and allows continuous capture to occur with the oldest data either being recycled and overwritten or transferred to external storage for permanent archive of captured network traffic. This architecture allows for an infinite capture system.
In the VFS file system, files are dynamically generated by an implemented Linux VFS, known in the art, that resides on top of the disk LRU that INPCS employs to capture network traffic to the disk. Since INPCS presents data via a standard VFS, this allows this data to be easily imported or accessed by applications or to be exported to other computer systems on using network standards such as scp (secure copy), HTTPS (secure Hyper Text Transport Protocol), SMB (Microsoft's Server Message Block protocol) or NFS (the Unix Network File System protocol. This allows the INPCS device to be installed in a wide range of disparate networking environments. Additionally, exposing the captured network traffic through a file system facilitates transfer or backup to external devices including data tapes, compact discs (CD), and data DVDs. A file system interface for the captured traffic allows for easy integration into a wide range of existing applications that recognize and read such formats.
The INPCS allows the archived data to be accessed as Virtual Network Interface using standard Ethernet protocols. Many security, forensics and network management applications have interfaces that allow them to open a network interface card directly, bypassing the operating system. This allows the application to read packets in their “raw” form from the network segment indicated by the opened device. The INPCS virtual internet device may be mapped onto the captured data store such that the stored data appear to the operating system as one or more physical network devices and the time-stamped stored data appears as if it were live network traffic. This allows existing applications to mimic their inherent direct access to network interface devices but with packets fed to the device from the captured packets in the INPCS. This architecture allows for ready integration with applications that are designed to access real-time network data, significantly enhancing their usability by turning them into tools that perform the same functions with historical data.
The Virtual Network Interface also allows analysts to configure the behavior of the INPCS virtual Ethernet device to deliver only specific packets desired. For example, since the INPCS device is a virtual device a user may program its behavior. Tools are provided whereby only packets that meet predetermined requirements match a programmed filter specification (such as by protocol ID or time domain). Additionally, while physical Ethernet devices that are opened by an application are rendered unavailable to other applications, the virtual interface employed by INPCS allows for multiple applications to read from virtual devices (which may be programmed to select for the same or different packet subsets) without mutual exclusion and without any impact on real-time network performance.
While it may be used to examine historical data, the virtual interface capability also enables near real time monitoring of captured data for these applications by providing them with a large network buffer to run concurrently with full data archiving and capture of analyzed data, while providing alerts and live network analysis with no packet loss as typically happens with applications analyzing packets running on congested networks as standalone applications.
The INPCS also facilitates data access through regeneration. Captured packets in the INPCS store can be re-transmitted to external devices on attached network segments. This allows for a “regeneration” of packets contained in the store to be sent to external appliances, emulating the receipt of real-time data by such appliances or applications. The INPCS includes tools to program the behavior of regeneration. For instance, packets can be re-transmitted at defined packet rates or packets that meet particular predetermined criteria can be excluded or included in the regenerated stream.
External appliances receiving packets regenerated to them by the INPCS appliance are unaware of the existence of the INPCS appliance, thus integration with existing or future appliances is seamless and easy, including applications where confidentiality and security are of paramount importance.
This regeneration method also facilitates “load balancing” by retransmitting stored packet streams to external devices that may not be able to examine packets received into the INPCS appliance at the real-time capture rate. Additionally, this method can make external appliances more productive by only seeing packets that a user determines are of interest to current analysis. Regeneration has no impact on the primary functions of the INPCS as it can be accomplished while the INPCS appliance is continuing to capture and store packets from defined interfaces.
The INPCS file system acts as an on-disk LRU (least recently used) cache, as is known in the art and recycles the oldest captured data when the store fills and allows continuous capture to occur with the oldest data either being recycled and overwritten or pushed out onto external storage for permanent archive of capture network traffic. This architecture allows for an infinite capture system. Captured packets at any given time in the on-disk store represents a view in time of all packets captured from the oldest packets to the newest.
The INPCS software is implemented as loadable modules loaded into a modified Linux operating system kernel. This module provides and implements the VFS, virtual network device driver (Ethernet), and the services for regeneration of packets to external network segments, as described above. INPCS uses a proprietary file system and data storage. The Linux drivers utilized by the INPCS modules have also been modified to support a copyless DMA switch technology that eliminates all packet copies. Use of the copyless receive and send methodology is essential to achieving the desired throughput of the INPC. Copyless sends allow an application to populate a message buffer with data before sending, rather than having the send function copy the data.
Captured packets are DMA (direct memory access) transferred directly from the network ring buffers into system storage cache without the need for copying or header dissection typical of traditional network protocol stacks. Similar methods are used for captured packets scheduled for writing to disk storage. These methods enable extremely high levels of performance and allows packet data to be captured and then written to disk at speeds of over 350 MB/s and allows support for lossless packet capture on gigabit networks. This enables the INPCS unit to capture full line rate gigabit traffic without any packet loss of live network data. This architecture allows real time post analysis of captured data by applications such as the popular Intrusion Detection System (IDS) software Snort, without the loss of critical data (packets). Additionally, should further research be desired, such as for session reconstruction, the full store of data is available to facilitate error free reconstruction.
These methods are superior to the more traditional “sniffer” and network trigger model that would require users and network investigators to create elaborate triggers and event monitors to look for specific events on a network. With INPCS, since every network packet is captured from the network, the need for sophisticated trigger and event monitor technology is obsolete since analysis operations are simply a matter of post analysis of a large body of captured data. Thus, INPCS represents a new model in network troubleshooting and network forensics and analysis since it allows analysts an unparalleled view of live network traffic and flow dynamics. Since the unit captures all network traffic, it is possible to replay any event in time which occurred on a network. The device creates, in essence, a monolithic “network buffer” that contains the entire body of network traffic.
In one embodiment, INPCS exposes the capture data via a VFS file system (DSFS) as PCAP files. The mounted DSFS file system behaves like traditional file systems, where files can be listed, viewed, copied and read. Since it is a file system, it can be exported via the Linux NFS or SMBFS to other attached network computers who can download the captured data as a collection time-indexed slot files or as consolidated capture files of the entire traffic on a network. This allows analysts the ability to simply copy those files of interest to local machines for local analysis. These capture PCAP files can also be written to more permanent storage, like a CD, or copied to another machine.
The INPCS File System (DSFS) also creates and exposes both time-replay based and real-time virtual network interfaces that map onto the capture packet data, allowing these applications to process captured data in real time from the data storage as packets are written into the DSFS cache system. This allows security applications, for instance, to continuously monitor capture data in real time and provide IDS and alert capability from a INPCS device while it continues to capture new network traffic without interruption. This allows existing security, forensics, compliance, analytics and network management applications to run seamlessly on top of the INPCS device with no software changes required to these programs, while providing these applications with a lossless method of analyzing all traffic on a network.
The INPCS unit can be deployed as a standalone appliance connected either via a Switched Port Analyzer (SPAN) or via an optical splitter via either standard LX or SX fiber optic connections. The unit also supports capture of UTP-based Ethernet at 10/100/1000 Mb line rates.
The INPCS unit can also be configured to support asymmetrically routed networks via dual SX fiber to gigabit Ethernet adapters with an optical splitter connecting the TX/RX ports to both RX ports of the INPCS device.
In SPAN configurations the INPCS unit is connected to a router, then the router is configured to mirror selected port traffic into the port connected to the INPCS Unit.
One distinct advantage of using a SPAN configuration relates to multi-router networks that host large numbers of routers in a campus-wide networked environment such as those that exist at universities or large business establishments. Routers can be configured to mirror local traffic onto a specific port and redirect this traffic to a central router bank to collect data on a campus-wide wide basis and direct it to a specific router that hosts an INPCS data recording appliance. This deployment demonstrates that even for a very large network utilizing gigabit Ethernet segments, this method is both deployable, and practical. At a University of 30,000 or more students with workstations and servers using Windows, Unix, Linux, and others operating systems, serving faculty, staff, labs and the like, average network traffic in and out of the university may be expected to continue at a sustained rate of approximately 55 MB/s with peaks up to 80 MB/s across multiple gigabit Ethernet segments. A deployment of the INCPS appliance utilizing a SPAN configuration can be effected without noticeable effect on the network and the INCPS can readily capture all network traffic at these rates and thus keep up with capture of all network traffic in and out of the university or similar sized enterprise.
The INPCS appliance can be configured to support capture of network traffic via an in-line optical splitter that diverts RX (receive) and TX (transmit) traffic in a configuration that feeds into two SX gigabit Ethernet adapters within the INPCS appliance.
There are further advantages related to support of asymmetric routing. In some large commercial networks RX and TX channels that carry network traffic between routers can be configured to take independent paths through the network fabric as a means of increasing the cross-sectional bandwidth of a network. Networks maintained in large financial markets, for example, may configure their networks in this manner. With this approach, it is required (in both the optical splitter configuration and in configurations involving SPAN port deployment) to re-integrate the captured traffic from one or more capture chains into a consolidated chain so that the network traffic can be reassembled and viewed in a logical arrival order.
The INPCS appliance supports both of these modes and also provides the ability to present the view of the captured network traffic as a merged and consolidated chain of captured packets.
The INPCS provides several utilities that allow configuration of virtual interfaces, starting and stopping data capture on physical adapters, mapping of virtual network interfaces onto captured data in the data store, and monitoring of network interfaces and capture data status. In addition, the entire captured data store is exported via a virtual file system that dynamically generates LIBPCAP files from the captured data as it is captured and allows these file data sets to be viewed and archived for viewing and forensic purposes by any network forensics programs that support the TCPDUMP LIBPCAP file formats for captured network traffic.
The DSCAPTURE utility configures and initiates capture of network data and also allows mapping of virtual network interfaces and selection of specific time domains based on packet index, date and time, or offset within a captured chain of packets from a particular network adapter or network segment.
The utility provides the following functions as they would appear in a command line environment:
The function DSCAPTURE INIT will initialize the INPCS capture store. DSCAPTURE START and DSCAPTURE STOP start and stop packet capture of network traffic, respectively, onto the local store based on network interface name. By default, Linux names interfaces eth0, eth1, eth2, etc. such that control code would resemble the following:
The DSCAPTURE MAP and DSCAPTURE MAP SHOW functions allow specific virtual network interfaces to be mapped from physical network adapters onto captured data located in the store. This allows SNORT, TCPDUMP, ARGUS, and other forensic applications known in the art to run on top of the INPCS store in a manner identical to their functionality were running on a live network adapter. This facilitates the use of a large number of existing or custom-designed forensic applications to concurrently analyze captured traffic at near real-time performance levels. The virtual interfaces to the captured data emulating a live network stream will generate a “blocking” event when they encounter the end of a stream of captured data from a physical network adapter and wait until new data arrives. For this reason, these applications can be used in unmodified form on top of the INPCS store while traffic is continuously captured and streamed to these programs in real time with concurrent capture of network traffic to the data store, as shown in the following command line sequence:
The DSCAPTURE function also allows the mapping of specific virtual interfaces to physical interfaces as shown in the following command line sequence and display:
The DSCAPTURE MAP SHOW function will now display:
There are two distinct types of virtual network interfaces provided by INPCS. ifp<#> and ift<#> named virtual network interfaces. the ifp<#> named virtual interfaces provide the ability to read data from the data store at full rate until the end of the store is reached. The ift<#> named virtual interfaces provide time sequenced playback of captured data at the identical time windows the data was captured from the network. This second class of virtual network interface allows data to be replayed with the same timing and behavior exhibited when the data was captured live from a network source. This is useful for viewing and analyzing network attacks and access attempts as the original timing behavior is fully preserved. The DSCAPTURE function also allows the virtual network interfaces to be indexed into the store at any point in time, packet number, or data offset a network investigator may choose to review, as in the follow command line sequence:
These commands allow the user to configure where in the stream the virtual interface should start reading captured packets. In a large system with over two terabytes of captured data, the investigator may only need to examine packets beginning at a certain date and time. This utility allows the user to set the virtual network interface pointer into the capture stream at a specific location. When the virtual device is then opened, it will begin reading packets from these locations rather that from the beginning of the capture stream.
The DSMON utility allows monitoring of a INPCS device from a standard Linux console, afty, or xterm window connected to the device via serial port, SSH (Secure Shell Login), or via a Terminal Window via an xterm device as is known in the art. This program provides comprehensive monitoring of data capture status, captured data in the store, network interface statistics, and virtual interface mappings.
Described below are typical excerpts from several DSMON panels detailing some of the information provided by this utility to network administrators and forensic investigators from the INPCS appliance and standalone software package.
The INPCS data recorder exposes captured data via a custom Virtual File System (DSFS) that dynamically generates LIBPCAP formatted files from the slots and slot chains in the data store. This data can be accessed via any of the standard file system access methods allowing captured data to be copied, archived and reviewed or imported into any programs or applications that support the LIBPCAP formats. By default, the INPCS system exposes a new file system type under the Linux Virtual File System (VFS) interface as follows:
The DSFS registers as a device based file system and is mounted as a standard file system via the mount command under standard System V Unix systems and systems that emulate the System V Unix command structure. This file system can be exposed to remote users via such protocols as NFS, SAMBA, InterMezzo, and other remote file system access methods provided by standard distributions of the Linux operating system. This allows the DSFS file system to be remotely access from Windows and Unix workstation clients from a central location.
DSFS appears to the operating system and remote users as simply another type of file system supported under the Linux Operating System, as shown in the command line sequence below:
Only the underlying capture engine subsystem can write and alter data in the DSFS file system. Beyond the assignment of user permissions to specific files, DSFS prohibits alteration of the captured data by any user, including the system administrator. This ensures the integrity of the captured data for purposes of chain of custody should the captured data be used in criminal or civil legal proceedings where rules of evidence are mandatory.
By default, the read-write nature of the DSFS file system is read only for users accessing the system from user space, and the Unix ‘df’ command will always report the store as inaccessible for writing, as shown in the following example of a command line sequence:
The DSFS File System is organized into the following directory structure:
By default, DSFS exposes captured slot chains in the root DSFS directory by adapter number and name in the system as a complete chain of packets that are contained in a LIBPCAP file. If the captured adapter contains multiple slots within a chain, the data is presented as a large contiguous file in PCAP format with the individual slots transparently chained together. These files can be opened either locally or remotely and read into any program that is designed to read LIBPCAP formatted data.
These master slot chains are in fact comprised of sub chains of individual slots that are annotated by starting and ending date and time. There are two files created by default for each adapter. One file contains the full payload of network traffic and another file has been frame sliced. Frame slicing only presents the first 96 bytes of each captured packet, and most Network Analysis software is only concerned with the payload of the network headers, and not the associated data within a packet. Providing both files reduces the amount of data transferred remotely over a network during network analysis operations since a frame sliced file is available for those applications that do not need the full network payload.
There are also several subdirectories that present the individual slots that comprise each slot chain represented in the root directory of the DSFS volume. These directories allow a more granular method of reviewing the captured data and are stored by slot and network adapter name along with the start and end capture times for the packets contain in each individual slot. A directory called “slots” is created that presents the full network payload of all packet data and a directory called “slice” that presents the same slot data in frame-sliced format. These slot files are also dynamically generated LIBPCAP files created from the underlying DSFS data store.
A SLOTS directory entry with individual slots for eth1 with full payload would appear as in the following command line sequence:
A SLICE directory entry with individual slots for eth1 with frame sliced payload would appear as follows:
These files can be imported into TCPDUMP or any other LIBPCAP based application from the DSFS File System, as follows:
The master slot chain files can also be imported from the root DSFS directory in the same manner and can be copied and archived as simple system files to local or remote target directories for later forensic analysis, as shown in the following command line example:
It is also possible to copy these files like any other system file for purposes of archiving captured network traffic using the following commands:
The DSFS “stats” directory contains text files that are dynamically updated with specific statistics information similar to the information reported through the DSMON utility. These files can also be opened and copied; thereby, providing a snapshot of the capture state of the INPCS system for a particular time interval, as shown:
For example, the file slot.txt contains the current cache state of all slot buffers in the DSFS system and can be displayed and copied as a simple text file with the following command line sequence:
In addition, an existing “merge” directory allows files to be dynamically created to provide merged slot chains for support of asymmetric routed traffic and optical tap configurations of captured data.
All of the standard applications that support network interface commands can be deployed with INPCS through the use of virtual network interface.
The SNORT Intrusion Detection System can be run with no software changes on top of the INPCS data recorder through the same use of the virtual network interfaces provided by the INPCS appliance. Since the Virtual Interfaces block when they reach the end of store data, SNORT can run in the background in real time reading from data captured and stored in a INPCS appliance as it accumulates. The procedure for invoking and initializing SNORT appears as shown in the following command line sequence and display:
The invention also allows rapid traffic regeneration of the captured data and retrieval of captured data via standard file system and network device interfaces into the operating system. This flexible design allows user space applications to access captured data in native file formats and native device support formats without the need for specialized interfaces and APIs (application programming interfaces).
Data is streamed from the capture adapters into volatile (memory) slot cache buffers via direct DMA mapping of the network adapter ring buffer memory and flushed into non-volatile (disk) as the volatile cache fills and overflows. Each slot cache segment is time based and has a start time, end time, size, and chain linkage meta tag and are self annotated and self describing units of storage of network traffic. As the slot cache storage system fills with fully populated slot cache segments, older segments in a slot chain are overwritten or pushed/pulled into long term archive storage.
The invention uses two primary disk partition types for the storage and archival of captured network traffic. These on-disk layouts facilitate rapid I/O transactions to the non-volatile (on-disk) storage cache for writing to disk captured network traffic. There are three primary partition types embodied in the invention. Partition type 0x97, 0x98 and partition type 0x99 as are known in the art.
Partition type 0x97 partitions are used by the system to storage active data being captured from a live network medium. Partition type 0x98 partitions are long term storage used to archive captured network traffic into large on-disk library caches that can span up to 128 Tera-bytes of disk storage for each Primary capture partition. Type 0x97 partitions are described by a Disk Space Record header located on each partition.
The Disk Space Record Header describes the block size, partition table layout, and slot storage layout of a type 0x97 partition. The Disk Space Record Header uses the following on-disk structure to define the storage extents of either a type 0x97 or type 0x98 storage partition.
Disk Space Records also allow chaining of Disk Space Records from multiple type 0x97 or type 0x98 partitions based upon creation and membership ID information stored in a membership cluster map, which allows the creation of a single logical view of multiple type 0x97 partitions. This allows the system to concatenate configured type 0x97 partitions into stripe sets and supports data striping across multiple devices, which increases disk channel performance dramatically.
Disk Space Records also define the internal table layouts for meta-data and chaining tables used to manage slot cache buffer chains within a virtual Disk Space Record set. Disk Space records contain table pointers that define the tables used by the DSFS file system to present slot storage as logical files and file chains of slot storage elements.
Disk Space Record based storage divides the storage partition into contiguous regions of disk sectors called slots. Slots can contain from 16 up to 2048 64K blocks of 512 byte sectors, and these storage elements are stored to disk in sequential fashion. Slots are access via a sequential location dependent numbering scheme starting at index 0 up to the number of slots that are backed up by physical storage on a particular disk device partition. Each Disk Space Record contains a space table. The space table is a linear listing of structures that is always NUMBER_OF_SLOTS*sizeof (SPACE_TABLE-ENTRY) in size. The Space table maintains size, linkage, and file attribute information for a particular slot and also stores the logical chaining and ownership of particular slots within a logical slot chain.
Virtual Cluster addresses are generated for stripe sets using the following algorithm:
The module of a cluster number relative to the number of stripe members is performed and used as an index into a particular disk LBA offset table of partition offsets within a disk device partition table that calculates the relative LBA offset of the 64K cluster number. Cluster numbers are divided by the number of striped members to determine and physical cluster address and sector LBA offset into a particular stripe set partition.
This optimization allows all I/O requests to the disk layout to be coalesced into 4K page addresses in the disk I/O layer. All read and write requests to the disk device are performed through the I/O layers as a 4K page.
The Disk Space Record (DSR) will occupy the first cluster of an adjusted Disk Space Record partition. The DSR records the cluster offset into the virtual Disk Space Store of the location of the Space Table, and optionally for partition type 0x98, the Name and Machine Tables as well. There is also a cluster record that indicates where the slot storage area begins on a Virtual Disk Space Store Partition.
The DSR also contains a table of slot chain head and tail pointers. This table is used to create slot chains that map to physical network adapters that are streaming data to the individual slot chains. This table supports a maximum of 32 slot chains per Disk Space Record Store. This means that a primary capture partition type 0x97 can archive up to 32 network adapter streams concurrently per active Capture Partition.
Type 0x98 Archive Storage Partitions employ a Name Table and Machine table that are used to store slots from primary capture partitions for long term storage and archive of network traffic and also record the host machine name and the naming and meta-tagging information from the primary capture partition. depicts the use of a Name Table and Machine Table in a type 0x98 partition. When slots are archived from the primary capture partition to a storage partition, the interface name and machine host name are added to the name table and the host name table on the archive storage partition. This allow multiple primary capture partitions to utilize a pool of archive storage to archive captured network traffic from specific segments into a large storage pool for archival and post capture analysis.
Archive storage can be mapped to multiple Network Capture Appliances as a common pool of slot segments. Archive storage pools can also be subdivided into storage zones with this architecture and tiered as a hierarchical cache and archive network traffic for months, or even years from target segments.
Individual Slot addresses are mapped to the Disk Space Store based upon partition size, number of slots, storage record cluster size, and reserved space based on the following algorithm:
The Start of slot data is the logical cluster address that immediately follows the last cluster of the space table for type 0x97 partitions and the last cluster of the machine table for type 0x98 partitions. Slots are read and written as a contiguous run of sectors to and from the disk storage device starting with the mapped slot cluster address derived from the slot number.
A slot defines a unit of network storage and each slot contains a slot header and a chain of 64K clusters. The on-disk structure of a slot is identical to the cache in-memory structure and both memory and the on-disk slot caches are viewed and treated by DSFS as specialized forms of LRU (last recently used) cache.
The slot header stores meta-data that describes the content and structure of a slot and its corresponding chain of 64 clusters.
The slot buffer header points to the first index:offset and the last index:offset pair within a slot segment buffer, and also contains a bitmap of buffer indexes that are known to contain valid slot data. These indexes are used by the I/O caching layer for reading sparse slots (slots not fully populated with network packet data) into memory efficiently.
Slot buffer sizes must match the underlying hardware in order for the algorithm to work properly. The high performance of this invention is derived from the technique described for filling of pre-load addresses into a network adapter device ring buffer. Network adapters operate by pre-loading an active ring or table on the adapter with memory addresses of buffer addresses to receive incoming network packets. Since the adapter cannot know in advance how large a received packet may be, the pre-loaded addresses must be assumed to be at least as large as the largest packet size the adapter will support. The algorithm used by DSFS always assumes at least the free space of (PACKET_SIZE+1) must be available for a pre-load buffer since buffers can exceed the maximum packet size due to VLAN (Virtual LAN) headers generated by a network router or switch.
The network adapter allocates buffers from the DSFS slot cache into the adapter based upon the next available index:offset pair. The buffers are maintained as a linear list of index addresses that are cycled through during allocation that allows all ring buffer entries to be pre-loaded from a buffer array (i.e. slot segment) in memory. The number of slot buffers must therefore be (NUMBER_OF_RING_BUFFERS*2) at a minimum in order to guarantee that as buffers elements are received and freed, the adapter will always obtain a new pre-load buffer without blocking on a slot segment that has too many buffers allocated for a given ring buffer.
Since ring buffer ring buffer pre-load/release behavior is always sequential in a network adapter, this model works very well, and as the buffer chain wraps, the adapter ring buffer will continue to pre-load buffers as free-behind network packets are released to the operating system on receive interrupts.
As buffers are allocate from a slot cache element and pre-loaded into the adapter ring buffer memory, the buffer header is pinned in memory for that particular buffer, and subsequent allocation requests will skip this buffer until the pre-loaded element has been received from the adapter.
This is necessary because the size of the received buffer is unknown. It is possible to round robin allocate pre-load buffers to the maximum size (MTU—maximum transmission unit) of a network packet, however, this method wastes space. In the current invention, preloads pin buffer headers until receipt so that subsequent allocation requests to the buffer will use space more efficiently.
Slot buffers are allocated in a round-robin pattern from each buffer element in a slot buffer list, as depicted in
The allocation algorithm is as follows:
The Disk Space Record contains a 32 entry slot chain table. The Slot chain table defines the starting and ending slot Identifiers for a chain of populated slot cache elements that reside in the non-volatile system cache (on-disk). The Slot Chain table also records the date extents for capture network packets that reside in the time domain that comprises the sum total of elapsed time between the starting and ending slot chain element.
As slots are filled, each slot records the starting and ending time for the first and last packet contained within the slot cache element. Slots internally record time at the microsecond interval as well as UTC time for each received packet, however, within the Slot Chain and Space Table, only the UTC time is exported and recorded since microsecond time measurement granularity is not required at these levels for virtual file system interaction.
The Slot Chain Table uses the internal layout depicted in
The Slot Chain Table records the starting slot address for a slot chain, the ending slot address for a slot chain, the number of total slots that comprise a slot chain, and the starting and ending dates for a slot chain. The dates are stored in standard UTC time format in both the Slot Chain Table and the System Space Table.
The slot chain table is contained within these fields in the disk space record header:
The Space Table serves as the file allocation table for Slot Chains in the system.
The space table also stores meta-data used for dynamic file reconstruction that includes the number of packets stored in a slot cache element, the number of total packet bytes in a slot cache element, file attributes, owner attributes, meta-data header size, and the size of packet sliced bytes (96 byte default).
Space Table Entries use the following internal structure:
Space Table Linkages are created by altering the next slot field which corresponds to a slot on a Disk Space Record Store. The Space Table entries are sequentially ordered based on slot position within the store. Index 0 into the Space Table corresponds to slot 0, index 1 to slot 1, and so forth. Space Table information is mirrored in both a secondary Mirrored Space table, and also exists within the slot cache element header for a slot as well. This allows a Space Table to be rebuilt from slot storage even if both primary and secondary Space Table mirrors are lost and is provided for added fault tolerance.
The slot number address space is a 32-bit value for which a unique disk space record store is expressed as:
(0xFFFFFFFF-1)=total number of slot addresses.
Value 0xFFFFFFFF is reserved as an EOF (end of file) marker for the Space Table next slot entry field which allows a range of 0-(0xFFFFFFFF-1) permissible slot addresses. Slot Chains are created and maintained as a linked list in the Space Table of slots that belong to a particular slot chain. The beginning and ending slots and their time domain and ending domain values are stored in the Slot Chain table in the DSR, and the actual linkages between slots is maintained in the space table. During Space Table traversal, when the value 0xFFFFFFFF is encountered, this signals end of chain has been reached.
The DSFS space table maintains an allocation table that employs positional chain elements in a forward linked list that describe a slot index within a DSFS file system partition. The Disk Space record stores the actual cluster based offset into a DSFS partition for meta-table and slot storage.
During normal operations in which a disk space record store has not been fully populated, slots are allocated based upon a bit table built during DSR mount that indicated the next free slot available on a particular DSR. As slots are allocated, and the disk space record store becomes full, it becomes necessary to recycle the oldest slot cache elements from the store. Since the time domain information for a particular slot chain is stored in the Disk Space Record header, it is a simple matter to scan the 32 entries in the table and determine the oldest slot cache element reference in a slot chain head. When the slot cache has become completely full, the oldest slot segment is pruned from the head of the target slot chain and re-allocated for storage from the volatile (in-memory) slot element cache.
The Slot Chain Heads are correspondingly updated to reflect the pruned slot and the storage is appended to the ending slot of the active slot chain that allocated the slot cache element storage.
During initial mounting and loading of a DSFS disk space record store, the store is scanned, space tables are scanned for inconsistencies, and the chain lengths and consistencies are checked. During this scan phase, the system builds several bit tables that are used to manage allocation of slot cache element storage and chain management. These tables allow rapid searching and state determinations of allocations and chain location and are used by the DSFS virtual file system to dynamically generate file meta-data and LIBPCAP headers. These tables also enable the system to correct data inconsistencies and rapid-restart of due to incomplete shutdown.
The Space Tables are mirrored during normal operations on a particular DSR and checked during initial mounting to ensure the partition is consistent. The system also builds an allocation map based on those slots reflected to exist with valid linkages in the space table.
It is possible for a user space application to hold a slot open for a particular slot chain, and for the chain to re-cycle the slot underneath the user during normal operations. The Slot Chain bitmaps allow the DSFS virtual file system to verify a slots membership in a chain before retrying the read with a known slot offset location.
The volatile (in-memory) slot element cache is designed as a memory based linked listing of slot cache elements that mirrors the slot cache element structure used on disk. The format is identical on-disk to the in-memory format that described a slot cache element. This list is maintained through three sets of linkages that are combined within the slot buffer header for a slot cache element. The structure of a slot cache element is as follows:
The slot buffer header that describes a slot cache element is a member of four distinct lists. The first list is the master allocation list. This list maintains a linkage of all slot buffer heads in the system. It is used to traverse the slot LRU listing for aging of slot requests and write I/O submission of posted slots. The slot buffer header also can exist in a slot hash listing.
The LRU list is used by DSFS to determine which slot buffer header was touched last. More recent accesses to a slot buffer header result in the slot buffer header being moved to the top of the listing. Slot cache elements that have valid data and have been flushed to disk and have not been accessed tend to move to the bottom of this list over time. When the system needs to re-allocate a slot cache element and it's associated slot buffer header for a new slot for either a read or write request to the volatile slot LRU cache, then the caching algorithm will select the oldest slot in memory that is not locked, has not been accessed, and has been flushed to disk and return date from it. In the event of a read request from user space, it the slot is does not exist in the slot hash listing, it is added, the oldest slot buffer header is evicted from the cache, and scheduled for read I/O in order to load the requested slot from a user space reader.
Network adapters that are open and capturing network packets allocate an empty slot buffer header which reference a slot cache element and its associated buffer chain from the LRU cache based on the algorithm depicted in
If a reader from user space accesses a slot buffer header and its associated slot cache element buffer chain during a recycle phase of a target slot, the slot LRU allows the network adapter at this layer to reallocate the same slot address in a unique slot buffer header and slot cache element. This process requires that the slot id be duplicated in the slot LRU until the last user space reference to a particular slot address is released. This even can occur if user space applications are reading data from a slot chain, and the application reaches a slot in the chain that has been recycled due to the slot store becoming completely full. In most cases, since slot chains contain the most recent data at the end of a slot chain, and the oldest data is located at the beginning of a slot chain, this is assumed to be an infrequent event.
The newly allocated slot chain element in this case becomes the primary entry in the slot hash list in the LRU, and all subsequent open requests are redirected to this entry. The previous slot LRU entry for this slot address is flagged with a −1 value and removed from the slot hash list that removes it from the user space portal view into the DSFS volatile slot cache. When the last reference to the previous slot buffer header is released from user space, the previous slot buffer header is evicted from the slot LRU and placed on a free list for reallocation by network adapters for writing or user space readers for slot reading by upper layer applications.
A single process daemon is employed by the operating system that is signaled via a semaphore when a slot LRU slot buffer header is dirty and requires the data content to be flushed to the disk array. This daemon uses the master slot list to peruse the slot buffer header chain to update aging timestamps in the LRU slot buffer headers, and to submit writes for posted LRU elements. By default, an LRU slot buffer header can have the following states:
Entries flagged as L_POST or L_REPAIR are written to non-volatile storage immediately. Entries flagged L_DIRTY are flushed at 30 second intervals to the system store. Meta-data updates to the Space Table for L_DIRTY slot buffer headers are synchronized with the flushing of a particular slot address. Slot buffer headers flagged L_LOADING are read requests utilizing asynchronous read I/O. L_HASHED means the slot address and slot buffer header are mapped in the slot hash list and are accessible by user space applications for open, read, and close requests.
The directory layouts are all accessible via open( ), read( ), write( ), Iseek( ), and close( ) system calls; Slot chains are also exposed as virtual files and can also use standard system calls to read an entire slot chain of capture network traffic. LIBPCAP allows this data to be exported dynamically to a wide variety of user space applications and network forensics monitoring and troubleshooting tools.
The DSFS file system utilizes a P_HANDLE structure to create a unique view into a slot cache element or a chain of slot cache elements. The P_HANDLE structure records the network interface chain index into the Slot Chain table, and specific context referencing current slot address, slot index address, and offset within a slot chain, if a slot chain is being access and not an individual slot cache element.
The P_HANDLE structure is described as:
The P_HANDLE structure is also hierarchical, and allows P_HANDLE contexts to be dynamically mapped to multiple slot cache elements in parallel, that facilitates time domain based merging of captured network traffic. In the case of asymmetrically routed TX/RX network traffic across separate network segments, or scenarios involving the use of an optical splitter, network TX/RX traffic may potentially be stored from two separate network devices that actually represent a single stream of network traffic.
With hierarchical P_HANDLE contexts, it is possible to combine several slot chains into a single chain dynamically by selecting the oldest packet from each slot chain with a series of open p_handles, each with it's own unique view into a slot chain. This facilitates merging of captured network traffic from multiple networks. This method also allows all network traffic captured by the system to be aggregated into a single stream of packets for real time analysis of network forensics applications, such as an intrusion detection system from all network interfaces in the system.
Commands are embedded directly into the created file name and parsed by the DSFS virtual file system and used to allocate and map P_HANDLE contexts into specific index locations within the specified slot chains. The format of the command language is more fully defined as:
Where <int0> is the name or chain index number of a slot chain and <D> date is either a starting or ending date formatted in the following syntax or a date and an ending size of a merged series of slot chains. The touch command can be used to create these views into specified slot chains. To create a file with a starting and ending date range you wish to view, enter:
To create a file with a starting date that is limited to a certain size, enter:
An interface number can also be used as an interface name. This was supported to allow renaming of interfaces while preserving the ability to read data captured on a primary partition including, by way of example, the following data sets and their respective command line entries:
all packets captured for a time period of 1 second on Aug. 2, 2004 at 14:15:07 through Aug. 2, 2004 at 14:15:08 on eth1 and eth2
touch eth1:eth2-08.02.2004.14.15.07:d-08.02.2004.14.15.08:d
all packets captured for a time period of Aug. 2, 2004 at 14:15:07 up to the <size> of the specified data range on eth1
touch eth1-08.02.2004.14.15.07:d-300000:s
all packets captured for a time period of 1 second on Aug. 2, 2004 at 14:15:07 through Aug. 2, 2004 at 14:15:08 for eth1(11)
touch 11-08.02.2004.14.15.07:d-08.02.2004.14.15.08:d
all packets captured for a time period of Aug. 2, 2004 at 14:15:07 up to the <size> of the specified data range eth1(11)
touch 11-08.02.2004.14.15.07:d-300000:s
P_HANDLE context structures are also employed via user space interfaces to create virtual network adapters to user space that appear as physical adapters to user space applications as depicted in
This also allows all known network forensic applications that use standard network and file system interfaces seamless and integrated access to captured data at real-time performance levels and additionally providing a multi-terabyte capture store that streams packets to disk in a permanent archive while at the same time supporting real-time analysis and filtering applications with no proprietary interfaces. Virtual interfaces are created using calls into the sockets layer of the underlying operating system. Calls to open s socket result in the creation of a P_HANDLE context pointer mapped into the captured slot chain for a mapped virtual device. The algorithm that maps a P_HANDLE context to an operating system socket is described as:
Subsequent IOCTL calls to the virtual device return the next packet in the stream. For merge slot chains, the IOCTL call returns the oldest packet for the entire array of open slot chains. This allows virtual interfaces ifm0 and ifm1 to return the entire payload of a captured system to user space applications though a virtual adapter interface. P_HANDLE contexts are unique and by default, are indexed to the current time the virtual interface is opened relative to the time domain position in a captured slot chain. This mirrors the actual behavior of a physical network adapter. It is also possible through the P_HANDLE context to request a starting point in the slot chain at a time index that is earlier or later than the current time a virtual interface was opened. This allows user space application to move backwards or forward in time on a captured slot chain and replay network traffic. Virtual interfaces can also be configured to replay data to user space applications with the exact UTC/microsecond timings the network data was actually received from the network segments and archived.
Playback is performed in a slot receive event that is also hooked to the underlying operating system sys_recvmsg sockets call. calls to recvmsg redirect socket reads to the DSFS slot cache store and read from the mapped slot chain for a particular virtual interface adapter.
The sys_recvmsg algorithm for redirecting operating system user space requests to read a socket from a virtual interface is described as:
Virtual network interface mappings also employ an include/exclude mask of port/protocol filters that is configured via a separate IOCTL call and maps a bit table of include/exclude ports to a particular virtual network interface.
The algorithm that performs the filtering of network packets from open slot chains is more fully described as:
Virtual network interfaces can also be used to regenerate captured network traffic onto physical network segments for playback to downstream IDS appliances and network troubleshooting consoles.
When a virtual interface encounters end of stream (0xFFFFFFFF) the call will block on an interruptible system semaphore until more packets are received at the end of the slot chain. Captured network traffic can be regenerated from multiple virtual network interfaces onto a single physical network interface, and filters may also be employed. This implementation allows infinite capture of network traffic and concurrent playback to downstream IDS appliances and support for real-time user space applications monitoring of captured network data.
Regeneration creates a unique process for each regenerated virtual network interface to physical interface session. This process reads from the virtual network device and outputs the data to the physical interface upon each return from a request to read a slot chain. A P_HANDLE context is maintained for each unique regeneration session with a unique view into the captured slot chain being read.
The regeneration process con be configured to limit data output on a physical segment in 1 mb/s (megabit per second) increments. The current embodiment of the invention allows these increments to span 1-10000 mb/s configurable per regeneration thread.
Regeneration steps consist of mapping a P_HANDLE context to a virtual interface adapter and reading packets from an active slot chain until the interface reaches the end of the slot chain and blocks until more packet traffic arrives. As the packets are read from the slot chain, they are formatted into system dependent transmission units (skb's on Linux) and queued for transmission on a target physical network interface.
The regeneration algorithm meters the total bytes transmitted over a target physical interface relative to the defined value for maximum bytes per second set by the user space application that initiated a regeneration process. The current embodiment of packet and protocol regeneration is instrumented as a polled method rather than event driven method.
The regeneration algorithm is more fully described as:
The primary capture (type 0x97) disk space record for a DSFS system can be configured to map to multiple Archive Storage (type 0x98) partitions in an FC-AL clustered fiber channel System Area Network.
This architecture allows days, week, months, or even years of network packet data to be archived and indexed for off line post analysis operations, auditing, and network transaction accounting purposes.
Primary Capture partitions contain a table of mapped archive partitions that may be used to allocate slot storage. As slots are allocated and pinned by adapters and subsequently filled, if a particular primary storage partition has an associated map of archive storage partitions, the primary capture partitions creates dual I/O links into the archive storage and initiates a mirrored write of a particular slot to both the primary capture partition and the archive storage partition in tandem. Slot chains located on archive storage partitions only export two primary slot chains. The VFS dynamic presents the slots in a replica chain (chain 0) and an archive chain (1).
As slots are allocated from an Archive Storage partition, they are linked into the replica partition. Originating interface name, MAC address, and machine host name are also annotated in the additional tables present on a type 0x98 partition to identify the source name of the machine and interface information relative to a particular slot. Altering the attributes by setting an slot to read-only on an archive partition moves the slot from the replica slot chain (0) to the permanent archive slot chain (1). Slot allocation for selection of eligible targets for slot recycle on archive storage partitions is always biased to use the replica chain for slot reclamation. Slots stored on the archive slot chain (1) are only recycled if all slots in a given archive storage partition replica chain (0) have been converted to entries on the archive slot chain (1). In both cases, the oldest slots are targeted for recycle when an archive storage partition becomes fully populated. This allows forensic investigators the ability to pin specific slots of interest in an archive chain for permanent archival.
In the event a storage array has been taken off line temporarily, the slot bitmap table records a value of 0 for any slots that have not been mirrored due to system unavailability, and a background re-mirroring process is spawned when the off line storage becomes active and re-mirrors the slot cache elements onto the target archive storage partitions with a background process. The system can also be configured to simply drop captured slots on the primary capture partition and not attempt mirroring of slots lost during an off line storage event for a group of archive partitions.
To avoid elevator starvation cases for sector ordering during re-mirroring, slots may be re-mirrored backwards as a performance optimization starting at the bottom of a primary capture partition rather than at the beginning to prevent excessive indexing at the block I/O layer of the operating system of coalesced read and write sector run requests.
Off line indexing is supported by tagging each captured packet with a globally unique identifier that allows rapid searching and retrieval on a per packet basis of capture network packets.
Off line indexes allow external applications to import indexing information for captured network traffic into off line databases and allow rapid search and retrieval of captured network packets through user space P_HANDLE context pointers. The globally unique identifier is guaranteed to be unique since it incorporates the unique MAC address of the network adapter that captured the packet payload. The global packet identifier also stores Ipv4 and Ipv6 address information per packet and supports Ipv4 and Ipv6 indexing.
Claims
1. A method of capturing data packets comprising of:
- connecting a capture device to a data communications path;
- capturing data packets communicated along the data communications path;
- persistently storing the captured data from the data packets in a predetermined combination of volatile and non-volatile storage media;
- aggregating the persistently stored data packets into a slot of predetermined size;
- annotating the aggregated data packets with persistent storage information;
- storing the annotated data packets using an infinitely journaled, write-once, hierarchical file system;
- reconstructing any corrupted data to ensure data accuracy of the persistently stored data; and
- retrieving a predetermined portion of captured data and persistently stored annotations from the slot;
- creating the slot of predetermined size to have a buffer of a predetermined size; and
- managing the slot based on a least recently used cache to map the data in the slot to a non-volatile storage thereby creating a cache image of the captured data.
2. A method capturing data packets comprising of connecting a capture appliance to data communications path;
- capturing data communicated along the data communications path;
- replicating and persistently annotating the captured data in a predetermined combination of volatile and nonvolatile storage;
- aggregating the captured data and persistent annotations in the volatile and non-volatile storage into a slot; and
- storing the data in a non-volatile storage using an infinitely journaled, write-once, hierarchical file system.
3. The method of claim 2 wherein the data is aggregated into a slot by:
- creating the slot; and
- managing the slot based on an least recently used cache.
4. The method of claim 3 wherein the least recently used cache maps the data in the slot to the non-volatile storage to create a cache image of the captured data across sectors of the non-volatile storage using striping and thereby allowing a controller simultaneously to write to a plurality of non-volatile storage devices.
5. The method of claim 4 wherein the data is copied from the slot to the volatile storage using a least recently used algorithm to allocate space in the volatile storage.
Type: Application
Filed: Apr 1, 2009
Publication Date: Jul 16, 2009
Applicant: SOLERA NETWORKS. INC. (LINDON, UT)
Inventors: JEFFREY V. MERKEY (Lindon, UT), BRYAN W. SPARKS (Lindon, UT)
Application Number: 12/416,276
International Classification: G06F 12/12 (20060101); G06F 12/08 (20060101); G06F 12/14 (20060101); G06F 15/16 (20060101); G06F 11/08 (20060101);