System and method for discovery of network entities
A system and method of discovering network entities. Network traffic is monitored, wherein monitoring includes finding network entities in the network traffic. If the network entities are network assets, the system determines if the network entities are critical network assets. If the network entities are network users, the system classifies the network users automatically into user groups. The network traffic is then displayed as a function of the critical network assets and the user groups.
This patent application claims the priority benefit of U.S. Provisional Patent Application Ser. No. 61/054,945 filed May 21, 2008 and entitled “ENHANCED DISCOVERY WITH IDENTITIES”, the content of which is incorporated herein by reference in its entirety.
TECHNICAL FIELDThe disclosure relates generally to network security and in particular to systems and methods for network discovery.
LIMITED COPYRIGHT WAIVERA portion of the disclosure of this patent document contains material to which the claim of copyright protection is made. The copyright owner has no objection to the facsimile reproduction by any person of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office file or records, but reserves all other rights whatsoever.
BACKGROUNDCompanies today face the task of continuously monitoring and verifying who is accessing critical business systems, what they are doing during each access and where they are accessing from. The challenge is that, when manually attempted, visibility into your network and your critical business applications is often nothing more than a static, after-the-fact “snapshot in time.” U.S. patent application Ser. No. 11/854,392, entitled “Identities Correlation Infrastructure for Passive Network Monitoring”, filed Sep. 12, 2007, describes, however, one approach for identifying and continuously monitoring user access to critical business systems, the description of which is incorporated herein by reference.
The systems proposed to date require administrators to identify users and group them with other users. In addition, the systems to date present data in ways that can be difficult to understand. What is needed is a system and method for identifying and classifying users, and for monitoring network activity as a function of the classifications of network users. What is also needed is a system and method for displaying network activity based on identified groups of users in a clear and concise manner. Finally, what is needed is a system and method for controlling network access as a function of the identified groups of users.
Unsecured and improper practices by authorized insiders can create substantial risk to critical business systems. Outsourcers, offshore developers, contractors, careless employees, partners, joint ventures, and others must be monitored. Yet monitoring security to the standards recommended by CERT and others is nearly impossible to do in real time with traditional security tools. And using log data to get this level of information can drain valuable IT resources while still falling short of delivering real-time operational visibility and control.
In computer system 100 of
In one embodiment, network monitor 108 provides automated, identity-based monitoring to keep computer system 100 in compliance and in control. This comprehensive monitoring solution delivers complete visibility and verification of who is doing what and where on an automated, continuous, real-time basis. By identity, we mean the actual user name, group name, and role correlated to behavior and delivered in real time—not after the fact.
In one embodiment, network monitor 108 communicates with a directory service 110 and an authentication service 111 over network 104. In one such embodiment, network monitor 108 integrates with existing directory stores, such as Microsoft Active Directory, leveraging actual user and group information to dynamically determine when a user accesses the network. In another such embodiment, network monitor 108 integrates with existing directory stores, such as Microsoft Active Directory, leveraging actual user, group, and role information to dynamically determine when a user accesses the network. Network monitor 108 queries the directory in real time, and then correlates users and their groups with all related access and activity. In one embodiment, user identity credentials are detected in the traffic without the use of any agents on the client or server side.
For example, a user named jsmith logs into the network. Network monitor 108 identifies this action and immediately determines that jsmith is part of the marketing group and has a job role that allows her access to the marketing database and a joint-venture database but not the finance database. Network monitor 108 continues to monitor network traffic to ensure that jsmith's actions abide by this policy as well as all other established security controls.
In one embodiment, monitors 108 passively capture, decode, and analyze traffic via native deep packet inspection (DPI). They use port mirroring or passive network taps to obtain full packet data for protocol decoding up to the application layer (layer 7). This level of detail is often required to ensure a tamperproof view of network activity within critical data centers and critical business systems.
In one such embodiment, flow monitors within monitor 108 leverage existing flow-based data from Cisco Netflow, Juniper J-Flow, and others for analysis. This broader network view is often useful for gaining a cost-effective, enterprise-wide view of who is doing what and from where across the entire network, including remote locations.
In another such embodiment, network monitors are used in a “Mixed” mode that combines both DPI and flow-based data.
In one embodiment, network monitor 108 operates on firewall audit logs to analyze traffic through the firewall. In one such embodiment, network entity discovery (e.g., user discovery or asset discovery) is used to drive security policy for both the network monitor 108 and the firewall 112.
In one embodiment, network monitor 108 is part of a tiered architecture that comprises network monitors 108, control centers and report appliances. This approach has the deployment advantages of an out-of-band, network-based solution without the need for agents or application integration.
In such an approach, network monitors 108 provide the cornerstone monitoring function. Monitors 108 are network-based and designed to capture and analyze critical traffic data inside the network using one of the three methods described above.
As shown in
In one such embodiment, large entities can easily stratify and delegate their management capabilities with control center 124. For example, you could retain the ability to analyze and control network activity at an overall organizational level while also allowing your various operating divisions or security zones to monitor and manage network activity that's specific to their group.
In one embodiment, as is shown in
In one embodiment, discovery module 120 provides additional analysis capabilities, including the ability to focus on a single system. For example, you could concentrate network monitor 108 on a specific CRM or accounting system. Likewise, you can use network monitor 108 to discover all user groups or focus on a specific user group, office location, or network boundary. For instance, you could monitor for sales representatives accessing a particular system from headquarters. Additional information on what users are doing is also provided, including protocol decode, ports, bandwidth, URLs, and commands. This level of detail is extremely useful for network rezoning and segmentation, or application and server moves that might impact users' ability to access their applications.
As noted above, in some embodiments, network monitor 108 communicates directly with existing network directories, leveraging existing groups and memberships. Additionally, network monitor 108 may leverage local directories for special-purpose groups and memberships.
In one embodiment, control module 122 applies user-based policies and then graphically illustrates the network usage of users and groups to critical systems, clearly denoting what activity is acceptable, what activity is unacceptable and what activity merits a closer look by the security and operations teams.
In the embodiment shown, DDC 130 runs on the Monitor and stores its data locally. In other embodiments, DDC 130 or discovery data store 132 could be accessed remotely from network monitor 108. In one such embodiment, control center 124 includes a data store (DS) 126 connected to a data analyzer 128 (as shown in
In the embodiment shown, GBP engine 136 runs on monitor 108, processing the DDC data in the data discovery data store 132 at the user's request. In one such embodiment, the GBP engine runs as a standalone java daemon, exporting an interface to its client.
In one such embodiment, the UI processes GBP data for the local Monitor only. In one such embodiment, GBP data generated by the GBP engine in response to UI initiated queries is used by servlet module 144 to generate UI pages that are in turn displayed by a browser.
In some embodiments, as is shown in
In one embodiment, the GBP engine 136 runs in the Monitor aggregating and processing data provided by the DDC 130. The GBP engine 136 will be described in greater detail below.
The GBP engine 136 provides an interface to the GBP Engine Proxy 142 through which the Web UI components make requests and receive results.
In one embodiment, the GBP Engine Proxy 142 (hereinafter referred to as “proxy 142”) uses a socket interface to communicate with the GBP Engine 136. This interface supports multiple concurrent asynchronous requests. Each request to the proxy is made on a separate thread and that thread is blocked until the request is completed, times out or the connection to the engine is abnormally terminated. Thus, within each socket connection requests are handled synchronously.
The RPC protocol between the engine and the proxy consists of a handshake phase followed by a request/response phase. Each connection is initiated by the proxy and accepted by the engine.
If the engine status is READY the proxy 142 can send a request to be processed. In one embodiment, each request is handled synchronously, i.e., a new request cannot be issued until the previous request has completed.
Requests and responses are encoded as serialized Java objects using the standard Java object serialization mechanism.
In one embodiment, the GBP engine 136 is restarted whenever there is a configuration change or a policy change.
The GBP engine uses a DDC Read Interface 150 to load into memory all the
DDC records required to perform its functions. A DDC data record summarizes the traffic observed between two network endpoints (e.g., a user and a server) for a specific network service over a given time period (e.g., one day). In order to produce results within an acceptable time period, in one embodiment, the engine caches its working set data and updates the cache periodically.
Policy management will be discussed next. In one embodiment, The GBP engine loads the current network security policy from the file system. The policy provides GBP engine 136 with information on already identified network assets, such as Critical Business Systems (CBSs), network services of interest, as well as, currently defined local groups.
Groups will be discussed next. When computing a set of groups to be used to classify users, GBP engine 136 takes into account both the user's membership in network directory groups and the user's membership in any local directory groups, such as those defined in policy. Anonymous users are automatically included in a pre-defined Anonymous group, while users that are not members of any directory group considered to be members of the Other Groups pre-defined group.
In some embodiments of the network monitor shown in
1) For a given user, the groups of which that user is a member; the set returned by the IAA represents the transitive closure of the user's group membership, including both network directory groups and local directory groups; and
2) For a given group, the number of active users in that group.
In one embodiment, all Identity Acquisition is performed by the Identity Acquisition Manager (IAM) of IAA 140. The JAM computes the number of active users per user group, where an active user is defined as being one that has been authenticated in the last YY days. This information is provided to the Identity Acquisition Agent (IAA) 140 periodically. The number of active users in a group need not be updated more than once a week.
In one embodiment, the IAA access protocol is extended to allow the GBP engine 136 to query the IAA 140 for the current list of groups, their active user counts and their list of users.
In one embodiment, Anonymous Intranet users are segregated by IP address. An anonymous user whose IP address falls within the Intranet is identified by its computer's IP address. Thus, in the list of users that are clients of a service, anonymous Intranet users will be denoted by their IP addresses.
On the other hand, Anonymous users outside the Intranet are not uniquely identified.
An anonymous Internet user is given the name Internet whereas an anonymous Extranet user receives the name Extranet. Thus, all anonymous users outside the Intranet coalesce into a single Internet user and a single Extranet user. However, and optionally, anonymous users may be segregated by their IP addresses, i.e., anonymous Internet and Extranet users are treated identically to anonymous Intranet users.
When computing group membership, GBP engine 136 also takes into account and ignores any and all groups that have been dismissed through the GBP UI. The built-in groups Anonymous and Other Groups cannot be dismissed.
As noted above, users that are not members of a network directory group or a local directory group are considered to be members of the Other Groups pre-defined group. This may happen because the user is not a member of any valid group, or because all the groups of which a user is a member have been dismissed.
As noted above, a network monitor which analyzes network traffic in the manner described above gives the monitor user the ability to see groups of users and their behavior across a network. Still, given the quantity of data, it can be difficult to see the forest for the trees. What is needed is a way of displaying the data that gives the user to a reasonable description of network behavior and which allows the user to modify network behavior based on that description. The problem is, how do you present as dense a context around network behavior as possible so as to give a good picture of what is going on in the network.
One depiction of network activity is shown in
The bubble table shown in
Any shape could be used at the intersection of each row and column. In one embodiment, the bubble is used as a visual measure, providing low-fidelity quantitative information of network activity as filtered by monitor 108. Multiple bubbles give the user a quick comparison of a large related data set and steer the questions in their investigation of identity policy.
In one embodiment, the bubble is a simple circle. The size of the bubble is determined by the amount of bandwidth it represents. The shade of the bubble is determined by whether or not there are outliers within that bubble's data. In one such embodiment, the bubble comes in seven sizes.
In one embodiment, bubble details are presented in the form of a tooltip, consistent in display with other tool-tips displayed by monitor 108. In one embodiment, bubble details provide the amount of bandwidth shown, the number of users involved, and whether or not there are outliers to investigate.
In one embodiment, the bubble graph can be constrained by service using, for instance, a pull-down menu 352 (as shown in
In one embodiment, pull-down menu 352 displays service as defined in policy. If a service is only covered by the highest-level definitions (“Tcp,” “Udp,” “IP”) then they may be further broken down by common IANA services. For example, if TCP/1521 is not defined in policy, then it would be reported as the generic “Tcp” service within this widget unless it also relates to a common IANA service.
In one embodiment, clicking on the user group brings up a box 340 with information regarding the services the user group is accessing. In the embodiment shown, the number of users accessing a particular service is graphically displayed by increasing or decreasing the size of the bubble 360 for that service. In one such embodiment, the visual representation of bad behavior is scaled to the equivalent range of the visual representation of good events such that good events don't swamp bad events. In one embodiment, the color of the bubble is used to indicate whether the behavior is potentially good or bad. In one such embodiment, a lighter colored bubble indicates potential bad behavior, while a darker bubble indicates expected behavior. Such an approach makes bad behavior obvious.
In addition to color, in one embodiment the bubble is moved along a continuum line 362 used to indicate if the behavior displayed is expected or unexpected. Unexpected behavior is noted by color and by the designation “Investigate” at the end of the row. (See, e.g., the Secure Shell access by users in the Auditing Contractors group to the Finance Servers, a potentially bad behavior for users outside the Finance Department.)
In one embodiment, you can drill down into each bubble to see the events. In one embodiment, events associated with the users' interaction with the CBS are displayed in order of criticality when the bubble is clicked. In one embodiment, one can drill down into the services to see the events, or drill down into the bubble to see events associated with groups of users.
In one embodiment, the standard bubble graph view is configured by the user to select the CBSs and Groups that should be displayed (or ignored). In one such embodiment, this view is unique to each of the system's users. In another embodiment, columns and rows can be dragged and dropped to their intended position so no other configuration process is needed.
In one embodiment, a simple mechanism is included for removing groups from consideration on all discovery data displays. In one such embodiment, this is done by dismissing a group from the bubble graph and/or the CBS view.
In one embodiment CBSs are displayed left to right in decreasing order of bandwidth occupied.
In one embodiment, the application makes a clear distinction between the activities common to most of a group's users of a specific service on a specific system, who can be considered “mainstream” in their behavior, versus the members of a group whose use of a service on a system greatly differs from their peers, who can be considered “outliers” in their behavior. In one such embodiment, as is shown in
In the embodiment shown in
In one embodiment, a network traffic indicator for each day indicates the amount of traffic saved by monitor 108. The traffic indicator can be used by the user to choose an analysis based on the amount of data available for the given period.
In one embodiment, all selections in timeline 350 are from current hour into the past. Selection grows to the left and shrinks to the right. Clicking in an unselected timeframe makes it the current selection. When the selection is changed, monitor 108 triggers a query, refreshing page data. In one such embodiment, clicking current selection refreshes data if last-queried data is more than one hour older than current time.
In one embodiment, a data refresh is triggered when the current hour moves past the last hour.
In one embodiment, when a new timeframe query is made, controller 108 waits for three seconds before sending the query. This is an expensive query. If a new timeframe query comes from the same user before the last one returns—such as when the user accidentally made the wrong selection—cancel the first query and start the new one.
In the embodiment shown in
In one embodiment, a three-dimensional grid is used to display network activity. In one such embodiment, this involves:
creating a grid having a first, second and third axis;
assigning client groups to the first axis;
assigning critical business systems to the second axis;
assigning services to a third axis;
monitoring network traffic;
displaying network traffic on the grid as a function of client group, service and critical business system, wherein displaying includes associating a point on the first axis with each client group, associating a point on the second axis with each critical business system, associating a point on the third axis with each service and displaying a shape at intersections in the grid between points on the first, second and third axes, wherein the shape varies in size as a function of network traffic associated with a particular client group and a particular critical business system.
In one embodiment, as is shown in
As can be seen in
This display and drill-down methodology provides a mechanism for organizing and tracking services and users in computer system 100.
In the embodiment shown in
Also shown on
Also shown on
Once you can display network behavior, it becomes possible to verify traffic against role-based controls and pre-built security best practices. Automated discovery capability helps uncover the “who, what, and where” during the planning phase of change projects, without requiring any rule definition.
In the approach shown in
In one embodiment, network monitor 108 includes a verification capability. In some such embodiments, this verification capability builds on the discovery view. In one embodiment, verification automatically verifies traffic against role-based controls and pre-built security best practices. This verification process can instantly pinpoint and provide real-time alerts on the following representative examples:
1) Access by non-authenticating users, such as terminated employees who have had their access privileges revoked
2) Network access exceptions such as file servers, disallowed network applications, and geographically dispersed printers that are not behaving as expected
3) Verifying access of users that should be on the network, such as reassigned employees or outsourcers who inappropriately, perhaps inadvertently, access systems they shouldn't
4) Unsecured or malicious activities, including tunneling of services like FTP inside of HTTP to transit firewalls
5) Verifying expected usage of administrative protocols or commands, such as web authoring.
The results are shown in a display such as is shown in
In one embodiment, one can easily switch between discovery and verification modes by selecting discovery tab 600 (for data discovery) or control tab 601 (for policy verification).
The above system can be extended beyond network traffic to discover and control firewalls.
The timeline 850 in
In a directory-based system such as Active Directory, a user can exist in a number of different groups. She may be a member of the Finance Group, the Executive Staff, Employees and Headquarters. In the past, systems that wanted to display network activity as a function of user behavior either had to list all users or all groups or were constrained to selected users or groups of users preselected by the network administrator. Each group-based approach had the limitation that actions of individual users distributed across multiple groups would be duplicated across each of the user's groups, overemphasizing the effect of that user. What is needed is a system and method for automatically assigning users to groups in a way that accurately reflects the actions of each group and that makes it easy to write security policies that cover each user. In one embodiment, given the set of users accessing a particular service on a particular server, we determine the best way to group the users.
In one embodiment, the GBP engine 136 uses one or more of the following algorithms during discovery to determine which groups best represent the clients of a CBS or a CBS-service pair: 1) a group representation algorithm, 2) a client representation algorithm and 3) a group ranking algorithm. In one embodiment, the algorithm to be used is selected by the user.
The Group Representation Algorithm is shown in
Given a target system—CBS or group of CBSs, service or group of services, or a combination of the two—and a set of clients for that target system, compute usage “by group representation” for that target as follows:
1) Compute (at 910) the set of all the groups of which the clients (users or computers) are members using the group information retrieved from the IAA 140, as noted above.
2) For each group, determine (at 915) the ratio of clients that are members of the group to the total number of active users of the group; this is termed the percentage of active users.
3) Segregate the groups into two sets: mainstream and outlier. Mainstream Groups are those whose percentage of active users is at or above a certain threshold value. The threshold value is configurable and varies depending on the total number of active users in the group. In one embodiment, the default set of threshold values is 50% for groups under 10 active users, 40% for groups between 11 and 100 active users, 30% for groups between 101 and 1000 active users, and 25% for groups larger than 1000 active users. Outlier Groups are those whose percentage of active users falls below this threshold.
4) Within the mainstream groups, sort (at 920) the groups in increasing order of clients.
5) Within the mainstream groups find (at 925) any and all groups whose entire set of clients is contained by another (larger) mainstream group; subsume (at 930) the smaller mainstream group into the larger mainstream group.
6) Sort (at 935) the mainstream groups in descending order of percentage of active users;
7) In the mainstream groups, assign (at 940) each client to one and only one group, giving priority to the groups with the highest percentage of active users, i.e., groups higher in the sorted list;
8) After this step is performed, any mainstream group that falls below the threshold of active users is moved (at 945) to the outliers group.
9) Remove from the outlier groups (at 950) all clients that have been accounted for (i.e., are members of) any of the mainstream groups.
10) Sort the outlier groups (at 955) in descending order of percentage of active users.
11) Assign (at 960) each client to one and only one outlier group, giving priority to the groups with the highest percentage of active users, i.e., groups higher in the sorted list.
The result is a set of user groups for each particular target. The assignment of clients to user groups is a function of target and may change as monitor 108 filters network traffic by target.
The following example illustrates the Group Representation Algorithm:
Discovery Data indicates that a Web Server has 15 distinct clients as follows:
-
- a) 8 clients are members of the Quality Assurance, Engineering and Corporate Employees groups;
- b) 2 clients are members of the Engineering and Corporate Employees group;
- c) 5 clients are members of the Corporate Employees group.
The composition of these groups is as follows:
-
- a) The Quality Assurance group has 8 active users;
- b) The Engineering group has 20 active users;
- c) The Corporate Employees group has 100 active users.
Thus, the percentage of active users for each group is:
-
- a) Quality Assurance: 8 out of 8=100%
- b) Engineering: 10 out of 20=50%
- c) Corporate Employees: 15 out of 100=15%
Both the Quality Assurance and the Engineering groups are mainstream groups while the Corporate Employees group is an outlier. However, because all members of the Quality Assurance group are also members of the Engineering group, the former is subsumed by the latter. Thus, after applying the algorithm noted above, Web Server's clients are assigned to groups as follows:
-
- a) 10 clients are assigned to the Engineering group which is designated a mainstream group;
- b) 5 clients are assigned to the Corporate Employees group which is designated an outlier group.
The Client Representation Algorithm will be described next. This algorithm presents different sets of groups that best represent the client set based on different aggregation levels. An aggregation level denotes the ratio of clients that are members of that group to the total number of clients. In one embodiment, three levels are defined:
-
- 1—groups that represent 33% or less of the clients;
- 2—groups that represent between 34% and 66% of the clients;
- 3—groups that represent between 67% and 100% of the clients;
The algorithm is shown in
1) Compute (at 1010) the set of all the groups of which the clients (users or computers) are members using the group information retrieved from the IAA 140, as noted above.
2) For each group, determine (at 1020) the ratio of clients that are members of the group to the total number of clients; this is termed the percentage of clients. Map this value into an aggregation level.
3) Aggregate (at 1030) all the groups into aggregation levels and select an aggregation level as the selected aggregation level.
4) Sort (at 1040) the selected groups in decreasing order of percentage of clients.
5) Within the selected group, assign each client to one and only one group, giving priority (at 1050) to the groups with the highest percentage of clients; if a selected group's aggregation level falls below the selected aggregation level, move it back to the remaining set of groups.
6) Remove (at 1060) all the clients assigned to the selected groups from the remaining groups.
7) Repeat (at 1070) previous 3 steps until there aren't any groups whose aggregation level matches the selected aggregation level.
8) At 1080, increase the aggregation level by one, making it the selected aggregation level, and repeat the procedure starting at step 3. Once the aggregation level has reached its highest value (3), decrease it by one at every iteration until the lowest aggregation level (1) has been reached or the set of groups remaining to be processed is empty.
After applying the above algorithm, any groups whose percentage of clients falls outside the selected aggregation level are considered outliers.
The following example illustrates the Client Representation Algorithm:
Discovery Data indicates that a Web Server has 15 distinct clients as follows:
-
- a) 8 clients are members of the Quality Assurance, Engineering and Corporate Employees groups;
- b) 2 clients are members of the Engineering and Corporate Employees group;
- c) 5 clients are members of the Corporate Employees group.
The percentage of clients for each group is:
-
- a) Quality Assurance: 8 out of 15=53% (aggregation level 2)
- b) Engineering: 10 out of 15=66% (aggregation level 2)
- c) Corporate Employees: 15 out of 15=100% (aggregation level 3)
If the selected aggregation level is 1 (0-33%), none of the groups fall under this aggregation level, therefore, all groups are designated outliers. The assignment of clients to groups proceeds as under aggregation level 2 below. If the selected aggregation level is 2 (34-66%), both Quality Assurance and Engineering fall under this aggregation level. Engineering represents a higher percentage of clients and is picked to represent 10 out of the 15 total number of clients. Quality Assurance is left with 0 (zero) clients and Corporate Employees accounts for the remaining 5 clients which represent 33% of the total clients (aggregation level 1). The algorithm attempts to select groups with next higher aggregation level (3) and, failing to find any, reduces the aggregation level first to 2 and then to 1 where Corporate Employees is selected as representative of the remaining 5 clients. Engineering is designated a mainstream group and Corporate Employees is designated an outlier.
If the selected aggregation level is 3 (67-100%), only Corporate Employees falls under this aggregation level. Since it accounts for 100% of the clients it is designated a mainstream group and no groups are designated outliers.
A Group Ranking Algorithm (1100) will be discussed next. As can be seen in
1) Compute (at 1110) the set of all the groups of which the clients (users or computers) are members using the group information retrieved from the IAA 140, as noted above.
2) For each group, retrieve from the IAA (at 1115) the count of group members that are active users.
3) Order (at 1120) the groups by increasing number of active users such that a group with a small number of active users ranks higher than a group with a larger number of active users.
4) Assign each user to the highest ranked group of which he is a member.
The following example illustrates the Group Ranking Algorithm:
Discovery Data indicates that a Web Server has 15 distinct clients who are members of 3 different groups whose composition is as follows:
-
- a) The Quality Assurance group has 8 active users;
- b) The Engineering group has 20 active users;
- c) The Corporate Employees group has 100 active users.
Thus, ordering the groups by increasing number of active users yields:
-
- 1. Quality Assurance (8)
- 2. Engineering (20)
- 3. Corporate Employees (100)
As a result, any client who is a member of Quality Assurance will be assigned to the Quality Assurance group regardless of any other group membership. Likewise, any client who is a member of Engineering (but not Quality Assurance) is assigned to the Engineering group. All other clients are assigned to the Corporate Employees group.
Asset DiscoveryIn one embodiment, the GBP engine 136 uses an asset discovery algorithm during discovery to determine which CBSs to include in the list of Critical Business Systems.
In one embodiment of the invention, CBS discovery is automated from the
DDC information, as follows.
At 1210, network traffic is reviewed for potential critical assets.
At 1220, the number of events for each asset found is compared to the number of events of all assets found to determine if the asset is a critical asset. If not, control moves to 1260.
If, at 1220, the number of events for each asset found indicates that the asset is a critical asset, control moves to 1230. At 1230, the number of packets for each asset found is compared to the number of packets of all assets found to determine if the asset is a critical asset. If not, control moves to 1260.
If, at 1230, the number of packets for each asset found indicates that the asset is a critical asset, control moves to 1240 and the asset is designated a critical asset (or Critical Business System (CBS)). In one embodiment, an asset is a critical asset if the events per asset of this asset places the asset in the top P1 percentile of all the assets found and if the packets per asset of this asset places the asset in the top P2 percentile of all the assets found. In one such embodiment, P1 and P2 equal 90%. Control then moves to 1250 and the process is repeated until all assets found have been reviewed.
At 1260, the number of users accessing each asset is compared to the number of users accessing each of the other assets found to determine if the asset is a critical asset. If not, control moves to 1250.
If, at 1260, the number of users accessing each asset found indicates that the asset is a critical asset, control moves to 1270. At 1270, the number of packets for each asset found is compared to the number of packets of all assets found to determine if the asset is a critical asset. If not, control moves to 1260. Note: this test may use different parameters than the test at 1230.
If, at 1270, the number of packets for each asset found indicates that the asset is a critical asset, control moves to 1240 and the asset is designated a critical asset (or Critical Business System (CBS)). In one embodiment, an asset is a critical asset if the users per asset of this asset places the asset in the top P3 percentile of all the assets found and if the packets per asset of this asset places the asset in the top P4 percentile of all the assets found. In one such embodiment, P3 and P4 equal 75%. Control then moves to 1250 and the process is repeated until all assets found have been reviewed.
In one embodiment, for each DDC data set, the monitor 108 computes histograms for the number of events, the number of packets, and the number of clients. For example, one histogram lists how many hosts have 1-100 events, how many 100-200 events, and so on. For purposes of CBS discovery, these numbers are summed over all the services accessing the CBS.
From the histogram data, a percentile may be computed by dividing the domain of the data into 100 parts.
From this histogram data, CBSs are assigned as follows:
Hosts above a fixed percentile of events AND of packets (set intersection) are considered to be CBSs. In one embodiment, this fixed percentile is 90 percent.
Hosts with at least a fixed percentile of number of users, also having a second fixed percentile of packets are considered to be CBSs. In one embodiment, the fixed percentile and second fixed percentile are both 75 percent.
In one embodiment of the invention, this algorithm is applied to subsequent DDC data sets, and newly discovered CBSs are merged in with those discovered in previous applications of the algorithm.
In one embodiment of the invention, the monitor 108 ignores DDC data sets unless minimum number of events, packets and users appear in the data.
In one embodiment of the invention, the monitor 108 ignores DDC data sets until a fixed time interval has passed after system startup.
In one embodiment of the invention, the monitor 108 maintains a list of hosts that are excluded from consideration as CBSs, such as hosts that are members of a DHCP pool.
In one embodiment of the invention, the monitor 108 additionally requires that a CBS be derived from a DDC data set more than a minimum number of times, such as 2, in an interval of time, such as one day, or it is excluded from consideration.
In one embodiment of the invention, the monitor 108 removes CBSs from its list of previously discovered CBSs if they fail to be identified as CBSs more than a minimum number of times per day, such as 3.
In one embodiment of the invention, CBS discovery is performed based on a template of services. The templates are created by the network administrator. Each template indicates a list of services and the kind of CBS that corresponds to this list. For example, a Microsoft Exchange Server might include the services: SMB, SMTP, POP, IMAP, RPC. For each host in each DDC data set, the monitor 108 computes the total services exported by that host, and compares this to each template. Hosts that match are considered to be CBSs.
-
- Discovery for Policy Development
In one embodiment, assigning users to groups that best represent them using the Group Representation Algorithm, the Client Representation Algorithm or the Group Ranking Algorithm allows a network administrator to determine how the resources of the network's critical business systems are being used and tailor policy controls that enable appropriate use while preventing undesirable access behavior, as in a network monitoring system or a firewall.
This process of policy definition and refinement is greatly enhanced and simplified by the application's ability to automatically segregate users into mainstream groups and outlier groups. In one embodiment, the network administrator automatically creates policy controls that enable mainstream groups to access the network resources. Clients that belong to outlier groups are scrutinized and, if deemed to be legitimate users of the resource, the outlier group is sub-divided to include just the set of users that are authorized.
In one embodiment, as is shown in
1. Select a timeframe for traffic analysis; this timeframe should be sufficiently long to allow for the collection of a representative sample of traffic, such as one week. Then monitor (at 1310) network traffic for the selected timeframe.
2. Select, at 1315, a CBS for which additional policy controls are desirable; this selection may be based on amount of traffic to that CBS or the number of users that are clients of the CBS, or it may be determined by the relative importance of the CBS.
3. Select, at 1320, a service offered by the CBS and review, at 1325. the group assignments presented by the application using the Group Representation Algorithm, classify, at 1330, each user group as either mainstream or outlier:
-
- a. If, at 1335, a very small number of mainstream groups are presented and there are no outlier groups, automatically create policy controls at 1340 for the mainstream groups.
- b. If, at 1335, there is a mix of mainstream and outlier groups presented, use the Client Representation Algorithm at 1345 to determine at 1350 if there is a group, or small number of groups, that comprises the entire client set without any one group being overly broad (e.g., if the only available user group that contains both user groups “Finance” and “Marketing” was the user group “Employees,” that could potentially be overly broad. Cross-check access by this group to the same or similar services against other CBSs to further substantiate the presumed access rights. If such a group (or groups) is found, create the corresponding policy controls at 1340.
- c. If the Client Representation Algorithm does not yield a strong set of candidate groups, or the groups presented by the Group Representation Algorithm are all outliers, create at 1360 one or more new groups that are a better fit to the client set and recompute the assignment of clients to groups at 1325 using the new groups.
The identity-aware network monitoring system and method described above helps lower cost and enable faster and broader deployment of visibility into “who is doing what and where” across applications and networks.
Ultimately, the systems and methods described above help increase efficiency and compliance by:
Replacing time-intensive manual discovery surveys
Simplifying the process of defining identity-based controls
Dispensing with the inaccurate manual verification of logs
Decreasing investigation time for access violations with correlated data
Reducing disruption of erroneous infrastructure and access changes
Monitor and control access to network resources without application recoding
At the same time, the systems and methods described above help reduce risk by:
Detecting inappropriate user behavior after network authentication and authorization
Eliminating the bypassing of security gateways and access controls
Compensating control for unprotected custom applications
Detecting abuses from deprovisioned users and users assigned to new roles
Monitoring the use of privileged accounts.
In an example embodiment of a machine-readable medium 1400 that includes the instruction set 1450, the instructions, when executed by a machine, cause the machine to perform operations such as automatic grouping of users in groups and automatic discovery of network assets and network policy.
Thus, methods and a machine-readable medium including instructions for displaying and controlling network behavior based on identity have been described. Although the various methods for electing a displaying and controlling network behavior based on identity have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader embodiment of the disclosed subject matter. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that any arrangement that achieve the same purpose, structure, or function may be substituted for the specific embodiments shown. This application is intended to cover any adaptations or variations of the example embodiments of the invention described herein. It is intended that this invention be limited only by the claims, and the full scope of equivalents thereof.
Claims
1. In a network having a plurality network entities, including network users and network assets, a method of discovering network entities, comprising:
- monitoring network traffic, wherein monitoring includes finding network entities in the network traffic;
- if the network entities are network assets, determining if the network entities are critical network assets; and
- if the network entities are network users, classifying the network users automatically into user groups; and
- displaying the network traffic as a function of the critical network assets and the user groups.
2. The method of claim 1, wherein classifying the network users into user groups automatically, comprising:
- assigning clients to user groups, wherein assigning clients to user groups includes assigning one or more clients to multiple user groups;
- sorting the groups; and
- processing the groups so that each client is a member of a single group.
3. The method of claim 2, wherein sorting the groups includes sorting clients into mainstream groups and outlier groups.
4. The method of claim 3, wherein sorting the groups includes applying a client representation algorithm to the groups.
5. The method of claim 3, wherein sorting the groups includes applying a group representation algorithm to the groups.
6. The method of claim 2, wherein sorting the groups includes:
- determining, for each group, a percentage of active users;
- sorting the groups in descending order of percentage of active users; and
- assigning each client to a single group, giving priority to the groups with the highest percentage of active users.
7. The method of claim 1, wherein determining if the network entities are critical network assets includes:
- determining, for each of the network assets, events, packets and users per network asset;
- for each network asset found in the network traffic, if the events per asset of the asset are greater than the events per asset of a first percentile of all assets found in the network traffic and if the packets per asset of the asset are greater than the packets per asset of a second percentile of all assets found in the network traffic, designating said asset as a critical asset; and
- for each network asset found in the network traffic, if the users per asset of the asset are greater than the users per asset of a third percentile of all assets found in the network traffic and if the packets per asset of the asset are greater than the packets per asset of a fourth percentile of all assets found in the network traffic, designating said asset as a critical asset.
8. In a network having a plurality network entities, including network users and network assets, a network monitor, comprising:
- a data collector, wherein the data collector captures information indicative of network traffic; and
- a data analyzer connected to the data collector, wherein the data analyzer decodes and analyzes the information captured by the data collector;
- a processor connected to the data analyzer, wherein the processor finds network entities in the network traffic and wherein: if the network entities are network assets, the processor determines if the network entities are critical network assets; and if the network entities are network users, classifying the network users automatically into user groups; and
- a user interface connected to the processor, wherein the user interface displays the network traffic as a function of the critical network assets and the user groups.
9. A method of discovering a network policy, comprising:
- selecting a timeframe for traffic analysis;
- selecting a critical business system for which additional policy controls are desirable;
- selecting a service offered by the CBS;
- displaying user group assignments associated with the service; and
- creating policy controls for the service, wherein creating policy controls for the service includes:
- automatically classifying user groups as mainstream or outlier based on the group representation algorithm; and
- if there are no outlier groups, creating policy controls automatically for the mainstream groups.
10. The method of claim 9, wherein creating policy controls includes checking access controls to similar service on other critical business systems to substantiate the created policy controls.
11. The method of claim 9, wherein the method further comprises:
- if there are outlier groups and mainstream groups, determining, based on the client representation algorithm, whether there are one or more groups that cover the client set without any one group being overly broad and, if so, creating policy controls associated with the one or more groups; and
- if there are outlier groups and mainstream groups and one cannot cover the client set without any one group being overly broad, adding new user groups.
12. A method of displaying identity-based network behavior, comprising:
- creating a grid having a first and a second axis;
- assigning client groups to the first axis;
- assigning critical business systems to the second axis;
- monitoring network traffic;
- displaying the network traffic on the grid as a function of client group and critical business system, wherein displaying includes associating a point on the first axis with each client group, associating a point on the second axis with each critical business system and displaying a shape at intersections in the grid between points on the first and second axes, wherein the shape varies in size as a function of network traffic associated with a particular client group and a particular critical business system; and
- displaying the network traffic graphically as an extended timeline of trend data, wherein the timeline of trend data includes clear gradations of time periods, wherein the clear gradations of time periods are used to select data sets associated with the time periods for display on the grid.
13. The method of claim 12, wherein displaying a shape includes filtering network traffic that is not likely to be a problem.
14. The method of claim 12, wherein displaying a shape includes filtering by one or more services observed in the network traffic.
15. The method of claim 14, wherein filtering can be constrained by the user to retain information on critical business systems.
16. The method of claim 12, wherein displaying the network traffic includes displaying points on each axis with labeled tabs such that selection of a tab results in dynamic display of corresponding data in a separate window on the same screen.
17. The method of claim 12, wherein displaying a shape at intersections in the grid between points on the first and second axes includes responding to selection of the bubble by displaying in separate windows, on the same screen as the grid, data corresponding to the critical business system associated with the intersection and data corresponding to the user group associated with the intersection.
18. The method of claim 17, wherein the critical business system window and the user group window both highlight data relating to the intersection.
19. The method of claim 12, wherein assigning client groups to the first axis includes performing user discovery automatically to derive the client groups.
20. The method of claim 12, wherein assigning critical business systems to the second axis includes performing asset discovery automatically to determine the critical business systems.
21. The method of claim 12, wherein the size of the shape is a function of the number of unique users associated with a particular client group and a particular critical business system.
22. The method of claim 12, wherein the dynamic display of corresponding data in a separate window includes display of policy controls, wherein the policy controls are associated with a specific service used by a particular group within a particular system and wherein the policy controls are displayed as an entity that can be selected for subsequent monitoring and control of the policy.
23. The method of claim 12, wherein the dynamic display of corresponding data in a separate window includes display of outlier threshold controls, wherein the outlier threshold controls are associated with a specific service within a particular system and wherein the outlier threshold controls can be actuated to dynamically adjust the outlier threshold.
24. An article comprising a computer readable medium having instructions thereon, wherein the instructions, when executed by a machine, create a system for executing the method of claim 12.
25. A method of displaying identity-based network behavior, comprising:
- creating a grid having a first, second and third axis;
- assigning client groups to the first axis;
- assigning critical business systems to the second axis;
- assigning services to a third axis;
- monitoring network traffic;
- displaying network traffic on the grid as a function of client group, service and critical business system, wherein displaying includes associating a point on the first axis with each client group, associating a point on the second axis with each critical business system, associating a point on the third axis with each service and displaying a shape at intersections in the grid between points on the first, second and third axes, wherein the shape varies in size as a function of network traffic associated with a particular client group, a particular service and a particular critical business system.
26. The method of claim 25, wherein displaying a shape includes filtering network traffic that is not likely to be a problem.
27. The method of claim 25, wherein displaying a shape includes filtering by one or more services observed in the network traffic.
28. The method of claim 27, wherein filtering can be constrained by the user to retain information on critical business systems.
29. A method of controlling a network, comprising:
- storing information on network traffic;
- displaying the information as identity-based network behavior, wherein displaying includes:
- determining if network assets are critical network assets;
- classifying the network users automatically into user groups; and
- displaying the network traffic as a function of the critical network assets and the user groups.
30. The method of claim 29, wherein storing includes applying heuristics to filter network traffic.
31. The method of claim 29, wherein classifying the network users automatically into user groups includes classifying user groups as mainstream or outlier.
32. A method of classifying clients into user groups automatically, comprising:
- assigning clients to user groups, wherein assigning clients to user groups includes assigning one or more clients to multiple user groups;
- sorting the groups; and
- processing the groups so that each client is a member of a single group.
33. The method of claim 32, wherein sorting the groups includes sorting clients into mainstream groups and outlier groups.
34. The method of claim 32, wherein sorting the groups includes:
- determining, for each group, a percentage of active users;
- identifying mainstream groups, wherein mainstream groups are client groups with percentages of active users above a pre-defined threshold;
- eliminating mainstream groups whose clients are all members of a larger mainstream group; and
- sorting the remaining mainstream groups in descending order of percentage of active users; and
- wherein processing the groups so that each client is a member of a single group includes:
- assigning each client in one or more mainstream groups to a single mainstream group, giving priority to the mainstream groups with the highest percentage of active users;
- determining if any of the client groups have a percentage of active users below the pre-defined threshold;
- if any of the client groups have a percentage of active users below the pre-defined threshold, reclassifying those client groups as outlier groups;
- removing from the outlier groups all clients that are members of one of the remaining mainstream groups;
- sorting the outlier groups in descending order of percentage of active users; and
- assigning each client in one or more outlier groups to a single outlier group, giving priority to the outlier groups with the highest percentage of active users.
35. The method of claim 32, wherein sorting the groups includes:
- calculating, for each group, the ratio of clients that are members of the group to the total number of clients;
- mapping each group to an aggregation level as a function of the ratio calculated for each group;
- selecting and processing a selected aggregation level, wherein selecting and processing includes:
- a) selecting all the groups whose aggregation level is the same as the selected aggregation level;
- b) sorting the selected groups in decreasing order of percentage of clients;
- c) within the selected groups in the selected aggregation level, assign each client to one and only one group, giving priority to the groups with the highest percentage of clients;
- d) if the aggregation level of a group within the selected group falls below the selected aggregation level after clients are removed, mapping the group to its appropriate aggregation level.
- e) removing all the clients assigned to the selected groups in the selected aggregation level from the remaining groups; and
- f) selecting and processing another aggregation level until each client is assigned to only one group.
36. The method of claim 35, wherein selecting and processing another aggregation level includes;
- determining if the highest level aggregation level has been selected and processed;
- if the highest level aggregation level has not been selected and processed, determining if the most recent aggregation level selected is the highest level aggregation level;
- if the highest level aggregation level has not been selected and processed and if the most recent aggregation level selected is not the highest level aggregation level, selecting and processing the aggregation level that is one higher than the most recent aggregation level as the selected aggregation level, selecting all the groups whose aggregation level is the same as the selected aggregation level and repeating a-f; and
- if the highest level aggregation level has been selected and processed, selecting and processing the aggregation level that is one lower than the most recent aggregation level as the selected aggregation level, selecting all the groups whose aggregation level is the same as the selected aggregation level and repeating a-f.
37. The method of claim 32, wherein sorting the groups includes:
- determining, for each group, a percentage of active users;
- sorting the groups in descending order of percentage of active users; and
- assigning each client to a single group, giving priority to the groups with the highest percentage of active users.
38. A method of discovering critical assets in a computer network having a plurality of assets, comprising:
- monitoring network traffic, wherein monitoring includes finding assets in the network traffic;
- determining, for each of the assets found in the network traffic, events, packets and users per asset;
- for each asset found in the network traffic, if the events per asset of the asset are greater than the events per asset of a first percentile of all assets found in the network traffic and if the packets per asset of the asset are greater than the packets per asset of a second percentile of all assets found in the network traffic, designating said asset as a critical asset; and
- for each asset found in the network traffic, if the users per asset of the asset are greater than the users per asset of a third percentile of all assets found in the network traffic and if the packets per asset of the asset are greater than the packets per asset of a fourth percentile of all assets found in the network traffic, designating said asset as a critical asset.
39. The method of claim 38, wherein the first and second percentiles are set at the 90 percentile while the third and fourth percentiles are set at the 75th percentile.
40. The method of claim 38, wherein determining includes computing histograms for the number of events, the number of packets, and the number of clients.
41. The method of claim 38, wherein monitoring further includes storing monitored network traffic over predefined periods of time as data sets and wherein finding includes determining if subsequent data sets introduce new assets to be considered as critical assets.
Type: Application
Filed: May 21, 2009
Publication Date: Mar 18, 2010
Inventors: Luis Filipe Pereira Valente (Palo Alto, CA), Derek Patton Pearcy (San Francisco, CA), Geoffrey Howard Cooper (Palo Alto, CA), Kieran Gerard Sherlock (Palo Alto, CA)
Application Number: 12/454,773
International Classification: H04L 12/26 (20060101); H04L 12/28 (20060101);