PLATFORM FOR POLICY-DRIVEN COMMUNICATION AND MANAGEMENT INFRASTRUCTURE

Info

Publication number: 20110066841
Type: Application
Filed: Sep 14, 2010
Publication Date: Mar 17, 2011
Inventors: Dennis Sidney GOODROW (Santa Rosa, CA), Peter Benjamin Loer (Oakland, CA), Christopher Jacob Loer (San Francisco, CA), Jonathan Shih-Shuo Fan (Oakland, CA), Gregory Mitchell Toto (Piedmont, CA), Amrit Tsering Williams (Alamo, CA), John Edward Firebaugh (San Francisco, CA), Jeremy Scott Spiegel (San Francisco, CA), Jesse Ward-Karet (San Francisco, CA), Benjamin John Kus (Alameda, CA)
Application Number: 12/881,995

Abstract

A policy-driven communication and management infrastructure may include components such as Agent, Server and Console, policy messages, and Relays to deliver security and system management to networked devices. An Agent resides on a Client, acting as a universal policy engine for delivering multiple management services. Relays, Clients additionally configured to each behave as though they were a root Server, Relaying information to and from other Clients, permit Clients to interact with the root Server through the Relay, enabling information exchange between Client and Server. Such information exchange allows Clients to gather information, such as new policy messages, from the Server, to pass status messages to the Server and to register their network address so that they can be readily located. Automatic Relay selection enables Clients and Relays to select their own parent Relays, thus allowing Clients and Relays to discover new routing paths through the network without administrator input.

Description

Description

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims benefit of U.S. provisional patent application No. 61/242,278, filed Sep. 14, 2009, the entirety of which is incorporated herein by this reference thereto. This application is related to U.S. patent application Ser. No. 10/804,799, now U.S. Pat. No. 7,398,272, filed Mar. 19, 2004, the entirety of which is incorporated herein by this reference thereto.

BACKGROUND OF THE INVENTION

1. Field of the Invention

Generally, the invention relates to management of enterprise-scale networks of computational devices. More particularly, the invention relates to a Platform for a policy-driven communication and management infrastructure.

2. Background Discussion

Information technology (IT) administrators in enterprises everywhere face a daunting task of managing the software and hardware on tens, hundreds, or thousands of machines in their domains. With many incompatibilities, patches, and policy advisories being announced every day, the management task involves much more than just acquisition and installation of updates and patches, for example. Simply keeping aware of all potentially problematic situations on hardware and software products used in an enterprise is more than a full-time job. Dealing with user requests and complaints adds still further to the demands of the job. Thus, it is required that IT managers be able to anticipate situations which may arise in a specific enterprise and address them proactively. Maintaining such a state of readiness may require an IT manager to understand the configuration of the hardware and software in a given network, to keep track of policy advisories, updates, incompatibilities and patches relevant to the specific enterprise, and to match those policy advisories, updates, and patches with the specific equipment in the enterprise. In a large enterprise, such management tasks involve monitoring of and policy dissemination to, perhaps, hundreds of thousands of computational devices by an administrator. Conventionally, management Platforms in such large enterprises employ a communication infrastructure that is conducive mainly to coarse-grained, one-to-many interaction, typically involving large numbers of devices, occasionally even the entire network rather than a fine-grained, per-endpoint policy determination.

SUMMARY

A policy-driven communication and management infrastructure may include components such as Agent, Server and Console, policy messages, and Relays to deliver security and system management to networked devices. An Agent resides on a Client, acting as a universal policy engine for delivering multiple management services. Relays are Clients additionally configured to each behave as though they were a proxy for the root Server, Relaying information to and from other Clients, permitting Clients to interact with the root Server through the Relay, and facilitating information exchange between Client and Server. Such information exchange allows Clients to gather information, such as new policy messages, from the Server, to pass status messages to the Server and to register their network address so that they can be readily located. Automatic Relay selection enables Clients and Relays to select their own parent Relays, thus allowing Clients and Relays to discover routing paths through the existing network without administrator input.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides a diagram of a machine in the exemplary form of a computer system within which a set of instructions, for causing the machine to perform any one of the methodologies discussed herein below, may be executed;

FIG. 2 provides a block diagram of a Relay hierarchy in a Platform for a policy-driven communication and management infrastructure;

FIG. 2A provides a block diagram of a proxy agent according to the invention;

FIG. 3 provides a flow diagram of a process for manual Relay selection in the Platform of FIG. 2;

FIG. 4 provides a flow diagram of a process for automated Relay selection in the Platform of FIG. 2;

FIG. 5 provides a flow diagram of a Relay selection failover process in the Platform of FIG. 2;

FIG. 6 provides a flow diagram of a Relay reselection process in the Platform of FIG. 2;

FIG. 7 provides a flow diagram of a process for Dynamic download of untrusted content in the Platform of FIG. 2;

FIG. 8 provides state a transition diagram for a Relay in the Platform of FIG. 2

FIG. 9 provides a state transition diagram for a Server in the Platform of FIG. 2

FIG. 10 provides a schematic of a process for Client registration in the Platform of FIG. 2;

FIG. 11 provides a schematic of a process for non-repudiation in the Platform of FIG. 2;

FIG. 12 provides a schematic of a process for secure data distribution in the Platform of FIG. 2;

FIG. 13 provides a schematic of a direct connection process between a Console and a Client in the Platform of FIG. 2;

FIG. 14 provides a schematic of a direct connection process between a first Client and a second Client in the Platform of FIG. 2;

FIG. 15 provides a diagram of a Network Asset Map in the Platform of FIG. 2; and

FIG. 16 provides a screen shot of a Console Operator interface from the Platform of FIG. 2.

DETAILED DESCRIPTION

A policy-driven communication and management infrastructure may include components such as Agent, Server and Console, policy messages, and Relays to deliver security and system management to networked devices. An Agent resides on a Client, acting as a universal policy engine for delivering multiple management services. Relays, Clients additionally configured to each behave as though they were a root Server, Relaying information to and from other Clients, permit Clients to interact with the root Server through the Relay, enabling information exchange between Client and Server. Such information exchange allows Clients to gather information, such as new policy messages, from the Server, to pass status messages to the Server and to register their network address so that they can be readily located. Automatic Relay selection enables Clients and Relays to select their own parent Relays, thus allowing Clients and Relays to discover new routing paths through the network without manual administrator input.

DEFINITIONS

Action: actions are typically scripts that can customize a specific solution for each Client, using a series of scripting commands and Relevance expressions. Although the Relevance language itself can't alter a Client, it can be used to direct actions in a way that parallels the original trigger. For instance, a Fixlet might use the Relevance language to inspect a file in the system folder. Using a similar Relevance clause, the Action can then target that same file without knowing explicitly where that folder resides. This allows the Action author (and issuer) to concentrate on the issue at hand without worrying about the vagaries of each individual computer system. AKA “ActionScript”.
ActionID: a unique identifier for an Action
Agent: Software that resides on Client and acts as a universal policy engine capable of delivering multiple management services. A single Agent can execute a diverse and extensible array of management services ranging from real-time Client status reporting, to patch and software distribution, to security policy enforcement. By assigning responsibility for reporting and management actions to endpoints themselves, the Platform enables visibility and management of IT infrastructures ranging from hundreds to hundreds of thousands of desktop, mobile and Server computers.
Client: an endpoint device in a network under management by a Platform for policy-driven communication and management infrastructure.
Console: an operations control center for administrators, which connects to the Server, that includes graphical displays of device, group, and enterprise-wide device status and dashboards for executing management actions through the infrastructure. The Console also includes reporting functions and templates that enable graphical and tabular views of infrastructure status.
Dashboard: Dashboard documents pop up in the main window of the Console when selected from a ‘Dashboards’ icon in a Domain Panel navigation tree. Dashboards tap into the Platform Database to provide the Operator with timely and compact high-level views of the network and allow an administrator to take action based on those views.
Download Request: In an embodiment, a download request may include a hash and a size that uniquely identify the file being requested, along with the information on how to retrieve the file. If a Client wants multiple files for an Action, it submits a set of DownLoadRequests in one interaction with the Relay. Although the interaction is batched, each request is handled individually by both Relays and the Server.
Dynamic Download aka “Client-initiated Download”: In an embodiment, a download whose hash, size and URL are not known at the time an Action is issued. Instead, the Client determines this information and then provides it to the Server, which fetches the file for the Client.
FileID: A FileID is a pair combination of (SHA-1, file size (bytes)) used to uniquely identify a file
Fixlet or Fixlet message: Instructions disseminated to the Agent to perform a management or reporting Action. Fixlet messages can be programmed to target specific groups of devices to perform management actions.
Hash-based Download: In an embodiment, a download that is requested or referred to by a “HashSizePair”. In an embodiment, this type of download is requested by a Client using a “DownloadRequest” plug-in, rather than the magic URLs that index-based downloads rely on. A hash-based download can be either static or dynamic.
Index-based Download aka “Legacy Download”: In an embodiment, a download that is referred to by a Client using an ActionID/Index pair, where the index is generated at the time the Action is issued. In an embodiment, an “indexed download” is a species of static download, because it is difficult to accommodate in the indexing strategy the case where the index is unknown at the time an Action is created. In an embodiment, indexed downloads can be requested without providing a hash, in which case the download represents whatever the URL happens to contain at the time an Action is created.
Relay: A Relay is a software module that executes as a shared service on non-dedicated hardware. Alternatively, “Relay” can refer to the hardware on which Relay software is running. Relays act as concentration points for Fixlet messages on network infrastructures and help reduce network bandwidth requirements for distribution of Fixlets and content such as software, patches, updates, and other information. Relays also offer a failover mechanism to keep managed Clients in touch with the Console should normal communications channels go dark or become overloaded with other traffic.
Server: Software that provides a control center and repository for managed system configuration data, software updates and patches, and other management information. In the alternative, “Server” can denote a computing machine running such software within a network under management.
Site: Sites are collections of Fixlet messages and other content to which an Operator of a Platform deployment may subscribe one or more Clients in the Operator's network. Sites may be created by the Platform manufacturer or by one or more third parties. Additionally, deployment Operators may create custom sites that contain internally generated content. Furthermore, the Operator may create sites, Integrations, which integrate internally- and externally-sourced content.
Static Download aka “Server-initiated Download”: In an embodiment, a download requested by the Console at the time an Action is taking place.

Referring now to FIG. 1, shown is a diagrammatic representation of a machine in the exemplary form of a computer system 100 within which a set of instructions for causing the machine to perform any one of the methodologies discussed herein below may be executed. In alternative embodiments, the machine may comprise a network router, a network switch, a network bridge, personal digital assistant (PDA), a cellular telephone, a web appliance or any machine capable of executing a sequence of instructions that specify actions to be taken by that machine.

The computer system 100 includes a processor 102, a main memory 104 and a static memory 106, which communicate with each other via a bus 108. The computer system 100 may further include a display unit 110, for example, a liquid crystal display (LCD) or a cathode ray tube (CRT). The computer system 100 also includes an alphanumeric input device 112, for example, a keyboard; a cursor control device 114, for example, a mouse; a disk drive unit 116, a signal generation device 118, for example, a speaker, and a network interface device 128.

The disk drive unit 116 includes a machine-readable medium 124 on which is stored a set of executable instructions, i.e. software, 126 embodying any one, or all, of the methodologies described herein below. The software 126 is also shown to reside, completely or at least partially, within the main memory 104 and/or within the processor 102. The software 126 may further be transmitted or received over a network 130 by means of a network interface device 128.

In contrast to the system 100 discussed above, a different embodiment of the invention uses logic circuitry instead of computer-executed instructions to implement processing offers. Depending upon the particular requirements of the application in the areas of speed, expense, tooling costs, and the like, this logic may be implemented by constructing an application-specific integrated circuit (ASIC) having thousands of tiny integrated transistors. Such an ASIC may be implemented with CMOS (complimentary metal oxide semiconductor), TTL (transistor-transistor logic), VLSI (very large scale integration), or another suitable construction. Other alternatives include a digital signal processing chip (DSP), discrete circuitry (such as resistors, capacitors, diodes, inductors, and transistors), field programmable gate array (FPGA), programmable logic array (PLA), programmable logic device (PLD), and the like. It is to be understood that embodiments of this invention may be used as or to support software programs executed upon some form of processing core (such as the Central Processing Unit of a computer) or otherwise implemented or realized upon or within a machine or computer readable medium. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine, e.g. a computer. For example, a machine readable medium includes read-only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals, for example, carrier waves, infrared signals, digital signals, etc.; or any other type of media suitable for storing or transmitting information.

Referring now to FIG. 2, shown is a Relay hierarchy in a Platform 200 for creating a policy-driven, communications and management infrastructure for delivery of security and management services to networked computational devices, such as desktop, laptop/notebook and Server computers. In an embodiment, components of the Platform may include at least one Client 202 running an Agent, at least one Server and Console 204, Fixlet messages (indicated by the arrows showing data flow between elements), and zero or more Relays 206. The Server and Console are shown as the same machine in FIG. 2, but many embodiments of the invention the Server and Console are separate machines. Thus, the Server 204 in FIG. 2 may comprise only the server function and a separate computer, connected to the Server, would be provided to implement the Console function. In addition to the Relays 206, the Relay hierarchy typically includes a top-level Relay 208 that directly interacts with the Server 204.

Key components of the Platform include the Agent 202, the Server and Console 204, the Fixlet messages, and the Relays 206, 208. The Platform creates a lightweight communications and management infrastructure for delivery of security and system management services to networked desktop, laptop/notebook and Server computers. By assigning responsibility for reporting and management actions on endpoints themselves, the Platform enables visibility and management of IT infrastructures ranging from hundreds to hundreds of thousands of desktop, mobile and Server computers.

The Agent 202 resides on managed devices and acts as a universal policy engine capable of delivering multiple management services. A single Agent 202 can execute a diverse and extensible array of management services that range from real-time Client status reporting, to patch and software distribution, to security policy enforcement.

The Agent's role in the Platform may be described as that of a Policy Engine: a piece of software and a computational context for evaluating content. Thus, the Agent constitutes a computational resource that uses one or more inspectors to examine its context, decide what is relevant, report properties, take Action in that environment, and report on the success or failure of the actions. Thus, the Agent gives an administrator visibility into the context and controls it. The motivation for provision of a policy engine thus may be the realization that any computing resource, including physical or virtual machines, or a machine, that is a delegate for another machine or a piece of hardware can benefit from management by having a policy engine that can inspect properties of the entity that is being managed, apply changes to the environment and report on the fact that those changes were effective or not.

The Agent also automatically notifies the Server and Console 204 of changes in managed device configuration, providing a real-time view of device status. In addition to a standard array of management services, customers and developers can create custom policies and services using a published authoring language. In various embodiments, the Agent runs on all versions of the MICROSOFT WINDOWS (Microsoft Corporation, Redmond Wash.) operating system since WINDOWS 95, UNIX, LINUX and MAC OS (APPLE COMPUTER, INC., Cupertino Calif.) operating systems, as well WINDOW MOBILE and POINT-OF-SALE variants of the Windows operating system, enabling administrators to consolidate management of heterogeneous infrastructures from the Console.

The invention herein extends the notion of an Agent beyond a computer to devices or logical structures, such as proxy-agents (also referred to as pseudo-agents), that are physically or logically proximate to a computer, and that are used to give visibility and control of assets that cannot, for technical or policy reasons, have a native agent installed. Proxy-agents are disclosed, for example, in co-assigned patent application to Lippincott, L. E., et al, Pseudo-Agents, U.S. patent application Ser. No. 12/044,614 (filed Mar. 7, 2008), and is incorporated herein in its entirety by this reference thereto.

Proxy-agents can be understood by reference to FIG. 2A. A proxy-agent 50 is deployed to manage each of one or more different devices, for example physical machine 1 (54) and physical machine 2 (56) via a virtual machine management system 52. For example, a router can have a proxy-agent. There can be a proxy-agent for such devices as a network printer on the file server, or a mobile device that resides most of its time in the local office, but that has a logical presence is over a cell network and that is in touch with a mobile enterprise server back in the central office. The physical device so managed, for example physical machine 1 (54), can itself serve as a natural agent for one or more virtual machines, e.g. VM1 and VM2, which machines can themselves include an agent A.

Another important variant is a proxy-agent that indirectly manages a set of devices by way of one or more other management systems. The example shown in FIG. 2 thus provides a virtual management system 52, for example a Blackberry enterprise server, which is a management system that manages a collection of Blackberry devices. In this example, a proxy-agent manages those devices by interacting with the Blackberry enterprise server.

The Server 204 is a software-based package that provides a control center and repository for managed system configuration data, software updates and patches, and other management information. In an embodiment, the Console 204, which runs from the Server 204, provides an operations control center for administrators that includes graphical displays of device, group, and enterprise-wide device status and dashboards for executing management actions through the management infrastructure. The Console may also include reporting functions and templates that enable graphical and tabular views on infrastructure status.

Fixlet messages are instructions to the Agent 202 to perform a management or reporting Action. Fixlet messages can be programmed to target specific groups of devices to perform management actions. As noted above, in an embodiment, users have the option of writing custom Fixlet messages.

Relays 206, 208 act as concentration points for Fixlet messages on network infrastructures. Relays are a software module that execute as a shared service on non-dedicated hardware. Relays help reduce network bandwidth requirements for distribution of Fixlets and content such as software, patches, updates, and other information. In an embodiment, Relays 206, 208 include a failover mechanism to keep managed Clients in touch with the Console should normal communications channels go dark or become overloaded with other traffic. In an embodiment, Relays allow an N-tier hierarchy to be created for the transmission of information from the Clients to the Server in the enterprise.

In an embodiment, Relays are included as network components to significantly improve the performance of an installation. Downloads and patches, which are often large files, represent, by far the greatest fraction of bandwidth. Relays are designed to take over the bulk of the download burden from the Server. Rather than downloading patches directly from a Server, Clients can instead be instructed to download from designated Relays, significantly reducing both Server load and network traffic. Relays help in the upstream direction as well, compiling and compressing data received from the Clients before passing it on the Server. As above, any Client can be programmed to serve as a Relay.

Relays simultaneously mitigate at least two bottlenecks:

Relieving the Load on Servers

The Server has many duties, among them, the taxing job of distributing patches and other files. A Relay can be set up to ease this burden, so that the Server does not need to distribute the same files to every Client. Instead, the file is sent once to the Relay, which in turn distributes it to other Clients. The overhead on the Server is reduced by the ratio of Relays to Clients. If one has a hundred Clients and one Relay, the Server need only process one percent of the downloads.

Reducing Congestion on Low-Bandwidth Connections If, for example, one has a Server communicating with a dozen computers in a remote office over a slow VPN (virtual private network), one of those computers may be designated as a Relay. Then, instead of sending patches over the VPN to every Client independently, the Server need only send a single copy to the Relay. That Relay, in turn, distributes the file to the other computers in the remote office over its own fast LAN (local area network). This effectively removes the VPN bottleneck for remote groups on the network.

Relays also function to reduce total the network usage when used on subnets connected through switches on a LAN.

Relay Characteristics

In an embodiment, a Relay takes over most of the download duties of the Server. If several Clients simultaneously request files from a Relay, a significant amount of the computer's resources may be used to serve those files. Other than that, the duties of the Relay are relatively undemanding. The requirements for a Relay computer vary widely depending on or more of the following: (1) The number of connected Clients that are downloading files: (2) the size of each download; and (3) the period of time allotted for the downloads.

A Relay can be installed on any ordinary workstation, but if several Clients simultaneously download files, it may slow the computer down. Workgroup file Servers and other Server-quality computers that are always turned on may be good candidates for installing a Relay.

Relay Selection

Although Clients can automatically seek out and connect to the available Relay, one may want to control the process manually. If so, for each Client in the network, one may specify both a primary and secondary Relay. The Client first attempts to download any patches from its primary Relay. However, if the primary Relay is unavailable (because the computer has crashed, the hard drive has run out of space, the computer is off, etc.), the Clients can download files from the secondary Relay.

In an embodiment, Relays have failover capability. Thus, if the primary Relay fails, the Client connects to the second Relay. If the secondary also fails (or if no secondary has been designated) then the Client automatically reverts to downloading files directly from the Server. In an embodiment, one or more tertiary Relays can be designated for a Client. In an embodiment, one can optimize a pair of Relays by splitting the connected Clients into two groups of roughly equal size. One group designates computer A as primary and B as secondary. The other group reverses the order, thus cutting the overhead of each Relay by two, while still providing a backup.

Setting Up A Relay

In an embodiment, configuring a Client computer as a Relay may involve using the Console to edit settings for the Client computer to run a Relay Server on the Client. FIG. 16, shows a screen shot of an Operator Interface 1600 to a Console. After a Relay is created, a Client can automatically discover it and connect to it by seeking the Relay that is the fewest hops distant from the Client. If there is a need to manually configure Clients, one must notify each computer that it should use a specific Relay to point to, as described herein below. Manual configuration of Relay assignment can be defined by policy such that a computer or group of computers can be configured to use a specific set of manual primary, secondary, and failover Relays.

Once a Relay has been set up on a Client, in addition to functioning as a Client, the Client behaves in the same manner as a Root Server, so that other Clients can do all the interactions they would do with a Root Server through the Relay.

The use of Relays significantly reduces the Client/Server communication necessary for patch application and management. Clients may start to download from designated Relays, minimizing the load on thin connections to the Server. The Clients may also upload their status information to the Relay, which compiles it and compresses it before passing it up to the Server.

In an embodiment, Relays may help enormously to spread out and optimize network traffic, ensuring maximum responsiveness with minimum bandwidth. Relays are especially attractive with remote offices connected by relatively slow VPNs. The Server sends a single download to the remote Relay, which can then distribute it to the Clients over a faster local subnet.

Manual Relay Selection 300 (shown in FIG. 3):

By way of the Console UI 1600, for each Client or for groups of Clients:

- Start (302);
- Select a primary Relay (304);
- Select a secondary Relay (306);
- Select at least one tertiary Relay (308; and
- End (310).

Agent Autoselection Algorithm 400 (shown in FIG. 4):

- Determine if any Relay is in my subnet by pinging Relays with a TTL (time to live) of 1. If so, try to register with the Relay. The registration interaction checks to see if the Relay can communicate with the Server. If registration completes, the Agent uses the Relay as normal. If registration fails, the Agent continues its autoselection algorithm (401);
- Ping each Relay with TTL of 2. If any Relay responds, attempt registration. If successful, then done. Otherwise, continue Autoselection (402);
- Continue incrementing TTL and pinging each Relay until a max TTL value is reached. In an embodiment, Max TTL is configured by way of the Console (403);
- If no Relays are found that accept registration, try to register with “Failover Relay” (404);
- If Failover Relay is unavailable, then try to register with the Server (405);
- If Server is unavailable, Autoselection has failed and Client waits for a minimum time period and tries Autoselection again. In an embodiment, “MinRetry” is configurable by way of the Console (406);
- After “MinRetry has elapsed, try Autoselection again. Double “MinRetry, wait and try again, doubling “MinRetry” each time (407);
- After a maximum retry time “MaxRetry”, for example, has been reached, continue to retry Autoselection (408).

Failover Behavior 500 (shown in FIG. 5)

- Agent posts/gathers/registers to the Relay (501);
- If Agent has a posting issue (or if gathering or registration fails), it notes the failure time (502);
- Agent tries again to post or gather or register on the normal schedule. If there is another failure, the Agent considers the Relay to be down (503);
- At this point, the Agent enters into a failure waiting state for “ResistFailure” time period starting at the failure time (504);
- After the “ResistFailure” time expires, the Agent tries again to post to the Relay. If it fails again, it begins Autoselection (505).

Relay Reselection Strategy (shown in FIG. 6)

Automatic Selection (FIG. 6A)

While Relay selection is in progress (601):

- Get a candidate host from the Relay selection algorithm (602);
- Try to register with that host. If registration succeeds, a new Relay has been selected, If registration fails, continue (603);
- Attempt Failover selection (604); and
- Attempt root Server selection (605).

Manual Selection (FIG. 6B)

- Attempt primary selection (606);
- Attempt secondary selection (607);
- Attempt tertiary selection if one or more tertiary Relays have been designated (608);
- Attempt failover selection (609); and
- Attempt root Server selection (610).

Triggers for Relay Selection

- A pre-configured validity interval for Relay selection expires;
- A Client sets itself up to perform Relay selection when if resets itself, for example, at startup when the Client detects that the Action site masthead points to a different deployment than the one in the data folder;
- If the Action site epoch changes;
- If the clock leaps backward by more than a permissible time interval, for example, five minutes;
- If the IP address table changes;
- If the last Relay selection failed and the retry interval has elapsed.

Intervals are configurable by settings;

- When Relay selection has failed and pending retries are outstanding, if the IP address table changes, it accelerates a Relay selection retry. If this fails, it goes back to the prior Relay selection retry interval;
- Client is unable to post report to its selected Relay for a configured time interval. Once the interval elapses, the Client tries to register. If registration fails, the Client tries to Relay select;
- A ‘Relay Select’ command is executed, for example by an administrator by way of the Console;
- If any of the RelayServer Automatic setting for any designated Relays are changed or deleted by a ‘setting’ or ‘setting delete’ Action command;
- If the registration interval has elapsed and the Client tries to register but registration fails.
- If the Agent on a Client is stopped and the Relay selection(s) is cleared, and the Client is restarted, the Client will begin Relay selection.

Typical Relay Functions

- Relays “Relay” information to and from the Client and another Relay or the Server;
- Relays may enable Clients to gather the latest information about new Fixlet messages, new actions, or new downloads;
- Relays may enable Clients to pass status messages to the Server including Action results, retrieved properties, and relevant Fixlet messages.
- Relays may enable Clients to register their last known IP address so they can be “pinged” later if there is new information to gather.
- Relays may enable BigFix Clients to download patches and other files.

As above, Relays are usually Clients that have that have been specially configured to function as a Server does, in addition to their normal functioning as a Client. Thus, like Clients, Relays themselves can be configured, as described above, to automatically seek out and connect to the nearest Relay. In effect, the connecting Relay is choosing its parent in a Relay hierarchy. Thus, in an embodiment, automatic Relay selection provides for a Relay that determines its parent Relay dynamically, so that as the state of the network changes, different hierarchies and routing paths through the network are constantly being discovered by Clients and Relays without any modification of the hardware or the network topology and without any input from an administrator. As will be described in greater detail herein below, the ability of Clients and Relays to discover routings through the network enables a multitude of use cases all based on the establishment of dedicated pathways through the network for particular purposes.

Dynamic Download

In an embodiment, Fixlet messages can download and run specified payloads whose SHA-1 checksums have been captured at the time the Fixlet is created. Thus, actions created from such Fixlets will run only the specific executable that was referred to by the source Fixlet.

Certain applications, however, may involve objects, updates for which need to be downloaded regularly. In particular, vendors of antivirus software update their antivirus definitions, occasionally as often as several times per day. There exists, however, a significant possibility of damage or attack when downloading a file without knowing exactly what it is.

While it would be possible to manually download and deploy the object, manual download would be time- and labor-intensive to most users of the Platform. What is needed is a trustworthy way to deploy the latest version of the object, for example, the latest version of an antivirus engine to Clients that request it. It would be desirable to offer providers of anti-virus and of spyware, for example, the ability to deploy a policy Action to tell Agents to periodically update the anti-virus definitions on the Client to the latest version, while taking advantage of the Relay distribution infrastructure.

Furthermore, it would be desirable to be able to configure a Client to automatically apply all critical updates in a particular site. Additionally if would be desirable to automatically push updated sales lists to field sales laptops, or to push to push data files to retail locations.

In an embodiment, a Fixlet message is authored and deployed that instructs a Client to trust an arbitrary piece of content to run, delegating the responsibility for knowing that the content is safe to run to a piece of trusted logic on the Client. In order to request the arbitrary piece of content, the Client need only supply certain information about the object, for example, a unique identifier for the object such as a hash of the object. Thus, by means of the Fixlet message, any Client in the system can be configured for this interaction wherein untrusted content is downloaded to the Client. Any Client can ask the Relay to retrieve a particular file by providing the file size and the hash of the file. After the information is provided, the Relay can mirror the file through, from the Root Server, from the Internet and back down through the Relay hierarchy. In an embodiment, the Client knows in advance what it is asking for. Thus, Dynamic downloading provides the ability to use relevance clauses to specify URLs.

An embodiment makes use of the Platform's site-signing and distribution capability to flow untrusted content, such as antivirus definitions, with the ability to merge the untrusted content from other sources with the assurance to users that the particular untrusted content can be trusted. When the content flows down through the Relay infrastructure to the Client, it may be merged with an Action instructing the Client to run whatever the content tells the Client to run.

Thus, in an embodiment, an object or an item of content may need to flow down to the Clients in order to be processed. Trusted software on a Client evaluates the content and decides the URL, the SHA-1 and size of the file necessary to update the Client. Then, the URL, the SHA-1 and file size flow back up from the Clients to the Server. The Server is then able to produce the specified file, whereupon the file flows down through the Relays and is executed in the context of Clients that have been configured to automatically apply an update policy whenever the SHA-1 changed.

Thus, it could be that a single piece of content may contain the information necessary for a piece of antivirus software to update itself. In addition to that, it could also contain antivirus definitions, such that a combined Agent could say, “yes, I need these three files” or a antivirus Agent could say “I only need this one file.” They could then both derive the information necessary to specify what file to download from the same content feed—the same piece of data that flowed down from the Server. The choice would then be conveyed back through the hierarchy to the Server to collect the appropriate file.

It will be apparent, that, at the time when a policy is published, at least some of the information that the policy concerns itself with may not be static. For example, in the case of a virus definition file, the information changes whenever a new version of the virus definition file is published, perhaps as often as several times per day.

In an embodiment, an Operator inspects ActionScripts and approves them for execution on the Client. ActionScripts may be static, in which case it is a fairly simple task to inspect them to see which steps will be executed on the Client. In the case of dynamic content, however, where dynamic elements change in an ActionScript, the ActionScript uses variables to refer to the dynamic content.

Additionally, the foregoing approach protects the confidentiality of customers of the Platform vendor, reassuring them that an excessive amount of control has not been surrendered to, for example, a software vendor who is producing the virus definition file.

In an embodiment, the Client is enabled to look up the dynamic information indirectly and fill it into the variables. In this way, the Operator is able to inspect the sequence of instructions as they are to be executed on the Client, allowing the Operator to better decide whether or not to trust the content and to approve the ActionScript.

One embodiment enables performance of dependency resolution, in order to install various pieces of software and to update that software. Dependency resolution is useful in the case of an arbitrary collection of software, at least some items of which depend on other software being installed. Any particular piece of software might have incompatibilities with other pieces of installed software. There may exist requirements such as if a first piece of software is updated another piece will need to be updated. It becomes a quite complicated process to resolve all those dependencies.

An embodiment of the Dynamic Download application provides data in the form of a set of packages to a process on the machine itself that is able to analyze the set of packages. The process produces a list of URLs, SHA1 checksums, and sizes that need to be downloaded for the particular machine in order for it to update to a new version of a package. That same set of information can be processed by different computers, and each may arrive at a different answer because of the software already installed on the machine.

As an example, one could author and rollout an Action to install the newest version of the [Apache] Web Server, for example.

The Action is rolled out to a number of machines. Each machine may have thereon a data file that defines the set of URLs, SHA1 checksums and sizes that contains specific versions of other packages upon which that version of the Web Server depended for use in extracts the set of other packages needed to be applied to that machine in order to update it to the newest version of that Web Server.

Thus, in this case, the ActionScript is written such that it may use one or both of relevance substitution and some local processing of the Client, to look through a large list of URLs, SHA1 checksums, sizes and dependency information about what each one of the package requires and is compatible with, to determine the set of downloads needed to be pulled down to this particular machine to execute just that set.

It will be appreciated that a common feature of the foregoing embodiments of the Dynamic Download Application is that they are based on knowledge of the context of the item or items sought. Thus, a requestor doesn't provide just an address. Instead, the requestor is asked to describe, through a SHA1 checksums, exactly what is sought, in order for a Relay to pull it by specifying, at least, the size of the file and the hash of the file. An additional common feature is the evaluation of relevance for a particular Client, because each Client may have different update requirements or download requirements.

An embodiment implements the Dynamic Download application as shown in FIG. 7. As described above, a Site is a collection of Fixlets and other content. Custom sites may contain only internally-sourced content or a combination of internally- and externally-sourced content. Additionally, an Integration 705, as shown in FIG. 7 is a site that may integrate content from a number of sources or providers. For example, an integration may contain Fixlets from one or more anti-virus software manufacturers for downloading anti-virus updates. Referring now to the drawing, a process 700 for implementing the Dynamic Download application may include at least one of the following steps:

- Integration (705) pulls data (1) from the cloud (702);
- Integration (705) modifies (2) the White-list (706) on disk;
- Integration (705) adds meta-file (3) to custom site (704) via Server API;
- Server propagates custom sites (4) to Clients (710, 711);
- Based on Action and meta file, Client 1 (710) submits request (5, 12) for files with hash “aqz24” and bgf39″ to Download request plug-in. Relay (709) has “aqz24” in cache, but does not have “bgf39”, so it initiates a download request (7) for that file and returns (4) “aqz24 available, bgf39 not yet available”;
- Client 2 (711) simultaneously submits a request (8) for file with hash “bgf39”. “bgf39” is already pending, so the Relay (709) simply returns (4) “not yet available”;
- Relay (709) submits request (7) for “bgf39” to Root Server (703). Server (703) checks submitted URL against White-list (706) and determines that the URL is acceptable. Server initiates download request (8) and returns “not yet available”.
- Server (703) fetches (9) “bgf39” from the Internet;
- Server (703) sends (4) “bgf39 available” notification to all children;
- Relay (709) receives “bgf39 available” and begins fetching (10) “bgf39” from cache of parent;
- Relay (709) sends (4) “bgf39 available” notification to all children (711, 710); a
- Both Clients download (11) “bgf39” directly frndom parent's cache, and if all Action requirements are now satisfied, begin running the Action.

As with static downloads, Dynamic Downloads must specify files with the confirmation of a size or SHA-1. However, the URL, size, and SHA-1 are allowed to come from a source outside of the ActionScript. This outside source may be a manifest containing a changing list of new downloads. This technique makes it easy to access files that change quickly or on a schedule, such as antivirus or security monitors.

This flexibility entails extra scrutiny. Because any Client can use Dynamic Downloading to request a file, it creates an opportunity for people to use the Server to host files indiscriminately. To prevent this, in an embodiment, Dynamic Downloading uses a White-list. Any request to download from a URL (that is not explicitly authorized by use of a literal URL in the ActionScript) must meet one of the criteria specified in a White-list of URLs on the Server. In an embodiment, the White-list may contain one or more regular expressions in, for example, a Perl regex format, separated by newlines, such as shown in Table 1, below:

TABLE 1 http://.*\.site-a\.com/.* http://software\.site-b\.com/.* http://download\.site-c\.com/patches/JustThisOneFile\.qfx

The first line is the least restrictive, allowing any file at the entire site-a domain to be downloaded. The second line requires a specific domain host, while the third expression is most restrictive, limiting the URL to a single file named “JustThisOneFile.qfx”. The foregoing description of the White-list is illustrative only and is not intended to be limiting. If a requested URL fails to match an entry in the White-list, the download immediately fails, with status NotAvailable. A note may be made in a Relay log of the URL that failed to pass. In an embodiment, an empty or non-existent White-list causes all URL downloads to fail. In the other hand, a White-list entry of “.*” (dot star) allows any URL to be downloaded. Other methods of composing and formatting a White-list are consistent with the spirit and scope of the subject matter described in the attached Claims.

While the foregoing embodiments describe Dynamic Downloads either from the Server or from a Relay, an embodiment permits Relays to download directly from the Internet. In such a case, a file that the Root Server has already told the Relay is available can be downloaded directly by the Relay.

In an embodiment, status reporting for Dynamic Downloads is integrated with reporting for static downloads, being displayed side-by-side. In an embodiment, reporting on any given Action is limited to a configurable number of Dynamic Downloads, for example, the twenty most recent, in order to avoid overwhelming an Action document and the connection between Server and Console.

As described above, the primary key or download request is the hash and the file size. Thus, in a case of different download requests for the same hash/file size, with each request naming a different URL, the second URL is ignored. Alternatively, if the first URL fails, a request for the second URL may succeed by changing the URL of the file recorded on the system.

In the event that a request fails, the Client may re-try the download by resubmitting the request.

In an embodiment, failures may not be propagated down to the network. Instead, Console status reporting is operative to alert the Console Operator of the failure, so that it can issue a notification to the Client to discontinue sending a request that has failed a number of times. In an embodiment, Clients are discouraged from making frequent retry requests by configuring a long delay interval between retries.

“DownloadRequest” Serialization

In an embodiment, DownloadRequests may have a serialization format as shown below in Table 2:

TABLE 2 <response format version number> aid=<Action id or “null”>, hash=<hash as hex or “null”>, status=<”Available” or . . .>

“DownloadResponse” Serialization

In an embodiment, DownloadResponses may have a serialization format as shown below in Table 3:

TABLE 3 <response format version number> Aid=<Action id or “null”>, index=<download index or “null”>, hash=<hash as hex or “null”, status=<“Available” or . . .

Requesting Downloads

In an embodiment, Clients and Relays may request a download from their parents by providing, for example:

- SHA-1 of the file;
- File size; and
- URL of the file.

In an embodiment, the file size and the URL are not technically necessary. However, the file size reinforces the SHA-1 mechanism and the URL allows the Server to fetch the file directly from the Internet without having to check a local index.

The file size/SHA-1 uniquely identifies a download request. If the Server has a matching entry in its cache, the provided URL does not need to be used. As above, the URL, in fact may not even match the original URL used to request the file.

In an embodiment, Clients are provided with the ability to request an arbitrary URL.

Dynamic Download Cache Model

In an embodiment, a record of file downloads and progress is stored in a table that uses FileID as the primary key. In an embodiment, the URL, the file location and the status are stored as values.

FIGS. 8 and 9 show state transition diagrams for Relay (800) and Server (900), respectively. When a Client issues a download request, the request goes to the Client's Relay, which then checks the cache for the file.

- If the file exists in the cache (if the state of the FileID in question is AVAILABLE, the Client is then instructed to download the file from the FileID's file location)
- If the Relay does not have the file:
  - The Relay creates an entry for the file in the table, with the state NEW
  - The Relay then proceeds to make the request to its parent about the file and changes the state to REQUEST_SUBMITTED;
  - The Relay informs the Client that the file is not yet available;
  - The Relay passes on the download request to its ancestor. When bytes of the file start arriving at the local Relay, it changes the state to DOWNLOADING.
    When the file is finally on the leaf Relay, the state is then changed to AVAILABLE.
- The download request may pass through the White-list screening at the Server level.
  The failure state:
- can be reach from the REQUEST_SUBMITTED state for reasons such as the link being down, and so on;
- can be reached from the DOWNLOADING state for reasons such as the connection dropping;
- means ‘nothing is happening.
  - In an embodiment, a timeout is configured and the FAILURE state reverts to a NOT STARTED state for that file request. Clients then may retry the request normally.

In an embodiment, the cache is implemented using SQLITE. Other embodiments may employ other database engines that support in-memory databases and triggers.

Sending Download Notifications

As above, a download triple consists of SHA-1, filesize and URL. The URL describes the location of the file and the SHA-1 and filesize function to verify the file. In an embodiment, a Client may send a download notification that includes a list of download triples. The Relay evaluates the triples and signals the Client when to start the download. This may be either immediately, if the file is present on the Client's Relay parent or after the download to the Relay is complete.

State Serialization

Given two Clients, C1 and C2 and one Relay, R, it may occur that C1 and C2 request the same file. When the download request comes into Relay R, and is processed, a lock may be held so that only one download request is processed at a time.

Example:

- C1 requests a file from R;
- C2 requests the same file;
- R grabs lock, processes C1's request first:
  - if the file is AVAILABLE, R notifies C1 that it is and begins download;
  - if not, R makes an entry, marks the file IN_TRANSIT, and passes the download request up to R′s parent;
- R releases lock
- R grabs the lock to process C2's request;
- R sees that C2 is requesting the same file as C1 and checks the cache to see if it is AVAILABLE. If Cis request has been filled, the file is already there. If the file is still IN_TRANSIT based on Cis request, R notifies C1 and C2 when the file is available.
- R releases lock
  In this way, a request lock avoids multiple downloads being passed up the hierarchy for the same file.

Download Status Reporting

In an embodiment, failures are not propagated to children. Thus, Clients do not need to be responsible for a retry, eliminating the necessity for a Client that switches to another Relay to check an additional state for a file. Instead, the Client can just do a re-try after a timeout. Such a practice also aids in Relay failure; thus, if a Relay state is lost, the default is that the Client eventually requests a re-try.

In order to keep the Relay cache synchronized with the actual files located on the Relay, on a Relay reboot, all states mapping to a file download request are removed. Thus, the cache can rebuild itself by checking what files are actually on the Relay. Typically, the Relay mailbox contains response and requests that map to files in the cache with the states NEW and REQUEST_SUBMITTED, respectively. The cache may either remove partially downloaded files or make a list of them and add them as files in the cache with state DOWNLOADING.

Distributed Server Architecture (DSA)

An embodiment incorporates a Distributed Server Architecture. In an embodiment, Distributed Servers do not download from each other because all Servers are assumed to have the same level of network connectivity. Additionally, there is no replication of the Servers' download caches. In an embodiment, download White-lists are not replicated. Thus, they may be manually configured on each Server.

Additionally, Download Requests may succeed and fail completely independently on different Servers. Because all of the necessary logic is stored on the Clients and in the White-list, exchange of information between Servers is rendered unnecessary.

Client Implementation

As described above, the Dynamic Download feature can render the limitation that URLs and SHA-1s be known at Action creation time unnecessary. With Dynamic Downloads it is sufficient that URLs and SHA-1s be computable by the Clients prior to Action execution. Client processing may be impacted in at least the following ways:

- Security: The Platform is capable of making changes to all machines in a deployment in a very short period of time. With the new ability for Clients to request arbitrary downloads, it is up to the ActionScript author to protect users of his actions and to ensure that the downloads and their SHA-1's have not been compromised. An end-to-end authentication mechanism, as described herein below, which is resistant to man-in-the-middle attacks, is an effective defense. In an embodiment, authoring a Dynamic Download ActionScript includes crafting the Action such that it authenticates information before using it, explicitly identifying those steps in the ActionScript that perform the authentication so that users of the Action can audit the mechanism before deciding to trust it.
  - To facilitate authentication and allow custom logic to be used to compute download URLs before the Action becomes active, an embodiment includes the ability to execute short-lived applications to perform these functions.
  - An embodiment includes a trusted software component to perform the authentication as an integrated part of the update process. An embodiment includes the ability for an ActionScript author to specifically call out the reliance on the trusted software component, in a comment, for example.

Download Requests

When processing an ActionScript containing the begin pre-fetch block/end pre-fetch block commands, as shown herein below, a Client can identify files to be downloaded to a Relay by providing the URL/checksum of each file. In an embodiment, multiple requests are consolidated by a Relay into single requests to a parent Relay. Ultimately the requests arrive at the Root Server. The Root Server then verifies the URLs through the White-list, and provides the file, either from its cache or by attempting to download the file. If the URL produces the appropriate SHA-1 file, the Relays are then notified of the availability of the files, and they pull them down if they have descendants that have requested the file. Agents are notified of the availability of these files, via a Notification message, which they pull them down if they are interested.

If a URL/SHA-1 is not available, Agents continue to request it, until (1) the Action that drove the request is stopped or (2) the URL/SHA-1 becomes available, or (3) the request has been made a number of times.

Action Processing Logic

In an embodiment, the Action language provides an explicit pre-fetch block of ActionScript to be used to identify pre-fetch downloads. Actions triggering the dynamic download feature may be authored with the pre-fetch block, thus making it easier to identify pre-fetch Action activity.

Action Language

The following Action language commands identify the boundaries of the pre-fetch block:

TABLE 4 begin pre-fetch block end pre-fetch block

A number of commands are allowed within the pre-fetch block:

TABLE 5 // comment lines and blank lines if/elseif/else/endif - properly nested within the pre-fetch block. parameter Action parameter query - treated as a comment by the Client

Commands allowed within the pre-fetch block that are not allowed outside it:

TABLE 6 add nohash pre-fetch item [name=<n>] [size=<s>] url=<url> add pre-fetch item [name=<n>] sha1=<sha1> size=<size> url=<url> [; ...] add pre-fetch item {[name=<n>] sha1=<sha1> size=<size> url=<url> [; ...]} collect pre-fetch items execute pre-fetch plug-in

When processing actions with pre-fetch blocks, certain commands should not be used, such as:

TABLE 7 download as pre-fetch download (other than download now, which must appear outside the pre-fetch block)

Command Placement

In addition to the above, when processing actions with pre-fetch blocks, downloading that is permitted during Action execution may be triggered by a ‘download now’ command. In an embodiment, pre-fetching specifications may be placed at the top of the ActionScript, thus making it easier for readers to understand which files are being collected.

Syntax Error Messages

For example:
“Only a single begin pre-fetch block is allowed”;
“Only comments and blank lines are allowed before pre-fetch block”;
“End pre-fetch block found before begin pre-fetch block”;
“Command invalid inside pre-fetch block”;
“Command invalid outside pre-fetch block”;
“Relevance substitution missing trailing ‘}’”;
“Relevance substitution is not allowed”;
“Missing required argument url=”;
“Missing required argument size=”;
“Missing required argument sha1=”;
“Argument not allowed sha1=”; and
“Argument is not recognized”.

Command Processing Notes

For example:

TABLE 8 begin pre-fetch block

Presence identifies new style Action;
One allowed per Action;
Comments and blank lines may precede this command; and
Paired with a matching ‘end pre-fetch block’ command.

TABLE 9 end pre-fetch block

Paired with a ‘begin pre-fetch block’ command

TABLE 10 if/elseif/else/endif

Only commands inside true condition pathways are performed.

TABLE 11 add nohash pre-fetch item [name=<n>] [size=<s>] url=<u>

- ‘name=’ is optional;
  - when specified, <n> is limited to 32 alphanumeric, ‘-’, ‘_’ and non-leading ‘.’;
  - when not specified, name is taken from last component of URL (after last ‘/’);
- ‘size=’ is optional. When specified, progress information can be more meaningful;
- ‘URL’ is required;
- ‘SHA-1=’ is NOT allowed;
- ‘keyword=<v>’ can be in any order, unrecognized keywords are a syntax error;
- Clients and Relays collect these files by ActionID/ordinal number;
- Relevance substitution not allowed;
- Not plural-can specify only a single download;
- Server caches download at Action creation time;
- Relays collect all these if Client requests any ordinal files; and
- Client will download if command is inside a true condition block.

add pre-fetch item [name=<n>] sha1=<h> size=<s> url=<u> [; ...]

- ‘name=’ is optional (same handling as in ‘add nohash pre-fetch item’ above);
- ‘SHA-1’, ‘size=’, and ‘URL=’ are required;
- ‘keyword=<v>’ can be in any order and unrecognized keywords are ignored;
- Clients and Relays collect files by URL/SHA-1;
- Relevance substitution is allowed;
  - When used, files are NOT cached on Server at Action creation time;
  - When used WITHOUT substitution, files are cached on Server at Action creation time;
- Plural-can specify 0 or more pre-fetch items, each separated by a ‘;’;
- Relays only collect files that Clients request;
- Clients will only request if inside a true condition block;
- When download items are in a file, one download item per line, use {concatenation “;” of lines of file <your file>}; and
- In cases where a file in a Fixlet site holds the download information, this command can specify the file(s) to download.

TABLE 12 execute pre-fetch plug-in “full path to executable to launch” <rest of line>

- This command requires the first argument to be the full path to plug-in that should return very quickly;
- Relevance substitution can be specified on this command;
- The remainder of the command line is passed as arguments to the executable;
- If the command takes longer than 2 seconds to execute, the Client will log a message;
- The main thread of the Agent will block for up to 60 seconds while it waits for the command to complete. The only thing that will interrupt this waiting is a shutdown service request;
- If the command takes longer than 60 seconds to execute, the Client will log a message and disable the ‘execute pre-fetch plug-in’ command;
- When disabled, all actions that use this command will not execute until after the Client is restarted;
- This command can be used to authenticate content;
- This command can be used to execute custom logic that can leave behind an artifact for subsequent ‘add pre-fetch items’ commands;
- In the trend integration, this command is used to execute a program that processes a Server_bf.ini file, and produces a file containing the set of URLs to be downloaded;
- The exit code of the execute pre-fetch plug-in application is important as it informs the Client of failure or success:
  - 0=success; and
  - all other exit codes are treated as failures and result in a failed Action attempt. For debugging purposes, the exit code is logged to the Client log.

TABLE 13 collect pre-fetch items

- Client interacts with the Relay to request the set of files specified thus far in the pre-fetch definition block;
- Use this command when a downloaded file is needed in order to compute what additional files should be downloaded;
- Subsequent lines in the ActionScript will not be executed until all files in the pre-fetch list are collected and given the names specified;
- Each instance of ‘collect download items’ serves as a synchronization point to make the Client get all the files specified so far;
- Any files not yet on the Client are requests from its parent and the Action will wait until those files are available;
- When they are all available and have been downloaded, the Client re-processes the pre-fetch block again from the beginning to refresh the set of files it needs;
- Any files needed by pre-fetch logic are available after the ‘collect pre-fetch items’ command and can be referenced in their pre-fetch location using the download inspectors identified below; and
- When the Client processes the ‘end pre-fetch block’ command, it collects all files in the pre-fetch items list before starting the Action.

Client Download Request Mechanics

When a Client builds a download list, if there are ActionID/ordinal downloads but no URL/SHA-1 downloads, the Client uses the request mechanism without URL/SHA-1. If there are any URL/SHA-1 downloads present, it uses the URL/SHA-1-based request mechanism, which allows for ActionID/ordinal requests and URL/SHA-1 requests to be co-mingled. The Client verifies the signature of the Action before it does any download pre-fetching calculations from the ActionScript. If a Relay or Server do not support the URL/SHA-1 based request mechanism, the Client blocks the Action from executing.

Inspectors

Several inspectors allow an ActionScript to be written in a consistent manner that refers to files in the pre-fetch folder when an Action is not active, and to files in the download folder when the Action is active. In an embodiment, Pre-fetch files are collected to a per-Action-pre-fetch-folder until the Action is ready to run. They exist in the per-Action-pre-fetch-folder with various names that indicate the progress of the pre-fetch activities. At various stages in processing these files may be renamed to the names specified in the pre-fetch commands. The named versions of the files when the Action is inactive after every ‘collect pre-fetch items’ may be placed into a ‘named’ folder. Before an Action is run, the pre-fetch files are moved from the ‘named’ folder to a ‘Download’ folder of the Action site. When the Action completes, any files remaining in the ‘Download’ folder are moved into the download cache or utility cache and renamed to their SHA-1.

One or more of the following inspectors can be used to locate files during the pre-fetch processing or while the Action is running:

- download folder
  - When the Action is active, this inspector returns a folder object of the location of the ‘Download’ folder;
  - When the Action is not active, this inspector returns a folder object of the location of the named per-action-prefetch-folder;
- download path “myfile”
  - This inspector returns a string containing the full path to the named file, the file need not exist.
- download file “name”
  - This inspector returns a file object of the specified name in the named folder or the download folder.

Client Behavior

Temporal Distribution with Downloads

In an embodiment, the Client asks for a ‘0’ file. Once the ‘0’ file is available, Clients calculate their time to start, causing the Relays to collect the file as soon as the first Client requests it, so that all of the Clients are not downloading at the same time.

In dynamic download situations, a set of pre-fetch files identified by a first ‘collect pre-fetch items’ statement is requested. If no ‘collect pre-fetch items’ statement is used, the full set is requested. When they become available, the Clients calculate their time to start. Once that time to run is reached, the Client sees if there are more files it needs; if so it requests them, then it runs. It will not pick a different time to run. The effect of this is that the Clients that choose an early distribution time trigger any additional files to be downloaded. Thus, the later Clients do not have to wait for them.

Client Requests Files when All Files Already Available in Cache

In an embodiment, Clients go to their caches before they ask the Relay if the files are available.

Name Collisions

In an embodiment, Clients run the Action with the last file with that name in place, regardless of how many other downloads have the same name.

Sample On-Demand Update Action

This example assumes a version comparison is used to detect that a change (upgrade or rollback) is necessary. Other techniques might use Dates, or compute SHA-1's of saved versions of a server configuration file to trigger the update. This is formatted in a fashion that assumes the wizard constructing it has access to key pieces of information required to generate the Action.

TABLE 14 Subject: Update Trend AV pattern files to version <Server_bf.ini.PatternVersion> Date: <Server_bf.ini.ReleaseDate> x-relevant-when: name of operating system starts with “Win” x-relevant-when: exists service “TMAUClient.exe” and version of service “TMAUClient.exe” >= 2 x-relevant-when: version of Client >= “7.1.5” x-relevant-when: setting “TMAVAUEnabled” of site = “0” x-relevant-when: <Server_bf.ini.PatternVersion> is greater than <VersionInstalledExpression> // ActionScript to update to pattern files to version <Server_bf.ini.PatternVersion> begin pre-fetch block // pre-fetch the Server_bf.ini add pre-fetch item name=ini sha1=<Server_bf.ini.Sha1> size=<Server_bf.ini.Size> url=<Server_bf.ini.URL> // pre-fetch the trend component that produces the download list and updates the pattern files add pre-fetch item name=tmdl.exe sha1=123 size=12 url=http://trend/downloads/tmav_get_dl_list.exe // collect above pre-fetch files (needed to compute the url list) collect pre-fetch items // execute trend component: given ini data file, it produces a file of pre-fetch items. execute pre-fetch plug-in “{download path “tmdl.exe”}” /downloads “{download path “ini”}” “{download path // urllist assumed to be formatted as lines, each containing name=<n> sha1=<h> size=<s> url=<u> add pre-fetch item {concatenation ″ ; ″ of lines of download file “urllist”} end pre-fetch block // Action is now active, update the pattern files now waithidden “{download path “tmdl.exe”}” /update “{download path “ini”}” “{location of download folder”}”

Sample Auto-Update Action

This example assumes a version comparison can be used to detect that the update is necessary. This arrives as a Fixlet. The values are substituted from a server configuration file when the Fixlet is authored by an on-demand wizard. In this situation, Server_bf.ini.PatternVersion, for example, is read from the Server initialization file when the wizard is used to create an on-demand update Fixlet. To build this expression, the name of the custom site must be known. The Client may be configured to know where the auto-update Server_bf.ini and Server_bf.ini come from.

TABLE 15 Subject: Update Trend AV pattern files to newest version x-relevant-when: name of operating system starts with “Win” x-relevant-when: exists service “TMAUClient.exe” and version of service “TMAUClient.exe” >= 2 x-relevant-when: version of Client >= “7.1.5” x-relevant-when: value of setting “TMAVAUEnabled” of site = “1” x-relevant-when: <Server_bf.ini.PatternVersion> is greater than <VersionInstalledExpression> // ActionScript to update automatically to whatever ini file in custom site specifies begin pre-fetch block parameter “ini”={pathname of file “Server_bf.ini” of Client folder of site (value of setting “TMAVCustomSite”) // pre-fetch the trend component that provides the download list add pre-fetch item name=tmdl.exe sha1=123 size=12 url=http://trend/downloads/tmav_get_dl_list.exe // collect above pre-fetch files (needed to compute the url list) collect pre-fetch items // execute trend component that given the ini data file, produces a list pre-fetch items execute pre-fetch plug-in ″{download path ″tmdl.exe″}″ /downloads “{parameter “ini”}” “{download path “urllist”}” // urllist assumed to be formatted as lines, each containing name=<n> sha1=<h> size=<s> url=<url> add pre-fetch item {concatentation ″ ; ″ of lines of download file “urllist”} end pre-fetch block // Action is now active, update the pattern files now waithidden “{download path “tmdl.exe”}” /update “{parameter “ini”}” “{location of download folder}”

Client Credential Security Model

In an embodiment, the Platform provides a security model having at least the following capabilities:

- Clients can trust content received from the Server. All commands and questions that Clients receive are signed by a key that can ultimately be verified against a public key that is distributed to all Clients at install time; and
- Clients can submit reports to the Server without risk of snooping. The Client can choose to encrypt the reports it sends up to the Server, so that no attacker can see what the report contains.

In the foregoing approach, Clients are assigned unique identifiers when they register. Any entity, such as a machine or network, that requests a registration interaction with the Server is issued a unique identifier and is trusted. Many of the properties associated with a particular Client that can be viewed by an operator by way of the UI to the Console are aligned with that Client based on that identifier that was handed out at the time of registration. Accordingly, the foregoing approach provides strong authentication of the Server and the Administrators by the endpoints (Clients). That is, whenever a Client receives a command from an Administrator, the Client knows exactly who issued it by virtue of the strong cryptographic mechanisms. Additionally, the channel can be encrypted through strong cryptographic mechanisms. However, information flowing in the opposite directions, from endpoints (Clients) into the system, is not authenticated because there previously has not existed a reliable way to authenticate the endpoints. Not being able to reliably authenticate an endpoint may provide an opportunity for such attacks as spoofing, in which a person or program successfully masquerades as another by falsifying data and thereby gaining some illegitimate advantage.

There exist, for example, simple techniques that attackers use to spoof information, such that the Console would display the spoofed information as if it were genuine—as if it was coming from the particular Client associated to a particular Client identifier. A Client authentication mechanism, in which a cryptographic credential is established on each Client (endpoint), provides a much stronger, more robust security model that greatly minimizes the risk of spoofing attacks.

In an embodiment, the Client Authentication mechanism extends the previous security model to include a mirror image of the above-mentioned capabilities:

- Clients sign every report submitted to the Server, which is able to verify that the report does not come from an attacker; and
- Servers can send data to Clients without risk of snooping. The Server can encrypt data that it sends to a Client so that no attacker can see what data is being sent to the Client.
  While such a model is well-suited to a use case in which Clients send reports to the Server, it is also applicable to various use cases in which Clients authenticate each other in a similar way.

The foregoing embodiments of the security model present complementary challenges:

- The first approach involves generation of a single private/public key pair and distribution of many copies of the public key. Additionally, at install time, the installer naturally has the right to tell a Client to trust a Server because the installer has control over the Client; and
- The Client Authentication mechanism involves generation of many private/public key pairs and wide distribution of each of the many public keys. Additionally, there exists no immediate way to prove that an installer has the right to tell the Server to trust the Client, because the installer may be unknown. For example, the installer may be an attacker installing a new Client on his/her own machine, pretending to be some other resource.

A solution to the above challenges allows anyone to enter the system and generate a new identity and builds trust from that starting point, unlike conventional security systems, which specifically require that a new resource be explicitly joined to the system by an Administrator. Referring now to FIG. 10, at Initial Registration, a Client produces a public/private key pair. The Server then grants a unique Computer ID which the Server associates to public key. Thus, after registration, the Computer ID and the public key are associated to the particular unique Client.

Assuming that the private key created on the Client is not distributed to any other devices, it can authenticate content coming from that Client, making it possible to verify any messages sent from the Client.

Overview

In an embodiment, a cryptographic protocol, such as OPENSSL is employed to create public/private key pairs for each new Client in a deployment. When a Client initially registers, it submits a public key with a request that the key be associated to a new computer ID. The response to the Client request, in turn, is signed with a key that can be authenticated by the Client. Thus, the Client may not be deceived, thinking that it has registered with the root directly with a Root Server when it has, in fact, registered through a malicious middleman who has switched the public key submitted to the Root. The Root Server stores the Client's public key in a map of computer IDs to public keys. The key remains associated with the ID for the life of the ID.

On subsequent interactions, reports or file uploads, for example, the Client signs the interaction with its private key. When the Root Server receives a report, before updating the data for the computer ID provided, it verifies that the report is signed by a key that matches the public key on file for that ID.

To send secure data to a Client, the Root Server exposes APIs, for example, by way of the database or SOAP (simple object access protocol), that allow lookup of public keys given a computer ID. In an embodiment, the data is trusted, to assure that the data gets encrypted against the intended target, and not a maliciously-inserted target. In an embodiment, database security and/or signing the data provide a sufficient degree of trust. Given the public key, any program can encrypt data and provide to the Client however it wishes.

Details of the Client Authentication Mechanism: Client Data:

- Public key;
- Private key;
- Computer ID;
- Registration interaction number; and
- Report number.

Server Data (per Client):

- Public key;
- Computer ID;
- Registration interaction number;
- Report number; and
- Reject this Computer ID.
- Shown in FIG. 10, if Client Computer ID=0 or if Client public/private key pair is missing or non-functional:
  - Begin registration;
  - Create public/private key pair;
  - Set registration interaction number to 0;
  - Send computer ID=0, public key, registration interaction number;
  - Receive computer ID;
    - Registration success, begin normal processing;
  - Receive public key in use;
    - Go back to begin registration.

If Client Computer ID !=0, and public/private key pair is functional:

- Subsequent registration;
  - Increment registration interaction number;
    - Send computer ID, Public key, and encrypted registration number;
    - Receive computer ID;
      - Registration success begin normal processing;
      - Receive clone detected, set computer ID to 0, go to Begin Registration.

If Server Registration Request with Computer ID=0:

- If Public key already in use, reject registration by telling Client ‘public key is in use’;
- Otherwise:
  - Allocate a new Computer ID that is unique;
  - Store a new computer record containing Computer ID, Public key, Registration interaction number=0, report number=0, reject this computer ID=false;
  - Send Computer ID.

If Subsequent Server Registration Request with Computer ID!=0 (FIG. 11):

- Receive Computer ID, Public key and encrypted Registration interaction number;
- Reject if cannot decrypt Registration interaction number with Public key provided;
- Look up Computer ID record;
  - If not found:
    - Store a new computer record containing the Computer ID, Public key, Registration interaction number decrypted, report number=0, reject this computer ID=false;
    - Send Computer ID;
  - else
    - if (decrypted Registration interaction number>stored value);
    - This is a valid subsequent Registration attempt;
  - else
    - this is a clone or replay attack;
    - send back a message encrypted with public key provided;
    - Receive response proving it is a clone (it has the private key);
    - If it is a clone:
      - Set ‘reject this Computer ID’=true;
      - Tell clone to reset itself (use a Computer ID=0);
    - Else:
      - inform sender that Registration failed.

Client report

- After preparing report (with report number and Computer ID embedded):
  - Compute SHA-1 of report;
  - Encrypt SHA-1 of report using private key;
  - Tack encrypted SHA-1 to end of report.

Server Report

- When receiving report:
  - Compute SHA-1 of report;
  - Read Computer ID from report headers;
  - Look up public key of this Client;
  - If not found, reject report;
  - Decrypt SHA-1;
  - if SHA-1s match, process report into database.

The person of ordinary skill will notice that the foregoing embodiments employ the SHA-1 cryptographic hash algorithm. Other embodiments may incorporate other cryptographic hash algorithms such as MD4, MD5, SHA-0, SHA-2 or SHA-3.

As shown in FIG. 12, it is apparent that, after a Client registers, barring the circumstance that the Client's private key is somehow installed on another machine, the foregoing Client Authentication model provides a high degree of certainty in subsequent interactions that the Client is authentic, that it is who it says it is.

In addition, the foregoing model also provides a mechanism for doing clone detection, in the event that a key does become compromised. The cloning detection, when it detects a cloned key during a registration attempt, invalidates the Computer ID associated with the cloned key. Subsequently, the Client must generate a new key pair and begin the registration process anew, thus enabling the detection of key reuse by a different party.

It will be appreciated that the level of trust established by the foregoing Client Authentication model may be raised through combination with other authentication mechanisms. For example, a higher level of trust may be achieved by establishing a second data pathway to secure a confirmation; for example, by requiring the registering party to confirm that they, in fact, are the registering party by email. Alternatively, a higher level of trust may be established if a Client is able to authenticate through a Server's active directory, or if the Client and Server can exchange keys via a protocol such as SSH (secure shell). A still higher level of trust may be achieved through by physically verifying that the machine's credentials can be trusted; for example by having an operator access the machine and verify the public key. Additionally, Clients accorded varying levels of trust may be identified in the Console interface. For example, Clients accorded the primary trust level are grouped together in one region of the display, while Clients accorded the highest trust level are grouped together in another region of the display.

While the foregoing Client Authentication model has been discussed primarily in connection with Client/Server interactions, the model also finds application in interactions between Client, for example a clustering relationship involving a number of endpoints.

Additionally, while the Client Authentication model has been discussed primarily in connection with Client/Server interaction, in an embodiment, it may also play a role in interactions between a Relay and a Client. As described above, Relays are typically Clients that have been additionally configured to behave as a Server. Accordingly, because a Relay is also a Client, the Relay can also be issued authentication credentials like a Client. By authenticating the Relay, a Client knows that it is talking to a Relay, thus providing additional protection against Snooping attacks, such as man-in-the-middle attacks.

An embodiment of the Client Authentication model finds application in the sending of a password down the hierarchy to a Client from the Server. It is a common IT management task to reset the password on a Client. Conventionally, a password, when it is sent to a Client is scrambled. The Client is then given a utility to unscramble the password. However, giving the Client the unscramble utility, in essence, gives it to the rest of the world. Thus, even though the scrambled password is not plaintext, it is not secure. There exists, therefore a great need for a secure way to send a password down to a Client. Because the Client Authentication model includes a key pair for the Client, the password can be encrypted using the Client's public key, which is then pushed to the Client. Because only the Client has the private key, only the Client can decrypt the password.

Direct Connect

As above, an embodiment of the Platform provides the ability to facilitate a connection between a Console operator and a remote computer, as shown in FIG. 13, where a Console 1301 is connected to Client A 1304 through the Root Server 3102 and Relay A 1303. This capability enables a multitude of use cases, many of which fall into one of the below categories:

- Remote control involves leveraging the infrastructure to reach out and establish a synchronous encrypted tunnel between a Console operator and an endpoint, even across NAT (network address translation) translation, personal firewalls, and so on;
- Mailboxing: Building a secured channel for asynchronously sending messages to individual machines.

Among the use cases are:

- Remote “QnA”: using the connection as a remote Fixlet debugger;
- Remote Desktop: remote Shell/SSH (secure shell)/VNC (virtual network computing);
- Password mailboxing;
- VPRO (INTEL Corporation, Santa Clara Calif.) tunneling;
- File discovery/sharing; and
- On-the-fly VPN (virtual private network): allowance for SMB (Server message block) sharing;
- Connecting “Users” and “Computer IDs” to automatically provide privileges to connect to a set of other computers;
- Anti-virus management: a Console plug-in synchronously opens a connection to a Client (endpoint) and transfers the log from the Client up to the Console.

Using the Platform to establish either synchronous or asynchronous one-to-one connections between the Console and a Client readily circumvents a host of restrictions imposed by network topology. For example, the Relay hierarchy readily allows penetration of NAT (network address translation) protocols—a technique that allows a number of machines to share a single IP address from the outside world's perspective—so that it is possible, assuming that a Relay exists behind the NAT, to communicate with Clients behind the NAT.

One embodiment enables routing through the infrastructure into a Relay inside a subnet and then allowing the last leg of communication to take place over an IP address that can directly connect to the target machine.

The Relay hierarchy and the Relay hierarchy discovery mechanisms that employ hop count as a measure of Relay suitability for a machine to connect to greatly simplifies the configuration of routes through the hierarchy. Upon registering with the most suitable Relay by a Client, not only is a connection established with the Relay, but through the Relay all the way up to the Server, such that messages can then be forwarded down the pathway to the particular Client.

In an embodiment, the present Direct Connect methodology uses the pathway to establish a connection. For example, a rendezvous technique may wake up the target machine, inform it that a direct connection is requested and inform the target of the network topology or pathway to use to connect. In an embodiment, it is possible to directly connect across a network.

In an embodiment, the Relay infrastructure may be used as a communication mechanism to trigger a rendezvous, and subsequently to facilitate communications by keeping sockets open in both directions with all of the internet Relays handing off traffic in both connections as packets flow between the two. For example, the Relay infrastructure can be used with certain distributed computing applications wherein a connection is opened up between two ports that wouldn't otherwise be able to connect; the connecting Server can then step out of the middle, so there is no longer any Server involvement.

In an embodiment, as shown in FIG. 14, a direct connection 1400 between two Clients (1401, 1404) may involve two points (1402, 1403) in the Relay hierarchy, without involving the Server at all. For example, in the case of a user who is logged into the same network in two different parts of the world, via direct connection between the two machines, it is possible to allow then for the machines to interact with each other.

In an embodiment, by means of a user interface displayed on the desktop of each Client in the network, the user is able to specify a machine that the user would like to connect to and initiate a connection, for example, with a simple mouse click, triggering an activity that, behind the seasons, makes the connection available to the Client.

In an embodiment, a Relay may be used to provide an execution environment for other functions inside a container, thus providing a place in which Server functionalities can be made more widely available to Clients on the network.

In an embodiment, Relays may be used to host software depositories, for example software updates, so that the updates could be readily flowed to any Relay that has been configured to host the updates.

In an embodiment, Relays may be used to host computational entities such as distributed pattern databases that ideally are scattered throughout the enterprise.

Additionally, Relays may be used to host computational entities such as virtual environments to give the Relay cross-Platform capability, allowing it to run software for any operating system.

In an embodiment Relays can be designated as processing points for a variety of computational tasks.

In an embodiment, Relays can provide a direct connection from a management point to an end point, thus enabling management technologies such as VPRO.

Wake-on-LAN

Wake-on-LAN is a computer networking standard that allows a computer to be turned on or woken up by a network message. Conventionally, the wake-up message is referred to as a “magic packet”, for example, a broadcast frame containing within its payload 6 bytes of 255 with all bits set to the ‘on’ position, followed by sixteen repetitions of the target computer's MAC address. Thus, the challenge is to direct a magic packet down to a target computer to wake it up. However, the magic packets used by Wake-on-LAN have the special property that they only work if they are broadcast within a subnet. Additionally, most networks do not permit sending a broadcast packet to other subnets because they can be easily abused to launch, for example, SMURF attacks.

To circumvent the limitations involved in using a magic packet to wake up a computer, the Relay infrastructure herein described is used to find a way to route a broadcast packet down from any central point within the system, from the management Console, from within an integration point, to any computer that exists within the system by taking advantage of the fact that, when a Client registers with its Relay, up to the root Server, the Client sends up a list of the interfaces that it knows it has to communicate with, what subnets they're in, and what their MAC addresses are. As above, the MAC (media access control) address is the address used for these wakeup commands. Thus, whenever a Client talks to a Relay, it sends up information saying “Here's where I am and here's how you can get in contact with me.”

The Relay retains this information, passing it up through the hierarchy all the way to the root, so that at the root of the deployment, an Administrator is able to readily determine what subnet a target computer occupies. The administrator next needs to find some other computer that is awake in the target computer's subnet that can broadcast the magic packet to the target computer. Because the Relay hierarchy has collected all of the necessary information for the Administrator, he/she knows of, for example, eighty computers that are all on the same subnet as the target computer, and they may be reporting in to, for example, two different Relays.

The administrator may then send a message down through the Relays, to reach the two target Relays which know how to contact the target's subnet, and they both then send out messages to all of the target's peers, requesting that the target be woken up. The Clients are configured to listen for the UDP messages sent out by the Relays asking that the target be woken up. When a Client hears one, it immediately broadcasts one of these Wake-on-LAN messages to the target computer.

Thus, unlike the conventional approach, which usually requires that a computer be designated in each subnet that must be powered-on at all times to provide a point of communication, all of the computers in the target subnet are told to wake-up the target machine. It is highly likely, that out of all of the computers in the target subnet, at least one will be found that is powered-on and can issue a Wake-on-LAN message to the target computer. Because the requirement of a single point of communication has been eliminated, the network is considerably more robust, and easier to consider.

The Clients send out the magic packet on the same interface they're already listening on and they see when other Clients start sending out the same packet. The Clients stop sending immediately when they see this duplicate traffic, so there is a likelihood of a small amount of duplicate traffic, but in the event of duplicate traffic, the Clients elect among themselves which Client will broadcast the magic packet. All Clients that elect to wait a while are silent the next time they see a forwarding request until, a period of time elapses, for example, a second. If they see that Client queried hasn't responded, for example, because it was powered-off, the next Client in line will try.

The election process uses a technique that relies on a unique computer ID and a comparison operation that each computer can use to decide whether or not it should take precedence over the other computers. Any individual computer observing all the UDP traffic to wake up a particular machine in the subnet can decide whether or not it should take precedence. That is, it should be the one who should take precedence in that subnet versus the other ones. Thus, the Client that takes precedence prevails and takes over. The other Clients stay out of the way unless they detect that the designated computer isn't performing its tasks, in which case they also chime in again and again. Whoever becomes dominant is controlled by the ordering of the individual machines according to the machines' unique identities. Thus, there is a built-in technique where the Clients do this election process based on a unique identifier and a colation order for determining precedence.

It should be noted that a Relay is generally a Client also, so that, as long as it fulfills the requirement of being in the same subnet as the target computer, a Relay could be the one to wake-up the target computer.

In view of the foregoing discussion, it will be apparent that the broadcast packet, within the context of the subnet, is actually a broadcast type of communication. The other messages that are actually happening inside of the system are directed messages. So what's flowing down through the Relay hierarchy after some user says “I want to wake up Bob's machine”, is not a broadcast. It is instead directed to the particular machines that are in that subnet that this particular machine reported that it was a member of.

The target machine resides inside a particular subnet; and so its peers within the subnet are notified through directed mechanisms saying “if you're in this subnet—you should wake up Bob (the target machine)”, with his MAC address and so on. Each peer constructs the magic packet with that information, and they tag it with the unique identifier that allows them and their peers to coordinate who's in charge of that subnet and delivering that message. And then they transform it into a broadcast message within the subnet

Thus, a fundamental advantage of the Relays and the Relay hierarchy herein described is that any computer in the system can be contacted through the Relay hierarchy. Unlike conventional network topologies, of for example, 100,000 machines, wherein each computer has an IP address and routes may exist between all of them, but many of those machines are not allowed to contact each other, or they are prevented from contacting each other because of the presence of firewalls, network segmenting, and so on.

The discovered routing that is established as a result of Relays and the automatic Relay selection makes it possible to reuse that routing to get a message back down to the computer. In fact, it is possible to find a routing between any two computers the administrator might want to talk to. By starting with a Relay and forwarding from one machine to a next until a message reaches the target, the Administrator can get a message through. Thus, it is to be appreciated that the Platform, in addition to providing the one-to-many communication of a broadcast system, allows direct one-to-one communication between any two machines within a network topology under management via the Platform.

Asset Network Mapping

In an embodiment, an Asset Network Map, as shown in FIG. 15, aggregates information, collected by the Relay selection algorithm, revealing the gateways between a computer and the Relay it talks to, the number of hops, along with information it has about the bandwidth between those links, and creates a visual mapping of the information. In some cases, hundreds of thousands of lines of data are aggregated to form a map that gives the Operator a visual representation of his/her network. In its basic form, the information comprises a multitude of points, representing gateways and lines, representing routes.

The aggregated data is rendered as a human readable graph using, for example, a force-directed algorithm, such as a spring algorithm. Additionally, the Operator can apply various filters to the data in order to create a map that highlights particular aspects of the data. For example, the Operator may specify that the link between a Relay and a Client should be 300 kilobytes/second.

While the Network Asset map can display historical data, in an embodiment it can be updated in real-time as the network infrastructure changes. Thus, the Network Asset Map can function to display data even as it is being generated. In this way, network traffic can be depicted visually, in real-time, so that the Operator, can, for example, detect, even as it is happening, that a particular area of the network is becoming overloaded.

In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.

Claims

1. In a policy-based network management and communication infrastructure, a computer-implemented method of providing one-to-one communication between networked computational devices comprising the steps of:

at least one computational device automatically discovering at least one parent computational device and registering at least its location with said discovered parent computational device to form a discovered hierarchy of computational devices;

a first computational device automatically discovering at least one routing path through said discovered hierarchy to a second computational device; and

said first and second computational devices communicating with each other via said discovered routing path.

2. The method of claim 1, wherein said infrastructure includes at least one of:

at least one Root Server;

at least one Console;

zero or more Relays; and

zero or more proxy agents; and

at least one Client.

3. The method of claim 2, wherein said at least one Root Server comprises a computational device programmed to provide a control center and repository for system configuration data, software updates and patches and other management information;

wherein said Console comprises an operations control center for administrators that runs from the Server wherein said console includes graphical displays of device, group, and enterprise-wide device status and dashboards for executing management actions through the infrastructure and wherein said Console includes reporting functions and templates that enable graphical and tabular views of infrastructure status;

wherein said at least one Relay comprises a non-dedicated computational device running Relay software as a shared service that acts as a concentration point for Fixlet messages on said infrastructure and reduces network bandwidth requirements for distribution of at least one of software, patches, updates and said Fixlet messages;

wherein said at least one Client comprises an endpoint device in said network executing an Agent, said Agent comprising software that acts as a universal policy engine capable of delivering multiple management services that includes at least one of Client status reporting, patch and software distribution, and security policy enforcement.

4. The method of claim 3, wherein said step of at least one computational device automatically discovering at least one parent computational device and registering at least its location with said discovered parent computational device to form a discovered hierarchy of computational devices comprises the steps of:

a Client determining if a Relay is in said Client's subnet by pinging Relays having a TTL (time-to-live) of 1 and, responsive to no detection of a Relay, incrementing the TTL value and pinging until at least on Relay is detected;

responsive to detection of a Relay, said Client attempting registration with said detected Relay;

responsive to successful registration with said detected Relay, said Client using said Relay as a parent device;

responsive to unsuccessful registration with said detected Relay, said Client continuing to increment TTL and pinging until a Relay is detected and registration is successful or until TTL is incremented to a predetermined value;

responsive to no Relay being detected, said Client attempting to register with a Failover Relay;

responsive to unsuccessful registration with said Failover Relay, said Client attempting to Register with a Server; and

responsive to unsuccessful registration with said Server, said Client attempting detection of a Relay again after elapse of a predetermined MinRetry period.

5. The method of claim 4, wherein said step of attempting to register with a Failover Relay comprises the steps of:

said Client attempting to interact with a Relay;

responsive to a failure of said interaction, said Client saving time of said failure and attempting said interaction a second time;

responsive to said second failure, attempting said interaction after a predetermined ResistFailure time elapses, said ResistFailure time starting at said saved time of Failure;

responsive to a failure following said ResistFailure time expiration, said Client initiating an automatic Relay selection procedure.

6. The method of claim 3, further comprising the step of:

said infrastructure providing means for credentialing a Client using a symmetric key pair in order to protect said Client and its parents from snooping attacks;

a Server signing and sending content down said hierarchy to a predetermined Client;

a predetermined Client encrypting and sending content up said hierarchy to a Server;

a predetermined Client signing and sending content to a Server; and

a Server encrypting and sending content down said hierarchy to said predetermined Client;

a first predetermined Client and a second predetermined Client exchanging content that has been one or both of signed and encrypted.

7. The method of claim 6, wherein the step of said infrastructure providing means for credentialing a Client using a symmetric key pair comprises the steps of:

a Server generating a private/public key pair and distributing copies of said public key to a plurality of Clients in said network;

a plurality of Clients each generating a public/private key pair and distributing a plurality of copies of said Client generated key to parent devices and peer devices on said network.

8. The method of claim 7, further comprising the steps of:

A Server granting a predetermined Client a unique ComputerID and associating said unique ComputerID to said public key generated and distributed by said predetermined Client;

said Client signing content originating from said Client with said private key generated by said predetermined Client;

responsive to a Server receiving said signed content from said predetermined Client, prior to update of said content, said Server verifying that said signed content is signed by a key that matches a public key associated to a ComputerID granted to said predetermined Client.

9. The method of claim 7, further comprising the steps of:

a Server looking up a public key associated with a predetermined ComputerID; and one or both of the steps of:

said Server signing content to be sent to the Client corresponding to said predetermined ComputerID; and

said Server using said public key associated with said predetermined ComputerID to encrypt said signed content for sending to said Client corresponding to said predetermined ComputerID.

10. The method of claim 7, further comprising the steps of;

A Client registering with a Server, wherein said Client sends said public key to its Server;

responsive to detection of a cloned key, said Server invalidating a ComputerID associated to said cloned key; and

said Server requiring said Client granted said ComputerID to generate a new key pair.

11. The method of claim 7, further comprising either of the steps of:

a first Client and a second Client authenticating content exchanged with each other;

and a Client and a Relay authenticating content exchanged with each other.

12. The method of claim 7, further comprising the step of:

a Server sending an encrypted password down said hierarchy to a Client, wherein said Client decrypts said password prior to use.

13. The method of claim 3, wherein said step of either of said first and second computational devices establishing communication with the other of said first and second computational devices via said discovered routing path comprises the step of:

said Console connecting to a predetermined Client via one or both of at least one Server and at least one Relay.

14. The method of claim 13, further comprising any of the steps of:

establishing a synchronous encrypted tunnel between said Console and said Client;

building a secured channel for asynchronously sending messages to individual Clients from said Console;

creating an on-the-fly VPN (virtual private network);

enabling one or both of file discovery and file sharing over a synchronous connection;

mailboxing passwords over an asynchronous connection;

establishing a remote desktop on a Client from a Console;

remotely debugging Actions;

connecting Users and ComputerIDs to automatically provide privileges to connect to a set of other computers;

synchronously opening a connection to a Client and transferring logs from the Client up to the Console;

routing through the infrastructure into a Relay inside a subnet and then allowing the last leg of communication to take place over an IP address that can directly connect to the target machine; and

establishing a direct connection between a first Client and a second Client.

15. The method of claim 13, further comprising the steps of:

routing a broadcast packet from said Console to a target computer in said network in order to wake-up said target computer.

16. The method of claim 15, said step of routing a broadcast packet from said Console to a predetermined computer in said network in order to wake-up said computer comprising at least one of the following steps:

said Console using Client MAC (media access control) addresses provided at registration to identify Clients occupying the same subnet as said target Client;

said Console sending at least one message down through said hierarchy to contact at least one Relay that is able to contact said target's subnet;

said at least one contacted Relay broadcasting messages to peers of said target, requesting that said target be woken up;

at least one of said peers listening for messages sent out by said Relays and detecting said request messages and said sending wake-up message to said target;

each of said Peers listening for duplicate traffic and suspending broadcast upon detection of said duplicate traffic.

17. The method of claim 16, wherein said step of each of said Peers listening for duplicate traffic and suspending broadcast upon detection of said duplicate traffic comprises the step of:

said peers deciding which peer should take precedence over the remaining peers based a unique computer ID and a coalition order for determining precedence.

18. The method of claim 3, wherein said step of either of said first and second computational devices establishing communication with the other of said first and second computational devices via said discovered routing path comprises the steps of:

deploying at least one Fixlet message to at least one Client that instructs said at least one Client to trust an arbitrary piece of content to run, so that responsibility for knowing that the content is safe to run is delegated to a trusted piece of software on said at least one Client;

said Client identifying said arbitrary piece of content according to file size and hash;

said Client requesting a Relay to provide said identified piece of content by providing said file size and said hash; and

said Relay mirroring said requested piece of content back down through said hierarchy to said Client.

19. The method of claim 18, further comprising the step of: merging said mirrored content with an Action instructing said Client to run whatever the content tells said Client to run.

20. The method of claim 18, wherein said content comprises dynamic content that changes and is updated frequently so that it is not known at the time of policy creation.

21. The method of claim 20, wherein said dynamic content comprises updates to anti-virus and spyware definitions.

22. The method of claim 18, comprising the steps of:

using variables to refer to said content in ActionScripts, wherein said Client is enable to look up dynamic information indirectly and fill it into said variables.

23. The method of claim 20, further comprising the step of determining dependency resolution in order to install various pieces of software in an arbitrary collection of software, at least some items of which depend on other software being installed.

24. The method of claim 20, further comprising the step of providing data in the form of a set of packages to a process on a Client itself that is able to analyze the set of packages, wherein said process produces a list of URLs, hashes, and sizes that need to be downloaded for the particular machine in order for it to update to a new version of a package.

25. The method of claim 24, wherein any request to download from a URL that is not explicitly authorized is checked against a white-list of URLs and must meet at least one of the criteria specified in said white-list.

26. A platform for providing one-to-one communication between networked computational devices in a policy-based network management and communication infrastructure, comprising:

at least one computational device programmed for automatically discovering at least one parent computational device and registering at least its location with said discovered parent computational device to form a discovered hierarchy of computational devices;

a first computational device programmed for automatically discovering at least one routing path through said discovered hierarchy to a second computational device; and

said first and second computational devices programmed for establishing communication with the other of said first and second computational devices via said discovered routing path.

27. The platform of claim 26, wherein said infrastructure includes at least one of:

at least one Root Server;

at least one Console;

at least one Relay; and

at least one Client.

28. The platform of claim 27, wherein said at least one Root Server comprises a computational device programmed to provide a control center and repository for system configuration data, software updates and patches and other management information;

wherein said Console comprises an operations control center for administrators that runs from the Server wherein said console includes graphical displays of device, group, and enterprise-wide device status and dashboards for executing management actions through the infrastructure and wherein said Console includes reporting functions and templates that enable graphical and tabular views of infrastructure status;

wherein said at least one Relay comprises a non-dedicated computational device running Relay software as a shared service that acts as a concentration point for Fixlet messages on said infrastructure and reduces network bandwidth requirements for distribution of at least one of software, patches, updates and said Fixlet messages;

wherein said at least one Client comprises an endpoint device in said network executing an Agent, said Agent comprising software that acts as a universal policy engine capable of delivering multiple management services that includes at least one of Client status reporting, patch and software distribution, and security policy enforcement.

29. The platform of claim 28, wherein said at least one computational device programmed for automatically discovering at least one parent computational device and registering at least its location with said discovered parent computational device to form a discovered hierarchy of computational devices comprises:

a Client programmed for determining if a Relay is in said Client's subnet by pinging Relays having a TTL (time-to-live) of 1 and, responsive to no detection of a Relay, incrementing the TTL value and pinging until at least on Relay is detected;

responsive to detection of a Relay, said Client programmed for attempting registration with said detected Relay;

responsive to successful registration with said detected Relay, said Client programmed for using said Relay as a parent device;

responsive to unsuccessful registration with said detected Relay, said Client programmed for continuing to increment TTL and pinging until a Relay is detected and registration is successful or until TTL is incremented to a predetermined value;

responsive to no Relay being detected, said Client programmed for attempting to register with a Failover Relay;

responsive to unsuccessful registration with said Failover Relay, said Client programmed for attempting to Register with a Server;

responsive to unsuccessful registration with said Server, said Client programmed for attempting detection of a Relay again after elapse of a predetermined MinRetry period.

30. The method of claim 29, wherein said Client programmed for attempting to register with a Failover Relay comprises

said Client programmed for attempting to interact with a Relay;

said Client programmed for, responsive to a failure of said interaction, saving time of said failure and attempting said interaction a second time;

said Client programmed for, responsive to said second failure, attempting said interaction after a predetermined ResistFailure time elapses, said ResistFailure time starting at said saved time of Failure;

said Client programmed for, responsive to a failure following said ResistFailure time expiration, initiating an automatic Relay selection procedure.

31. The platform of claim 28, further comprising:

at least one computational device programmed for credentialing a Client using a symmetric key pair in order to protect said Client and its parents from snooping attacks;

a Server programmed for signing and sending content down said hierarchy to a predetermined Client;

a predetermined Client programmed for encrypting and sending content up said hierarchy to a Server;

a predetermined Client programmed for signing and sending content to a Server; and

a Server programmed for encrypting and sending content down said hierarchy to said predetermined Client;

a first predetermined Client and a second predetermined Client programmed for exchanging content that has been one or both of signed and encrypted.

32. The platform of claim 31, wherein said at least one computational device programmed for credentialing a Client using a symmetric key pair comprises;

a Server programmed for generating a private/public key pair and distributing copies of said public key to a plurality of Clients in said network;

a plurality of Clients each programmed for generating a public/private key pair and distributing a plurality of copies of said Client generated key to parent devices and peer devices on said network.

33. The platform of claim 32, further comprising:

a Server programmed for granting a predetermined Client a unique ComputerID and associating said unique ComputerID to said public key generated and distributed by said predetermined Client;

said Client programmed for signing content originating from said Client with said private key generated by said predetermined Client;

a Server programmed for verifying that said signed content is signed by a key that matches a public key associated to a ComputerID granted to said predetermined Client responsive to said Server receiving said signed content from said predetermined Client, prior to update of said content.

34. The platform of claim 32, further comprising:

a Server programmed for looking up a public key associated with a predetermined ComputerID; and one or both of the steps of:

said Server programmed for signing content to be sent to the Client corresponding to said predetermined ComputerID; and

said Server programmed for using said public key associated with said predetermined ComputerID to encrypt said signed content for sending to said Client corresponding to said predetermined ComputerID.

35. The platform of claim 32, further comprising:

a Client programmed for registering with a Server, wherein said Client sends said public key to its Server;

said Server for invalidating a ComputerID associated to a cloned key, responsive to detection of said cloned key, and

said Server programmed for requiring said Client granted said ComputerID to generate a new key pair.

36. The platform of claim 32, further comprising either of:

a first Client and a second Client programmed for authenticating content exchanged with each other;

and a Client and a Relay programmed for authenticating content exchanged with each other.

37. The platform of claim 32, further comprising a Server programmed for sending an encrypted password down said hierarchy to a Client, wherein said Client is programmed for decrypting said password prior to use.

38. The platform of claim 28, wherein either of said first and second computational devices being programmed for establishing communication with the other via said discovered routing path comprise:

said Console programmed for connecting to a predetermined Client via one or both of at least one Server and at least one Relay.

39. The platform of claim 38, further comprising any of:

a computational device programmed for establishing a synchronous encrypted tunnel between said Console and said Client;

a computational device programmed for building a secured channel for asynchronously sending messages to individual Clients from said Console;

a computational device programmed for creating an on-the-fly VPN (virtual private network);

a computational device programmed for enabling one or both of file discovery and file sharing over a synchronous connection;

a computational device programmed for mailboxing passwords over an asynchronous connection;

a computational device programmed for establishing a remote desktop on a Client from a Console;

a computational device programmed for remotely debugging Actions;

a computational device programmed for connecting Users and ComputerIDs to automatically provide privileges to connect to a set of other computers;

a computational device programmed for synchronously opening a connection to a Client and transferring logs from the Client up to the Console;

a computational device programmed for routing through the infrastructure into a Relay inside a subnet and then allowing the last leg of communication to take place over an IP address that can directly connect to the target machine; and

a computational device programmed for establishing a direct connection between a first Client and a second Client.

40. The platform of claim 38, further comprising:

a computational device programmed for routing a broadcast packet from said Console to a target computer in said network in order to wake-up said target computer.

41. The platform of claim 41, said computational device programmed for routing a broadcast packet from said Console to a predetermined computer in said network in order to wake-up said computer comprising at least one of the following:

said Console programmed for using Client MAC (media access control) addresses provided at registration to identify Clients occupying the same subnet as said target Client;

said Console programmed for sending at least one message down through said hierarchy to contact at least one Relay that is able to contact said target's subnet;

said at least one contacted Relay programmed for broadcasting messages to peers of said target, requesting that said target be woken up;

at least one of said peers programmed for listening for messages sent out by said Relays and detecting said request messages and said sending wake-up message to said target;

each of said Peers programmed for listening for duplicate traffic and suspending broadcast upon detection of said duplicate traffic.

42. The platform of claim 41, wherein each of said peers programmed for listening for duplicate traffic and suspending broadcast upon detection of said duplicate traffic are programmed for:

deciding which peer should take precedence over the remaining peers based a unique computer ID and a coalition order for determining precedence.

43. The platform of claim 28, wherein said first and second computational devices programmed for establishing communication with the other of said first and second computational devices via said discovered routing path are programmed for:

deploying at least one Fixlet message to at least one Client that instructs said at least one Client to trust an arbitrary piece of content to run, so that responsibility for knowing that the content is safe to run is delegated to a trusted piece of software on said at least one Client;

said Client identifying said arbitrary piece of content according to file size and hash;

said Client requesting a Relay to provide said identified piece of content by providing said file size and said hash; and

said Relay mirroring said requested piece of content back down through said hierarchy to said Client.

44. The platform of claim 43, further comprising a computational device programmed for merging said mirrored content with an Action instructing said Client to run whatever the content tells said Client to run.

45. The platform of claim 43, wherein said content comprises dynamic content that changes and is updated frequently so that it is not known at the time of policy creation.

46. The platform of claim 45, wherein said dynamic content comprises updates to anti-virus and spyware definitions.

47. The platform of claim 43, further comprising a computational device programmed for:

using variables to refer to said content in ActionScripts, wherein said Client is enable to look up dynamic information indirectly and fill it into said variables.

48. The platform of claim 45, further comprising a computational device programmed for determining dependency resolution in order to install various pieces of software in an arbitrary collection of software, at least some items of which depend on other software being installed.

49. The platform of claim 45, further comprising a computational device programmed for providing data in the form of a set of packages to a process on a Client itself that is able to analyze the set of packages, wherein said process produces a list of URLs, hashes, and sizes that need to be downloaded for the particular machine in order for it to update to a new version of a package.

50. The platform of claim 49, wherein any request to download from a URL that is not explicitly authorized is checked against a white-list of URLs and must meet at least one of the criteria specified in said white-list.

51. In a platform providing one-to-one communication between networked computational devices, a method for at least one computational device to automatically discover at least one parent computational device comprising the steps of:

a Client determining if a Relay is in said Client's subnet by pinging Relays having a TTL (time-to-live) of 1 and, responsive to no detection of a Relay, incrementing the TTL value and pinging until at least on Relay is detected;

responsive to detection of a Relay, said Client attempting registration with said detected Relay;

responsive to successful registration with said detected Relay, said Client using said Relay as a parent device;

responsive to unsuccessful registration with said detected Relay, said Client continuing to increment TTL and pinging until a Relay is detected and registration is successful or until TTL is incremented to a predetermined value;

responsive to no Relay being detected, said Client attempting to register with a Failover Relay;

responsive to unsuccessful registration with said Failover Relay, said Client attempting to Register with a Server; and

responsive to unsuccessful registration with said Server, said Client attempting detection of a Relay again after elapse of a predetermined MinRetry period.

52. A computer program product method for at least one computational device to automatically discover at least one parent computational device in a platform for providing one-to-one communication between networked computational devices, comprising a tangible computer-readable storage medium having embodied thereon computer-readable instructions for:

a Client determining if a Relay is in said Client's subnet by pinging Relays having a TTL (time-to-live) of 1 and, responsive to no detection of a Relay, incrementing the TTL value and pinging until at least on Relay is detected;

responsive to detection of a Relay, said Client attempting registration with said detected Relay;

responsive to successful registration with said detected Relay, said Client using said Relay as a parent device;

responsive to unsuccessful registration with said detected Relay, said Client continuing to increment TTL and pinging until a Relay is detected and registration is successful or until TTL is incremented to a predetermined value;

responsive to no Relay being detected, said Client attempting to register with a Failover Relay;

responsive to unsuccessful registration with said Failover Relay, said Client attempting to Register with a Server; and

responsive to unsuccessful registration with said Server, said Client attempting detection of a Relay again after elapse of a predetermined MinRetry period.

53. In a platform providing one-to-one communication between networked computational devices, a method for credentialing a Client using a symmetric key pair in order to protect said Client and its parents from snooping attacks comprising the steps of:

a Server signing and sending content down said hierarchy to a predetermined Client;

a predetermined Client encrypting and sending content up said hierarchy to a Server;

a predetermined Client signing and sending content to a Server; and

a Server encrypting and sending content down said hierarchy to said predetermined Client;

a first predetermined Client and a second predetermined Client exchanging content that has been one or both of signed and encrypted.

54. A computer program product for credentialing a Client using a symmetric key pair in order to protect said Client and its parents from snooping attacks in a platform providing one-to-one communication between networked computational devices, comprising a tangible computer-readable storage medium having embodied thereon computer-readable instructions for:

a Server signing and sending content down said hierarchy to a predetermined Client;

a predetermined Client encrypting and sending content up said hierarchy to a Server;

a predetermined Client signing and sending content to a Server; and

a Server encrypting and sending content down said hierarchy to said predetermined Client;

a first predetermined Client and a second predetermined Client exchanging content that has been one or both of signed and encrypted.

55. In a platform providing one-to-one communication between networked computational devices, a method for either of first and second computational devices establishing communication with the other via a discovered routing path comprises the steps of;

deploying at least one Fixlet message to at least one Client that instructs said at least one Client to trust an arbitrary piece of content to run, so that responsibility for knowing that the content is safe to run is delegated to a trusted piece of software on said at least one Client;

said Client identifying said arbitrary piece of content according to file size and hash;

said Client requesting a Relay to provide said identified piece of content by providing said file size and said hash; and

said Relay mirroring said requested piece of content back down through said hierarchy to said Client.

56. A computer program product for first and second computational devices establishing communication with each other via a discovered routing path in a platform providing one-to-one communication between networked computational devices, comprising a tangible computer-readable storage medium having embodied thereon computer-readable instructions for:

deploying at least one Fixlet message to at least one Client that instructs said at least one Client to trust an arbitrary piece of content to run, so that responsibility for knowing that the content is safe to run is delegated to a trusted piece of software on said at least one Client;

said Client identifying said arbitrary piece of content according to file size and hash;

said Client requesting a Relay to provide said identified piece of content by providing said file size and said hash; and

said Relay mirroring said requested piece of content back down through said hierarchy to said Client.