Dynamic coordination and control of network connected devices for large-scale network site testing and associated architectures
Dynamic coordination and control of network connected devices within a distributed processing platform is disclosed for large-scale network site testing, or for other distributed projects. For network site testing, the distributed processing system utilizes a plurality of client devices which are running a client agent program associated with the distributed computing platform and which are running potentially distinct project modules for the testing of network sites or other projects. The participating client devices can be selected based upon their attributes and can receive test workloads from the distributed processing server systems. In addition, the client devices can send and receive poll communications that may be used during processing of the project to control, manage and coordinate the project activities of the distributed devices. If desired, a separate poll server system can be dedicated to handling the poll communication and coordination and control operations with the participating distributed devices during test operations, thereby allowing other server tasks to be handled by other distributed processing server systems. Once the tests are complete, the results can be communicated from the client devices to the server systems and can be reported, as desired. Additionally, the distributed processing system can identify the attributes, including device capabilities, of distributed devices connected together through a wide variety of communication systems and networks and utilize those attributes to organize, manage and distribute project workloads to the distributed devices.
This application is a continuation-in-part application of the following applications: application Ser. No. 09/539,448 entitled “CAPABILITY-BASED DISTRIBUTED PARALLEL PROCESSING SYSTEM AND ASSOCIATED METHOD,” now abandoned application Ser. No. 09/539,428 entitled “METHOD OF MANAGING DISTRIBUTED WORKLOADS AND ASSOCIATED SYSTEM,” and application Ser. No. 09/539,106 entitled “NETWORK SITE TESTING METHOD AND ASSOCIATED SYSTEM,” which was filed on Mar. 30, 2000, now U.S. Pat. No. 6,891,802 and which is hereby incorporated by reference in its entirety. This application is also a continuation-in-part application of the following application: application Ser. No. 09/603,740 entitled “METHOD OF MANAGING WORKLOADS AND ASSOCIATED DISTRIBUTED PROCESSING SYSTEM,” now abandoned and application Ser. No. 09/602,983 entitled “CUSTOMER SERVICES AND ADVERTISING BASED UPON DEVICE ATTRIBUTES AND ASSOCIATED DISTRIBUTED PROCESSING SYSTEM,” now U.S. Pat. No. 6,963,897 each of which was filed on Jun. 23, 2000, and each of which is hereby incorporated by reference in its entirety. This application is also a continuation-in-part application of the following application: application Ser. No. 09/648,832 entitled “SECURITY ARCHITECTURE FOR DISTRIBUTED PROCESSING SYSTEMS AND ASSOCIATED METHOD,” which was filed on Aug. 25, 2000, now U.S. Pat. No. 6,847,995 and which is hereby incorporated by reference in its entirety. This application is also a continuation-in-part application of the following co-pending application: application Ser. No. 09/794,969 entitled “SYSTEM AND METHOD FOR MONITIZING NETWORK CONNECTED USER BASES UTILIZING DISTRIBUTED PROCESSING SYSTEMS,” which was filed on Feb. 27, 2001, and which is hereby incorporated by reference in its entirety. This application is also a continuation-in-part application of the following co-pending application: application Ser. No. 09/834,785 entitled “SOFTWARE-BASED NETWORK ATTACHED STORAGE SERVICES HOSTED ON MASSIVELY DISTRIBUTED PARALLEL COMPUTING NETWORKS,” which was filed on Apr. 13, 2001, and which is hereby incorporated by reference in its entirety. The present application also claims priority to the following co-pending U.S. provisional patent application: Provisional Application Ser. No. 60/368,871 that is entitled “MASSIVELY DISTRIBUTED PROCESSING SYSTEM ARCHITECTURE, SCHEDULING, UNIQUE DEVICE IDENTIFICATION AND ASSOCIATED METHODS,” which was filed Mar. 29, 2002, and which is hereby incorporated by reference in its entirety.
TECHNICAL FIELD OF THE INVENTIONThis invention relates to distributing project workloads among a multitude of distributed devices and more particularly to techniques and related methods for managing, facilitating and implementing distributed processing in a network environment. This invention is also related to functional, quality of server (QoS), and other testing of network sites utilizing a distributed processing platform.
BACKGROUNDNetwork site testing is typically desired to determine how a site or connected service performs under a desired set of test circumstances. Several common tests that are often attempted are site load testing and quality of service (QoS) testing. Quality of service (QoS) testing refers to testing a user's experience accessing a network site under normal or various other usability situations. Load testing refers to testing the load a particular network site's infrastructure can handle in user interactions. An extreme version of load testing is a denial-of-service attack, where a system or group of systems intentionally attempt to overload and shut-down a network site. Co-pending Application Ser No. 09/539,106 entitled “NETWORK SITE TESTING METHOD AND ASSOCIATED SYSTEM,” (which is commonly owned by United Devices, Inc.) discloses a distributed processing system capable of utilizing a plurality of distributed client devices to test network web sites, for example, with actual expected user systems. One problem associated with network site testing is the management, control and coordination of the distributed devices participating in the network site testing project.
SUMMARY OF THE INVENTIONThe present invention provides architectures and methods for the dynamic coordination and control of network connected devices for network site testing and other distributed computing projects. For the network site testing, the distributed processing system utilizes a plurality of client devices that run client agent programs which are associated with a distributed computing platform and which are running one or more possibly distinct project modules for network site testing or other projects. The participating client devices receive project workloads unit from the distributed processing server systems. Poll communications between the client systems and the server systems are used during processing of the distributed project to control, manage and coordinate the activities of the distributed devices in accomplishing the project goal, such as network site testing. If desired, a separate poll server system can be dedicated to handle the poll communications and coordination and control operations with the participating distributed devices during test operation, thereby allowing other server tasks to be handled by other distributed processing server systems. Once the tests are complete, the results can be communicated from the client devices to the server systems and can be reported, as desired. Additionally, the distributed processing system can identify the attributes of distributed devices connected together through a wide variety of communication systems and networks and utilize those attributes to organize, manage and distribute project workloads to the distributed devices.
It is noted that the appended drawings illustrate only exemplary embodiments of the invention and are, therefore, not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
The present invention provides a dynamic coordination and control architecture for network site testing within a distributed processing platform that utilizes a plurality of network-connected client devices. The client systems are configured to run a client agent program and project modules for the testing of network sites or other distributed project activities. In addition to project work units, these client devices can receive poll communications that are used during project operations to control, manage and coordinate the project activities of the distributed devices. In addition, if desired, a separate poll server system can be dedicated to handling the poll communications and coordination and control operations with the participating distributed devices during test operation, thereby allowing other server tasks to be handled by other distributed processing server systems. Once the tests are complete, the results can be collected and reported.
Example embodiments for the coordination and control architecture of the present invention, including a poll server, are described with respect to
As described in the co-pending applications, distributed processing systems according to the present invention may identify the capabilities of distributed devices connected together through a wide variety of communication systems and networks and then utilize these capabilities to accomplish network site testing objectives of the present invention. For example, distributed devices connected to each other through the Internet, an intranet network, a wireless network, home networks, or any other network may provide any of a number of useful capabilities to third parties once their respective capabilities are identified, organized, and managed for a desired task. These distributed devices may be connected personal computer systems (PCs), internet appliances, notebook computers, servers, storage devices, network attached storage (NAS) devices, wireless devices, hand-held devices, or any other computing device that has useful capabilities and is connected to a network in any manner. The present invention further contemplates providing an incentive, which may be based in part upon capabilities of the distributed devices, to encourage users and owners of the distributed devices to allow the capabilities of the distributed devices to be utilized in the distributed parallel processing system of the present invention.
The number of usable distributed devices contemplated by the present invention is preferably very large. Unlike a small local network environment, for example, which may include less than 100 interconnected computers systems, the present invention preferably utilizes a multitude of widely distributed devices to provide a massively distributed processing system. With respect to the present invention, a multitude of distributed devices refers to greater than 1,000 different distributed devices. With respect to the present invention, widely distributed devices refers to a group of interconnected devices of which at least two are physically located at least 100 miles apart. With respect to the present invention, a massively distributed processing system is one that utilizes a multitude of widely distributed devices. The Internet is an example of a interconnected system that includes a multitude of widely distributed devices. An intranet system at a large corporation is an example of an interconnected system that includes multitude of distributed devices, and if multiple corporate sites are involved, may include a multitude of widely distributed devices. A distributed processing system according to the present invention that utilizes such a multitude of widely distributed devices, as are available on the Internet or in a large corporate intranet, is a massively distributed processing system according to the present invention.
Looking now to
It is noted that the client systems 108, 110 and 112 represent any number of systems and/or devices that may be identified, organized and utilized by the server systems 104 to accomplish a desired task, for example, personal computer systems (PCs), internet appliances, notebook computers, servers, storage devices, network attached storage (NAS) devices, wireless devices, hand-held devices, or any other computing device that has useful capabilities and is connected to a network in any manner. The server systems 104 represent any number of processing systems that provide the function of identifying, organizing and utilizing the client systems to achieve the desired tasks.
The incentives provided by the incentives block 126 may be any desired incentive. For example, the incentive may be a sweepstakes in which entries are given to client systems 108, 110 . . . 112 that are signed up to be utilized by the distributed processing system 100. Other example incentives are reward systems, such as airline frequent-flyer miles, purchase credits and vouchers, payments of money, monetary prizes, property prizes, free trips, time-share rentals, cruises, connectivity services, free or reduced cost Internet access, domain name hosting, mail accounts, participation in significant research projects, achievement of personal goals, or any other desired incentive or reward.
As indicated above, any number of other systems may also be connected to the network 102. The element 106, therefore, represents any number of a variety of other systems that may be connected to the network 102. The other systems 106 may include ISPs, web servers, university computer systems, and any other distributed device connected to the network 102, for example, personal computer systems (PCs), internet appliances, notebook computers, servers, storage devices, network attached storage (NAS) devices, wireless devices, hand-held devices, or any other connected computing device that has useful capabilities and is connected to a network in any manner. The customer systems 152 represents customers that have projects for the distributed processing system, as further described with respect to FIG. 1B. The customer systems 152 connect to the network 102 through the communication link 119.
It is noted that the communication links 114, 116, 118, 119, 120 and 122 may allow for communication to occur, if desired, between any of the systems connected to the network 102. For example, client systems 108, 110 . . . 112 may communicate directly with each other in peer-to-peer type communications. It is further noted that the communication links 114, 116, 118, 119, 120 and 122 may be any desired technique for connecting into any portion of the network 102, such as, Ethernet connections, wireless connections, ISDN connections, DSL connections, modem dial-up connections, cable modem connections, fiber optic connections, direct T1 or T3 connections, routers, portal computers, as well as any other network or communication connection. It is also noted that there are any number of possible configurations for the connections for network 102, according to the present invention. The client system 108 may be, for example, an individual personal computer located in someone's home and may be connected to the Internet through an Internet Service Provider (ISP). Client system 108 may also be a personal computer located on an employee's desk at a company that is connected to an intranet through a network router and then connected to the Internet through a second router or portal computer. Client system 108 may further be personal computers connected to a company's intranet, and the server systems 104 may also be connected to that same intranet. In short, a wide variety of network environments are contemplated by the present invention on which a large number of potential client systems are connected.
It is noted, therefore, that the capabilities for client systems 108, 110 . . . 112 may span the entire range of possible computing, processing, storage and other sub-systems or devices that are connected to a system connected to the network 102. For example, these subsystems or devices may include: central processing units (CPUs), digital signal processors (DSPs), graphics processing engines (GPEs), hard drives (HDs), memory (MEM), audio sub-systems (ASs), communications subsystems (CSs), removable media types (RMs), and other accessories with potentially useful unused capabilities (OAs). In short, for any given computer system connected to a network 102, there exists a variety of capabilities that may be utilized by that system to accomplish its direct tasks. At any given time, however, only a fraction of these capabilities are typically used on the client systems 108, 110 . . . 112.
As indicated above, to encourage owners or users of client systems to allow their system capabilities to be utilized by control system 104, an incentive system may be utilized. This incentive system may be designed as desired. Incentives may be provided to the user or owner of the clients systems when the client system is signed-up to participate in the distributed processing system, when the client system completes a workload for the distributed processing system, or any other time during the process. In addition, incentives may be based upon the capabilities of the client systems, based upon a benchmark workload that provides a standardized assessment of the capabilities of the client systems, or based upon any other desired criteria.
Security subsystems and interfaces may also be included to provide for secure interactions between the various devices and systems of the distributed processing system 100. The security subsystems and interfaces operate to secure the communications and operations of the distributed processing system. This security subsystem and interface also represents a variety of potential security architectures, techniques and features that may be utilized. This security may provide, for example, authentication of devices when they send and receive transmissions, so that a sending device verifies the authenticity of the receiving device and/or the receiving device verifies the authenticity of the sending device. In addition, this security may provide for encryption of transmissions between the devices and systems of the distributed processing system. The security subsystems and interfaces may also be implemented in a variety of ways, including utilizing security subsystems within each device or security measures shared among multiple devices, so that security is provided for all interactions of the devices within the distributed processing system. In this way, for example, security measures may be set in place to make sure that no unauthorized entry is made into the programming or operations of any portion of the distributed processing system including the client agents.
As discussed above, each client system includes a client agent that operates on the client system and manages the workloads and processes of the distributed processing system. As shown in
Also as discussed above, security subsystems and interfaces may be included to provide for secure interactions between the various devices and systems of the distributed processing system 100. As depicted in
In operation, client systems or end-users may utilize the clients subsystem 1548 within the web interface 1554 to register, set user preferences, check statistics, check sweepstakes entries, or accomplish any other user interface option made available, as desired. Advertising customers may utilize the advertisers subsystem 1552 within the web interface 1554 to register, add or modify banner or other advertisements, set up rules for serving advertisements, check advertising statistics (e.g., click statistics), or accomplish any other advertiser interface option made available, as desired. Customers and their respective task or project developers may utilize the task developer subsystem 1550 to access information within database systems 1546 and modules within the server systems 104, such as the version/phase control subsystem 1528, the task module and work unit manager 1530, and the workload information 308. Customers may also check project results, add new work units, check defect reports, or accomplish any other customer or developer interface option made available, as desired.
Advantageously, the customer or developer may provide the details of the project to be processed, including specific program code and algorithms that will process the data, in addition to any data to be processed. In the embodiment shown in
Information sent from the server systems 104 to the client agents 270A, 270B . . . 270C may include task modules, data for work units, and advertising information. Information sent from the client agents 270A, 270B . . . 270C to the server systems 104 may include user information, system information and capabilities, current task module version and phase information, and results. The database systems 1546 may hold any relevant information desired, such as workload information (WL) 308 and client capability vectors (CV) 620. Examples of information that may be stored include user information, client system information, client platform information, task modules, phase control information, version information, work units, data, results, advertiser information, advertisement content, advertisement purchase information, advertisement rules, or any other pertinent information.
It may be expected that different workload projects WL1, WL2 . . . WLN within the workload database 308 may require widely varying processing requirements. Thus, in order to better direct resources to workload projects, the server system may access various system vectors when a client system signs up to provide processing time and other system or device capabilities to the server system. This capability scheduling helps facilitate project operation and completion. In this respect, the capability vector database 620 keeps track of any desired feature of client systems or devices in capability vectors CBV1, CBV2 . . . CBVN, represented by elements 628, 630 . . . 632, respectively. These capability vectors may then be utilized by the control system 304 through line 626 to capability balance workloads.
This capability scheduling according to the present invention, therefore, allows for the efficient management of the distributed processing system of the present invention. This capability scheduling and distribution will help maximize throughput, deliver timely responses for sensitive workloads, calculate redundancy factors when necessary, and in general, help optimize the distributed processing computing system of the present invention. The following TABLE 1 provides lists of capability vectors or factors that may be utilized. It is noted that this list is an example list, and any number of vectors or factors may be identified and utilized, as desired.
This capability scheduling and management based upon system related vectors allows for efficient use of resources. For example, utilizing the operating system or software vectors, workloads may be scheduled or managed so that desired hardware and software configurations are utilized. This scheduling based upon software vectors may be helpful because different software versions often have different capabilities. For example, various additional features and services are included in MICROSOFT WINDOWS '98 as compared with MICROSOFT WINDOWS '95. Any one of these additional functions or services may be desired for a particular workload that is to be hosted on a particular client system device. Software and operating system vectors also allow for customers to select a wide variety of software configurations on which the customers may desire a particular workload to be run. These varied software configurations may be helpful, for example, where software testing is desired. Thus, the distributed processing system of the present invention may be utilized to test new software, data files, Java programs or other software on a wide variety of hardware platforms, software platforms and software versions. For example, a Java program may be tested on a wide proliferation of JREs (Java Runtime Engines) associated with a wide variety of operating systems and machine types, such as personal computers, handheld devices, etc.
From the customer system perspective, the capability management and the capability database, as well as information concerning users of the distributed devices, provide a vehicle through which a customer may select particular hardware, software, user or other configurations, in which the customer is interested. In other words, utilizing the massively parallel distributed processing system of the present invention, a wide variety of selectable distributed device attributes, including information concerning users of the distributed devices, may be provided to a customer with respect to any project, advertising, or other information or activity a customer may have to be processed or distributed.
For example, a customer may desire to advertise certain goods or services to distributed devices that have certain attributes, such as particular device capabilities or particular characteristics for users of those distributed devices. Based upon selected attributes, a set of distributed devices may be identified for receipt of advertising messages. These messages may be displayed to a user of the distributed device through a browser, the client agent, or any other software that is executing either directly or remotely on the distributed device. Thus, a customer may target particular machine specific device or user attributes for particular advertising messages. For example, users with particular demographic information may be targeted for particular advertisements. As another example, the client agent running on client systems that are personal computers may determine systems that are suffering from numerous page faults (i.e., through tracking operating system health features such as the number of page faults). High numbers of page faults are an indication of low memory. Thus, memory manufacturers could target such systems for memory upgrade banners or advertisements.
Still further, if a customer desires to run a workload on specific device types, specific hardware platforms, specific operating systems, etc., the customer may then select these features and thereby select a subset of the distributed client systems on which to send a project workload. Such a project would be, for example, if a customer wanted to run a first set of simulations on personal computers with AMD ATHLON microprocessors and a second set of simulations on personal computers with INTEL PENTIUM III microprocessors. Alternatively, if a customer is not interested in particular configurations for the project, the customer may simply request any random number of distributed devices to process its project workloads.
Customer pricing levels for distributed processing may then be tied, if desired, to the level of specificity desired by a particular customer. For example, a customer may contract for a block of 10,000 random distributed devices for a base amount. The customer may later decide for an additional or different price to utilize one or more capability vectors in selecting a number of devices for processing its project. Further, a customer may request that a number of distributed devices be dedicated solely to processing its project workloads. In short, once device attributes, including device capabilities and user information, are identified, according to the present invention, any number of customer offerings may be made based upon the device attributes for the connected distributed devices. It is noted that to facilitate use of the device capabilities and user information, capability vectors and user information may be stored and organized in a database, as discussed above.
Referring now to
As shown in
Site testing is typically desired to determine how a site or connected service performs under any desired set of test circumstances. With the distributed processing system of the present invention, site performance testing may be conducted using any number of real client systems 108, 110 and 112, rather than simulated activity that is currently available. Several tests that are commonly desired are site load tests and quality of service (QoS) tests. Quality of service (QoS) testing refers to testing a user's experience accessing a network site under normal usability situations. Load testing refers to testing what a particular network site's infrastructure can handle in user interactions. An extreme version of load testing is a denial-of-service attack, where a system or group of systems intentionally attempt to overload and shut-down a network site. Advantageously, the current invention will have actual systems testing network web sites, as opposed to simulated tests for which others in the industry are capable and which yield inaccurate and approximate results.
Network site 106B and the multiple interactions represented by communication lines 116B, 116C and 116D are intended to represent a load testing environment. Network site 106A and the single interaction 116A is indicative of a user interaction or QoS testing environment. It is noted that load testing, QoS testing and any other site testing may be conducted with any number of interactions from client systems desired, and the timing of those interactions may be manipulated and controlled to achieve any desired testing parameters. It is further noted that periodically new load and breakdown statistics will be provided for capacity planning.
-
- Polling. Clients poll the server with a given frequency. The server instructs them to start or stop running the module and can provide other instructions as part of the polling response communications. The number of clients running the module can be adjusted dynamically during the life of the project.
- Non-polling. All cued clients start running the module. The start times can be based on a specified distribution specified over a “startup period.” Examples of distributions that might be specified are uniform, random, Poisson. If the startup period has zero duration, the cued clients are started simultaneously.
It is noted that more complicated schemes could be implemented, if desired. It is also noted that although this dynamic coordination and control architecture is particularly useful for in supporting website quality of service and load testing, this architecture can more generally be utilized for other projects, if desired.
Looking first to
As discussed above, the server systems 104 can be connected to and configured to utilize a variety of databases, as desired. These databases can also store information, as need, that is related to the dynamic coordination and control of tasks and results data. In the embodiment of
The poll server 502 is provided to allow the control server 504 to off-load much of its management tasks for site testing activities during operation of the tests on the participating client systems. As shown in the example embodiment of
The project information and project control information can take any of a variety of forms depending upon the nature of the project being run and the nature of the management and scheduling control desired. For example, as part of the initial project setup or control information provided to the client systems, the client systems can be given poll parameters, such as a poll period, a test start time and a test end time. The poll period refers to information that determines when the client system will communicate with the poll server 502. For example, the poll period information can define a regular time interval, scheduled times or defined times-at which the client systems communicate with the poll server 502 to provide project information such as status of the project on the client system, partial result data, local clock information, or any other desired project related data or information, that may be utilized by the poll server 502 to help manage and coordinate the project operations of the various different client systems. If the poll period is zero, the client system can simply run the project from its start time to finish time without polling the poll server 502. The poll server 502 can send back information such as clock synchronization information, project instructions, poll period changes, or any other desired instructions or information, as desired to manage and coordinate the activities of the client systems conducting the project processing.
A control interface 509 can also be provided. The control interface 509 allows someone formulating and running a project to communicate through link 511 with the control server 504 and the poll server 502. And the control interface 509 can provide a variety of functional controls and information to a user of the interface, such as coordination tools, project overview information, project processing status, project snapshot information during project operations, or other desired information and/or functional controls. For example, with respect to a network site testing project, a tester can use this interface 509 to create the test scripts that are included within the work units that are sent to client systems participating in the test and could set and adjust the poll parameters that are to be used by each client system. The control interface 509 is also used over the duration of the test to view dynamic snapshot information about the current state of the test, including the load on the system, and to use this information to modify test activities such as the number of active clients participating in the test. The broken line 507 represents a demarcation between the servers 502 and 054 and the interface 509. It is noted that the interface 509 could take any of a variety of forms and that the interface 509 can be remote or disconnected from the server systems 104 (which in
Looking back to
If the poll period is greater than zero, then the client agent running the test project code will poll the poll server 502 at periodic intervals. The poll communications that are received from the client systems in block 562 can include a wide variety of information, as desired. These client system communications, for example, can provide information about the current project operations of the client systems and partial test results for the project. In response to the poll communications from the client systems, the poll server 502 can modify test, load and poll parameters as desired in block 564 to manage, control and coordinate the test activities of the client systems. In decision block 560, the determination is made whether the test end time has been reached. In “NO,” then the test continues in block 558. If “YES,” then the test ends in block 566. Test results can then be reported, for example, by being sent from the client systems to the control server 504 for compilation and further processing, as desired. The final results can be stored in a results database 510 and can be provided to the customer that requested or sponsored the site testing project. It is noted that the “load” parameter includes the load on the site under test (SUT), and a change to the load could include increasing or decreasing the number of client systems active in the test project. It is also noted that the poll period can be relatively simple, such as a regular time interval at which the client system communicates with the poll server 502. And the poll period could be more complicated, such as a time interval that changes based upon some condition or criteria, or a communication that occurs after a certain event or events during the test processing, such as each time a test routine is completed. In other words, any of a variety of procedures or algorithms could be utilized, as desired, to set the polling activity of the client systems, and each client system could be set to have unique polling instructions.
As stated above, in one example operation, a goal of the poll server 502 and control server 504 is to coordinate a multitude of clients interconnected over the Internet (or other unbounded network) to conduct a project such as load testing a web site. Some advantageous features of this design are the ability to select clients for the load test based on client characteristics, capabilities, components and attributes, and the ability to dynamically alter the number of active clients actively participating in the test. This is an improvement on the prior techniques where the client systems were typically simulated on a small number of test machines, leading to less accurate results. Other coordinated applications that can use this method of control include measuring the quality of service (QoS) of a site under test.
As shown in
-
- 1. Dynamic coordination and control of a load test is initiated by sending a create command to the server with information about the time, duration, size and type of the test. The following parameters are specified:
- a. Start and end time of the test. The start time is usually specified at some time in the future.
- b. Test script to be run by each client. The scripts can be identical or can be randomized to represent the behavior of several web users.
- c. Specification of number and mix of clients desired. The mix of clients can be based on client geography, machine type, or bandwidth.
- d. Initial number of clients to run
- 2. The server attempts to cue clients for the load test based on the specified mix. All cued clients are sent the following information:
- a. Start and end time of the test
- b. Test script to be run
- c. Poll interval, the interval between successive times when the client contacts (polls) the poll server.
- 3. A control interface or web console 509 is used by the person or developer conducting the test to set parameters for the test and view dynamic statistics as the test progresses.
- 4. After the requested or required number of clients has been cued, the test is ready to begin. At the specified start time, all cued clients contact the poll server for instructions. The poll server 502 tracks the state of each client and is able to estimate the total number of clients available, and the number of clients currently running the test script.
- 5. The target number of running clients can be modified dynamically during the test. A typical usage would be to start the test with a small number of running clients, and then gradually increase the number of running clients, thus increasing the load on the web site. The poll server attempts to adjust the number of running clients to match the target. If the target is increased, the poll server would instruct additional clients to join in the test. To stop the test, the target number of running clients is set to zero. The polling mechanism also allows the system to recover from client failures during a test. In this case, the poll server can detect a client failure and activate another client to take its place in the test.
- 6. The client passes dynamic results to the poll server during each poll. The dynamic statistics include throughput, hits per second and errors found. These statistics are combined to give a snapshot view of the current performance of the web site under test. This snapshot information can be used by the tester to modify the test parameters (number of active clients, poll interval, etc.) or even to stop the test if the desired load level has been reached.
- 7. Upon completion of the test, all participating clients send back detailed statistics from the test, which are aggregated and presented to the person conducting the test.
- 1. Dynamic coordination and control of a load test is initiated by sending a create command to the server with information about the time, duration, size and type of the test. The following parameters are specified:
This coordinated testing architecture could be used for other network site testing operations. For example, it can be used for quality of server (QoS) testing, where the typical goal is to be able to measure response times at Internet connected desktops in order to gauge the user experience when browsing a website (e.g., the site under test (SUT)). The number of active clients selected for QoS testing is typically much smaller than the number for load testing, but the selected active clients are typically spread across the network (e.g., geographically, and by ISP). Each client periodically runs a project workload script making HTTP commands to one or more websites and measures the response times from each. These summarized results are returned to the poll server 502 which aggregates results across all active clients and generates reports for each website being tested. The active clients in this case typically do not, by themselves, add significant load to the SUT. The load on the SUT is the normal load generated by browsing on the Internet. The active clients are merely providing performance measurement data at a wide variety of points across the Internet, and their results tend to provide a true reflection of what a person browsing on his desktop would see when interacting with the SUT. For example, QoS testing can identify performance bottlenecks over time by geography, ISP, machine type, system type or related other possible factors. For example, a website might be able to determine that response times at night to machines within a major ISP are much longer than the mean response time.
There are a number of advantageous that are provided by the poll server architecture of the present invention. For example, where the network is the Internet, it is expected that the set of clients on the Internet are non-dedicated resources. Thus, there is desirably a mechanism to keep track of the current state of each client system. This task is difficult to accomplish in an efficient and reasonable manner by the dispatch or control server alone, which is also responsible for scheduling distributed computing work to all other clients in the distributed computing network. One method for getting the state of a client machine is to have a listening port on the client, which is queried by the server to get status information. In other words, instead of the polling by the client system to the poll server as indicated above, the poll server could initiate contact to each client system. However, due to the reluctance of information technology managers, individual PC owners, and others who control client systems to have open ports on their machines, the alternative where the client system periodically communicates with the poll server to sends summary status information and to receive test instructions is likely a method that is more widely acceptable. It is noted that the poll server 502 and the dispatch/control server 504 can each be one or more server systems that operate to perform desired functions in the dynamic coordination and control architecture. It is also again noted that the poll server 502 and control server 504 could be combined if desired into a single server system or set of systems that handles both roles. However, this would likely lead to a more inefficient operation of the overall distributed processing system.
As discussed above, a poll server 502 can be used to offload the polling connections from the main server 504. (The poll requests can be short, unencrypted, unauthenticated, single-turnaround requests from the client agent running on each client system.) Without the separate poll server, there are communication requirements that would likely reduce the performance of the distributed computing platform, for example, the number of database queries that can be handled at a given time and the number of connected client systems at a given time. This architecture of the present invention helps to improve performance by offloading the work of handling agent poll requests to another server. It is noted, however, that the present invention could still be utilized without offloading the polling functions, if this were desired. In general, the polling server 502 can be designed to open a single connection to a database to retrieve information about active schedex records. Periodically, the poll server 502 can use this database connection to refresh and update current running count information. On each agent poll request, the poll server 502 uses data structures in memory to determine whether the client system should start, stop, or terminate.
The client systems can make the polling connection to the server using TCP. However, UDP could be utilized to reduce the overhead inherent in TCP connection establishment. If the agent has a proxy configured, however, then UDP will likely not work. Otherwise, UDP could be tried, and if no response were received, TCP could be used as a fall back communication protocol. When the agent receives a new schedex record, one of the attributes can be the address of a polling server where the client will send poll requests. If this is not specified, the agent can fall back to using the main server address. It is noted, however, that in the latter case a different port would preferably be utilized on the main server, because the polling server function is best viewed as a separate process from the main server function.
In a more-generalized environment, where the server systems include multiple dispatch servers, each responsible for a different set of project applications, the poll server could have a broader function of tracking outstanding messages for delivery to clients the next time they contact the poll server. Periodic polling by a client systems can improve the responsiveness of the system. For example, if the person conducting the test stops a project currently running on the distributed computing system, the poll server can obtain a list of all client systems processing work on behalf of the project and its workloads and can instruct these client systems to stop the currently executing workload and return to the dispatch server to get a new piece of work. In addition, high priority jobs entering the system can be immediately serviced by having the poll server draft clients from a client system resource pool by issuing a preempt call to the client at the next poll. This preempt call would preempt all pending work being done by the client system and would start operation of the high priority job on the selected client systems.
EXAMPLE IMPLEMENTATION DETAILSTo further describe the dynamic coordination and control architecture of the present invention (referred to below in relation to a scheduled execution (schedex) project), example polling procedures, poll communications, initialization parameters, test parameters, management, coordination and control procedures and associated function calls are now discussed.
A scheduled execution (schedex) project can also have associated with it a variety of polling and related test parameters. For example, the following attributes can be provided:
-
- poll_period_sec—How frequently (in seconds) clients should poll the server while they are running. This determines how long until control actions take effect (see below). Zero for a non-polling execution.
- IDs—task and workunit IDs
- startup_start_time—The beginning of the startup period.
- startup_end_time—The end of the startup period (defined only for a non-polling execution).
- end_time—The end of the execution period. Any clients still running at this time will be gracefully terminated.
- nhosts_cue—How many hosts to cue. NOTE: the server attempts to choose hosts that are likely to be running during the execution period, but not all of them actually will be. So the maximum number of running hosts may be less than this.
- nrunning_target—how many hosts should run the module (defined only for a polling execution).
- state—The example states are “being edited”, “activated”, “running”, and “completed”.
A scheduled execution project can further define client type quotas for the number of cued client systems possessing particular attribute values. The attribute types can include any of a variety of client capabilities, attributes and components as discussed above, for example, with respect to personal computers, the attributes can include geographic location such as country, device operating system, and downstream bandwidth. The client system type quotas can be used to limit the client systems to which the server systems distribute the scheduled execution project. For each quota, the server system can maintain a counter of the number of client systems with that attribute that have been cued so far to participate in the particular scheduled execution project. Client systems can be considered in a non-deterministic order. For each client system, the UD server checks whether the counters for the client systems particular attributes are less than the corresponding quotas. If so, the scheduled execution project is cued on that client system. These selection parameters can be used to accomplish various goals. Some examples are provided below.
For example, suppose that the number of client systems (or hosts) to cue is 1000, such that nhosts_cue=1000.
If the tester wants at least 50% of the hosts to be from Canada, the following could be used:
<attr_type=“country”, value=“Canada”, quota=1000>
<attr_type=“country”, value=“*”, quota=500>
If you want exactly 50% each from Canada and Poland, use
<attr_type=“country”, value=“Canada”, quota=500>
<attr_type=“country”, value=“Poland”, quota=500>
<attr_type=“country”, value=“*”, 0>
If, in addition, you want only Windows computers, use
<attr_type=“country”, value=“Canada”, quota=500>
<attr_type=“country”, value=“Poland”, quota=500>
<attr_type=“country”, value=“*”, quota=0>
<attr_type=“OS”, value=“Win95”, quota=1000>
<attr_type=“OS”, value=“Win98”, quota=1000>
<attr_type=“OS”, value=“WinNT”, quota=1000>
<attr_type=“OS”, value=“*”, quota=0>
It is noted that the above parameter system may not able to express some requirements, such as a requirement that at least 25% of the clients are from one country and at least 25% are from another. However, if desired, additional execution parameters could be added to provide such capability. It is also noted that client system type quotas discussed above may be designed such that they affect the set of hosts on which the scheduled execution project is cued and not the hosts on which the project actually runs. For example, client systems could be chosen to run the scheduled execution project essentially randomly, so the properties of the set of running hosts will generally approximate those of the set of cued hosts; however, they may not match exactly. There may be exceptions, for example, if the scheduled execution project is scheduled at a time when most hosts in Poland are turned off, the fraction of running Polish hosts may be smaller than desired.
The control or console interface 509, which can be an Internet web interface, can be configured to allow a variety of tasks, including (1) create, edit and activate a scheduled execution project, (2) to control a scheduled execution project while it is running by viewing and adjusting the number of clients running the scheduled execution project (if polling by client systems is implement, these adjustments will likely have a certain lag time associated with the poll period until they go into effect), and (3) to mark a scheduled execution project as “completed” to stop operation on all running clients. Alternatively, the same operations are available as HTTP RPCs (Remote Procedure Calls).
The scheduled execution architecture of the present invention lends itself to a variety if implementations. Example implementation and operation details are provided below with respect to function calls and operations that may be utilized to realize the present invention.
It is noted that this is an example operation to creates and activate a scheduled execution project for a given task. Times are given in seconds. The return value “status” is “OK” if the operation succeeded, else a description of the error.
It is noted that this operation requests a change in the number of clients running the scheduled execution project. If client system polling is utilized, it will typically take up to “poll_period” seconds for this target to be reached. If the number is increased, additional clients (cued but not yet running) are started. If the number is decreased, the application is gracefully terminated on some hosts, creating a result file on each host. If the application is later started on the host, additional result files will be created.
It is noted that the scheduled execution project is gracefully terminated on all hosts. In this example, no further operations on the scheduled execution project are allowed. The transfer of result files to the server systems is started.
It is noted that this operation returns the number of client cued to run the scheduled execution project, the number currently running it, and the number of clients available to run it (i.e. that are actively polling the server). The latter two numbers arc defined only for a scheduled execution project where client system polling is utilized
Scheduled Execution (Schedex) Protocol
Regular (<request>) RPCs can include the following item in both requests and replies.
The client tells the server what schedex workloads are currently cued. The server gives the client new schedex workloads to cue.
Clients with a cued, active polling schedex periodically make the following RPC:
It is noted that <schedex_stop/> tells the client to stop a running schedex, <schedex_start/> tells the client to start a cued schedex, and <schedex_terminate/> says to stop a schedex if running and delete it.
Database
The schedex table, in addition to the schedex attributes, can include the following:
The schedex_host table stores hosts on which the schedex is cued.
(It is noted that the number of running clients can be found by counting the number of records with “running” set.
The schedex — quota table stores quoas:
Server
The server maintains in-memory copies of the schedex and schedex_quota tables.
GLOBALS::check_schedex(CLIENT_CONN&cc)
When the server handles a <request> RPC, and there is a schedex with ncued <nhosts_cue, and the host is of eligible type and not barred by user preferences from running the schedex, and doesn't already have an overlapping schedex, and no quotas are exceeded, the server sends the host that schedex. If the schedex is polling, it creates a schedex_host record. It updates and reloads the schedex and schedex_quota entries.
CLIENT_CONN::handle_schedex_poll( )
When a <schedex_poll_request> RPC is received, the server looks up the schedex_host record. If not found it returns a <schedex_terminate> (this should never happen). If the client is running this module, and number of running hosts is more than nrunning_target, the server returns a <schedex_stop> and clears the running field in the schedex_host record. Similarly, if the client is not running this module and the number of running hosts is less than nrunning_target, the server returns a <schedex_start> and sets the running field in the schedex_host record. In any case it updates the “last poll time” field in the DB.
GLOBALS::schedex_timer( )
Each server periodically enumerates all server_host records with the “running” flag set and “poll deadline” <now—poll period, and clears the “running” flag. When a schedex end_time is reached each server changes the state to “ended” and clears the “running” flag of all schedex_host records. It is noted that in principle the above tasks can be accomplished by one server, but it may be better for all servers to do them.
Client
The client stores a list of pending schedex workloads in memory and in the core state file. It also may have variables, such as:
When a polling schedex becomes active, the client sets the polling timer randomly in the interval [now . . , now+polling_period].
INSTANCE::schedex_timer_func( )
The client maintains a polling timer for each active polling schedex. When this reaches zero, it sends a poll RPC. If the schedex remains active, it resets the timer. When a nonpolling schedex becomes active, the client picks a start time randomly in the startup period. When the end time of a schedex is reached, the client stops it (if running) and removes it from the data structure. If no other cued schedex references the same workunit, it removes the workunit.
Data Structures
The polling server maintains a list of “active” schedex records and the current number of hosts running that schedex task:
This list is indexed by schedex identification. Schedex records will be added and removed infrequently, but there will be one lookup on this table per poll request.
The SchedexHostList is a list of hosts that are currently running the schedex task. The list consists of records containing the following information:
This list is indexed by host identification. Hosts will be added once during the lifetime of the schedex task, and removed en masse at the end of the schedex. There will be one lookup on this table per poll request.
Poll Requests
Each poll request contains the following information:
Schedex id
Host id
Agent's is_running flag
Each poll response can contain zero or one of the following commands:
-
- <schedex_start>—tells the agent to start running the schedex task.
- <schedex_stop>—tells the agent to stop running the schedex task, but continue to poll.
- <schedex_terminate>—tells the agent to stop running the schedex task, remove the schedex record, and no longer poll.
Operation
On each poll request, the server performs the following sequence of operations:
An invariant after this operation is that the running_count for the schedex should match the number of host records where the is_running flag is set.
The poll server also runs a background process that periodically performs (every 10 seconds or perhaps more often) the following operations:
-
- If the schedex poll server crashes, recovery is performed by loading all the schedex records from the database where the current time is greater than or equal to the start time, but less than the end time. These records will contain the running hosts count from the last periodic update. This procedure should happen every time the server is started, so there is no need to detect whether the previous run of the server crashed or not.
Further modifications and alternative embodiments of this invention will be apparent to those skilled in the art in view of this description. It will be recognized, therefore, that the present invention is not limited by these example arrangements. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the manner of carrying out the invention. It is to be understood that the forms of the invention herein shown and described are to be taken as the presently preferred embodiments. Various changes may be made in the implementations and architectures for database processing. For example, equivalent elements may be substituted for those illustrated and described herein, and certain features of the invention may be utilized independently of the use of other features, all as would be apparent to one skilled in the art after having the benefit of this description of the invention.
Claims
1. A method of providing dynamic coordination of distributed client systems in a distributed computing platform, comprising:
- providing at least one server system coupled to a network;
- providing a plurality of network-connected distributed client systems, the client systems having under-utilized capabilities and running a client agent program to provide workload processing for at least one project of a distributed computing platform;
- utilizing the server system to distribute workloads for the at least one project to the client systems and to distribute initial project and poll parameters to the client systems;
- receiving poll communications from the client systems during processing of project workloads by the client systems, wherein a dynamic snapshot information of current project status is provided based at least in part upon the poll communications;
- analyzing the poll communications to determine whether or not to make one or more modification to the initial project and poll parameters, wherein the modifications to the initial project and poll parameters utilize the dynamic snapshot information to determine whether to change how many client systems are active in the at least one project, and if a fewer number is desired, including within a polling response communications a reduction in the number of actively participating clients, and if a greater number is desired, adding client systems to active participation in the at least one project;
- sending the poll response communications to the client systems to modify the initial project and poll parameters depending upon one or more decisions reached in the analyzing step; and
- repeating the receiving, analyzing and sending steps to dynamically coordinate project activities of the plurality of client systems during project operations.
2. The method of claim 1, wherein the initial project and poll parameters comprise a poll period setting for each client system that determines when the client system will poll the server system.
3. The method of claim 2, wherein the poll period setting for a plurality of the client systems are the same.
4. The method of claim 1, wherein in the poll communications from the client systems comprise current project status information.
5. The method of claim 1, wherein the client systems participating in a the at least one project are assigned as active client systems and on-hold client systems, such that the active client systems actively process the project workloads and the on-hold client systems form an on-hold pool of client systems that are capable of being added to active participation.
6. The method of claim 5, wherein the client systems added to active participation in the at least one project are selected from the on-hold pool, and wherein client systems removed from active participation in the at least one project are added to the on-hold pool.
7. The method of claim 1, wherein the network comprises the Internet.
8. The method of claim 1, wherein the at least one project comprises network site testing and the dynamic snapshot information comprises current load on a network site under test (SUT).
9. The method of claim 1, wherein the at least one server ststem system comprises at least one control server and least one poll server system, the poll server system operating to handle the poll communication with the client systems.
10. The method of claim 1, wherein the at least one project comprises network site testing.
11. The method of claim 10, wherein the site testing is quality of service testing, or load testing, or and denial of service testing to, and wherein site testing is applied to testing content delivery from a network site.
12. The method of claim 10, wherein the initial test and poll parameters comprise a test start time, test slop time and poll period information.
13. The method of claim 1, further comprising identifying attributes for the client systems, storing the attributes in a database, and utilizing the attributes to select the client systems for participation in the at least one project.
14. The method of claim 13, wherein the attributes comprise device capabilities for the client systems.
15. The method of claim 13, wherein the network comprises the Internet, wherein the at least one project comprises network site testing, and wherein the attributes comprise geographic location of the client system, type of device for the client systems or operating used by the client systems.
16. The method claim 13, wherein the network comprises the Internet, wherein the at least one project comprises network site testing, and wherein the attributes comprise ISP information (Internet Service Provider) for the client systems or, and routing information to a site under test for the client systems.
17. The method of claim 1, wherein one of the at least one projects comprises network site testing, and wherein the method further comprising comprises transferring a core agent module and a site testing project module to the client system, the site testing project module being capable of operating on the core agent module to process site testing workloads.
18. A distributed computing platform having dynamic coordination capabilities for distributed client systems processing project workloads, comprising:
- a plurality of network-connected distributed client systems, the client systems having under-utilized capabilities;
- a client agent program configured to run on the client systems and to provide workload processing for at least one project of a distributed computing platform; and
- at least one server system configured to communicate with the plurality of client systems through a network to provide the client agent program to the client systems, to send initial project and poll parameters to the client systems, to receive poll communications from the client systems during processing of the project workloads, wherein a dynamic snapshot information of current project status is provided based at least in part upon the poll communications from the client systems, to analyze the poll communications utilizing the dynamic snapshot information to determine whether to change how many client systems are active in the at least one project, wherein if a fewer number is desired, including within a poll response communications a reduction in the number of actively participating clients, and if a greater number is desired, adding client systems to active participation in the at least one project within a poll response communications, the server system repeatedly utilizing the poll communications and the poll response communications to coordinate project activities of the client systems during project operations.
19. The distributed computing platform of claim 18, wherein the initial project and poll parameters comprise a poll period setting that determines when the client system will poll the server system.
20. The distributed computing platform of claim 19, wherein the poll period settings for a plurality of the client systems are the same.
21. The distributed computing platform of claim 18, wherein the poll communications from each client comprise identification, project status information and current poll period setting information.
22. The distributed computing platform of claim 18, further comprising a poll database configured to store poll related information about each client system.
23. The distributed computing platform of claim 22, wherein the at least one server system comprises at least one control server and at least one poll server system, the poll server system being coupled to the poll database and being configured to handle the poll operations of the client systems.
24. The distributed computing platform of claim 23, wherein one of the at least one projects comprises a network site testing project.
25. The distributed computing platform of claim 24, wherein the client agent program comprises a core agent module and a site testing project module, the site testing project module being capable of operating on the core agent module to process site testing workloads.
26. The distributed computing platform of claim 24, wherein the initial project and poll information parameters comprises a test start time, a test stop time and poll period information.
27. The distributed computing platform of claim 23, further comprising a control interface configured to communication communicate with the server system, the control interface allowing coordination of the client system participating in the network site testing project.
28. The distributed computing platform of claim 27, wherein the poll server is configured to provide dynamic snapshot information through the control interface and to receive modifications to the network testing initial project and poll parameters for ongoing project operations.
29. The distributed computing platform of claim 28, wherein the modifications are configured to include modifications to how many client systems are active in the network site testing project.
30. The distributed computing platform of claim 29, wherein the client systems participating in a the at least one project are assigned as active client systems and on-hold client systems, such that the active client systems actively process the project workloads and the on-hold client systems form an on-hold pool of client systems that are capable of being added to active participation.
31. The distributed computing platform of claim 18, further comprising an attributes database coupled to the server system, the database configured to store attributes of the client systems.
32. The distributed computing platform of claim 31, the server system further configured to allow selection of the client systems for project participation based upon identification of desired client system attributes.
33. The distributed computing platform of claim 32, wherein the attributes comprise device capabilities for the client systems.
34. In a server system communicatively coupled to a network via a communication interface, a method of providing dynamic coordination of distributed client systems, the method comprising:
- distributing via the communication interface workloads for at least one project, and initial project and poll parameters, to each of a plurality of client systems that is communicatively connected to the network, each of the plurality of client systems running a client agent program to provide workload processing for the at least one project;
- receiving via the communication interface poll communications from the plurality of client systems during processing of project workloads by the plurality of client systems, the poll communications providing at least part of a dynamic snapshot information of current project status;
- analyzing the poll communications to determine whether or not to make one or more modifications to the initial project and poll parameters, wherein the modifications to the initial project and poll parameters utilize the dynamic snapshot information to determine whether to change how many client systems are active in the at least one project, and if a fewer number is desired, including within a polling response communications a reduction in the number of actively participating clients, and if a greater number is desired, adding client systems to active participation in the at least one project;
- transmitting via the communication interface the poll response communications to the plurality of client systems to modify the initial project and poll parameters depending upon one or more decisions reached in the analyzing step; and
- repeating the receiving, analyzing and transmitting steps to dynamically coordinate project activities of the plurality of client systems during project operations.
35. The method of claim 34, wherein the initial project and poll parameters comprise a poll period setting for each client system that determines when the client system of the plurality of client systems will poll the server system.
36. The method of claim 35, wherein the poll period setting for the plurality of the client systems are the same.
37. The method of claim 34, where in the poll communications from the plurality of client systems comprise current project status information.
38. The method of claim 34, wherein the client systems participating in at least one project are assigned as active client systems and on-hold client systems, such that the active client systems actively process the project workloads and the on-hold client systems form an on-hold pool of client systems that are capable of being added to active participation.
39. The method of claim 38, wherein the client systems added to active participation in the at least one project are selected from the on-hold pool, and wherein client systems removed from active participation in the at least one project are added to the on-hold pool.
40. The method of claim 34, wherein the network comprises the Internet.
41. The method of claim 34, wherein the at least one project comprises network site testing and the dynamic snapshot information comprises current load on a network site under test (SUT).
42. The method of claim 34, wherein receiving via the communication interface poll communications from the plurality of client systems comprises receiving poll communications at a poll server system of the server system,
- and wherein transmitting via the communication interface the poll response communications to the plurality of client systems comprises sending the poll response communications from the poll server system of the server system.
43. The method of claim 34, wherein the at least one project comprises network site testing.
44. The method of claim 43, wherein the site testing is one of quality of service testing, load testing, and denial of service testing, and wherein the site testing is applied to testing content delivery from a network site.
45. The method of claim 43, wherein the initial test and poll parameters comprise a test start time, test stop time and poll period information.
46. The method of claim 34, further comprising identifying attributes for the plurality of client systems, storing the attributes in a database, and utilizing the attributes to select the client systems for participation in the at least one project.
47. The method of claim 46, wherein the attributes comprise device capabilities for the plurality of client systems.
48. The method of claim 46, wherein the network comprises the Internet, wherein the at least one project comprises network site testing, and wherein the attributes include at least one of: geographic locations of the plurality of client systems, type of device for each of the plurality of client systems, and operating system used by each of the plurality of client systems.
49. The method claim 46, wherein the network comprises the Internet, wherein the at least one project comprises network site testing, and wherein the attributes include at least one of ISP information (Internet Service Provider) for the plurality client systems, and routing information to a site under test for the plurality client systems.
50. The method of claim 34, wherein one of the at least one project comprises network site testing, and wherein the method further comprises transferring a core agent module and a site testing project module to the plurality of client systems, the site testing project module being capable of operating on the core agent module to process site testing workloads.
51. A server system comprising a network interface, the server configured to:
- distribute to each of a plurality of client systems via the network interface workloads for a project that is configured to be carried out by a client agent program executing on each of the plurality of client systems;
- transmit to each of the plurality of client systems via the network interface initial project and poll parameters applicable to workload processing of the project by the client agent program;
- receive via the network interface poll communications indicative of ongoing workload processing of the project by the client agent program executing the each of the plurality client systems, wherein the poll communications provides at least a partial basis for a dynamic snapshot information of current project status;
- analyze the poll communications utilizing the dynamic snapshot information to make a determination of whether to change a current number of client systems that are active in the project;
- transmit via the network interface a poll response communications, wherein the poll response communications include a reduction in the current number if the determination is to reduce the current number of client systems that are active in the project, and the poll response communications include an increase in the current number if the determination is to increase the current number of client systems that are active in the project;
- repeatedly utilize the poll communications and the poll response communications to coordinate project activities of the client systems during project operations.
52. The server system of claim 51, wherein the initial project and poll parameters comprise a poll period setting that determines when the plurality of client systems will poll the server system.
53. The server system of claim 52, wherein the poll period settings for the plurality of the client systems are the same.
54. The server system of claim 51, wherein the poll communications comprise identification, project status information and current poll period setting information.
55. The server system of claim 51, further comprising a poll database configured to store poll related information about each of the plurality of client systems.
56. The server system of claim 55, further comprising at least one control server and at least one poll server system, the poll server system being coupled to the poll database and being configured to handle poll operations relating to the plurality of client systems.
57. The server system of claim 56, wherein the project comprises a network site testing project.
58. The server system of claim 57, wherein the initial project and poll parameters comprises a test start time, a test stop time and poll period information.
59. The server system of claim 56, further comprising a control interface configured for coordination of a client system participating in the network site testing project.
60. The server system of claim 59, wherein the poll server is configured to provide dynamic snapshot information through the control interface and to receive modifications to the initial project and poll parameters for ongoing project operations.
61. The server system of claim 60, wherein the modifications are configured to include modifications to how many client systems are active in the network site testing project.
62. The server system of claim 61, wherein the plurality of client systems participating in the project are assigned as active client systems and on-hold client systems, such that the active client systems actively process the project workload and the on-hold client systems form an on-hold pool of client systems that are capable of being added to active participation.
63. The server system of claim 51, further comprising an attributes database, the database configured to store attributes of the plurality of client systems.
64. The server system of claim 63, further configured to allow selection of the plurality of client systems for project participation based upon identification of desired client system attributes.
65. The server system of claim 64, wherein the attributes comprise device capabilities for the plurality of client systems.
66. A tangible computer-readable medium having stored thereon computer-executable instructions that, if executed by a server system, cause the server system to perform a method comprising:
- distributing workloads for at least one project, and initial project and poll parameters, to each of a plurality of client systems, each of the plurality of client systems running a client agent program to provide workload processing for the at least one project;
- receiving poll communications from the plurality of client systems during processing of project workloads by the plurality of client systems, the poll communications providing at least part of a dynamic snapshot information of current project status;
- analyzing the poll communications to determine whether or not to make one or more modifications to the initial project and poll parameters, wherein the modifications to the initial project and poll parameters utilize the dynamic snapshot information to determine whether to change how many client systems are active in the at least one project, and if a fewer number is desired, including within a polling response communications a reduction in the number of actively participating clients, and if a greater number is desired, adding client systems to active participation in the at least one project;
- transmitting the poll response communications to the plurality of client systems to modify the initial project and poll parameters depending upon one or more decisions reached in the analyzing step; and
- repeating the receiving, analyzing and transmitting steps to dynamically coordinate project activities of the plurality of client systems during project operations.
4669730 | June 2, 1987 | Small |
4699513 | October 13, 1987 | Brooks et al. |
4815741 | March 28, 1989 | Small |
4818064 | April 4, 1989 | Youngquist et al. |
4839798 | June 13, 1989 | Eguchi et al. |
4893075 | January 9, 1990 | Dierker, Jr. |
4987533 | January 22, 1991 | Clark et al. |
5056019 | October 8, 1991 | Schultz et al. |
5031089 | July 9, 1991 | Takashima |
5332218 | July 26, 1994 | Lucey |
5402394 | March 28, 1995 | Turski |
5483444 | January 9, 1996 | Heintzeman et al. |
5594792 | January 14, 1997 | Chouraki et al. |
5598566 | January 28, 1997 | Pascucci et al. |
5655081 | August 5, 1997 | Bonnell et al. |
5659614 | August 19, 1997 | Bailey, III |
5703949 | December 30, 1997 | Rosen |
5740231 | April 14, 1998 | Cohn et al. |
5740549 | April 1998 | Reilly et al. |
5761507 | June 2, 1998 | Govett |
5768504 | June 16, 1998 | Kells et al. |
5768532 | June 16, 1998 | Megerian |
5790789 | August 4, 1998 | Suarez |
5793964 | August 11, 1998 | Rogers et al. |
5802062 | September 1, 1998 | Gehani et al. |
5806045 | September 8, 1998 | Biorge et al. |
5815793 | September 29, 1998 | Ferguson |
5826261 | October 20, 1998 | Spencer |
5826265 | October 20, 1998 | Van Huben et al. |
5832411 | November 3, 1998 | Schatzmann et al. |
5842219 | November 24, 1998 | High, Jr. et al. |
5848415 | December 8, 1998 | Guck |
5862325 | January 19, 1999 | Reed et al. |
5881232 | March 9, 1999 | Cheng et al. |
5884072 | March 16, 1999 | Rasmussen |
5884320 | March 16, 1999 | Agrawal et al. |
5887143 | March 23, 1999 | Saito et al. |
5893075 | April 6, 1999 | Plainfield et al. |
5893905 | April 13, 1999 | Main et al. |
5907619 | May 25, 1999 | Davis |
5909540 | June 1, 1999 | Carter et al. |
5911776 | June 15, 1999 | Guck |
5916024 | June 29, 1999 | Von Kohorn |
5918229 | June 29, 1999 | Davis et al. |
5921865 | July 13, 1999 | Scagnelli et al. |
5937192 | August 10, 1999 | Martin |
5953420 | September 14, 1999 | Matyas, Jr. et al. |
5958010 | September 28, 1999 | Agarwal et al. |
5964832 | October 12, 1999 | Kisor |
5966451 | October 12, 1999 | Utsumi |
5970469 | October 19, 1999 | Scroggie et al. |
5970477 | October 19, 1999 | Roden |
5978594 | November 2, 1999 | Bonnell et al. |
5987506 | November 16, 1999 | Carter et al. |
6003065 | December 14, 1999 | Yan et al. |
6003083 | December 14, 1999 | Davies et al. |
6009455 | December 28, 1999 | Doyle |
6014634 | January 11, 2000 | Scroggie et al. |
6014712 | January 11, 2000 | Islam et al. |
6024640 | February 15, 2000 | Walker et al. |
6026474 | February 15, 2000 | Carter et al. |
6052584 | April 18, 2000 | Harvey et al. |
6052785 | April 18, 2000 | Lin et al. |
6058393 | May 2, 2000 | Meier et al. |
6061660 | May 9, 2000 | Eggleston et al. |
6065046 | May 16, 2000 | Feinberg et al. |
6070190 | May 30, 2000 | Reps et al. |
6076105 | June 13, 2000 | Wolff et al. |
6078953 | June 20, 2000 | Vaid et al. |
6094654 | July 25, 2000 | Van Huben et al. |
6098091 | August 1, 2000 | Kisor |
6112181 | August 29, 2000 | Shear et al. |
6112225 | August 29, 2000 | Kraft et al. |
6112243 | August 29, 2000 | Downs et al. |
6112304 | August 29, 2000 | Clawson |
6115713 | September 5, 2000 | Pascucci et al. |
6128644 | October 3, 2000 | Nozaki |
6131067 | October 10, 2000 | Girerd et al. |
6134532 | October 17, 2000 | Lazarus et al. |
6135646 | October 24, 2000 | Kahn et al. |
6138155 | October 24, 2000 | Davis et al. |
6148335 | November 14, 2000 | Haggard et al. |
6148377 | November 14, 2000 | Carter et al. |
6151684 | November 21, 2000 | Alexander et al. |
6167428 | December 26, 2000 | Ellis |
6189045 | February 13, 2001 | O'Shea |
6191847 | February 20, 2001 | Melendez et al. |
6208975 | March 27, 2001 | Bull et al. |
6211782 | April 3, 2001 | Sandelman et al. |
6212550 | April 3, 2001 | Segur |
6249836 | June 19, 2001 | Downs et al. |
6253193 | June 26, 2001 | Ginter et al. |
6263358 | July 17, 2001 | Lee et al. |
6308203 | October 23, 2001 | Itabashi et al. |
6334126 | December 25, 2001 | Nagatomo et al. |
6336124 | January 1, 2002 | Alam et al. |
6345240 | February 5, 2002 | Havens |
6347340 | February 12, 2002 | Coelho et al. |
6356929 | March 12, 2002 | Gall et al. |
6370510 | April 9, 2002 | McGovern et al. |
6370560 | April 9, 2002 | Robertazzi et al. |
6374254 | April 16, 2002 | Cochran et al. |
6377975 | April 23, 2002 | Florman |
6389421 | May 14, 2002 | Hawkins et al. |
6393014 | May 21, 2002 | Daly et al. |
6415373 | July 2, 2002 | Peters et al. |
6418462 | July 9, 2002 | Xu |
6421781 | July 16, 2002 | Fox et al. |
6434594 | August 13, 2002 | Wesemann |
6434609 | August 13, 2002 | Humphrey |
6438553 | August 20, 2002 | Yamada |
6463457 | October 8, 2002 | Armentrout et al. |
6473805 | October 29, 2002 | Lewis |
6477565 | November 5, 2002 | Daswani et al. |
6499105 | December 24, 2002 | Yoshiura et al. |
6505246 | January 7, 2003 | Landsman et al. |
6516338 | February 4, 2003 | Landsman et al. |
6516350 | February 4, 2003 | Lumelsky et al. |
6546419 | April 8, 2003 | Humpleman et al. |
6570870 | May 27, 2003 | Berstis |
6574605 | June 3, 2003 | Sanders et al. |
6574628 | June 3, 2003 | Kahn et al. |
6587866 | July 1, 2003 | Modi et al. |
6601101 | July 29, 2003 | Lee et al. |
6604122 | August 5, 2003 | Nilsson |
6615166 | September 2, 2003 | Guheen et al. |
6643291 | November 4, 2003 | Yoshihara et al. |
6643640 | November 4, 2003 | Getchius et al. |
6654783 | November 25, 2003 | Hubbard |
6714976 | March 30, 2004 | Wilson et al. |
6757730 | June 29, 2004 | Berardin |
6775699 | August 10, 2004 | DeLuca et al. |
6792455 | September 14, 2004 | DeLuca et al. |
6847995 | January 25, 2005 | Hubbard et al. |
6871223 | March 22, 2005 | Drees |
6891802 | May 10, 2005 | Hubbard |
6963897 | November 8, 2005 | Hubbard |
7003547 | February 21, 2006 | Hubbard |
7020678 | March 28, 2006 | Hubbard |
7082474 | July 25, 2006 | Hubbard |
7136857 | November 14, 2006 | Chen et al. |
7143089 | November 28, 2006 | Petras et al. |
20010029613 | October 11, 2001 | Fernandez et al. |
20020010757 | January 24, 2002 | Granik et al. |
20020018399 | February 14, 2002 | Schulze et al. |
20020019584 | February 14, 2002 | Schulze et al. |
20020019725 | February 14, 2002 | Petite |
20020065864 | May 30, 2002 | Hartsell et al. |
20020133593 | September 19, 2002 | Johnson et al. |
20020188733 | December 12, 2002 | Collins |
20020194251 | December 19, 2002 | Richter |
20020198957 | December 26, 2002 | Amjadi |
20040098449 | May 20, 2004 | Bar-Lavi et al. |
20070011224 | January 11, 2007 | Mena |
20090132649 | May 21, 2009 | Hubbard |
20090138551 | May 28, 2009 | Hubbard |
20090164533 | June 25, 2009 | Hubbard |
20090171855 | July 2, 2009 | Hubbard |
20090216641 | August 27, 2009 | Hubbard |
20090216649 | August 27, 2009 | Hubbard |
20090222508 | September 3, 2009 | Hubbard |
20100036723 | February 11, 2010 | Hubbard |
0883313 | September 1998 | EP |
WO-2001014961 | March 2001 | WO |
WO-2001073545 | October 2001 | WO |
- Brian Hayes, “Computing Science: Collective Wisdon,” American Scientist, Mar.-Apr. 1998.
- Steve Lawrence, et al., “Accessibility of information on the web,” Nature, vol. 400, pp. 107-109, Jul. 1999.
- Steve Lawrence, et al., “Searching the World Wide Web,” Science, vol. 280, pp. 98-100, Apr. 3, 1998.
- Steve Lawrence, et al., “Context and Page Analysis for Improved Web Search,” IEEE Internet Computing, pp. 38-46, Jul.-Aug. 1998.
- Vasken Bohossian, et al., “Computing in the RAIN: A Reliable Array of Independent Nodes,” California Institute of Technology, Sep. 24, 1999.
- “A White Paper: The Economic Impacts of Unacceptable Web-Site Download Speeds,” Zona research, Inc., pp. 1-17, Apr. 1999.
- Peter J. Sevcik, “The World-Wide-Wait Status Report,” Northeast Consulting Resources, Inc., Global Internet-Performance Conference, Oct. 14, 1999.
- “White Paper: Max, and the Objective Measurement of Web Sites,” WebCriteria, Version 1.00, pp. 1-11, Mar. 12, 1999.
- Renu Tewari, et al., “Design Considerations for Distributed Caching on the Internet,” pp. 1-13, May 1999.
- “Measuring and Improving Your E-Commerce Web Site Performance with Keynote Perspective,” Keynote Systems, pp. 1-15, Mar. 29, 2000.
- Sullivan, et al., “A New Major SETI Project Based On Project Serendip Data and 100,000 Personal Computers,” Proc. of the Fifth Intl Conf on Bioastronomy IAU Colloq No. 161, pp. 729-734, 1997.
- Caronni, et al., “How Exhausting is Exhaustive Search?” RSA Laboratories' CryptoBytes, vol. 2, No. 3, pp. 2-6, Jan.-Mar. 1997.
- Bricker, et al., “Condor Technical Summary,” Computer Sciences Dept., University of Wisconsin, Version 4.1b, pp. 1-10, Jan. 28, 1992.
- Fields, “Hunting for Wasted Computing Power-New Software for Computing Networks Puts Idle PC's to Work,” 1993 Research Sampler, University of Wisconsin, pp. 1-5, 1993.
- Anderson, et al., “SETI@home: Internet Distributed Computing for SETI,” A New Era in Bioastronomy, ASP Conference Series, vol. 213, pp. 511-517, 2000.
- Bowyer, et al., “Twenty Years of Serendip, the Berkeley SETI Effort: Past Results and Future Plans,” Astronomical and Biochemical Origins and the Search for Life in the Universe, pp. 667-676, 1997.
- Litzkow, et al., “Condor—A Hunter of Idle Workstations,” The 8th International Conf. on Distributed Computing Systems, pp. 104-111, 1988.
- Hamidzadeh, et al., “Dynamic Scheduling Techniques for Heterogeneous Computing Systems,” Concurrency: Practice and Experience, vol. 7(7), pp. 633-652, 1995.
- Grimshaw, et al., “The Legion Vision of a Worldwide Virtual Computer,” Communications of the ACM, vol. 40, No. 1, pp. 39-45, 1997.
- Catlett, et al., “Metacomputing,” Communications of the ACM, vol. 35, No. 6, pp. 44-52, 1992.
- Foster, et al., “Globus: A Metacomputing Infrastructure Toolkit,” The International Journal of Supercomputer Applications and High Performance Computing, vol. 11, No. 2, pp. 115-128, 1997.
- Mutka, et al., “The Available Capacity of a Privately Owned Workstation Environment,” Performance Evaluation 12 (1991) pp. 269-284.
- Sullivan, et al., “A New Major SETI Projet Based on Project Serendip Data and 100,000 Personal Computers,” Astronomical and Biochemical Origins and the Search for Life in the Universe, 5th International Conference on Bioastronomy, IAU Colloquium No. 161, pp. 729-734, 1996.
- Gelernter, “Domesticating Parallelism,” IEEE Computer, Aug. 1986, 19(8), pp. 12-16.
- Goldberg, et al., “A Secure Environment for Untrusted Helper Applications-Confining the Wily Hacker,” 6th USENIX Security Symposium, pp. 1-13, 1996.
- distributed.net: The fastest computer on Earth: Feb. 8, 1999, http://web.archive.org/web/19990221230053/http://distributed.
- London et al., “Popcorn—A Paradigm for Global-Computing”, Thesis University Jerusalem, Jun. 1998.
- Takagi H. et al., “Ninflet: a migratable parallel objects framework using Java”, Java for High-Performance Network Computing, Syracuse, NY, USA, Feb. 1998, vol. 10, No. 11-13, pp. 1063-1078.
- Waldspurger, C.A. et al., “Spawn: a distributed computational economy” IEEE Transactions on Software Engineering, IEEE Inc., NY, US, Feb. 1992, vol. 18, No. 2, pp. 103-117.
- Neary, M. O., et al., “Javelin: Parallel computing on the internet” Future Generations Computer Systems, Elsevier Science Publishers, Amsterdam, NL, Oct. 1999, vol. 15, No. 5-6, pp. 661-664.
- Foster, Ian et al., “The Physiology of the Grid,” This is a Draft document and continues to be revised. Version Feb. 17, 2002.
- Douceur, John R. et al., “A Large-Scale Study of File-System Contents,” Microsoft Research, Redmond, WA 98052, May 1999.
- Bolosky, William J. et al., “Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs,” Microsoft Research, Redmond, WA 98052, Jun. 2000.
- Regev, Ori; Economic Oriented CPU Sharing System for the Internet; Master of Science in Computer Science thesis; Institute of Computer Science; The Hebrew University of Jerusalem; Jul. 1998.
- May, Michael; Idle Computing Resources as Micro-Currencies—Bartering CPU Time for Online Content; AACE WebNet99; Oct. 25-30, 1999.
- May, Michael; Distributed RC5 Decryption as a Consumer for Idle—Time Brokerage; DCW99 Workshop on Distributed Computer on the Web; Jun. 21-23, 1999.
- May, Michael; Locust—A Brokerage System for Accessing Idle Resources for Web-Computing; Proceedings of the 25th Euromicro Conference; vol. 2, pp. 466-473; Sep. 8-10, 1999.
- Huberman, Bernardo A., et al.; Distributed Computation as an Economic System; Journal of Economic Perspectives; vol. 9, No. 1; pp. 141-152; Winter 1995.
- Hayes, Brian “Computing Science: Collective Wisdom”, Retrieved from: <http://www.americanscientist.org/issues/id.3341,y.0,no.,content.true,page.1,css.print/issue.asp>on Dec. 3, 2009, American Scientist, (Mar. 1998), 3 pages.
- Lawrence, Steve et al., “Accessibility of Information on the Web”, Nature, vol. 400, (Jul. 1999), pp. 107-109.
- Lawrence, Steve et al., “Searching the World Wide Web”, Science, vol. 280, Available at <www.sciencemag.org>, (Apr. 3, 1998), pp. 98-100.
- Lawrence, Steve “Context and Page Analysis for Improved Web Search”, IEEE Internet Computing, vol. 2, No. 4, Available at <http://www.neci.nj.nec.com/homepages/lawrence/papers/search-ic98/>, (Jul. 1998), 11 pages.
- Vasken, Bohossian et al., “Computing in the Rain: A reliable array of independent nodes”, California Institute of Technology, (Sep. 24, 1999).
- “A White Paper: The economic impacts of Unacceptable Web-Site Download Speeds”, Research Inc.,(Apr. 1999) 1-17.
- Sevcik, Peter J., “The world Wide Wait Status Report”, Northeast consulting resources, Inc.; Global Internet Performance Conference, (Oct. 14, 1999).
- Henry, Shannon “Putting Idle computers to Work”, The Washington Post, (Jun. 15, 2000), 3 pages.
- Shmulik, London “POPCORN-A Paradigm for Global-Computing”, Master of Computer Science thesis, supervised by Prof. Noam Nisan, Institute of Computer Science, The Hebrew University of Jerusalem, (Jun. 1998), 94 pages.
- “GIMPS Finds First Million-Digit Prime, Stakes Claim To $50,000 EFF Award”, (Jun. 1999), 3 pages.
- “Final Office Action”, U.S. Appl. No. 10/68,210, (May 26, 2009), 12 pages.
- “Notice of Allowance”, U.S. Appl. No. 09/834785, (Jul. 15, 2009), 4 pages.
- “Non-Final Office Action”, U.S. Appl. No. 10/687,210, (Nov. 25, 2009), 20 pages.
- “Non-final Office Action”, U.S. Appl. No. 09/834,785, (Dec. 28, 2009), 8 pages.
- “Final Office Action”, U.S. Appl. No. 09/834,785, (Jun. 24, 2010), 9 pages.
- “Non Final Office Action”, U.S. Appl. No. 10/687,210, (Jun. 24, 2010), 21 pages.
- “Advisory Action”, U.S. Appl. No. 09/834,785, (Sep. 21, 2010), 3 pages.
Type: Grant
Filed: Aug 6, 2009
Date of Patent: Feb 15, 2011
Inventors: Edward A. Hubbard (Round Rock, TX), Krishnamurthy Venkatramani (Austin, TX), David P. Anderson (Berkeley, CA), Ashok K. Adiga (Austin, TX), Greg D. Hewgill (Christchurch), Jeff A. Lawson (Austin, TX)
Primary Examiner: Michael Won
Application Number: 12/462,600
International Classification: G06F 15/16 (20060101);