GPGPU SYSTEMS AND SERVICES

Info

Publication number: 20150042665
Type: Application
Filed: Jul 18, 2014
Publication Date: Feb 12, 2015
Inventors: Greg Scantlen (Albuquerque, NM), Gary Scantlen (Albuquerque, NM)
Application Number: 14/335,105

Abstract

Graphics processing units (GPUs) deployed in general purpose GPU (GPGPU) units are combined into a GPGPU cluster. Access to the GPGPU cluster is then offered as a service to users who can use their own computers to communicate with the GPGPU cluster. The users develop applications to be run on the cluster and a profiling module tracks the applications' resource utilization and can report it to the user and to a subscription server. The user can examine the report to thereby optimize the application or the cluster's configuration. The subscription server can interpret the report to thereby invoice the user or otherwise govern the users' access to the cluster.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation and claims the priority benefit of U.S. patent application Ser. No. 12/895,554 filed Sep. 30, 2010, which claims the priority benefit of U.S. provisional application 61/261,973 filed Nov. 17, 2009 and U.S. provisional application 61/247,237 filed Sep. 30, 2009, the disclosures of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

Embodiments relate to computing clusters, cloud computing, and general purpose computing based on graphic processor units. Embodiments also relate to massive computing power offered on a subscription basis. Embodiments additionally relate to profiling massively parallel programs on a variety of cluster configurations.

2. Description of the Related Art

Massive computing capability has traditionally been provided by highly specialized and very expensive supercomputers. As technology advances, however, inexpensive desktop and server hardware has steadily supplanted expensive high end systems. More recently, inexpensive hardware has been gathered together to form computing clusters. The individual computers in a compute cluster are typically not as expensive or reliable as their supercomputer and mainframe forbearers but overcome those limitations with sheer numbers.

The drawback of compute clusters is that they are difficult to maintain and to program. In order to harness the power of a compute cluster, a program must be split into a great number of pieces and the multitudinous results later reconciled and reassembled. Furthermore, the program itself must be fault tolerant because there is a risk of individual failures amongst the great number of inexpensive computers.

Desktop and gaming computers often conserve central processing unit (CPU) resources by employing a graphics subsystems dedicated to drive one or more computer displays. A graphics processing unit (GPU) is at the heart of the graphics subsystem. The CPU is a general purpose processor designed to efficiently run a great variety of algorithms. Graphics processing, however, consists of a limited and well known set of algorithms. GPUs are specialized processors that are very good at graphics processing but not necessarily good at other tasks.

Another recent development is the identification of algorithms, other than graphics algorithms, that are well suited for GPUs. These algorithms currently require expert programming in order to put them into a form that a GPU can run. Further optimization is required to for a GPU to run the algorithm well. The effort is often worthwhile because the speedup can be orders of magnitude faster. Unfortunately, properly configured computing systems having the software tools required for developing algorithms to run on GPUs are rare. As such, expertise in the required programming techniques is rare and difficult to develop.

Systems and methods for providing GPU powered compute clusters and for deploying non-graphics applications to efficiently run on those GPU powered compute clusters are needed.

SUMMARY OF THE PRESENTLY CLAIMED INVENTION

The following summary is provided to facilitate an understanding of some of the innovative features unique to the embodiments and is not intended to be a full description. A full appreciation of the various aspects of the embodiments can be gained by taking the entire specification, claims, drawings, and abstract as a whole.

It is therefore an aspect of the embodiments to provide service granting remote users with access to a general purpose GPU (GPGPU) based compute cluster. The GPGPU cluster consists of a number of GPGPU units. Each GPGPU unit is a self contained computer having an enclosure, CPU, cooling fan, GPU, memory for the CPU and GPU, and a communications interface.

It is another aspect of the embodiments to provide a subscription server module. A user access the subscription server module through the user's own computer. The subscription server module governs the users access to the GPGPU units, related hardware, and related software tools.

The user provides a GPGPU application to be run on the GPGPU cluster. The GPGPU application can be developed by on the user's computer or on the GPGPU cluster itself. The user can obtain the application development tools from the GPGPU cluster, from the entity providing access to the GPGPU cluster, or from another source.

The GPGPU application can be designed to run on a specific configuration of GPGPU units or can otherwise specify a configuration. The GPGPU application has GPU instructions and application data. The GPUs in the GPU units can operate on the application data while executing the GPU instructions. Furthermore, the GPGPU cluster can be interconnected in accordance with the configuration and the GPGPU application then run.

It is a further aspect of the embodiments to provide a profiling module. The profiling module tracks the GPGPU cluster resources consumed by the GPGPU application. The resources can include the number of GPGPU units, the amounts of memory, the amounts of processing time, the numbers of GPU cores, and similar information that the user can interpret to optimize the GPGPU application. The GPGPU application can be optimized by altering the control flow of the instructions, flow of the data, or configuration of the GPGPU cluster.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying figures, in which like reference numerals refer to identical or functionally similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate aspects of the embodiments and, together with the background, brief summary, and detailed description serve to explain the principles of the embodiments.

FIG. 1 illustrates a subscription based service by which a user can test an algorithm, application, or utility upon a number of different GPGPU configurations in accordance with aspects of the embodiments;

FIG. 2 illustrates one possible GPGPU configuration units in accordance with aspects of the embodiments; and

FIG. 3 illustrates a GPGPU configuration having GPGPU units in accordance with aspects of the embodiments.

DETAILED DESCRIPTION

The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof. In general, the figures are not to scale.

Graphics processing units (GPUs) deployed in general purpose GPU (GPGPU) units are combined into a GPGPU cluster. Access to the GPGPU cluster is then offered as a service to users who can use their own computers to communicate with the GPGPU cluster. The users develop applications to be run on the cluster and a profiling module tracks the applications' resource utilization and can report it to the user and to a subscription server. The user can examine the report to thereby optimize the application or the cluster's configuration. The subscription server can interpret the report to thereby invoice the user or otherwise govern the users' access to the cluster.

FIG. 1 illustrates a subscription based service by which a user 101 can test an algorithm, application, or utility upon a number of different GPGPU configurations 105, 106, 107. The user 101 can access the user's computer 102 to develop, compile, etc a GPGPU application. A service provider can provide the user with access to a number of different GPGPU configurations such as GPGPU configuration 1 105, GPGPU configuration 2 106, and GPGPU configuration 3 107. The user 101 can download the application to a suitably configured GPGPU cluster and run it. A data storage array 108 can store data for the user such that the data is available to the user's application. A profiling module 104 can track the number of processors, amount of processing time, amount of memory, and other resources utilized by the application and report those utilizations back to the user.

The user's computer 102 connects to the service using a communications network. As illustrated, a second communications network can interconnect the configurations, modules, and data storage array 108. For example, the user's computer might over the internet whereas the GPGPU cluster communicates internally using infiniband or some other very high speed interconnect. The various networks must also include network hardware as required (not shown) such as routers and switches.

A subscription module 103 can control the user's access to the GPGPU configurations such that only certain users have access. The subscription module 103 can also limit the amount of resources consumed by the user such as how much data can be stored in the data storage array 108 or how much total GPU time can be consumed by the user. Alternatively, the subscription module can track the user's resource consumption such that the user 101 can be invoiced after the fact or on a pay-as-you-go basis.

The user's application can include a specification of the GPGPU cluster configuration. In this case, the user can produce multiple applications that are substantially similar with the exception that each specifies a different configuration. Testing and profiling the different applications provides the user with information leading to the selection of a preferred GPGPU cluster configuration for running the application. As such, the cluster configuration can be tuned to run an application such as a molecular dynamics simulator. Alternatively, the application can be tuned for the configuration.

A service provider can provide access to a number of different cluster configurations. A user accessing the service can submit an application that is then run and profiled on each of the available configurations or on a subset of the available configurations. This embodiment eases the users burden of generating numerous cluster configuration specifications because those specifications are available from the service provider.

FIG. 2 illustrates one possible GPGPU configuration. GPGPU configuration A 201 has a CPU 202, memory 203, a network interface 204, and three GPUs 205. In GPGPU configuration A 201 a single computer holds all the processing capability. Note that GPGPU configuration A 201 can be deployed as a unit within a much larger configuration that contains numerous computers. However, should GPGPU configuration A encompass all of the available resources then the subscription server module and the profiling module can run as application programs on the single computer.

FIG. 3 illustrates a GPGPU configuration having numerous GPGPU units. GPGPU configuration B 301 has a control computer 301, GPGPU unit 1 303 and GPGPU unit 2 304 interconnected by a communications network 306. Note that each of the GPGPU units has a single GPU 205 and the control computer 302 has none. As such, this is a non limiting example because a controller can contain multiple GPUs as can each of the GPGPU units. The communications network can be a single technology such as infiniband or Ethernet. Alternatively, the communications network can be a combination of technologies. In any case, the communications module 305 in each computer has the hardware, firmware, and software required for operation with the communications network 306. The control computer 302 can run the subscription server module and the profiling module as application programs.

It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.

Claims

1. (canceled)

2. A method for offering access to a general purpose graphics processing unit (GPGPU) compute cluster, the method comprising:

communicating with a user computer seeking access to the GPGPU compute cluster to control access by the user computer to the GPGPU compute cluster;

determining that the user computer is presently subscribed to and has the requisite permissions to access one or more GPGPU units in the GPGPU compute cluster;

receiving a specification for submission to the GPGPU compute cluster, the specification received from the user computer seeking access to the GPGPU compute cluster;

executing the specification at the GPGPU compute cluster to produce one or more computational results as defined by the specification

tracking resource utilization data during execution of the specification by one or more GPGPU units in the GPGPU computer cluster; and

controlling utilization of one or more GPGPU units in the GPGPU compute cluster during execution of the specification and responsive to the resource utilization data.

3. The method of claim 2, further comprising storing resource utilization data in a data array communicatively coupled to the GPGPU computer cluster for subsequent control of one or more units in the GPGPU compute cluster during execution of a later received specification.

4. The method of claim 2, further comprising invoicing a user of the user computer based on the resource utilization data.

5. A method for offering access to a general purpose graphics processing unit (GPGPU) compute cluster, the method comprising:

communicating with a user computer seeking access to the GPGPU compute cluster to control access by the user computer to the GPGPU compute cluster;

determining that the user computer is presently subscribed to and has the requisite permissions to access one or more GPGPU units in the GPGPU compute cluster;

receiving a specification for submission to the GPGPU compute cluster, the specification received from the user computer seeking access to the GPGPU compute cluster;

configuring one or more units in the GPGPU compute cluster in accordance with the specification;

producing one or more computational results as defined by the specification, the computational results generated by the GPGPU compute cluster following configuration as defined by the specification;

tracking resource utilization data during execution of the specification by one or more GPGPU units in the GPGPU computer cluster; and

controlling utilization of one or more GPGPU units in the GPGPU compute cluster during execution of the specification and responsive to the resource utilization data.

6. The method of claim 5, further comprising:

alternatively configuring one or more units in the GPGPU computer cluster in a manner not set forth in the specification;

producing one or more computation results as defined by the specification, the computational results generated by the alternatively configured GPGPU units in parallel with the computational results generated by the one or more units of the GPGPU units as defined by the specification; and

tracking resource utilization data during execution of the specification by the one or more GPGPU units in the alternatively configured GPGPU computer cluster.

7. The method of claim 6, further comprising identifying the more optimal GPGPU computer cluster configuration for execution of the specification and subsequently executing the specification in accordance with the more optimal configuration.