Patents by Inventor Thomas M. Gooding
Thomas M. Gooding has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7954012Abstract: Embodiments of the invention are generally related to retrieving debug data from a plurality of nodes of a parallel computer system. To retrieve debug data, a message may be broadcast from a service node of the system to each of the plurality of nodes via a first network, the message indicating a debug operation that is to be performed. A node of the plurality of nodes may transfer an interrupt signal to the rest of the plurality of nodes via a second network. Upon receiving the interrupt signal, the plurality of nodes may perform the debug operation comprising transferring the debug data to the service node via a third network.Type: GrantFiled: October 27, 2008Date of Patent: May 31, 2011Assignee: International Business Machines CorporationInventor: Thomas M. Gooding
-
Patent number: 7953957Abstract: Methods, apparatus, and products for distributing parallel algorithms of a parallel application among compute nodes of an operational group in a parallel computer are disclosed that include establishing a hardware profile, the hardware profile describing thermal characteristics of each compute node in the operational group; establishing a hardware independent application profile, the application profile describing thermal characteristics of each parallel algorithm of the parallel application; and mapping, in dependence upon the hardware profile and application profile, each parallel algorithm of the parallel application to a compute node in the operational group.Type: GrantFiled: February 11, 2008Date of Patent: May 31, 2011Assignee: International Business Machines CorporationInventors: Thomas M. Gooding, Brant L. Knudson, Cory Lappi, Ruth J. Poole, Andrew T. Tauferner
-
Publication number: 20110119445Abstract: A method and system for providing a memory access check on a processor including the steps of detecting accesses to a memory device including level-1 cache using a wakeup unit. The method includes invalidating level-1 cache ranges corresponding to a guard page, and configuring a plurality of wakeup address compare (WAC) registers to allow access to selected WAC registers. The method selects one of the plurality of WAC registers, and sets up a WAC register related to the guard page. The method configures the wakeup unit to interrupt on access of the selected WAC register. The method detects access of the memory device using the wakeup unit when a guard page is violated. The method generates an interrupt to the core using the wakeup unit, and determines the source of the interrupt. The method detects the activated WAC registers assigned to the violated guard page, and initiates a response.Type: ApplicationFiled: January 29, 2010Publication date: May 19, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Thomas M. Gooding, David L. Satterfield, Burkhard Steinmacher-Burow
-
Publication number: 20110119521Abstract: Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed: a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.Type: ApplicationFiled: May 5, 2010Publication date: May 19, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ralph A. Bellofatto, Dong Chen, Paul W. Coteus, Noel A. Eisley, Alan Gara, Thomas M. Gooding, Rudolf A. Haring, Philip Heidelberger, Gerard V. Kopcsay, Thomas A. Liebsch, Martin Ohmacht, Don D. Reed, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
-
Publication number: 20110119475Abstract: A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.Type: ApplicationFiled: January 29, 2010Publication date: May 19, 2011Applicant: International Business Machines CorporationInventors: Dong Chen, Matthew R. Ellavsky, Ross L. Franke, Alan Gara, Thomas M. Gooding, Rudolf A. Haring, Mark J. Jeanson, Gerard V. Kopcsay, Thomas A. Liebsch, Daniel Littrell, Martin Ohmacht, Don D. Reed, Brandon E. Schenck, Richard A. Swetz
-
Patent number: 7941681Abstract: Proactive power management in a parallel computer, the parallel computer including a service node and a plurality of compute nodes, the service node connected to the compute nodes through an out-of-band service network, each compute node including a computer processor and a computer memory operatively coupled to the computer processor. Embodiments include receiving, by the service node, a user instruction to initiate a job on an operational group of compute nodes in the parallel computer, the instruction including power management attributes for the compute nodes; setting, by the service node in accordance with the power management attributes for the compute nodes of the operational group, power consumption ratios for each compute node of the operational group including a computer processor power consumption ratio and a computer memory power consumption ratio; and initiating, by the service node, the job on the compute nodes of the operational group of the parallel computer.Type: GrantFiled: August 17, 2007Date of Patent: May 10, 2011Assignee: International Business Machines CorporationInventors: Thomas M. Gooding, Todd A. Inglett, Thomas A. Liebsch, Thomas E. Musta, Don D. Reed
-
Publication number: 20110078249Abstract: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.Type: ApplicationFiled: September 28, 2009Publication date: March 31, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael Blocksome, Gabor Dozsa, Thomas M. Gooding, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Douglas Miller
-
Patent number: 7877620Abstract: Managing power in a parallel computer, the parallel computer including a power supply and a plurality of compute nodes, the plurality of compute nodes powered by the power supply through a plurality of DC-DC converters, each DC-DC converter supplying current to an assigned group of compute nodes, each DC-DC converter having a current sensor. Embodiments include monitoring, by the current sensor, an amount of current supplied by that DC-DC converter to its assigned group of compute nodes; determining, by at least one DC-DC converter, that the amount of current supplied is greater than a predefined threshold value; sending, by the at least one DC-DC converter to the plurality of compute nodes, a global interrupt, including notifying the plurality of compute nodes to reduce power consumption; and reducing, by the plurality of compute nodes in accordance with power consumption ratios, power consumption of the compute nodes.Type: GrantFiled: August 17, 2007Date of Patent: January 25, 2011Assignee: International Business Machines CorporationInventors: Alan Gara, Thomas M. Gooding, Todd A. Inglett, Thomas A. Liebsch, Thomas E. Musta
-
Patent number: 7716407Abstract: Executing application function calls in response to an interrupt including creating a thread; receiving an interrupt having an interrupt type; determining whether a value of a semaphore represents that interrupts are disabled; if the value of the semaphore represents that interrupts are not disabled: calling, by the thread, one or more preconfigured functions in dependence upon the interrupt type of the interrupt; yielding the thread; and if the value of the semaphore represents that interrupts are disabled: setting the value of the semaphore to represent to a kernel that interrupts are hard-disabled; and hard-disabling interrupts at the kernel.Type: GrantFiled: January 3, 2008Date of Patent: May 11, 2010Assignee: International Business Machines CorporationInventors: Gheorghe Almasi, Charles J. Archer, Mark E. Giampapa, Thomas M. Gooding, Philip Heidelberger, Jeffrey J. Parker
-
Publication number: 20100107012Abstract: Embodiments of the invention are generally related to retrieving debug data from a plurality of nodes of a parallel computer system. To retrieve debug data, a message may be broadcast from a service node of the system to each of the plurality of nodes via a first network, the message indicating a debug operation that is to be performed. A node of the plurality of nodes may transfer an interrupt signal to the rest of the plurality of nodes via a second network. Upon receiving the interrupt signal, the plurality of nodes may perform the debug operation comprising transferring the debug data to the service node via a third network.Type: ApplicationFiled: October 27, 2008Publication date: April 29, 2010Inventor: Thomas M. Gooding
-
Publication number: 20100088705Abstract: Call stack protection, including executing at least one application program on the one or more computer processors, including initializing threads of execution, each thread having a call stack, each call stack characterized by a separate guard area defining a maximum extent of the call stack, dispatching one of the threads of the process, including loading a guard area specification for the dispatched thread's call stack guard area from thread context storage into address comparison registers of a processor; determining by use of address comparison logic in dependence upon a guard area specification for the dispatched thread whether each access of memory by the dispatched thread is a precluded access of memory in the dispatched thread's call stack's guard area; and effecting by the address comparison logic an address comparison interrupt for each access of memory that is a precluded access of memory in the dispatched thread's guard area.Type: ApplicationFiled: October 8, 2008Publication date: April 8, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: John E. Attinella, Mark E. Giampapa, Thomas M. Gooding
-
Publication number: 20100017655Abstract: Methods, apparatus, and products are disclosed for error recovery during execution of an application on a parallel computer that includes a plurality of compute nodes. Such error recovery includes: storing, by the application during execution on the nodes, application restore data in a restore buffer at predetermined points during execution of the application, the restore data specifying an execution state of the application at one or more points during application execution; encountering, by at least one of the nodes executing the application, a recoverable error during application execution; determining, by the application, the nodes affected by the recoverable error; restarting, by each of the affected nodes, execution of the application; retrieving, by the restarted application executing on each of the affected nodes, the restore data from the restore buffer; and continuing, by each affected node, execution of the application with the execution state specified by the retrieved restore data.Type: ApplicationFiled: July 16, 2008Publication date: January 21, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Thomas M. GOODING, Patrick J. MCCARTHY, Michael B. MUNDY
-
Publication number: 20090271608Abstract: A method of managing a process relocation operation in a computing system is provided and includes determining respective operating temperatures of first, second and additional nodes of the system, where the first node has an elevated operating temperature and the second node has a normal operating temperature, notifying first and second kernels respectively associated with the first and second nodes, of a swapping condition, initially managing the first and second kernels to swap an application between the first and the second nodes while the swapping condition is in effect, and secondarily managing the first and second kernels to perform a barrier operation to end the swapping condition.Type: ApplicationFiled: April 25, 2008Publication date: October 29, 2009Inventors: Thomas M. Gooding, Brant L. Knudson, Cory Lappi, Ruth J. Pooie, Andrew T. Tauferner
-
Publication number: 20090204789Abstract: Methods, apparatus, and products for distributing parallel algorithms of a parallel application among compute nodes of an operational group in a parallel computer are disclosed that include establishing a hardware profile, the hardware profile describing thermal characteristics of each compute node in the operational group; establishing a hardware independent application profile, the application profile describing thermal characteristics of each parallel algorithm of the parallel application; and mapping, in dependence upon the hardware profile and application profile, each parallel algorithm of the parallel application to a compute node in the operational group.Type: ApplicationFiled: February 11, 2008Publication date: August 13, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: THOMAS M. GOODING, BRANT L. KNUDSON, CORY LAPPI, RUTH J. POOLE, ANDREW T. TAUFERNER
-
Publication number: 20090177828Abstract: Executing application function calls in response to an interrupt including creating a thread; receiving an interrupt having an interrupt type; determining whether a value of a semaphore represents that interrupts are disabled; if the value of the semaphore represents that interrupts are not disabled: calling, by the thread, one or more preconfigured functions in dependence upon the interrupt type of the interrupt; yielding the thread; and if the value of the semaphore represents that interrupts are disabled: setting the value of the semaphore to represent to a kernel that interrupts are hard-disabled; and hard-disabling interrupts at the kernel.Type: ApplicationFiled: January 3, 2008Publication date: July 9, 2009Inventors: Gheorghe Almasi, Charles J. Archer, Mark E. Giampapa, Thomas M. Gooding, Philip Heidelberger, Jeffrey J. Parker
-
Publication number: 20090049313Abstract: Proactive power management in a parallel computer, the parallel computer including a service node and a plurality of compute nodes, the service node connected to the compute nodes through an out-of-band service network, each compute node including a computer processor and a computer memory operatively coupled to the computer processor. Embodiments include receiving, by the service node, a user instruction to initiate a job on an operational group of compute nodes in the parallel computer, the instruction including power management attributes for the compute nodes; setting, by the service node in accordance with the power management attributes for the compute nodes of the operational group, power consumption ratios for each compute node of the operational group including a computer processor power consumption ratio and a computer memory power consumption ratio; and initiating, by the service node, the job on the compute nodes of the operational group of the parallel computer.Type: ApplicationFiled: August 17, 2007Publication date: February 19, 2009Inventors: Thomas M Gooding, Todd A. Inglett, Thomas A. Liebsch, Thomas E. Musta, Don D. Reed
-
Publication number: 20090049317Abstract: Managing power in a parallel computer, the parallel computer including a power supply and a plurality of compute nodes, the plurality of compute nodes powered by the power supply through a plurality of DC-DC converters, each DC-DC converter supplying current to an assigned group of compute nodes, each DC-DC converter having a current sensor. Embodiments include monitoring, by the current sensor, an amount of current supplied by that DC-DC converter to its assigned group of compute nodes; determining, by at least one DC-DC converter, that the amount of current supplied is greater than a predefined threshold value; sending, by the at least one DC-DC converter to the plurality of compute nodes, a global interrupt, including notifying the plurality of compute nodes to reduce power consumption; and reducing, by the plurality of compute nodes in accordance with power consumption ratios, power consumption of the compute nodes.Type: ApplicationFiled: August 17, 2007Publication date: February 19, 2009Inventors: Alan Gara, Thomas M. Gooding, Todd A. Inglett, Thomas A. Liebsch, Thomas E. Musta
-
Publication number: 20090006873Abstract: An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.Type: ApplicationFiled: June 26, 2007Publication date: January 1, 2009Applicant: International Business Machines CorporationInventors: Ralph E. Bellofatto, Paul W. Coteus, Paul G. Crumley, Alan G. Gara, Mark E. Giampapa, Thomas M. Gooding, Rudolf Haring, Mark G. Megerian, Martin Ohmacht, Don D. Reed, Richard A. Swetz, Todd Takken
-
Publication number: 20090006894Abstract: An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state.Type: ApplicationFiled: June 26, 2007Publication date: January 1, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ralph E. Bellofatto, Matthew R. Ellavsky, Alan G. Gara, Mark E. Giampapa, Thomas M. Gooding, Rudolf A. Haring, Lance G. Hehenberger, Martin Ohmacht