Patents by Inventor Mark E. Giampapa

Mark E. Giampapa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Simplifying and speeding the management of intra-node cache coherence

Patent number: 8161248

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Grant

Filed: November 24, 2010

Date of Patent: April 17, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Phillip Heidelberger, Dirk Hoenicke, Martin Ohmacht
Method and apparatus to debug an integrated circuit chip via synchronous clock stop and scan

Patent number: 8140925

Abstract: An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state.

Type: Grant

Filed: June 26, 2007

Date of Patent: March 20, 2012

Assignee: International Business Machines Corporation

Inventors: Ralph E. Bellofatto, Matthew R. Ellavsky, Alan G. Gara, Mark E. Giampapa, Thomas M. Gooding, Rudolf A. Haring, Lance G. Hehenberger, Martin Ohmacht
Managing coherence via put/get windows

Patent number: 8122197

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Grant

Filed: August 19, 2009

Date of Patent: February 21, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht
Local rollback for fault-tolerance in parallel computing systems

Patent number: 8103910

Abstract: A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

Type: Grant

Filed: January 29, 2010

Date of Patent: January 24, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Martin Ohmacht, Burkhard Steinmacher-Burow, Krishnan Sugavanam
Snoop filtering system in a multiprocessor system

Patent number: 8103836

Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.

Type: Grant

Filed: May 23, 2008

Date of Patent: January 24, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

Patent number: 8095585

Abstract: The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via “all-to-all” distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates eff

Type: Grant

Filed: October 31, 2007

Date of Patent: January 10, 2012

Assignee: International Business Machines Corporation

Inventors: Gyan V. Bhanot, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
Message passing with a limited number of DMA byte counters

Patent number: 8032892

Abstract: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node.

Type: Grant

Filed: June 26, 2007

Date of Patent: October 4, 2011

Assignee: International Business Machines Corporation

Inventors: Michael Blocksome, Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER

Publication number: 20110219208

Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).

Type: Application

Filed: January 10, 2011

Publication date: September 8, 2011

Applicant: International Business Machines Corporation

Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
COLLECTIVE NETWORK FOR COMPUTER STRUCTURES

Publication number: 20110219280

Abstract: A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

Type: Application

Filed: May 5, 2011

Publication date: September 8, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Paul W. Coteus, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Todd E. Takken, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
SPECULATIVE THREAD EXECUTION WITH HARDWARE TRANSACTIONAL MEMORY

Publication number: 20110209155

Abstract: In an embodiment, if a self thread has more than one conflict, a transaction of the self thread is aborted and restarted. If the self thread has only one conflict and an enemy thread of the self thread has more than one conflict, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread and the enemy thread only conflicts with the self thread and the self thread has a key that has a higher priority than a key of the enemy thread, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread, the enemy thread only conflicts with the self thread, and the self thread has a key that has a lower priority than the key of the enemy thread, the transaction of the self thread is aborted.

Type: Application

Filed: February 24, 2010

Publication date: August 25, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mark E. Giampapa, Thomas M. Gooding, Raul E. Silvera, Kai-Ting Amy Wang, Peng Wu, Xiaotong Zhuang
Collective network for computer structures

Patent number: 8001280

Abstract: A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

Type: Grant

Filed: July 18, 2005

Date of Patent: August 16, 2011

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Paul W. Coteus, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Todd E. Takken, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
Power throttling of collections of computing elements

Patent number: 8001401

Abstract: An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.

Type: Grant

Filed: June 26, 2007

Date of Patent: August 16, 2011

Assignee: International Business Machines Corporation

Inventors: Ralph E. Bellofatto, Paul W. Coteus, Paul G. Crumley, Alan G. Gara, Mark E. Giampapa, Thomas M. Gooding, Rudolf A. Haring, Mark G. Megerian, Martin Ohmacht, Don D. Reed, Richard A. Swetz, Todd Takken
TLB EXCLUSION RANGE

Publication number: 20110173411

Abstract: A system and method for accessing memory are provided. The system comprises a lookup buffer for storing one or more page table entries, wherein each of the one or more page table entries comprises at least a virtual page number and a physical page number; a logic circuit for receiving a virtual address from said processor, said logic circuit for matching the virtual address to the virtual page number in one of the page table entries to select the physical page number in the same page table entry, said page table entry having one or more bits set to exclude a memory range from a page.

Type: Application

Filed: January 8, 2010

Publication date: July 14, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Jon K. Kriegel, Martin Ohmacht, Burkhard Steinmacher-Burow
LOCAL ROLLBACK FOR FAULT-TOLERANCE IN PARALLEL COMPUTING SYSTEMS

Publication number: 20110119526

Abstract: A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

Type: Application

Filed: January 29, 2010

Publication date: May 19, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Martin Ohmacht, Burkhard Steinmacher-Burow, Krishnan Sugavanam
MANAGING COHERENCE VIA PUT/GET WINDOWS

Publication number: 20110072219

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Application

Filed: November 24, 2010

Publication date: March 24, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Phillip Heidelberger, Dirk Hoenicke, Martin Ohmacht
Optimized collectives using a DMA on a parallel computer

Patent number: 7886084

Abstract: Optimizing collective operations using direct memory access controller on a parallel computer, in one aspect, may comprise establishing a byte counter associated with a direct memory access controller for each submessage in a message. The byte counter includes at least a base address of memory and a byte count associated with a submessage. A byte counter associated with a submessage is monitored to determine whether at least a block of data of the submessage has been received. The block of data has a predetermined size, for example, a number of bytes. The block is processed when the block has been fully received, for example, when the byte count indicates all bytes of the block have been received. The monitoring and processing may continue for all blocks in all submessages in the message.

Type: Grant

Filed: June 26, 2007

Date of Patent: February 8, 2011

Assignee: International Business Machines Corporation

Inventors: Dong Chen, Dozsa Gabor, Mark E. Giampapa, Phillip Heidelberger
Managing coherence via put/get windows

Patent number: 7870343

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Grant

Filed: February 25, 2002

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht
Low latency memory access and synchronization

Patent number: 7818514

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Grant

Filed: August 22, 2008

Date of Patent: October 19, 2010

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
DMA engine for repeating communication patterns

Patent number: 7802025

Abstract: A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.

Type: Grant

Filed: June 26, 2007

Date of Patent: September 21, 2010

Assignee: International Business Machines Corporation

Inventors: Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard Steinmacher-Burow, Pavlos Vranas
Configurable memory system and method for providing atomic counting operations in a memory device

Patent number: 7797503

Abstract: A memory system and method for providing atomic memory-based counter operations to operating systems and applications that make most efficient use of counter-backing memory and virtual and physical address space, while simplifying operating system memory management, and enabling the counter-backing memory to be used for purposes other than counter-backing storage when desired. The encoding and address decoding enabled by the invention provides all this functionality through a combination of software and hardware.

Type: Grant

Filed: June 26, 2007

Date of Patent: September 14, 2010

Assignee: International Business Machines Corporation

Inventors: Ralph E. Bellofatto, Alan G. Gara, Mark E. Giampapa, Martin Ohmacht

prev 1 2 3 4 5 6 next