Patents by Inventor Amith R. Mamidala

Amith R. Mamidala has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROCESSING POSTED RECEIVE COMMANDS IN A PARALLEL COMPUTER

Publication number: 20130312011

Abstract: Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.

Type: Application

Filed: November 16, 2012

Publication date: November 21, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: SAMEER KUMAR, AMITH R. MAMIDALA, JOSEPH D. RATTERMAN, BRIAN E. SMITH
Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

Patent number: 8527672

Abstract: Fencing direct memory access (‘DMA’) data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two en

Type: Grant

Filed: November 5, 2010

Date of Patent: September 3, 2013

Assignee: International Business Machines Corporation

Inventors: Michael A. Blocksome, Amith R. Mamidala
Mechanism of supporting sub-communicator collectives with O(64) counters as opposed to one counter for each sub-communicator

Patent number: 8527740

Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.

Type: Grant

Filed: January 29, 2010

Date of Patent: September 3, 2013

Assignee: International Business Machines Corporation

Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Michael Blocksome, Douglas Miller
MECHANISMS FOR EFFICIENT INTRA-DIE/INTRA-CHIP COLLECTIVE MESSAGING

Publication number: 20130007378

Abstract: Mechanism of efficient intra-die collective processing across the nodelets with separate shared memory coherency domains is provided. An integrated circuit die may include a hardware collective unit implemented on the integrated circuit die. A plurality of cores on the integrated circuit die is grouped into a plurality of shared memory coherence domains. Each of the plurality of shared memory coherence domains is connected to the collective unit for performing collective operations between the plurality of shared memory coherence domains.

Type: Application

Filed: September 12, 2012

Publication date: January 3, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amith R. Mamidala, Valentina Salapura, Robert W. Wisniewski
MECHANISMS FOR EFFICIENT INTRA-DIE/INTRA-CHIP COLLECTIVE MESSAGING

Publication number: 20120179879

Abstract: Mechanism of efficient intra-die collective processing across the nodelets with separate shared memory coherency domains is provided. An integrated circuit die may include a hardware collective unit implemented on the integrated circuit die. A plurality of cores on the integrated circuit die is grouped into a plurality of shared memory coherence domains. Each of the plurality of shared memory coherence domains is connected to the collective unit for performing collective operations between the plurality of shared memory coherence domains.

Type: Application

Filed: January 7, 2011

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amith R. Mamidala, Valentina Salapura, Robert W. Wisniewski
Fencing Direct Memory Access Data Transfers In A Parallel Active Messaging Interface Of A Parallel Computer

Publication number: 20120117281

Abstract: Fencing direct memory access (‘DMA’) data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two en

Type: Application

Filed: November 5, 2010

Publication date: May 10, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Blocksome, Amith R. Mamidala
Fencing Data Transfers In A Parallel Active Messaging Interface Of A Parallel Computer

Publication number: 20120117137

Abstract: Fencing data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task; the compute nodes coupled for data communications through the PAMI and through data communications resources including at least one segment of shared random access memory; including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers through a segment of shared memory; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endp

Type: Application

Filed: November 5, 2010

Publication date: May 10, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Blocksome, Amith R. Mamidala
Fencing Data Transfers In A Parallel Active Messaging Interface Of A Parallel Computer

Publication number: 20120117211

Abstract: Fencing data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint comprising a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI and through data communications resources including a deterministic data communications network, including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

Type: Application

Filed: November 5, 2010

Publication date: May 10, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Blocksome, Amith R. Mamidala
Fencing Network Direct Memory Access Data Transfers In A Parallel Active Messaging Interface Of A Parallel Computer

Publication number: 20120117138

Abstract: Fencing direct memory access (‘DMA’) data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to a deterministic data communications network through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and the deterministic data communications network; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data trans

Type: Application

Filed: November 5, 2010

Publication date: May 10, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Blocksome, Amith R. Mamidala
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER

Publication number: 20110219208

Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).

Type: Application

Filed: January 10, 2011

Publication date: September 8, 2011

Applicant: International Business Machines Corporation

Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
MECHANISM OF SUPPORTING SUB-COMMUNICATOR COLLECTIVES WITH O(64) COUNTERS AS OPPOSED TO ONE COUNTER FOR EACH SUB-COMMUNICATOR

Publication number: 20110119468

Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.

Type: Application

Filed: January 29, 2010

Publication date: May 19, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Michael Blocksome, Douglas Miller
SHARED ADDRESS COLLECTIVES USING COUNTER MECHANISMS

Publication number: 20110078249

Abstract: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

Type: Application

Filed: September 28, 2009

Publication date: March 31, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Blocksome, Gabor Dozsa, Thomas M. Gooding, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Douglas Miller

prev 1 2