Patents by Inventor Sameer Kumar

Sameer Kumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8745123
    Abstract: Completion processing of data communications instructions in a distributed computing environment, including receiving, in an active messaging interface (‘AMI’) data communications instructions, at least one instruction specifying a callback function; injecting into an injection FIFO buffer of a data communication adapter, an injection descriptor, each slot in the injection FIFO buffer having a corresponding slot in a pending callback list; listing in the pending callback list any callback function specified by an instruction, incrementing a pending callback counter for each listed callback function; transferring payload data as per each injection descriptor, incrementing a transfer counter upon completion of each transfer; determining from counter values whether the pending callback list presently includes callback functions whose data transfers have been completed; calling by the AMI any such callback functions from the pending callback list, decrementing the pending callback counter for each callback functi
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: June 3, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Sameer Kumar, Jeffrey J. Parker
  • Patent number: 8732229
    Abstract: Completion processing of data communications instructions in a distributed computing environment, including receiving, in an active messaging interface (‘AMI’) data communications instructions, at least one instruction specifying a callback function; injecting into an injection FIFO buffer of a data communication adapter, an injection descriptor, each slot in the injection FIFO buffer having a corresponding slot in a pending callback list; listing in the pending callback list any callback function specified by an instruction, incrementing a pending callback counter for each listed callback function; transferring payload data as per each injection descriptor, incrementing a transfer counter upon completion of each transfer; determining from counter values whether the pending callback list presently includes callback functions whose data transfers have been completed; calling by the AMI any such callback functions from the pending callback list, decrementing the pending callback counter for each callback functi
    Type: Grant
    Filed: January 6, 2011
    Date of Patent: May 20, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Sameer Kumar, Jeffrey J. Parker
  • Patent number: 8655962
    Abstract: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.
    Type: Grant
    Filed: September 28, 2009
    Date of Patent: February 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael Blocksome, Gabor Dozsa, Thomas M. Gooding, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Douglas Miller
  • Publication number: 20130346997
    Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a barrier algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.
    Type: Application
    Filed: August 30, 2013
    Publication date: December 26, 2013
    Applicant: International Business Machines
    Inventors: Michael Blocksome, Sameer Kumar, Amith R. Mamidala, Douglas Miller, Joseph D. Ratterman
  • Publication number: 20130339506
    Abstract: Methods and arrangements for performing synchronized collective operations. Communication calls are accepted from at least two distinct processor groups. Edge disjoint spanning paths are created over a collective comprising the processor groups, and the spanning paths are assigned to the processor groups to facilitate communication within each processor group.
    Type: Application
    Filed: August 31, 2012
    Publication date: December 19, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Thomas George, Nikhil Jain, Sameer Kumar, Anshul Mittal, Yogish Sabharwal
  • Publication number: 20130339499
    Abstract: Methods and arrangements for performing synchronized collective operations. Communication calls are accepted from at least two distinct processor groups. Edge disjoint spanning paths are created over a collective comprising the processor groups, and the spanning paths are assigned to the processor groups to facilitate communication within each processor group.
    Type: Application
    Filed: June 13, 2012
    Publication date: December 19, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Thomas George, Nikhil Jain, Sameer Kumar, Anshul Mittal, Yogish Sabharwal
  • Publication number: 20130312010
    Abstract: Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.
    Type: Application
    Filed: May 21, 2012
    Publication date: November 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20130312011
    Abstract: Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.
    Type: Application
    Filed: November 16, 2012
    Publication date: November 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SAMEER KUMAR, AMITH R. MAMIDALA, JOSEPH D. RATTERMAN, BRIAN E. SMITH
  • Patent number: 8543722
    Abstract: In an embodiment, a send thread receives an identifier that identifies a destination node and a pointer to data. The send thread creates a first send request in response to the receipt of the identifier and the data pointer. The send thread selects a selected channel from among a plurality of channels. The selected channel comprises a selected hand-off queue and an identification of a selected message unit. Each of the channels identifies a different message unit. The selected hand-off queue is randomly accessible. If the selected hand-off queue contains an available entry, the send thread adds the first send request to the selected hand-off queue. If the selected hand-off queue does not contain an available entry, the send thread removes a second send request from the selected hand-off queue and sends the second send request to the selected message unit.
    Type: Grant
    Filed: March 30, 2010
    Date of Patent: September 24, 2013
    Assignee: International Business Machines Corporation
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow, Robert W. Wisniewski
  • Patent number: 8527740
    Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: September 3, 2013
    Assignee: International Business Machines Corporation
    Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Michael Blocksome, Douglas Miller
  • Patent number: 8407376
    Abstract: A parallel computer system includes a plurality of compute nodes. Each of the compute nodes includes at least one processor, at least one memory, and a direct memory address engine coupled to the at least one processor and the at least one memory. The system also includes a network interconnecting the plurality of compute nodes. The network operates a global message-passing application for performing communications across the network. Local instances of the global message-passing application operate at each of the compute nodes to carry out local processing operations independent of processing operations carried out at another one of the compute nodes. The direct memory address engines are configured to interact with the local instances of the global message-passing application via injection FIFO metadata describing an injection FIFO in a corresponding one of the memories.
    Type: Grant
    Filed: July 10, 2009
    Date of Patent: March 26, 2013
    Assignee: International Business Machines Corporation
    Inventors: Philip Heidelberger, Sameer Kumar
  • Patent number: 8381230
    Abstract: In an embodiment, a reception thread receives a source node identifier, a type, and a data pointer from an application and, in response, creates a receive request. If the source node identifier specifies a source node, the reception thread adds the receive request to a fast-post queue. If a message received from a network does not match a receive request on a posted queue, a polling thread adds a receive request that represents the message to an unexpected queue. If the fast-post queue contains the receive request, the polling thread removes the receive request from the fast-post queue. If the receive request that was removed from the fast-post queue does not match the receive request on the unexpected queue, the polling thread adds the receive request that was removed from the fast-post queue to the posted queue. The reception thread and the polling thread execute asynchronously from each other.
    Type: Grant
    Filed: April 21, 2010
    Date of Patent: February 19, 2013
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow
  • Patent number: 8359404
    Abstract: A system for routing data in a network comprising a network logic device at a sending node for determining a path between the sending node and a receiving node, wherein the network logic device sets one or more selection bits and one or more hint bits within the data packet, a control register for storing one or more masks, wherein the network logic device uses the one or more selection bits to select a mask from the control register and the network logic device applies the selected mask to the hint bits to restrict routing of the data packet to one or more routing directions for the data packet within the network and selects one of the restricted routing directions from the one or more routing directions and sends the data packet along a link in the selected routing direction toward the receiving node.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: January 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Philip Heidelberger, Sameer Kumar
  • Publication number: 20120179736
    Abstract: Completion processing of data communications instructions in a distributed computing environment, including receiving, in an active messaging interface (AMI) data communications instructions, at least one instruction specifying a callback function; injecting into an injection FIFO buffer of a data communication adapter, an injection descriptor, each slot in the injection FIFO buffer having a corresponding slot in a pending callback list; listing in the pending callback list any callback function specified by an instruction, incrementing a pending callback counter for each listed callback function; transferring payload data as per each injection descriptor, incrementing a transfer counter upon completion of each transfer; determining from counter values whether the pending callback list presently includes callback functions whose data transfers have been completed; calling by the AMI any such callback functions from the pending callback list, decrementing the pending callback counter for each callback function
    Type: Application
    Filed: January 6, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael A. Blocksome, Sameer Kumar, Jeffrey J. Parker
  • Publication number: 20120179760
    Abstract: Completion processing of data communications instructions in a distributed computing environment with computers coupled for data communications through communications adapters and an active messaging interface (‘AMI’), injecting for data communications instructions into slots in an injection FIFO buffer a transfer descriptor, at least some of the instructions specifying callback functions; injecting a completion descriptor for each instruction that specifies a callback function into an injection FIFO buffer slot having a corresponding slot in a pending callback list; listing in the pending callback list callback functions specified by data communications instructions; processing each descriptor in the injection FIFO buffer, setting a bit in a completion bit mask corresponding to the slot in the FIFO where the completion descriptor was injected; and calling by the AMI any callback functions in the pending callback list as indicated by set bits in the completion bit mask.
    Type: Application
    Filed: January 6, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael A. Blocksome, Sameer Kumar, Jeffrey J. Parker
  • Publication number: 20110265098
    Abstract: In an embodiment, a reception thread receives a source node identifier, a type, and a data pointer from an application and, in response, creates a receive request. If the source node identifier specifies a source node, the reception thread adds the receive request to a fast-post queue. If a message received from a network does not match a receive request on a posted queue, a polling thread adds a receive request that represents the message to an unexpected queue. If the fast-post queue contains the receive request, the polling thread removes the receive request from the fast-post queue. If the receive request that was removed from the fast-post queue does not match the receive request on the unexpected queue, the polling thread adds the receive request that was removed from the fast-post queue to the posted queue. The reception thread and the polling thread execute asynchronously from each other.
    Type: Application
    Filed: April 21, 2010
    Publication date: October 27, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow
  • Patent number: 8037213
    Abstract: Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (‘DMA’) injection first-in-first-out (‘FIFO’) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.
    Type: Grant
    Filed: May 30, 2007
    Date of Patent: October 11, 2011
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Bob R. Cernohous, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
  • Publication number: 20110246582
    Abstract: In an embodiment, a send thread receives an identifier that identifies a destination node and a pointer to data. The send thread creates a first send request in response to the receipt of the identifier and the data pointer. The send thread selects a selected channel from among a plurality of channels. The selected channel comprises a selected hand-off queue and an identification of a selected message unit. Each of the channels identifies a different message unit. The selected hand-off queue is randomly accessible. If the selected hand-off queue contains an available entry, the send thread adds the first send request to the selected hand-off queue. If the selected hand-off queue does not contain an available entry, the send thread removes a second send request from the selected hand-off queue and sends the second send request to the selected message unit.
    Type: Application
    Filed: March 30, 2010
    Publication date: October 6, 2011
    Applicant: International Business Machines Corporation
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow, Robert W. Wisniewski
  • Patent number: 8032892
    Abstract: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node.
    Type: Grant
    Filed: June 26, 2007
    Date of Patent: October 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: Michael Blocksome, Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
  • Publication number: 20110219208
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).
    Type: Application
    Filed: January 10, 2011
    Publication date: September 8, 2011
    Applicant: International Business Machines Corporation
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu