Patents by Inventor Sameer Kumar

Sameer Kumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Message passing with a limited number of DMA byte counters

Patent number: 8032892

Abstract: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node.

Type: Grant

Filed: June 26, 2007

Date of Patent: October 4, 2011

Assignee: International Business Machines Corporation

Inventors: Michael Blocksome, Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER

Publication number: 20110219208

Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).

Type: Application

Filed: January 10, 2011

Publication date: September 8, 2011

Applicant: International Business Machines Corporation

Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
STORE-OPERATE-COHERENCE-ON-VALUE

Publication number: 20110179229

Abstract: A system, method and computer program product for performing various store-operate instructions in a parallel computing environment that includes a plurality of processors and at least one cache memory device. A queue in the system receives, from a processor, a store-operate instruction that specifies under which condition a cache coherence operation is to be invoked. A hardware unit in the system runs the received store-operate instruction. The hardware unit evaluates whether a result of the running the received store-operate instruction satisfies the condition. The hardware unit invokes a cache coherence operation on a cache memory address associated with the received store-operate instruction if the result satisfies the condition. Otherwise, the hardware unit does not invoke the cache coherence operation on the cache memory device.

Type: Application

Filed: January 7, 2011

Publication date: July 21, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dong Chen, Philip Heidelberger, Sameer Kumar, Martin Ohmacht, Burkhard Steinmacher-Burow
Mechanism to support generic collective communication across a variety of programming models

Patent number: 7984448

Abstract: A system and method for supporting collective communications on a plurality of processors that use different parallel programming paradigms, in one aspect, may comprise a schedule defining one or more tasks in a collective operation, an executor that executes the task, a multisend module to perform one or more data transfer functions associated with the tasks, and a connection manager that controls one or more connections and identifies an available connection. The multisend module uses the available connection in performing the one or more data transfer functions. A plurality of processors that use different parallel programming paradigms can use a common implementation of the schedule module, the executor module, the connection manager and the multisend module via a language adaptor specific to a parallel programming paradigm implemented on a processor.

Type: Grant

Filed: June 26, 2007

Date of Patent: July 19, 2011

Assignee: International Business Machines Corporation

Inventors: Gheorghe Almasi, Gabor Dozsa, Sameer Kumar
ZONE ROUTING IN A TORUS NETWORK

Publication number: 20110173343

Abstract: A system for routing data in a network comprising a network logic device at a sending node for determining a path between the sending node and a receiving node, wherein the network logic device sets one or more selection bits and one or more hint bits within the data packet, a control register for storing one or more masks, wherein the network logic device uses the one or more selection bits to select a mask from the control register and the network logic device applies the selected mask to the hint bits to restrict routing of the data packet to one or more routing directions for the data packet within the network and selects one of the restricted routing directions from the one or more routing directions and sends the data packet along a link in the selected routing direction toward the receiving node.

Type: Application

Filed: January 8, 2010

Publication date: July 14, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dong Chen, Philip Heidelberger, Sameer Kumar
Heuristic status polling

Patent number: 7958274

Abstract: Methods, compute nodes, and computer program products are provided for heuristic status polling of a component in a computing system. Embodiments include receiving, by a polling module from a requesting application, a status request requesting status of a component; determining, by the polling module, whether an activity history for the component satisfies heuristic polling criteria; polling, by the polling module, the component for status if the activity history for the component satisfies the heuristic polling criteria; and not polling, by the polling module, the component for status if the activity history for the component does not satisfy the heuristic criteria.

Type: Grant

Filed: June 18, 2007

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Charles J. Archer, Michael A. Blocksome, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker, Joseph D. Ratterman
MECHANISM OF SUPPORTING SUB-COMMUNICATOR COLLECTIVES WITH O(64) COUNTERS AS OPPOSED TO ONE COUNTER FOR EACH SUB-COMMUNICATOR

Publication number: 20110119468

Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.

Type: Application

Filed: January 29, 2010

Publication date: May 19, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Michael Blocksome, Douglas Miller
SHARED ADDRESS COLLECTIVES USING COUNTER MECHANISMS

Publication number: 20110078249

Abstract: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

Type: Application

Filed: September 28, 2009

Publication date: March 31, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Blocksome, Gabor Dozsa, Thomas M. Gooding, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Douglas Miller
FAST CONCURRENT ARRAY-BASED STACKS, QUEUES AND DEQUES USING FETCH-AND-INCREMENT-BOUNDED AND A TICKET LOCK PER ELEMENT

Publication number: 20110072241

Abstract: Implementation primitives for concurrent array-based stacks, queues, double-ended queues (deques) and wrapped deques are provided. In one aspect, each element of the stack, queue, deque or wrapped deque data structure has its own ticket lock, allowing multiple threads to concurrently use multiple elements of the data structure and thus achieving high performance. In another aspect, new synchronization primitives FetchAndIncrementBounded (Counter, Bound) and FetchAndDecrementBounded (Counter, Bound) are implemented. These primitives can be implemented in hardware and thus promise a very fast throughput for queues, stacks and double-ended queues.

Type: Application

Filed: September 22, 2009

Publication date: March 24, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dong Chen, Alana Gara, Philip Heidelberger, Sameer Kumar, Martin Ohmacht, Burkhard Steinmacher-Burow, Robert Wisniewski
N2 based plasma treatment for enhanced sidewall smoothing and pore sealing of porous low-k dielectric films

Patent number: 7910936

Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.

Type: Grant

Filed: December 9, 2008

Date of Patent: March 22, 2011

Assignee: Texas Instruments Incorporated

Inventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin
Method for route optimization with dual mobile IPv4 node in IPv6-only network

Patent number: 7899055

Abstract: A method for route optimization with a dual mobile IPv4 node in an IPv6-only network is provided. The method includes the operations of: receiving a visited IPv6 address from a router when the dual mobile node is connected to the IPv6-only network; updating a home agent with the IPv6 address; deregistering a binding update with a correspondent node via the home agent; updating the correspondent node with an IPv6 address; checking the reachability of packets directly to the correspondent node using its IPv6 address; the mobile node starting sending, to the CN, data packets tunneled in an IPv6 packet once the reachability is verified; and the correspondent node sending tunneled data packets directly to an IPv6 address of the mobile node.

Type: Grant

Filed: December 27, 2006

Date of Patent: March 1, 2011

Assignee: Samsung Electronics Co., Ltd.

Inventors: Kishore Mundra, Lakshmi Praba Gurusamy, Sameer Kumar, Ranjitsinh Udaysinh Wable
Recording A Communication Pattern and Replaying Messages in a Parallel Computing System

Publication number: 20110010471

Abstract: A parallel computer system includes a plurality of compute nodes. Each of the compute nodes includes at least one processor, at least one memory, and a direct memory address engine coupled to the at least one processor and the at least one memory. The system also includes a network interconnecting the plurality of compute nodes. The network operates a global message-passing application for performing communications across the network. Local instances of the global message-passing application operate at each of the compute nodes to carry out local processing operations independent of processing operations carried out at another one of the compute nodes. The direct memory address engines are configured to interact with the local instances of the global message-passing application via injection FIFO metadata describing an injection FIFO in a corresponding one of the memories.

Type: Application

Filed: July 10, 2009

Publication date: January 13, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Philip Heidelberger, Sameer Kumar
Replenishing Data Descriptors in a DMA Injection FIFO Buffer

Publication number: 20100268852

Abstract: Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (‘DMA’) injection first-in-first-out (‘FIFO’) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.

Type: Application

Filed: May 30, 2007

Publication date: October 21, 2010

Inventors: Charles J Archer, Michael A. Blocksome, Bob R. Cernohous, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
Method of fabrication of on-chip heat pipes and ancillary heat transfer components

Patent number: 7781884

Abstract: The density of components in integrated circuits (ICs) is increasing with time. The density of heat generated by the components is similarly increasing. Maintaining the temperature of the components at reliable operating levels requires increased thermal transfer rates from the components to the IC package exterior. Dielectric materials used in interconnect regions have lower thermal conductivity than silicon dioxide. This invention comprises a heat pipe located in the interconnect region of an IC to transfer heat generated by components in the IC substrate to metal plugs located on the top surface of the IC, where the heat is easily conducted to the exterior of the IC package. Refinements such as a wicking liner or reticulated inner surface will increase the thermal transfer efficiency of the heat pipe. Strengthening elements in the interior of the heat pipe will provide robustness to mechanical stress during IC manufacture.

Type: Grant

Filed: September 28, 2007

Date of Patent: August 24, 2010

Assignee: Texas Instruments Incorporated

Inventors: Sameer Kumar Ajmera, Phillip D. Matz, Stephan Grunow, Satyavolu Srinivas Papa Rao
Direct memory access transfer completion notification

Patent number: 7765337

Abstract: Methods, compute nodes, and computer program products are provided for direct memory access (‘DMA’) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (‘FIFO’) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.

Type: Grant

Filed: June 5, 2007

Date of Patent: July 27, 2010

Assignee: International Business Machines Corporation

Inventors: Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker, Burkhard D. Steinmacher-Burow, Pavlos Vranas
Asynchronous broadcast for ordered delivery between compute nodes in a parallel computing system where packet header space is limited

Patent number: 7738443

Abstract: Disclosed is a mechanism on receiving processors in a parallel computing system for providing order to data packets received from a broadcast call and to distinguish data packets received at nodes from several incoming asynchronous broadcast messages where header space is limited. In the present invention, processors at lower leafs of a tree do not need to obtain a broadcast message by directly accessing the data in a root processor's buffer. Instead, each subsequent intermediate node's rank id information is squeezed into the software header of packet headers. In turn, the entire broadcast message is not transferred from the root processor to each processor in a communicator but instead is replicated on several intermediate nodes which then replicated the message to nodes in lower leafs. Hence, the intermediate compute nodes become “virtual root compute nodes” for the purpose of replicating the broadcast message to lower levels of a tree.

Type: Grant

Filed: June 26, 2007

Date of Patent: June 15, 2010

Assignee: International Business Machines Corporation

Inventor: Sameer Kumar
METHOD OF APPLYING FAST MOBILE IPV6 FOR MOBILE NODES IN MOBILE NETWORKS, MOBILE ROUTER THEREFOR, AND MOBILE NETWORK THEREFOR

Publication number: 20090147751

Abstract: Provided are a method of applying fast mobile IPv6 (FMIPv6) to a mobile node in order to prevent loss of packets transmitted during a handover of a mobile router from a first access router to a second access router, and a mobile router and a mobile network therefor. Specifically, the mobile router receives, from the first access router, a message containing a prefix corresponding to the second access router, transmits, to the mobile node, a message containing the prefix and information indicating that the prefix is received from the first access router, transmits, to the first access router, a message for FMIPv6, and transmits, to the mobile node, a message to set a zero lifetime for a prefix corresponding to the first access router. Furthermore, the mobile node transmits a message for FMIPv6 to the first access router when the mobile node receives the message containing the prefix.

Type: Application

Filed: August 4, 2006

Publication date: June 11, 2009

Inventors: Lakshmi Prabha Gurusamy, Sameer Kumar, Ranjitsinh Udaysinh Wable, Kishore Mundra, Syam Madanapalli
N2 BASED PLASMA TREATMENT FOR ENHANCED SIDEWALL SMOOTHING AND PORE SEALING OF POROUS LOW-K DIELECTRIC FILMS

Publication number: 20090115030

Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.

Type: Application

Filed: December 9, 2008

Publication date: May 7, 2009

Applicant: Texas Instruments Incorporated

Inventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin
Method of Fabrication of On-Chip Heat Pipes and Ancillary Heat Transfer Components

Publication number: 20090085197

Abstract: The density of components in integrated circuits (ICs) is increasing with time. The density of heat generated by the components is similarly increasing. Maintaining the temperature of the components at reliable operating levels requires increased thermal transfer rates from the components to the IC package exterior. Dielectric materials used in interconnect regions have lower thermal conductivity than silicon dioxide. This invention comprises a heat pipe located in the interconnect region of an IC to transfer heat generated by components in the IC substrate to metal plugs located on the top surface of the IC, where the heat is easily conducted to the exterior of the IC package. Refinements such as a wicking liner or reticulated inner surface will increase the thermal transfer efficiency of the heat pipe. Strengthening elements in the interior of the heat pipe will provide robustness to mechanical stress during IC manufacture.

Type: Application

Filed: September 28, 2007

Publication date: April 2, 2009

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Sameer Kumar Ajmera, Phillip D. Matz, Stephan Grunow, Satyavolu Srinivas Papa Rao
N based plasma treatment for enhanced sidewall smoothing and pore sealing porous low-k dielectric films

Patent number: 7476602

Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.

Type: Grant

Filed: January 31, 2005

Date of Patent: January 13, 2009

Assignee: Texas Instruments Incorporated

Inventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin

prev … 2 3 4 5 6 7 next