Patents by Inventor Sameer Kumar
Sameer Kumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8032892Abstract: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node.Type: GrantFiled: June 26, 2007Date of Patent: October 4, 2011Assignee: International Business Machines CorporationInventors: Michael Blocksome, Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
-
Publication number: 20110219208Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).Type: ApplicationFiled: January 10, 2011Publication date: September 8, 2011Applicant: International Business Machines CorporationInventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
-
Publication number: 20110179229Abstract: A system, method and computer program product for performing various store-operate instructions in a parallel computing environment that includes a plurality of processors and at least one cache memory device. A queue in the system receives, from a processor, a store-operate instruction that specifies under which condition a cache coherence operation is to be invoked. A hardware unit in the system runs the received store-operate instruction. The hardware unit evaluates whether a result of the running the received store-operate instruction satisfies the condition. The hardware unit invokes a cache coherence operation on a cache memory address associated with the received store-operate instruction if the result satisfies the condition. Otherwise, the hardware unit does not invoke the cache coherence operation on the cache memory device.Type: ApplicationFiled: January 7, 2011Publication date: July 21, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dong Chen, Philip Heidelberger, Sameer Kumar, Martin Ohmacht, Burkhard Steinmacher-Burow
-
Patent number: 7984448Abstract: A system and method for supporting collective communications on a plurality of processors that use different parallel programming paradigms, in one aspect, may comprise a schedule defining one or more tasks in a collective operation, an executor that executes the task, a multisend module to perform one or more data transfer functions associated with the tasks, and a connection manager that controls one or more connections and identifies an available connection. The multisend module uses the available connection in performing the one or more data transfer functions. A plurality of processors that use different parallel programming paradigms can use a common implementation of the schedule module, the executor module, the connection manager and the multisend module via a language adaptor specific to a parallel programming paradigm implemented on a processor.Type: GrantFiled: June 26, 2007Date of Patent: July 19, 2011Assignee: International Business Machines CorporationInventors: Gheorghe Almasi, Gabor Dozsa, Sameer Kumar
-
Publication number: 20110173343Abstract: A system for routing data in a network comprising a network logic device at a sending node for determining a path between the sending node and a receiving node, wherein the network logic device sets one or more selection bits and one or more hint bits within the data packet, a control register for storing one or more masks, wherein the network logic device uses the one or more selection bits to select a mask from the control register and the network logic device applies the selected mask to the hint bits to restrict routing of the data packet to one or more routing directions for the data packet within the network and selects one of the restricted routing directions from the one or more routing directions and sends the data packet along a link in the selected routing direction toward the receiving node.Type: ApplicationFiled: January 8, 2010Publication date: July 14, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dong Chen, Philip Heidelberger, Sameer Kumar
-
Patent number: 7958274Abstract: Methods, compute nodes, and computer program products are provided for heuristic status polling of a component in a computing system. Embodiments include receiving, by a polling module from a requesting application, a status request requesting status of a component; determining, by the polling module, whether an activity history for the component satisfies heuristic polling criteria; polling, by the polling module, the component for status if the activity history for the component satisfies the heuristic polling criteria; and not polling, by the polling module, the component for status if the activity history for the component does not satisfy the heuristic criteria.Type: GrantFiled: June 18, 2007Date of Patent: June 7, 2011Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker, Joseph D. Ratterman
-
Publication number: 20110119468Abstract: A system and method for enhancing barrier collective synchronization on a computer system comprises a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program being executed by a processor. The system includes providing a plurality of communicators for storing state information for a bather algorithm. Each communicator designates a master core in a multi-processor environment of the computer system. The system allocates or designates one counter for each of a plurality of threads. The system configures a table with a number of entries equal to the maximum number of threads. The system sets a table entry with an ID associated with a communicator when a process thread initiates a collective. The system determines an allocated or designated counter by searching entries in the table.Type: ApplicationFiled: January 29, 2010Publication date: May 19, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Michael Blocksome, Douglas Miller
-
Publication number: 20110078249Abstract: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.Type: ApplicationFiled: September 28, 2009Publication date: March 31, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael Blocksome, Gabor Dozsa, Thomas M. Gooding, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Douglas Miller
-
Publication number: 20110072241Abstract: Implementation primitives for concurrent array-based stacks, queues, double-ended queues (deques) and wrapped deques are provided. In one aspect, each element of the stack, queue, deque or wrapped deque data structure has its own ticket lock, allowing multiple threads to concurrently use multiple elements of the data structure and thus achieving high performance. In another aspect, new synchronization primitives FetchAndIncrementBounded (Counter, Bound) and FetchAndDecrementBounded (Counter, Bound) are implemented. These primitives can be implemented in hardware and thus promise a very fast throughput for queues, stacks and double-ended queues.Type: ApplicationFiled: September 22, 2009Publication date: March 24, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dong Chen, Alana Gara, Philip Heidelberger, Sameer Kumar, Martin Ohmacht, Burkhard Steinmacher-Burow, Robert Wisniewski
-
Patent number: 7910936Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.Type: GrantFiled: December 9, 2008Date of Patent: March 22, 2011Assignee: Texas Instruments IncorporatedInventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin
-
Patent number: 7899055Abstract: A method for route optimization with a dual mobile IPv4 node in an IPv6-only network is provided. The method includes the operations of: receiving a visited IPv6 address from a router when the dual mobile node is connected to the IPv6-only network; updating a home agent with the IPv6 address; deregistering a binding update with a correspondent node via the home agent; updating the correspondent node with an IPv6 address; checking the reachability of packets directly to the correspondent node using its IPv6 address; the mobile node starting sending, to the CN, data packets tunneled in an IPv6 packet once the reachability is verified; and the correspondent node sending tunneled data packets directly to an IPv6 address of the mobile node.Type: GrantFiled: December 27, 2006Date of Patent: March 1, 2011Assignee: Samsung Electronics Co., Ltd.Inventors: Kishore Mundra, Lakshmi Praba Gurusamy, Sameer Kumar, Ranjitsinh Udaysinh Wable
-
Publication number: 20110010471Abstract: A parallel computer system includes a plurality of compute nodes. Each of the compute nodes includes at least one processor, at least one memory, and a direct memory address engine coupled to the at least one processor and the at least one memory. The system also includes a network interconnecting the plurality of compute nodes. The network operates a global message-passing application for performing communications across the network. Local instances of the global message-passing application operate at each of the compute nodes to carry out local processing operations independent of processing operations carried out at another one of the compute nodes. The direct memory address engines are configured to interact with the local instances of the global message-passing application via injection FIFO metadata describing an injection FIFO in a corresponding one of the memories.Type: ApplicationFiled: July 10, 2009Publication date: January 13, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Philip Heidelberger, Sameer Kumar
-
Publication number: 20100268852Abstract: Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (‘DMA’) injection first-in-first-out (‘FIFO’) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.Type: ApplicationFiled: May 30, 2007Publication date: October 21, 2010Inventors: Charles J Archer, Michael A. Blocksome, Bob R. Cernohous, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker
-
Patent number: 7781884Abstract: The density of components in integrated circuits (ICs) is increasing with time. The density of heat generated by the components is similarly increasing. Maintaining the temperature of the components at reliable operating levels requires increased thermal transfer rates from the components to the IC package exterior. Dielectric materials used in interconnect regions have lower thermal conductivity than silicon dioxide. This invention comprises a heat pipe located in the interconnect region of an IC to transfer heat generated by components in the IC substrate to metal plugs located on the top surface of the IC, where the heat is easily conducted to the exterior of the IC package. Refinements such as a wicking liner or reticulated inner surface will increase the thermal transfer efficiency of the heat pipe. Strengthening elements in the interior of the heat pipe will provide robustness to mechanical stress during IC manufacture.Type: GrantFiled: September 28, 2007Date of Patent: August 24, 2010Assignee: Texas Instruments IncorporatedInventors: Sameer Kumar Ajmera, Phillip D. Matz, Stephan Grunow, Satyavolu Srinivas Papa Rao
-
Patent number: 7765337Abstract: Methods, compute nodes, and computer program products are provided for direct memory access (‘DMA’) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (‘FIFO’) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.Type: GrantFiled: June 5, 2007Date of Patent: July 27, 2010Assignee: International Business Machines CorporationInventors: Dong Chen, Mark E. Giampapa, Philip Heidelberger, Sameer Kumar, Jeffrey J. Parker, Burkhard D. Steinmacher-Burow, Pavlos Vranas
-
Patent number: 7738443Abstract: Disclosed is a mechanism on receiving processors in a parallel computing system for providing order to data packets received from a broadcast call and to distinguish data packets received at nodes from several incoming asynchronous broadcast messages where header space is limited. In the present invention, processors at lower leafs of a tree do not need to obtain a broadcast message by directly accessing the data in a root processor's buffer. Instead, each subsequent intermediate node's rank id information is squeezed into the software header of packet headers. In turn, the entire broadcast message is not transferred from the root processor to each processor in a communicator but instead is replicated on several intermediate nodes which then replicated the message to nodes in lower leafs. Hence, the intermediate compute nodes become “virtual root compute nodes” for the purpose of replicating the broadcast message to lower levels of a tree.Type: GrantFiled: June 26, 2007Date of Patent: June 15, 2010Assignee: International Business Machines CorporationInventor: Sameer Kumar
-
Publication number: 20090147751Abstract: Provided are a method of applying fast mobile IPv6 (FMIPv6) to a mobile node in order to prevent loss of packets transmitted during a handover of a mobile router from a first access router to a second access router, and a mobile router and a mobile network therefor. Specifically, the mobile router receives, from the first access router, a message containing a prefix corresponding to the second access router, transmits, to the mobile node, a message containing the prefix and information indicating that the prefix is received from the first access router, transmits, to the first access router, a message for FMIPv6, and transmits, to the mobile node, a message to set a zero lifetime for a prefix corresponding to the first access router. Furthermore, the mobile node transmits a message for FMIPv6 to the first access router when the mobile node receives the message containing the prefix.Type: ApplicationFiled: August 4, 2006Publication date: June 11, 2009Inventors: Lakshmi Prabha Gurusamy, Sameer Kumar, Ranjitsinh Udaysinh Wable, Kishore Mundra, Syam Madanapalli
-
Publication number: 20090115030Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.Type: ApplicationFiled: December 9, 2008Publication date: May 7, 2009Applicant: Texas Instruments IncorporatedInventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin
-
Publication number: 20090085197Abstract: The density of components in integrated circuits (ICs) is increasing with time. The density of heat generated by the components is similarly increasing. Maintaining the temperature of the components at reliable operating levels requires increased thermal transfer rates from the components to the IC package exterior. Dielectric materials used in interconnect regions have lower thermal conductivity than silicon dioxide. This invention comprises a heat pipe located in the interconnect region of an IC to transfer heat generated by components in the IC substrate to metal plugs located on the top surface of the IC, where the heat is easily conducted to the exterior of the IC package. Refinements such as a wicking liner or reticulated inner surface will increase the thermal transfer efficiency of the heat pipe. Strengthening elements in the interior of the heat pipe will provide robustness to mechanical stress during IC manufacture.Type: ApplicationFiled: September 28, 2007Publication date: April 2, 2009Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Sameer Kumar Ajmera, Phillip D. Matz, Stephan Grunow, Satyavolu Srinivas Papa Rao
-
Patent number: 7476602Abstract: A method of forming a semiconductor device including forming a low-k dielectric material over a substrate, depositing a liner on a portion of the low-k dielectric material, and exposing the liner to a plasma. The method also includes depositing a layer over the liner.Type: GrantFiled: January 31, 2005Date of Patent: January 13, 2009Assignee: Texas Instruments IncorporatedInventors: Sameer Kumar Ajmera, Patricia Beauregard Smith, Changming Jin