Patents by Inventor Lakshminarayana B. Arimilli

Lakshminarayana B. Arimilli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Hardware-assisted interthread push communication

Patent number: 9286148

Abstract: In a data processing system, a switch of the data processing system receives a request to push a message referenced by an instruction of a sending thread to a receiving thread. In response to receiving the request, the switch determines whether the sending thread is authorized to push the message to the receiving thread by attempting to access an entry of a data structure of the switch utilizing a key derived from at least one identifier of the sending thread. In response to access to the entry being successful, content of the entry is utilized to determine an address of a mailbox of the receiving thread, and the switch pushes the message to the mailbox of the receiving thread. In response to access to the entry not being successful, the switch refrains from pushing the message to the mailbox of the receiving thread.

Type: Grant

Filed: December 22, 2014

Date of Patent: March 15, 2016

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, John D. Irish, Charles F. Marino, William J. Starke
Techniques for cache injection in a processor system

Patent number: 9110885

Abstract: A technique for performing cache injection includes monitoring addresses on a bus. Ownership of input/output data on the bus is acquired by a cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache.

Type: Grant

Filed: September 18, 2008

Date of Patent: August 18, 2015

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Balaram Sinharoy
Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks

Patent number: 8893148

Abstract: A system and method are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.

Type: Grant

Filed: June 15, 2012

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Binding a process to a special purpose processing element having characteristics of a processor

Patent number: 8893126

Abstract: A heterogeneous processing element model is provided where I/O devices look and act like processors. In order to be treated like a processor, an I/O processing element, or other special purpose processing element, must follow some rules and have some characteristics of a processor, such as address translation, security, interrupt handling, and exception processing, for example. The heterogeneous processing element model puts special purpose processing elements on the same playing field as processors, from a programming perspective, operating system perspective, and power perspective. The operating system can get work to a security engine, for example, in the same way it does to a processor.

Type: Grant

Filed: February 1, 2008

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, Charles F. Marino, William J. Starke
Collective acceleration unit tree structure

Patent number: 8756270

Abstract: A mechanism is provided in a collective acceleration unit for performing a collective operation to distribute or collect data among a plurality of participant nodes. The mechanism receives an input collective packet for a collective operation from a neighbor node within a collective tree. The input collective packet comprises a tree identifier and an input data field and wherein the collective tree comprises a plurality of sub trees. The mechanism maps the tree identifier to an index within the collective acceleration unit. The index identifies a portion of resources within the collective acceleration unit and is associated with a set of neighbor nodes in a given sub tree within the collective tree. For each neighbor node the collective acceleration unit stores destination information. The collective acceleration unit performs an operation on the input data field using the portion of resources to effect the collective operation.

Type: Grant

Filed: April 24, 2012

Date of Patent: June 17, 2014

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Paul F. Lecocq, Hanhong Xue
Collective acceleration unit tree structure

Patent number: 8751655

Abstract: A mechanism is provided in a collective acceleration unit for performing a collective operation to distribute or collect data among a plurality of participant nodes. The mechanism receives an input collective packet for a collective operation from a neighbor node within a collective tree. The input collective packet comprises a tree identifier and an input data field and wherein the collective tree comprises a plurality of sub trees. The mechanism maps the tree identifier to an index within the collective acceleration unit. The index identifies a portion of resources within the collective acceleration unit and is associated with a set of neighbor nodes in a given sub tree within the collective tree. For each neighbor node the collective acceleration unit stores destination information. The collective acceleration unit performs an operation on the input data field using the portion of resources to effect the collective operation.

Type: Grant

Filed: March 29, 2010

Date of Patent: June 10, 2014

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Paul F. Lecocq, Hanhong Xue
Host fabric interface (HFI) to perform global shared memory (GSM) operations

Patent number: 8484307

Abstract: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.

Type: Grant

Filed: February 1, 2008

Date of Patent: July 9, 2013

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, William J. Starke, Hanhong Xue
Collective acceleration unit tree flow control and retransmit

Patent number: 8417778

Abstract: A mechanism is provided for collective acceleration unit tree flow control forms a logical tree (sub-network) among those processors and transfers “collective” packets on this tree. The system supports many collective trees, and each collective acceleration unit (CAU) includes resources to support a subset of the trees. Each CAU has limited buffer space, and the connection between two CAUs is not completely reliable. Therefore, to address the challenge of collective packets traversing on the tree without colliding with each other for buffer space and guaranteeing the end-to-end packet delivery, each CAU in the system effectively flow controls the packets, detects packet loss, and retransmits lost packets.

Type: Grant

Filed: December 17, 2009

Date of Patent: April 9, 2013

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Jody B. Joyner, Paul F. Lecocq, Hanhong Xue
Collective Acceleration Unit Tree Structure

Publication number: 20120296915

Abstract: A mechanism is provided in a collective acceleration unit for performing a collective operation to distribute or collect data among a plurality of participant nodes. The mechanism receives an input collective packet for a collective operation from a neighbor node within a collective tree. The input collective packet comprises a tree identifier and an input data field and wherein the collective tree comprises a plurality of sub trees. The mechanism maps the tree identifier to an index within the collective acceleration unit. The index identifies a portion of resources within the collective acceleration unit and is associated with a set of neighbor nodes in a given sub tree within the collective tree. For each neighbor node the collective acceleration unit stores destination information. The collective acceleration unit performs an operation on the input data field using the portion of resources to effect the collective operation.

Type: Application

Filed: April 24, 2012

Publication date: November 22, 2012

Applicant: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Paul F. Lecocq, Hanhong Xue
Hardware based dynamic load balancing of message passing interface tasks by modifying tasks

Patent number: 8312464

Abstract: Mechanisms are provided for providing hardware based dynamic load balancing of message passing interface (MPI) tasks by modifying tasks. Mechanisms for adjusting the balance of processing workloads of the processors executing tasks of an MPI job are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. Each processor has an associated hardware implemented MPI load balancing controller. The MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. Thus, operations may be performed to shift workloads from the slowest processor to one or more of the faster processors.

Type: Grant

Filed: August 28, 2007

Date of Patent: November 13, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Performing Setup Operations for Receiving Different Amounts of Data While Processors are Performing Message Passing Interface Tasks

Publication number: 20120266180

Abstract: A system and method are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.

Type: Application

Filed: June 15, 2012

Publication date: October 18, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Mechanism to prevent illegal access to task address space by unauthorized tasks

Patent number: 8275947

Abstract: A method and data processing system for tracking global shared memory (GSM) operations to and from a local node configured with a host fabric interface (HFI) coupled to a network fabric. During task/job initialization, the system OS assigns HFI window(s) to handle the GSM packet generation and GSM packet receipt and processing for each local task. HFI processing logic automatically tags each GSM packet generated by the HFI window with a global job identifier (ID) of the job to which the local task is affiliated. The job ID is embedded within each GSM packet placed on the network fabric. On receipt of a GSM packet from the network fabric, the HFI logic retrieves the embedded job ID and compares the embedded job ID with the ID within the HFI window(s). GSM packets are forwarded to an HFI window only when the embedded job ID matches the HFI window's job ID.

Type: Grant

Filed: February 1, 2008

Date of Patent: September 25, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
Notification to task of completion of GSM operations by initiator node

Patent number: 8255913

Abstract: In a global shared memory (GSM) environment, a method provides local notification of completion of a global shared memory (GSM) operation processed by a first task executing at a local node of the distributed system. The system includes multiple nodes on which different tasks of a single job execute and perform GSM operations that are received from a second task via a via host fabric interface (HFI) and associated HFR window assigned to the first tasks. The local task initiates execution of a GSM operation on the local node. The task then monitors for and detects a completion of the execution of the GSM operation on the local node. When the task detects completion of the execution of the GSM operation, the task issues an internal notification to inform the locally-executing tasks of the completion of the GSM operation.

Type: Grant

Filed: February 1, 2008

Date of Patent: August 28, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
Claiming coherency ownership of a partial cache line of data

Patent number: 8255635

Abstract: According to method of data processing in a multiprocessor data processing system, in response to a processor request to modify a target granule of a target cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a data-claim-partial request that requests permission to promote only the target granule of the target cache line to a unique copy with an intent to modify the target granule. In response to a combined response to the data-claim-partial request indicating success (the combined response representing a system-wide response to the data-claim-partial-request), the processing unit promotes only the target granule of the target cache line to a unique copy by updating a coherency state of the target granule and retaining a coherency state of at least one other granule of the target cache line.

Type: Grant

Filed: February 1, 2008

Date of Patent: August 28, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Jerry D. Lewis, Warren E. Maule
Notification by task of completion of GSM operations at target node

Patent number: 8239879

Abstract: A method for providing global notification of completion of a global shared memory (GSM) operation during processing by a target task executing at a target node of a distributed system. The distributed system has at least one other node on which an initiating task that generated the GSM operation is homed. The target task receives the GSM operation from the initiating task, via a host fabric interface (HFI) window assigned to the target task. The task initiates execution of the GSM operation on the target node. The task detects completion of the execution of the GSM operation on the target node, and issues a global notification to at least the initiating task. The global notification indicates the completion of the execution of the GSM operation to one or more tasks of a single job distributed across multiple processing nodes.

Type: Grant

Filed: February 1, 2008

Date of Patent: August 7, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks

Patent number: 8234652

Abstract: Mechanisms are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so us to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.

Type: Grant

Filed: August 28, 2007

Date of Patent: July 31, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
User level message broadcast mechanism in distributed computing environment

Patent number: 8214424

Abstract: A data processing system is programmed to provide a method for enabling user-level one-to-all message/messaging (OTAM) broadcast within a distributed parallel computing environment in which multiple threads of a single job execute on different processing nodes across a network. The method comprises: generating one or more messages for transmission to at least one other processing node accessible via a network, where the messages are generated by/for a first thread executing at the data processing system (first processing node) and the other processing node executes one or more second threads of a same parallel job as the first thread. An OTAM broadcast is transmitting via a host fabric interface (HFI) of the data processing system as a one-to-all broadcast on the network, whereby the messages are transmitted to a cluster of processing nodes across the network that execute threads of the same parallel job as the first thread.

Type: Grant

Filed: April 16, 2009

Date of Patent: July 3, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore
Method and apparatus for handling multiple memory requests within a multiprocessor system

Patent number: 8214603

Abstract: A method for handling multiple memory requests within a multi-processor system is disclosed. A lock control section is initially assigned to a data block within a system memory. In response to a request for accessing the data block by a processing unit, a determination is made whether or not the lock control section of the data block has been set. If the lock control section has been set, another determination is made whether or not the requesting processing unit is located beyond a predetermined distance from a memory controller. If the requesting processing unit is located beyond a predetermined distance from the memory controller, the requesting processing unit is invited to perform other functions; otherwise, the number of the requesting processing unit is placed in a queue table. However, if the lock control section has not been set, the lock control section of the data block is set, and the access request is allowed.

Type: Grant

Filed: February 1, 2008

Date of Patent: July 3, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, William J. Starke
Generating and issuing global shared memory operations via a send FIFO

Patent number: 8200910

Abstract: A method for issuing global shared memory (GSM) operations from an originating task on a first node coupled to a network fabric of a distributed network via a host fabric interface (HFI). The originating task generates a GSM command within an effective address (EA) space. The task then places the GSM command within a send FIFO. The send FIFO is a portion of real memory having real addresses (RA) that are memory mapped to EAs of a globally executing job. The originating task maintains a local EA-to-RA mapping of only a portion of the real address space of the globally executing job. The task enables the HFI to retrieve the GSM command from the send FIFO into an HFI window allocated to the originating task. The HFI window generates a corresponding GSM packet containing GSM operations and/or data, and the HFI window issues the GSM packet to the network fabric.

Type: Grant

Filed: February 1, 2008

Date of Patent: June 12, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
Method for data processing using a multi-tiered full-graph interconnect architecture

Patent number: 8185896

Abstract: A method is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.

Type: Grant

Filed: August 27, 2007

Date of Patent: May 22, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, Edward J. Seminaro, William E. Speight

prev 1 2 3 4 5 6 7 next