Patents by Inventor Ravi K. Arimilli

Ravi K. Arimilli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8250307
    Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8250396
    Abstract: A hardware wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism recognizes a programming idiom that indicates that a thread is waiting for an event. The wake-and-go mechanism updates a wake-and-go array with a target address associated with the event. The thread then goes to sleep until the event occurs. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, the CAM returns a list of storage addresses at which the target address is stored. The wake-and-go mechanism associates these storage addresses with the threads waiting for an even at the target addresses, and may wake the one or more threads waiting for the event.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8245004
    Abstract: A data processing system includes a set of architected registers within which the processor places state and other information to communicate with the asynchronous memory mover in order to initiate and control an AMM operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation in response to receiving a set of parameters within the architected registers, which parameters are associated with an AMM store instruction executed by the processor to initiates a move of data in virtual space before placing the information in the architected registers. The architected registers are processor architected registers, defined on a per thread basis by a compiler, or memory mapped architected registers allocated for communicating with the asynchronous memory mover during a bind and subsequent execution of an application. The architected registers are also utilized to store state information to enable a restore to a point before execution of the AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 8234652
    Abstract: Mechanisms are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so us to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.
    Type: Grant
    Filed: August 28, 2007
    Date of Patent: July 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
  • Publication number: 20120191674
    Abstract: Mechanisms are provided for processing streaming data at high sustained data rates. These mechanisms receive a plurality of data elements over a plurality of non-sequential communication channels and write the plurality of data elements directly to the file system of the data processing system in an unassembled manner. The mechanisms determining whether to perform a data scrubbing operation or not based on history information indicative of whether data elements in the plurality of data elements are being received in a substantially sequential manner. The mechanisms perform a data scrubbing operation, in response to a determination to perform data scrubbing, to identify any missing data elements in the plurality of data elements written to the tile system and assemble the plurality of data elements into a plurality of data streams in response to results of the data scrubbing indicating that there are no missing data elements.
    Type: Application
    Filed: April 3, 2012
    Publication date: July 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi K. Arimilli, Piyush Chaudhary
  • Patent number: 8230201
    Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism detects a thread running on a first processing unit within a plurality of processing units that is waiting for an event that modifies a data value associated with a target address. The wake-and-go mechanism creates a wake-and-go instance for the thread by populating a wake-and-go storage array with the target address. The operating system places the thread in a sleep state. Responsive to detecting the event that modifies the data value associated with the target address, the wake-and-go mechanism assigns the wake-and-go instance to a second processing unit within the plurality of processing units. The operating system on the second processing unit places the thread in a non-sleep state.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8225120
    Abstract: Snoop response logic on a system bus is configured to detect on the system bus requests to access data at a target address with data exclusivity from at least one of a plurality of wake-and-go engines. The snoop response logic is further configured to determine a winning wake-and-go engine from the at least one wake-and-go engine that obtains a lock on the target address and generate a combined snoop response. The combined snoop response identifies the winning wake-and-go engine. The snoop response logic sends the combined snoop response to the at least one wake-and-go engine on the system bus. Each remaining wake-and-go engine within the at least one wake-and-go engine places an entry in its respective wake-and-go storage array to spin on a lock for the target address.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: July 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8214424
    Abstract: A data processing system is programmed to provide a method for enabling user-level one-to-all message/messaging (OTAM) broadcast within a distributed parallel computing environment in which multiple threads of a single job execute on different processing nodes across a network. The method comprises: generating one or more messages for transmission to at least one other processing node accessible via a network, where the messages are generated by/for a first thread executing at the data processing system (first processing node) and the other processing node executes one or more second threads of a same parallel job as the first thread. An OTAM broadcast is transmitting via a host fabric interface (HFI) of the data processing system as a one-to-all broadcast on the network, whereby the messages are transmitted to a cluster of processing nodes across the network that execute threads of the same parallel job as the first thread.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore
  • Patent number: 8214592
    Abstract: Disclosed are a method, a system and a computer program product for operating a cache system. The cache system can include multiple cache lines, and a first cache line of the multiple of cache lines can include multiple cache cells, and a bus coupled to the multiple cache cells. In one or more embodiments, the bus can include a switch that is operable to receive a first control signal and to split the bus into first and second portions or aggregate the bus into a whole based on the first control signal. When the bus is split, a first cache cell and a second cache cell of the multiple cache cells are coupled to respective first and second portions of the bus. Data from the first and second cache cells can be selected through respective portions of the bus and outputted through a port of the cache system.
    Type: Grant
    Filed: April 15, 2009
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Donald W. Plass, William John Starke
  • Patent number: 8214603
    Abstract: A method for handling multiple memory requests within a multi-processor system is disclosed. A lock control section is initially assigned to a data block within a system memory. In response to a request for accessing the data block by a processing unit, a determination is made whether or not the lock control section of the data block has been set. If the lock control section has been set, another determination is made whether or not the requesting processing unit is located beyond a predetermined distance from a memory controller. If the requesting processing unit is located beyond a predetermined distance from the memory controller, the requesting processing unit is invited to perform other functions; otherwise, the number of the requesting processing unit is placed in a queue table. However, if the lock control section has not been set, the lock control section of the data block is set, and the access request is allowed.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, William J. Starke
  • Patent number: 8209488
    Abstract: A technique for data prefetching using indirect addressing includes monitoring data pointer values, associated with an array, in an access stream to a memory. The technique determines whether a pattern exists in the data pointer values. A prefetch table is then populated with respective entries that correspond to respective array address/data pointer pairs based on a predicted pattern in the data pointer values. Respective data blocks (e.g., respective cache lines) are then prefetched (e.g., from the memory or another memory) based on the respective entries in the prefetch table.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: June 26, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20120159126
    Abstract: A programming language may include hint instructions that may notify a programming idiom accelerator that a programming idiom is coming. An idiom begin hint exposes the programming idiom to the programming idiom accelerator. Thus, the programming idiom accelerator need not perform pattern matching or other forms of analysis to recognize a sequence of instructions. Rather, the programmer may insert idiom hint instructions, such as an idiom begin hint, to expose the idiom to the programming idiom accelerator. Similarly, an idiom end hint may mark the end of the programming idiom.
    Type: Application
    Filed: February 1, 2008
    Publication date: June 21, 2012
    Inventors: Ravi K Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8185896
    Abstract: A method is provided for implementing a multi-tiered full-graph interconnect architecture. In order to implement a multi-tiered full-graph interconnect architecture, a plurality of processors are coupled to one another to create a plurality of processor books. The plurality of processor books are coupled together to create a plurality of supernodes. Then, the plurality of supernodes are coupled together to create the multi-tiered full-graph interconnect architecture. Data is then transmitted from one processor to another within the multi-tiered full-graph interconnect architecture based on an addressing scheme that specifies at least a supernode and a processor book associated with a target processor to which the data is to be transmitted.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: May 22, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, Edward J. Seminaro, William E. Speight
  • Patent number: 8176026
    Abstract: Mechanisms for performing a backend operation in a file system are provided. A backend operation on a portion of the file system is initiated. At least one indirect transition table data structure is created for performing the backend operation. Metadata corresponding to the portion of the file system is linked to the at least one indirect transition table data structure. The backend operation is performed on data in a sub-portion of the portion of the file system and the at least one indirect transition table data structure is updated with pointers to new locations of the data in the sub-portion as transitions of the data are completed. At least one data access operation is performed to the portion of the file system at substantially a same time as performing the backend operation on the data in the sub-portion of the portion of the file system.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: May 8, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Piyush Chaudhary
  • Patent number: 8171476
    Abstract: A hardware private array is a thread state storage that is embedded within the processor or within logic associated with a bus or wake-and-go logic. The hardware private array and/or wake-and-go array may have a limited storage area. Therefore, each thread may have an associated priority. If there is insufficient space in the hardware private array, then the wake-and-go mechanism may compare the priority of the thread to the priorities of the threads already stored in the hardware private array and wake-and-go array. If the thread has a higher priority than at least one thread already stored in the hardware private array and wake-and-go array, then the wake-and-go mechanism may remove a lowest priority thread, meaning the thread is removed from hardware private array and wake-and-go array and converted to a flee model.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8166277
    Abstract: A technique for performing indirect data prefetching includes determining a first memory address of a pointer associated with a data prefetch instruction. Content of a memory at the first memory address is then fetched. A second memory address is determined from the content of the memory at the first memory address. Finally, a data block (e.g., a cache line) including data at the second memory address is fetched (e.g., from the memory or another memory).
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8161263
    Abstract: A processor includes a first address translation engine, a second address translation engine, and a prefetch engine. The first address translation engine is configured to determine a first memory address of a pointer associated with a data prefetch instruction. The prefetch engine is coupled to the first translation engine and is configured to fetch content, included in a first data block (e.g., a first cache line) of a memory, at the first memory address. The second address translation engine is coupled to the prefetch engine and is configured to determine a second memory address based on the content of the memory at the first memory address. The prefetch engine is also configured to fetch (e.g., from the memory or another memory) a second data block (e.g., a second cache line) that includes data at the second memory address.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8161264
    Abstract: A technique for performing data prefetching using indirect addressing includes determining a first memory address of a pointer associated with a data prefetch instruction. Content, that is included in a first data block (e.g., a first cache line) of a memory, at the first memory address is then fetched. An offset is then added to the content of the memory at the first memory address to provide a first offset memory address. A second memory address is then determined based on the first offset memory address. A second data block (e.g., a second cache line) that includes data at the second memory address is then fetched (e.g., from the memory or another memory). A data prefetch instruction may be indicated by a unique operational code (opcode), a unique extended opcode, or a field (including one or more bits) in an instruction.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8161265
    Abstract: A technique for performing data prefetching using multi-level indirect data prefetching includes determining a first memory address of a pointer associated with a data prefetch instruction. Content that is included in a first data block (e.g., a first cache line of a memory) at the first memory address is then fetched. A second memory address is then determined based on the content at the first memory address. Content that is included in a second data block (e.g., a second cache line) at the second memory address is then fetched (e.g., from the memory or another memory). A third memory address is then determined based on the content at the second memory address. Finally, a third data block (e.g., a third cache line) that includes another pointer or data at the third memory address is fetched (e.g., from the memory or the another memory).
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8145723
    Abstract: A remote update programming idiom accelerator is configured to detect a complex remote update programming idiom in an instruction sequence of a thread. The complex remote update programming idiom includes a read operation for reading data from a storage location at a remote node, a sequence of instructions for performing an update operation on the data to form result data, and a write operation for writing the result data to the storage location at the remote node. The remote update programming idiom accelerator is configured to determine whether the sequence of instructions is longer than an instruction size threshold and responsive to a determination that the sequence of instructions is not longer than the instruction size threshold, transmit the complex remote update programming idiom to the remote node to perform the update operation on the data at the remote node.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg