Patents by Inventor Ravi Kumar Arimilli

Ravi Kumar Arimilli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20040111575
    Abstract: A method for enabling concurrent, overlapping data moves associated with separate data clone operations of different memory cloners. A first data is being moved from its source to a destination. The first data is tagged with the address of the first destination to identify the data, and the data is sent over the fabric with the destination tag. A second data is concurrently (or subsequently) routed over the fabric to a next destination, while the first data is still in on the fabric. The second data is also tagged with its specific destination tag, which is different from the destination tag of the first data routed. Thus, the two sets of data overlap on the on the fabric but are each uniquely identified by their respective destination tag. Both the first and second data may also be tagged with a respective unique identifier (ID) associated with the memory cloner that initiated the particular clone operation.
    Type: Application
    Filed: December 5, 2002
    Publication date: June 10, 2004
    Applicant: International Business Machines Corp.
    Inventors: Ravi Kumar Arimilli, Benjiman Lee Goodman, Jody Bern Joyner
  • Patent number: 6748518
    Abstract: Disclosed is a processor, which reduces issuing of unnecessary barrier operations during instruction processing. The processor comprises an instruction sequencing unit and a load store unit (LSU) that issues a group of memory access requests that precede a barrier instruction in an instruction sequence. The processor also includes a controller, which in response to a determination that all of the memory access requests hit in a cache affiliated with the processor, withholds issuing on an interconnect a barrier operation associated with the barrier instruction. The controller further directs the load store unit to ignore the barrier instruction and complete processing of a next group of memory access requests following the barrier instruction in the instruction sequence without receiving an acknowledgment.
    Type: Grant
    Filed: June 6, 2000
    Date of Patent: June 8, 2004
    Assignee: International Business Machines Corporation
    Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
  • Patent number: 6748501
    Abstract: A method of storing values in a sliced cache by providing separate, but coordinated, reservation units for each cache slice. When a load-with-reserve (larx) operation is issued from the processor core as part of an atomic read-modify-write sequence, a message is broadcast to each of the cache slices to clear reservation flags in the slices; a reservation flag is also set in the target cache slice, and a memory address associated with the load-with-reserve operation is loaded into a reservation unit of the target cache slice. When a conditional store operation is issued from the core to complete the atomic read-modify-write sequence, a second message is broadcast to any non-target cache slice of the processing unit to clear reservation flags in the non-target cache slice(s).
    Type: Grant
    Filed: December 30, 2000
    Date of Patent: June 8, 2004
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Guy Lynn Guthrie, Derek E. Williams
  • Patent number: 6728873
    Abstract: Disclosed is a method of operation within a processor, that enhances speculative branch processing. A speculative execution path contains an instruction sequence that includes a barrier instruction followed by a load instruction. While a barrier operation associated with the barrier instruction is pending, a load request associated with the load instruction is speculatively issued to memory. A flag is set for the load request when it is speculatively issued and reset when an acknowledgment is received for the barrier operation. Data which is returned by the speculatively issued load request is temporarily held and forwarded to a register or execution unit of the data processing system after the acknowledgment is received. All process results, including data returned by the speculatively issued load instructions are discarded when the speculative execution path is determined to be incorrect.
    Type: Grant
    Filed: June 6, 2000
    Date of Patent: April 27, 2004
    Assignee: International Business Machines Corporation
    Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
  • Patent number: 6725340
    Abstract: Disclosed is a processor that reduces barrier operations during instruction processing. An instruction sequence includes a first barrier instruction and a second barrier instruction with a store instruction in between the first and second barrier instructions. A store request associated with the store instruction is issued prior to a barrier operation associated with the first barrier instruction. A determination is made of when the store request completes before the first barrier instruction has issued. In response, only a single barrier operation is issued for both the first and second barrier instructions. The single barrier operation is issued after the store request has been issued and at the time the second barrier operation is scheduled to be issued.
    Type: Grant
    Filed: June 6, 2000
    Date of Patent: April 20, 2004
    Assignee: International Business Machines Corporation
    Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
  • Patent number: 6725304
    Abstract: An apparatus for connecting circuit modules is disclosed. The apparatus for connecting circuit modules that receives an input and an output signal at one circuit module and uses a transmitter/receiver to transmit data to and receive data from the second circuit module. Each transmitter/receiver is selectable between a bidirectional mode that transmits and simultaneously receives via two transmission lines, and a unidirectional mode that transmits on a first transmission line and receives from a second transmission line.
    Type: Grant
    Filed: December 19, 2000
    Date of Patent: April 20, 2004
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Daniel Mark Dreps
  • Publication number: 20040073757
    Abstract: A multiprocessor data processing system includes first and second processors coupled to an interconnect and to a global promotion facility containing a plurality of promotion bit fields. The first processor executes a single acquisition instruction to concurrently acquire a plurality of promotion bit fields exclusive of at least the second processor. In response to execution of the acquisition instruction, the first processor receives an indication of success or failure of the acquisition instruction, wherein the indication indicates success of the acquisition instruction if all of the plurality of promotion bit fields were concurrently acquired by the first processor and indicates failure of the acquisition instruction if fewer than all of the plurality of promotion bit fields were acquired by the first processor.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073759
    Abstract: A data processing system includes a global promotion facility and a plurality of processors coupled by an interconnect. In response to execution of an acquisition instruction by a first processor among the plurality of processors, the first processor transmits an address-only operation on the interconnect to acquire a promotion bit field within the global promotion facility exclusive of at least a second processor among the plurality of processors. In response to receipt of a combined response for the address-only operation representing a collective response of others of the plurality of processors to the address-only operation, the first processor determines whether or not acquisition of the promotion bit field was successful by reference to the combined response.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073760
    Abstract: A data processing system includes a global promotion facility and a plurality of processing units coupled by an interconnect. At least one processing unit among the plurality of processing units includes one or more second caches having cache arrays in which instructions and operand data are cached, an instruction sequencing unit, an execution unit that executes an acquisition instruction to acquire a promotion bit field within the global promotion facility exclusive of at least one other processing unit, and a promotion cache separate from the one or more second caches. In response to acquisition of the promotion bit field by the first processor, the promotion cache of the first processor stores the promotion bit field separately from instructions and operand data.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073765
    Abstract: A processor contains a move engine and a memory controller contains a mapping engine that, together, transparently reconfigure physical memory to accomplish addition, subtraction, or replacement of a memory module. A mapping engine register stores current and new real addresses that enable the engines to virtualize the physical address of the memory module being reconfigured and provide the reconfiguration in real-time through the use of hardware functionality and not software. Using the current and new real addresses to select a source and a target, the move engine copies the contents of the memory module to be removed or reconfigured into the remaining or inserted memory modules. Then, the real address associated with the reconfigured memory module is re-assigned to the memory module receiving the copied contents, thereby creating a virtualized physical mapping from the addressable real address space being utilized by the operating system into a virtual physical address space.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, John Steven Dodson, Sanjeev Ghai, Kenneth Lee Wright
  • Publication number: 20040073909
    Abstract: A multiprocessor data processing system includes a plurality of processors coupled to an interconnect and to a global promotion facility containing at least one promotion bit field. A first processor executes a high speed instruction sequence including a load-type instruction to acquire a promotion bit field within the global promotion facility exclusive of at least a second processor. The request may be made visible to all processors coupled to the interconnect. In response to execution of the load-type instruction, a register of the first processor receives a register bit field indicating whether or not the promotion bit field was acquired by execution of the load-type instruction. While the first processor holds the promotion bit field exclusive of the second processor, the second processor is permitted to initiate a request on the interconnect.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073742
    Abstract: A move engine and operating system transparently reconfigure physical memory to accomplish addition, subtraction, or replacement of a memory module. The operating system stores FROM and TO real addresses in unique fields in memory that are used to virtualize the physical address of the memory module being reconfigured and provide the reconfiguration in real-time through the use of hardware functionality and not software. Using the FROM and TO real addresses to select a source and a target, the move engine copies the contents of the memory module to be removed or reconfigured into the remaining or inserted memory module. Then, the real address associated with the reconfigured memory module is re-assigned to the memory module receiving the copied contents, thereby creating a virtualized physical mapping from the addressable real address space being utilized by the operating system into a virtual physical address space.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corp.
    Inventors: Ravi Kumar Arimilli, John Steven Dodson, Sanjeev Ghai, Kenneth Lee Wright
  • Publication number: 20040073743
    Abstract: A processor contains a move engine and mapping engine that transparently reconfigure physical memory to accomplish addition, subtraction, or replacement of a memory module. A mapping engine register stores FROM and TO real addresses that enable the engines to virtualize the physical address of the memory module being reconfigured and provide the reconfiguration in real-time through the use of hardware functionality and not software. Using the FROM and TO real addresses to select a source and a target, the move engine copies the contents of the memory module to be removed or reconfigured into the remaining or inserted memory module. Then, the real address associated with the reconfigured memory module is re-assigned to the memory module receiving the copied contents, thereby creating a virtualized physical mapping from the addressable real address space being utilized by the operating system into a virtual physical address space.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corp.
    Inventors: Ravi Kumar Arimilli, John Steven Dodson, Sanjeev Ghai, Kenneth Lee Wright
  • Publication number: 20040073766
    Abstract: Within a data processing system, a pool of facilities are allocated to an operating system, where each facility within the pool of facilities has an associated real address. The operating system allocates from the pool at least one bypass facility to a first process that the first process is permitted to directly access by its associated real address without first obtaining translation of a non-real address. The operating system also allocates from the pool at least one protected facility to a second process that the second process accesses only by translation of a non-real address to obtain the real address associated with the protected facility. Accesses to the facilities are either protected or unprotected based upon the state of a bypass field within a request address.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073734
    Abstract: A multiprocessor data processing system includes first and second processors coupled to an interconnect and to a global promotion facility containing at least one promotion bit field. The first processor initiates execution of a branch-type instruction to request acquisition of a promotion bit field exclusive of at least the second processor. In response to the branch-type instruction, the first processor issues an access request to acquire the promotion bit field. After the accessing request, a register of the first processor receives a register bit indicating whether or not the promotion bit field was successfully acquired by the access request. As a part of executing the branch-type instruction, the first processor selects among a first execution path and a second execution path in response to the register bit.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Publication number: 20040073756
    Abstract: A data processing system includes a global promotion facility containing a plurality of promotion bit fields, an interconnect, and a plurality of processing units coupled to the global promotion facility and to the interconnect. A first processing unit includes an instruction sequencing unit, an execution unit that executes an acquisition instruction to acquire a particular promotion bit field within the global promotion facility, and a promotion awareness facility. In response to the first processing unit snooping a request by a second processing unit for the particular promotion bit field, the first processing unit records an association between the second processing unit and the particular promotion bit field in the global promotion facility.
    Type: Application
    Filed: October 10, 2002
    Publication date: April 15, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Derek Edward Williams
  • Patent number: 6721853
    Abstract: A cache controller for a processor in a remote node of a system bus in a multiway multiprocessor link sends out a cache deallocate address transaction (CDAT) for a given cache line when that cache line is flushed and information from memory in a home node is no longer deemed valid for that cache line of that remote node processor. A local snoop of that CDAT transaction is then performed as a background function by other processors in the same remote node. If the snoop results indicate that same information is valid in another cache, and that cache decides it better to keep it valid in that remote node, then the information remains there. If the snoop results indicate that the information is not valid among caches in that remote node, or will be flushed due to the CDAT, the system memory directory in the home node of the multiprocessor link is notified and changes state in response to this.
    Type: Grant
    Filed: June 29, 2001
    Date of Patent: April 13, 2004
    Assignee: International Business Machines Corporation
    Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, James Stephen Fields, Jr., John Steven Dodson
  • Publication number: 20040059871
    Abstract: A set of local invalidation buses for a highly scalable shared cache memory hierarchy is disclosed. A symmetric multiprocessor data processing system includes multiple processing units. Each of the processing units is associated with a level one cache memory. All the level one cache memories are associated with an imprecisely inclusive level two cache memory. In addition, a group of local invalidation buses is connected between all the level one cache memories and the level two cache memory. The imprecisely inclusive level two cache memory includes a tracking means for imprecisely tracking cache line inclusivity of the level one cache memories. Thus, the level two cache memory does not have dedicated inclusivity bits for tracking the cache line inclusivity of each of the associated level one cache memories. The tracking means includes a last_processor_to_store field and a more_than_two_loads field per cache line.
    Type: Application
    Filed: August 8, 2002
    Publication date: March 25, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Guy Lynn Guthrie
  • Patent number: 6711652
    Abstract: A non-uniform memory access (NUMA) computer system includes a remote node coupled by a node interconnect to a home node including a home system memory. The remote node includes a plurality of snoopers coupled to a local interconnect. The plurality of snoopers includes a cache that caches a cache line corresponding to but modified with respect to data resident in the home system memory. The cache has a cache controller that issues a deallocate operation on the local interconnect in response to deallocating the modified cache line. The remote node further includes a node controller, coupled between the local interconnect and the node interconnect, that transmits the deallocate operation to the home node with an indication of whether or not a copy of the cache line remains in the remote node following the deallocation. In this manner, the local memory directory associated with the home system memory can be updated to precisely reflect which nodes hold a copy of the cache line.
    Type: Grant
    Filed: June 21, 2001
    Date of Patent: March 23, 2004
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields, Jr.
  • Patent number: 6704844
    Abstract: A method for increasing performance optimization in a multiprocessor data processing system. A number of predetermined thresholds are provided within a system controller logic and utilized to trigger specific bandwidth utilization responses. Both an address bus and data bus bandwidth utilization are monitored. Responsive to a fall of a percentage of data bus bandwidth utilization below a first predetermined threshold value, the system controller provides a particular response to a request for a cache line at a snooping processor having the cache line, where the response indicates to a requesting processor that the cache line will be provided. Conversely, if the percentage of data bus bandwidth utilization rises above a second predetermined threshold value, the system controller provides a next response to the request that indicates to any requesting processors that the requesting processor should utilize super-coherent data which is currently within its local cache.
    Type: Grant
    Filed: October 16, 2001
    Date of Patent: March 9, 2004
    Assignee: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Guy Lynn Guthrie, William J. Starke, Derek Edward Williams