Patents by Inventor Ravi Arimilli

Ravi Arimilli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080109816
    Abstract: A processor communication register (PCR) contained in each processor within a multiprocessor system provides enhanced processor communication. Each PCR stores identical processor communication information that is useful in pipelined or parallel multi-processing. Each processor has exclusive rights to store to a sector within each PCR and has continuous access to read the contents of its own PCR. Each processor updates its exclusive sector within all of the PCRs, instantly allowing all of the other processors to see the change within the PCR data, and bypassing the cache subsystem. Efficiency is enhanced within the multiprocessor system by providing processor communications to be immediately transferred into all processors without momentarily restricting access to the information or forcing all the processors to be continually contending for the same cache line, and thereby overwhelming the interconnect and memory system with an endless stream of load, store and invalidate commands.
    Type: Application
    Filed: January 10, 2008
    Publication date: May 8, 2008
    Inventors: RAVI ARIMILLI, Robert Cargnoni, Derek Williams, Kenneth Wright
  • Publication number: 20080091918
    Abstract: A processor communication register (PCR) contained within a multiprocessor cluster system provides enhanced processor communication. The PCR stores information that is useful in pipelined or parallel multi-processing. Each processor cluster has exclusive rights to store to a sector within the PCR and has continuous access to read its contents. Each processor cluster updates its exclusive sector within the PCR, instantly allowing all of the other processors within the cluster network to see the change within the PCR data, and bypassing the cache subsystem. Efficiency is enhanced within the processor cluster network by providing processor communications to be immediately networked and transferred into all processors without momentarily restricting access to the information or forcing all the processors to be continually contending for the same cache line, and thereby overwhelming the interconnect and memory system with an endless stream of load, store and invalidate commands.
    Type: Application
    Filed: December 7, 2007
    Publication date: April 17, 2008
    Inventors: Ravi Arimilli, Robert Cargnoni, Derek Williams, Kenneth Wright
  • Publication number: 20070250668
    Abstract: A data processing system includes a memory subsystem and an execution unit, coupled to the memory subsystem, which executes store instructions to determine target memory addresses of store operations to be performed by the memory subsystem. The data processing system further includes a mode field having a first setting indicating strong ordering between store operations and a second setting indicating weak ordering between store operations. Store operations accessing the memory subsystem are associated with either the first setting or the second setting. The data processing system also includes logic that, based upon settings of the mode field, inserts a synchronizing operation between a store operation associated with the first setting and a store operation associated with the second setting, such that all store operations preceding the synchronizing operation complete before store operations subsequent to the synchronizing operation.
    Type: Application
    Filed: April 25, 2006
    Publication date: October 25, 2007
    Inventors: Ravi Arimilli, Thomas Capasso, Guy Guthrie, Hugh Shen, William Starke
  • Publication number: 20070250669
    Abstract: A data processing system includes a processor core and a memory subsystem. The memory subsystem includes a store queue having a plurality of entries, where each entry includes an address field for holding the target address of store operation, a data field for holding data for the store operation, and a virtual sync field indicating a presence or absence of a synchronizing operation associated with the entry. The memory subsystem further includes a store queue controller that, responsive to receipt at the memory subsystem of a sequence of operations including a synchronizing operation and a particular store operation, places a target address and data of the particular store operation within the address field and data field, respectively, of an entry in the store queue and sets the virtual sync field of the entry to represent the synchronizing operation, such that a number of store queue entries utilized is reduced.
    Type: Application
    Filed: April 25, 2006
    Publication date: October 25, 2007
    Inventors: Ravi Arimilli, Thomas Capasso, Robert Cargnoni, Guy Guthrie, Hugh Shen, William Starke
  • Publication number: 20070226423
    Abstract: A data processing system includes at least first and second coherency domains, each including at least one processor core and a memory. In response to an initialization operation by a processor core that indicates a target memory block to be initialized, a cache memory in the first coherency domain determines a coherency state of the target memory block with respect to the cache memory. In response to the determination, the cache memory selects a scope of broadcast of an initialization request identifying the target memory block. A narrower scope including the first coherency domain and excluding the second coherency domain is selected in response to a determination of a first coherency state, and a broader scope including the first coherency domain and the second coherency domain is selected in response to a determination of a second coherency state. The cache memory then broadcasts an initialization request with the selected scope.
    Type: Application
    Filed: March 23, 2006
    Publication date: September 27, 2007
    Inventors: Ravi Arimilli, Guy Guthrie, William Starke, Derek Williams
  • Publication number: 20070150675
    Abstract: A system, method, and a computer readable for protecting content of a memory page are disclosed. The method includes determining a start of a semi-synchronous memory copy operation. A range of addresses is determined where the semi-synchronous memory copy operation is being performed. An issued instruction that removes a page table entry is detected. The method further includes determining whether the issued instruction is destined to remove a page table entry associated with at least one address in the range of addresses. In response to the issued instruction being destined to remove the page table entry, the execution of the issued instruction is stalled until the semi-synchronous memory copy operation is completed.
    Type: Application
    Filed: December 22, 2005
    Publication date: June 28, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi Arimilli, Rama Govindaraju, Peter Hochschild, Bruce Mealey, Satya Sharma, Balaram Sinharoy
  • Publication number: 20070150659
    Abstract: A system, method, and a computer readable for inserting data into a cache memory based on information in a semi-synchronous memory copy instruction are disclosed. The method comprises determining a start of a semi-synchronous memory copy operation. The semi-synchronous memory copy operation is checked for a given value in at least one cache injection bit. In response to the given value in the cache injection bit, a predefined number of lines of destination data is copied into at least one level of cache memory.
    Type: Application
    Filed: December 22, 2005
    Publication date: June 28, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi Arimilli, Rama Govindaraju, Peter Hochschild, Bruce Mealey, Satya Sharma, Balaram Sinharoy
  • Publication number: 20070150665
    Abstract: A method, processing node, and computer readable medium for propagating data using mirrored lock caches are disclosed. The method includes coupling a first mirrored lock cache associated with a first processing node to a bus that is communicatively coupled to at least a second mirrored lock cache associated with a second processing node in a multi-processing system. The method further includes receiving, by the first mirrored lock cache, data from a processing node. The data is then mirrored automatically so that the same data is available locally at the second mirrored lock cache for use by the second processing node.
    Type: Application
    Filed: December 22, 2005
    Publication date: June 28, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi Arimilli, Rama Govindaraju, Peter Hochschild, Bruce Mealey, Satya Sharma, Balaram Sinharoy
  • Publication number: 20070150676
    Abstract: A system, method, and computer program product for semi-synchronously copying data from a first portion of memory to a second portion of memory are disclosed. The method comprises receiving, in a processor, a call for a semi-synchronous memory copy operation. The semi-synchronous memory copy operation preserves temporal persistence of validity for a virtual source address corresponding to a source location in a memory and a virtual target address corresponding to a target location in the memory by setting a flag bit. The call includes at least the virtual source address, the virtual target address, and an indicator identifying a number of bytes to be copied. The memory copy operation is placed in a queue for execution by a memory controller. The queue is coupled to the memory controller. At least one subsequent instruction is continued to be executed as the subsequent instruction becomes available from an instruction pipeline.
    Type: Application
    Filed: December 22, 2005
    Publication date: June 28, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi Arimilli, Rama Govindaraju, Peter Hochschild, Bruce Mealey, Satya Sharma, Balaram Sinharoy
  • Publication number: 20070081516
    Abstract: A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links.
    Type: Application
    Filed: October 7, 2005
    Publication date: April 12, 2007
    Inventors: Ravi Arimilli, Benjiman Goodman, Guy Guthrie, Praveen Reddy, William Starke
  • Publication number: 20060265553
    Abstract: In response to receiving an initialization operation from an associated processor core that indicates a target memory block to be initialized, a cache memory determines a coherency state of the target memory block. In response to a determination that the target memory block has a data-invalid coherency state with respect to the cache memory, the cache memory issues on a interconnect a corresponding initialization request indicating the target memory block. In response to the initialization request, the target memory block is initialized within a memory of the data processing system to an initialization value. The target memory block may thus be initialized without the cache memory holding a valid copy of the target memory block.
    Type: Application
    Filed: May 17, 2005
    Publication date: November 23, 2006
    Applicant: International Business Machines Corporation
    Inventors: Ravi Arimilli, Derek Williams
  • Publication number: 20060251120
    Abstract: An Ethernet adapter is disclosed. The Ethernet adapter comprises a plurality of layers for allowing the adapter to receive and transmit packets from and to a processor. The plurality of layers include a demultiplexing mechanism to allow for partitioning of the processor. A Host Ethernet Adapter (HEA) is an integrated Ethernet adapter providing a new approach to Ethernet and TCP acceleration. A set of TCP/IP acceleration features have been introduced in a toolkit approach: Servers TCP/IP stacks use these accelerators when and as required. The interface between the server and the network interface controller has been streamlined by bypassing the PCI bus. The HEA supports network virtualization. The HEA can be shared by multiple OSs providing the essential isolation and protection without affecting its performance.
    Type: Application
    Filed: April 1, 2005
    Publication date: November 9, 2006
    Inventors: Ravi Arimilli, Claude Basso, Jean Calvignac, Chih-Jen Chang, Philippe Damon, Ronald Fuhs, Satya Sharma, Natarajan Vaidhyanathan, Fabrice Verplanken, Colin Verrilli, Scott Willenborg
  • Publication number: 20050251623
    Abstract: A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.
    Type: Application
    Filed: April 15, 2004
    Publication date: November 10, 2005
    Applicant: International Business Machines Corp.
    Inventors: Ravi Arimilli, Guy Guthrie, Hugh Shen, Derek Williams
  • Publication number: 20050251622
    Abstract: A method and processor system that substantially enhances the store gathering capabilities of a store queue entry to enable gathering of a maximum number of proximate-in-time store operations before the entry is selected for dispatch. A counter is provided for each entry to track a time since a last gather to the entry. When a new gather does not occur before the counter reaches a threshold saturation point, the entry is signaled ready for dispatch. By defining an optimum threshold saturation point before the counter expires, sufficient time is provided for the entry to gather a proximate-in-time store operation. The entry may be deemed eligible for selection when certain conditions occur, including the entry becoming full, issuance of a barrier operation, and saturation of the counter. The use of the counter increases the ability of a store queue entry to complete gathering of enough store operations to update an entire cache line before that entry is dispatched to an RC machine.
    Type: Application
    Filed: April 15, 2004
    Publication date: November 10, 2005
    Applicant: International Business Machines Corp.
    Inventors: Ravi Arimilli, Robert Cargnoni, Hugh Shen, Derek Williams
  • Publication number: 20050193174
    Abstract: A method for informing a processor of a selected order of transmission of data to the processor. The method comprises the steps of coupling system components via a data bus to the processor to effectuate data transfer, determining at the system component logic the order in which to transmit data to the processor, and issuing to the data bus a selected order bit concurrent with the data, wherein the selected order bit alerts the processor of the order and the data is transmitted in that order. In a preferred embodiment, the system component is the cache and the method may involve receiving at the cache a preference of ordering for a read address/request from the processor. The preference order logic of the cache controller or a preference order logic component evaluates the preference of ordering desired by comparing the processor preference with other preferences, including cache order preference.
    Type: Application
    Filed: January 22, 2005
    Publication date: September 1, 2005
    Inventors: Ravi Arimilli, Vicente Chung, Guy Guthrie, Jody Joyner
  • Publication number: 20050154861
    Abstract: According to a method of operating a data processing system, software communicates to a processing unit a classification each of at least one schedulable software entity that the processing unit executes. A resource manager within the processing unit dynamically allocates hardware resources within the processing unit to the schedulable software entity during execution in accordance with the classification of the at least one schedulable software entity. The classification may be retrieved by the software from in data storage, and operating system software may schedule the schedulable software entity for execution by reference to the classification. The processing unit may also monitor, in hardware, execution of each of a plurality of schedulable software entities within the processing unit in accordance with a monitoring parameter set among one or more monitoring parameter sets.
    Type: Application
    Filed: January 13, 2004
    Publication date: July 14, 2005
    Applicant: International Business Machines Corporation
    Inventors: Ravi Arimilli, Randal Swanberg
  • Publication number: 20050154860
    Abstract: According to a method of operating a data processing system, one or more monitoring parameter sets are established in a processing unit within the data processing system. The processing unit monitors, in hardware, execution of each of a plurality of schedulable software entities within the processing unit in accordance with a monitoring parameter set among the one or more monitoring parameter sets. The processing unit then reports to software executing in the data processing system utilization of hardware resources by each of the plurality of schedulable software entities. The hardware utilization information reported by the processing unit may be stored and utilized by software to schedulable execution of the schedulable software entities reported by the processing unit.
    Type: Application
    Filed: January 13, 2004
    Publication date: July 14, 2005
    Applicant: International Business Machines Corporation
    Inventors: Ravi Arimilli, Randal Swanberg
  • Publication number: 20050149660
    Abstract: A data interconnect and routing mechanism reduces data communication latency, supports dynamic route determination based upon processor activity level/traffic, and implements an architecture that supports scalable improvements in communication frequencies. In one implementation, a data processing system includes at least first through third processing units, data storage coupled to the plurality of processing units, and an interconnect fabric. The interconnect fabric includes at least a first data bus coupling the first processing unit to the second processing unit and a second data bus coupling the third processing unit to the second processing unit so that the first and third processing units can transmit data traffic to the second processing unit. The data processing system further includes a control channel coupling the first and third processing units.
    Type: Application
    Filed: January 7, 2004
    Publication date: July 7, 2005
    Applicant: International Business Machines Corporation
    Inventors: Ravi Arimilli, Jerry Lewis, Vicente Chung, Jody Joyner
  • Publication number: 20050149692
    Abstract: The data interconnect and routing mechanism reduces data communication latency, supports dynamic route determination based upon processor activity level/traffic, and implements an architecture that supports scalable improvements in communication frequencies. In one application, a data processing system includes first and second processing books, each including at least first and second processing units. Each of the first and second processing units has a respective first output data bus. The first output data bus of the first processing unit is coupled to the second processing unit, and the first output data bus of the second processing unit is coupled to the first processing unit. At least the first processing unit of the first processing book and the second processing unit of the second processing book each have a respective second output data bus.
    Type: Application
    Filed: January 7, 2004
    Publication date: July 7, 2005
    Applicant: International Business Machines Corp.
    Inventors: Ravi Arimilli, Jerry Lewis, Vicente Chung, Jody Joyner
  • Publication number: 20050132147
    Abstract: A data processing system includes one or more processing cores, a system memory having multiple rows of data storage, and a memory controller that controls access to the system memory and performs supplier-based memory speculation. The memory controller includes a memory speculation table that stores historical information regarding prior memory accesses. In response to a memory access request, the memory controller directs an access to a selected row in the system memory to service the memory access request. The memory controller speculatively directs that the selected row will continue to be energized following the access based upon the historical information in the memory speculation table, so that access latency of an immediately subsequent memory access is reduced.
    Type: Application
    Filed: December 10, 2003
    Publication date: June 16, 2005
    Applicant: International Business Machines Corporation
    Inventors: Ravi Arimilli, Sanjeev Ghai, Warren Maule