Patents by Inventor Ravi K. Arimilli

Ravi K. Arimilli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20090199184
    Abstract: A wake-and-go mechanism is provided for a data processing system. When a thread is waiting for an event, rather than performing a series of get-and-compare sequences, the thread updates a wake-and-go array with a target address associated with the event. Software may save the state of the thread. The thread is then put to sleep. When the wake-and-go array snoops a kill at a given target address, logic associated with wake-and-go array may generate an exception, which may result in a switch to kernel mode, wherein the operating system performs some action before returning control to the originating process. In this case, the trap results in other software, such as the operating system or background sleeper thread, for example, to reload thread from thread state storage and to continue processing of the active threads on the processor.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20090198914
    Abstract: According to at least one embodiment, a method of data processing in a multiprocessor data processing system includes a requesting processing unit initiating an interconnect operation including a memory access request that indicates an acceptability of a variable amount of data to service the interconnect request for data. In response to snooping the memory access request on an interconnect, a snooper selects an amount of data to supply to the requesting processing unit and transmits the selected amount of data to the requesting processing unit. The requesting processing unit receives the selected amount of data and utilizes at least some of the selected amount of data to service a processor request.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: LAKSHMINARAYANA B. ARIMILLI, Ravi K. Arimilli, Jerry D. Lewis, Warren E. Maule
  • Publication number: 20090199029
    Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism recognizes a programming idiom, specialized instruction, operating system call, or application programming interface call that indicates that a thread is waiting for an event. The wake-and-go mechanism updates a wake-and-go array with a target address, expected data value, and comparison type associated with the event. The thread then goes to sleep until the event occurs. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, logic associated with the CAM performs a comparison based on the data value being written, expected data value, and comparison type.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20090199183
    Abstract: A wake-and-go mechanism is provided for a data processing system. When a thread is waiting for an event, rather than performing a series of get-and-compare sequences, the thread updates a wake-and-go array with a target address associated with the event. The wake-and-go mechanism may save the state of the thread in a hardware private array. The hardware private array may comprise a plurality of memory cells embodied within the processor or pervasive logic associated with the bus, for example. Alternatively, the hardware private array may be embodied within logic associated with the wake-and-go storage array.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20090198865
    Abstract: In at least one embodiment, a method of data processing in a data processing system having a memory hierarchy includes a processor core executing a storage-modifying memory access instruction to determine a memory address. The processor core transmits to a cache memory within the memory hierarchy a storage-modifying memory access request including the memory address, an indication of a memory access type, and, if present, a partial cache line hint signaling access to less than all granules of a target cache line of data associated with the memory address. In response to the storage-modifying memory access request, the cache memory performs a storage-modifying access to all granules of the target cache line of data if the partial cache line hint is not present and performs a storage-modifying access to less than all granules of the target cache line of data if the partial cache line hint is present.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, GUY L. GUTHRIE, WILLIAM J. STARKE, DEREK E. WILLIAMS
  • Publication number: 20090198916
    Abstract: A method for supporting low-overhead memory locks within a multi-processor system is disclosed. A lock control section is initially assigned to a data block within a system memory of the multiprocessor system. In response to a request for accessing the data block by a processing unit within the multiprocessor system, a determination is made by a memory controller whether or not the lock control section of the data block has been set. If the lock control section of the data block has been set, the request for accessing the data block is ignored. Otherwise, if the lock control section of the data block has not been set, the lock control section of the data block is set, and the request for accessing the data block is allowed.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, Edward J. Seminaro, William J. Starke
  • Publication number: 20090199030
    Abstract: A hardware wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism recognizes a programming idiom that indicates that a thread is waiting for an event. The wake-and-go mechanism updates a wake-and-go array with a target address associated with the event. The thread then goes to sleep until the event occurs. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, the CAM returns a list of storage addresses at which the target address is stored. The wake-and-go mechanism associates these storage addresses with the threads waiting for an even at the target addresses, and may wake the one or more threads waiting for the event.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20090198953
    Abstract: An addressing model is provided where devices, including I/O devices, are addressed with internet protocol (IP) addresses, which are considered part of the virtual address space. A task, such as an application, may be assigned an effective address range, which corresponds to addresses in the virtual address space. The virtual address space is expanded to include Internet protocol addresses. Thus, the page frame tables are also modified to include entries for IP addresses and additional properties for devices and I/O. Thus, a processing element, such as an I/O adapter or even a printer, for example, may also be addressed using IP addresses without the need for library calls, device drivers, pinning memory, and so forth. This addressing model also provides full virtualization of resources across an IP interconnect, allowing a process to access an I/O device across a network.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Claude Basso, Jean L. Calvignac, Piyush Chaudhary, Edward J. Seminaro
  • Publication number: 20090198910
    Abstract: According to method of data processing in a multiprocessor data processing system, in response to a processor touch request targeting a target granule of a cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a partial touch request that requests a copy of only the target granule for subsequent query access. In response to a combined response to the partial touch request indicating success, the combined response representing a system-wide response to the partial touch request, the processing unit receives the target granule of the target cache line and updates a coherency state of the target granule while retaining a coherency state of at least one other granule of the cache line.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090199170
    Abstract: A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. If executing a portion of the set of helper thread binaries results in the retrieval of data needed by the set of main thread binaries, then that retrieved data is utilized by the set of main thread binaries.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Juan C. Rubio, Balaram Sinharoy
  • Publication number: 20090198918
    Abstract: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, William J. Starke, Hanhong Xue
  • Publication number: 20090198965
    Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090198955
    Abstract: A distributed data processing system includes: (1) a first node with a processor, a first memory, and asynchronous memory mover logic; and connection mechanism that connects (2) a second node having a second memory. The processor includes processing logic for completing a cross-node asynchronous memory move (AMM) operation, wherein the processor performs a move of data in virtual address space from a first effective address to a second effective address, and the asynchronous memory mover logic completes a physical move of the data from a first memory location in the first memory having a first real address to a second memory location in the second memory having a second real address. The data is transmitted via the connection mechanism connecting the two nodes independent of the processor.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198917
    Abstract: An instruction set architecture (ISA) includes an asynchronous memory move (AMM) synchronization (SYNC) instruction. When processor of a data processing system executes the AMM SYNC instruction, the processor prevents an AMM operation generated by a subsequently received/executed AMM ST instruction from proceeding with the data move portion of the AMM operation within the memory subsystem until completion of all ongoing memory access operations within the memory subsystem and fabric. The AMM operation does not wait for a normal barrier operation. The processor forwards the information relevant to initiate the AMM operation to an asynchronous memory mover logic, and signals the logic to not proceed with the AMM operation until signaled of the completion of the AMM SYNC.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090199189
    Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism recognizes a programming idiom that indicates that a thread is spinning on a lock. The wake-and-go mechanism updates a wake-and-go array with a target address associated with the lock and sets a lock bit in the wake-and-go array. The thread then goes to sleep until the lock frees. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, the CAM returns a list of storage addresses at which the target address is stored. The wake-and-go mechanism associates these storage addresses with the threads waiting for an even at the target addresses, and may wake the thread that is spinning on the lock.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20090198904
    Abstract: A technique for performing data prefetching using indirect addressing includes determining a first memory address of a pointer associated with a data prefetch instruction. Content, that is included in a first data block (e.g., a first cache line) of a memory, at the first memory address is then fetched. An offset is then added to the content of the memory at the first memory address to provide a first offset memory address. A second memory address is then determined based on the first offset memory address. A second data block (e.g., a second cache line) that includes data at the second memory address is then fetched (e.g., from the memory or another memory). A data prefetch instruction may be indicated by a unique operational code (opcode), a unique extended opcode, or a field (including one or more bits) in an instruction.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090198936
    Abstract: A method performed in a data processing system initiates an asynchronous memory move (AMM) operation, whereby a processor performs a move of data in virtual address space from a first effective address to a second effective address and forwards parameters of the AMM operation to asynchronous memory mover logic for completion of the physical movement of data from a first memory location to a second memory location. The processor executes a second operation, which checks a status of the completion of the data move and returns a notification indicating the status. The notification indicates a status, which includes one of: data move in progress; data move totally done; data move partially done; data move cannot be performed; and occurrence of a translation look-aside buffer invalidate entry (TLBIE) operation. The processor initiates one or more actions in response to the notification received.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ronald N. Kalla, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090199181
    Abstract: A set of helper thread binaries is created from a set of main thread binaries. The helper thread monitors software or hardware ports for incoming data events. When the helper thread detects an incoming event, the helper thread asynchronously executes instructions that calculate incoming data needed by the main thread.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Juan C. Rubio, Balaram Sinharoy
  • Publication number: 20090198695
    Abstract: A locking mechanism for supporting distributed computing within a multiprocessor system is disclosed. A lock control section and a stage control section are assigned to a data block within a system memory. In response to a request for accessing the data block by a processing unit, a determination is made by a memory controller whether or not the lock control section of the data block has been set. If the lock control section of the data block has been set, the access request is denied. Otherwise, if the lock control section of the data block has not been set, another determination is made whether or not a current processing stage of the requesting processing unit matches a processing stage indicated by the stage control section. If the current processing stage of the requesting processing unit does not match the processing stage indicated by the stage control section, the access request is denied; otherwise, the access request is allowed.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, William J. Starke
  • Publication number: 20090198958
    Abstract: A system and method for performing dynamic request routing based on broadcast source request information are provided. Each processor chip in the system may use a synchronized heartbeat signal it generates to provide source request information to each of the other processor chips in the system. The source request information identifies the number of active source requests sent by the processor chip that originated the heartbeat signal. The source request information from each of the processor chips in the system may be used by the processor chips in determining optimal routing paths for data from a source processor chip to a destination processor chip. As a result, the congestion of data for processing at each of the processor chips along each possible routing path may be taken into account when selecting to which processor chip to forward data.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Bernard C. Drerup, Jody B. Joyner, Jerry D. Lewis