Patents by Inventor Hanhong Xue

Hanhong Xue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7797588
    Abstract: In a global shared memory (GSM) environment, an initiating task at a first node with a host fabric interface (HFI) uses epochs to provide reliability of transmission of packets via a network fabric to a target task. The HFI generates a packet for the initiating task addressed to the target task, and automatically inserts a current epoch of the initiating task into the packet. A copy of the current epoch is maintained by the target task, which accepts for processing only packets having the correct epoch, unless the packet is tagged for guaranteed-once delivery. When a packet delivery is accepted, the target task sends a notification to the initiating task. If the initiating task does not receive the notification of delivery for the issued packet, the initiating task updates the epoch at both the target node and the initiating node and re-transmits the packet.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: September 14, 2010
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Hanhong Xue
  • Patent number: 7774554
    Abstract: A system and method to provide injection of important data directly into a processor's cache location when that processor has previously indicated interest in the data. The memory subsystem at a target processor will determine if the memory address of data to be written to a memory location associated with the target processor is found in a processor cache of the target processor. If it is determined that the memory address is found in a target processor's cache, the data will be directly written to that cache at the same time that the data is being provided to a location in main memory.
    Type: Grant
    Filed: February 20, 2007
    Date of Patent: August 10, 2010
    Assignee: International Business Machines Corporation
    Inventors: Piyush Chaudhary, Rama K. Govindaraju, Jay Robert Herring, Peter Hochschild, Chulho Kim, Rajeev Sivaram, Hanhong Xue
  • Publication number: 20100153964
    Abstract: Load balancing of adapters on a multi-adapter node of a communications environment. A task executing on the node selects an adapter resource unit to be used as its primary port for communications. The selection is based on the task's identifier, and facilitates a balancing of the load among the adapter resource units. Using the task's identifier, an index is generated that is used to select a particular adapter resource unit from a list of adapter resource units assigned to the task. The generation of the index is efficient and predictable.
    Type: Application
    Filed: December 15, 2008
    Publication date: June 17, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hung Q. Thai, Hanhong Xue
  • Publication number: 20100100674
    Abstract: A method, system, and computer program product are provided for managing a cache. A region to be stored within the cache is received. The cache includes multiple regions and each of the regions is defined by memory ranges having a starting index and an ending index. The region that has been received is stored in the cache in accordance with a cache invariant. The cache invariant guarantees that at any given point in time the regions in the cache are stored in a given order and none of the regions are completely contained within any other of the regions.
    Type: Application
    Filed: October 21, 2008
    Publication date: April 22, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: UMAN CHAN, Deryck X. Hong, Tsai-Yang Jea, Hanhong Xue
  • Patent number: 7689992
    Abstract: Shared locks are employed for controlling a thread which extends across more than one protocol layer in a data processing system. The use of a counter is used as part of a data structure which makes it possible to implement shared locks across multiple layers. The use of shared locks avoids the processing overhead usually associated with lock acquisition and release. The thread which is controlled may be initiated in either an upper layer protocol or in a lower layer.
    Type: Grant
    Filed: June 25, 2004
    Date of Patent: March 30, 2010
    Assignee: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Su-Hsuan Huang, Chulho Kim, Richard R. Treumann, Hanhong Xue
  • Publication number: 20090198975
    Abstract: A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: (1) an asynchronous memory mover (AMM) store (ST) instruction initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides (2) an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and (3) a LD CMP instruction for checking a status of an AMM operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ronald N. Kalla, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198963
    Abstract: A method within a data processing system by which a processor executes an asynchronous memory move (AMM) store (ST) instruction to complete a corresponding AMM operation in parallel with an ongoing (not yet completed), previously issued barrier operation. The processor receives the AMM ST instruction after executing the barrier operation (or SYNC instruction) and before the completion of the barrier operation or SYNC on the system fabric. The processor continues executing the AMM ST instruction, which performs a move in virtual address space and then triggers the generation of the AMM operation. The AMM operation proceeds while the barrier operation continues, independent of the processor. The processor stops further execution of all other memory access requests, excluding AMM ST instructions that are received after the barrier operation, but before completion of the barrier operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198891
    Abstract: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshimarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, William J. Starke, Hanhong Xue
  • Publication number: 20090199200
    Abstract: A method and data processing system for performing fence operations within a global shared memory (GSM) environment having a local task executing on a processor and providing GSM commands for processing by a host fabric interface (HFI) window that is allocated to the task. The HFI window has one or more registers for use during local fence operations. A first register tracks a first count of task-issued GSM commands, and a second register tracks a second count of GSM operations being processed by the HFI. The processing logic detects a locally-issued fence operation, and responds by performing a series of operations, including: automatically stopping the task from issuing additional GSM commands; monitoring for completion of all the task-issued GSM commands at the HFI; and triggering a resumption of issuance of GSM commands by the task when the completion of all previous task-issued GSM commands is registered by the HFI.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Publication number: 20090198939
    Abstract: A data processing system has an asynchronous memory mover, which includes multiple sets of registers for storing addressing and control parameters utilized to generate one or more asynchronous memory move (AMM) operations. The memory mover detects a receipt of a first set of parameters in a first set of registers from the processor. The processor forwards the parameters after the processor initiates a data move in virtual address space, utilizing a source effective address and a destination effective address. The memory mover responds to receiving the first set of parameters by generating and launching a first asynchronous memory move (AMM) operation. When the memory mover receives a second set of parameters in a second set of registers before the first AMM operation completes, the memory mover generates and launches a second AMM operation concurrently with the first AMM operation if no address conflicts exist.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198935
    Abstract: A data processing system with a processor and memory includes an instruction set architecture (ISA) that provides an asynchronous memory move (AMM) store (ST) instruction. When the processor executes the AMM ST instruction, the processor performs a series of functions, which initiates an asynchronous memory move (AMM) operation. The AMM ST instruction moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) performing a move of the data in virtual address space utilizing a source effective address that is memory mapped to the first memory location and a destination effective address that is memory mapped to the second memory location. When the move is completed in the virtual address space, the AMM operation performs the physical move of the data to the second memory location outside the processor core, without processor involvement.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198934
    Abstract: A data processing system has a processor and a memory coupled to the processor and an asynchronous memory mover coupled to the processor. The asynchronous memory mover has registers for receiving a set of parameters from the processor, which parameters are associated with an asynchronous memory move (AMM) operation initiated by the processor in virtual address space, utilizing a source effective address and a destination effective address. The asynchronous memory mover performs the AMM operation to move the data from a first physical memory location having a source real address corresponding to the source effective address to a second physical memory location having a destination real address corresponding to the destination effective address. The asynchronous memory mover has an associated off-chip translation mechanism. The AMM operation thus occurs independent of the processor, and the processor continues processing other operations independent of the AMM operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090199201
    Abstract: In a global shared memory (GSM) environment, an initiating task at a first node with a host fabric interface (HFI) uses epochs to provide reliability of transmission of packets via a network fabric to a target task. The HFI generates a packet for the initiating task addressed to the target task, and automatically inserts a current epoch of the initiating task into the packet. A copy of the current epoch is maintained by the target task, which accepts for processing only packets having the correct epoch, unless the packet is tagged for guaranteed-once delivery. When a packet delivery is accepted, the target task sends a notification to the initiating task. If the initiating task does not receive the notification of delivery for the issued packet, the initiating task updates the epoch at both the target node and the initiating node and re-transmits the packet.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Hanhong Xue
  • Publication number: 20090198937
    Abstract: A data processing system includes a set of architected registers within which the processor places state and other information to communicate with the asynchronous memory mover in order to initiate and control an AMM operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation in response to receiving a set of parameters within the architected registers, which parameters are associated with an AMM store instruction executed by the processor to initiates a move of data in virtual space before placing the information in the architected registers. The architected registers are processor architected registers, defined on a per thread basis by a compiler, or memory mapped architected registers allocated for communicating with the asynchronous memory mover during a bind and subsequent execution of an application. The architected registers are also utilized to store state information to enable a restore to a point before execution of the AMM operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198908
    Abstract: While an AMM operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers a cache injection by the AMM mover (or memory controller) of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache. The memory controller also forwards the next cache lines in the sequence of data to the L2 cache and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller does not overrun the upper caches, by placing moved data into only a subset of the available cache lines of the upper level cache.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090199194
    Abstract: A method and data processing system for tracking global shared memory (GSM) operations to and from a local node configured with a host fabric interface (HFI) coupled to a network fabric. During task/job initialization, the system OS assigns HFI window(s) to handle the GSM packet generation and GSM packet receipt and processing for each local task. HFI processing logic automatically tags each GSM packet generated by the HFI window with a global job identifier (ID) of the job to which the local task is affiliated. The job ID is embedded within each GSM packet placed on the network fabric. On receipt of a GSM packet from the network fabric, the HFI logic retrieves the embedded job ID and compares the embedded job ID with the ID within the HFI window(s). GSM packets are forwarded to an HFI window only when the embedded job ID matches the HFI window's job ID.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Publication number: 20090198938
    Abstract: A method within a data processing system in which a processor handles conflicts, which occur during performance by an asynchronous memory mover of an asynchronous memory move (AMM) operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation by which the actual data is moved from a source to a destination memory location, independent of the processor. The memory mover sets a flag bit to indicate that the asynchronous memory mover is currently performing an AMM operation at the memory. When the processor receives a memory access operation, the processor checks the value of the flag bit before issuing the new memory access operation, and checks the associated address of the AMM operation to determine possible address conflicts. The processor then evaluates and responds to address conflicts to prevent corruption of data during an AMM operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20090198918
    Abstract: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, William J. Starke, Hanhong Xue
  • Publication number: 20090199046
    Abstract: A host fabric interface (HFI) enables debugging of global shared memory (GSM) operations received at a local node from a network fabric. The local node has a memory management unit (MMU), which provides an effective address to real address (EA-to-RA) translation table that is utilized by the HFI to evaluate when EAs of GSM operations/data from a received GSM packet is memory-mapped to RAs of the local memory. The HFI retrieves the EA associated with a GSM operation/data within a received GSM packet. The HFI forwards the EA to the MMU, which determines when the EA is mapped to RAs within the local memory for the local task. The HFI processing logic enables processing of the GSM packet only when the EA of the GSM operation/data within the GSM packet is an EA that has a local RA translation. Non-matching EAs result in an error condition that requires debugging.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Publication number: 20090199195
    Abstract: A method for issuing global shared memory (GSM) operations from an originating task on a first node coupled to a network fabric of a distributed network via a host fabric interface (HFI). The originating task generates a GSM command within an effective address (EA) space. The task then places the GSM command within a send FIFO. The send FIFO is a portion of real memory having real addresses (RA) that are memory mapped to EAs of a globally executing job. The originating task maintains a local EA-to-RA mapping of only a portion of the real address space of the globally executing job. The task enables the HFI to retrieve the GSM command from the send FIFO into an HFI window allocated to the originating task. The HFI window generates a corresponding GSM packet containing GSM operations and/or data, and the HFI window issues the GSM packet to the network fabric.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue