Patents by Inventor Hanhong Xue

Hanhong Xue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8135911
    Abstract: A method, system, and computer program product are provided for managing a cache. A region to be stored within the cache is received. The cache includes multiple regions and each of the regions is defined by memory ranges having a starting index and an ending index. The region that has been received is stored in the cache in accordance with a cache invariant. The cache invariant guarantees that at any given point in time the regions in the cache are stored in a given order and none of the regions are completely contained within any other of the regions.
    Type: Grant
    Filed: October 21, 2008
    Date of Patent: March 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Uman Chan, Deryck X. Hong, Tsai-Yang Jea, Hanhong Xue
  • Publication number: 20120023304
    Abstract: A message flow controller limits a process from passing a new message in a reliable message passing layer from a source node to at least one destination node while a total number of in-flight messages for the process meets a first level limit. The message flow controller limits the new message from passing from the source node to a particular destination node from among a plurality of destination nodes while a total number of in-flight messages to the particular destination node meets a second level limit. Responsive to the total number of in-flight messages to the particular destination node not meeting the second level limit, the message flow controller only sends a new packet from among at least one packet for the new message to the particular destination node while a total number of in-flight packets for the new message is less than a third level limit.
    Type: Application
    Filed: July 22, 2010
    Publication date: January 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: UMAN CHAN, DERYCK X. HONG, TSAI-YANG JEA, CHULHO KIM, ZENON J. PIATEK, HUNG Q. THAI, ABHINAV VISHNU, HANHONG XUE
  • Patent number: 8095758
    Abstract: A data processing system has a processor and a memory coupled to the processor and an asynchronous memory mover coupled to the processor. The asynchronous memory mover has registers for receiving a set of parameters from the processor, which parameters are associated with an asynchronous memory move (AMM) operation initiated by the processor in virtual address space, utilizing a source effective address and a destination effective address. The asynchronous memory mover performs the AMM operation to move the data from a first physical memory location having a source real address corresponding to the source effective address to a second physical memory location having a destination real address corresponding to the destination effective address. The asynchronous memory mover has an associated off-chip translation mechanism. The AMM operation thus occurs independent of the processor, and the processor continues processing other operations independent of the AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 8056087
    Abstract: A barrier synchronization register, accessible to the nodes in a distributed data processing system, has portions thereof allotted to threads which are present in multiple groups. The barrier synchronization register portion allotted to a given thread has stored therein, over time, group identifier numbers. In this way the state space of a barrier synchronization register is shared over more than one group of process threads.
    Type: Grant
    Filed: September 25, 2006
    Date of Patent: November 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: Piyush Chaudhary, Rama K. Govindaraju, Chulho Kim, Rajeev Sivaram, Hanhong Xue
  • Publication number: 20110238956
    Abstract: A mechanism is provided in a collective acceleration unit for performing a collective operation to distribute or collect data among a plurality of participant nodes. The mechanism receives an input collective packet for a collective operation from a neighbor node within a collective tree. The input collective packet comprises a tree identifier and an input data field and wherein the collective tree comprises a plurality of sub trees. The mechanism maps the tree identifier to an index within the collective acceleration unit. The index identifies a portion of resources within the collective acceleration unit and is associated with a set of neighbor nodes in a given sub tree within the collective tree. For each neighbor node the collective acceleration unit stores destination information. The collective acceleration unit performs an operation on the input data field using the portion of resources to effect the collective operation.
    Type: Application
    Filed: March 29, 2010
    Publication date: September 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Paul F. Lecocq, Hanhong Xue
  • Patent number: 8015380
    Abstract: A data processing system has an asynchronous memory mover, which includes multiple sets of registers for storing addressing and control parameters utilized to generate one or more asynchronous memory move (AMM) operations. The memory mover detects a receipt of a first set of parameters in a first set of registers from the processor. The processor forwards the parameters after the processor initiates a data move in virtual address space, utilizing a source effective address and a destination effective address. The memory mover responds to receiving the first set of parameters by generating and launching a first asynchronous memory move (AMM) operation. When the memory mover receives a second set of parameters in a second set of registers before the first AMM operation completes, the memory mover generates and launches a second AMM operation concurrently with the first AMM operation if no address conflicts exist.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: September 6, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7991981
    Abstract: A method within a data processing system by which a processor executes an asynchronous memory move (AMM) store (ST) instruction to complete a corresponding AMM operation in parallel with an ongoing (not yet completed), previously issued barrier operation. The processor receives the AMM ST instruction after executing the barrier operation (or SYNC instruction) and before the completion of the barrier operation or SYNC on the system fabric. The processor continues executing the AMM ST instruction, which performs a move in virtual address space and then triggers the generation of the AMM operation. The AMM operation proceeds while the barrier operation continues, independent of the processor. The processor stops further execution of all other memory access requests, excluding AMM ST instructions that are received after the barrier operation, but before completion of the barrier operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 2, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20110173258
    Abstract: A mechanism is provided for collective acceleration unit tree flow control forms a logical tree (sub-network) among those processors and transfers “collective” packets on this tree. The system supports many collective trees, and each collective acceleration unit (CAU) includes resources to support a subset of the trees. Each CAU has limited buffer space, and the connection between two CAUs is not completely reliable. Therefore, to address the challenge of collective packets traversing on the tree without colliding with each other for buffer space and guaranteeing the end-to-end packet delivery, each CAU in the system effectively flow controls the packets, detects packet loss, and retransmits lost packets.
    Type: Application
    Filed: December 17, 2009
    Publication date: July 14, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Jody B. Joyner, Paul F. Lecocq, Hanhong Xue
  • Patent number: 7966454
    Abstract: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: June 21, 2011
    Assignee: International Business Machines Corporation
    Inventors: Lakshimarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, William J. Starke, Hanhong Xue
  • Publication number: 20110145791
    Abstract: A technique for debugging code during runtime includes providing, from an outside process, a trigger to a daemon. In this case, the trigger is associated with a registered callback function. The trigger is then provided, from the daemon, to one or more designated tasks of a job. The registered callback function (that is associated with the trigger) is then executed by the one or more designated tasks. Execution results of the executed registered callback function are then returned (from the one or more designated tasks) to the daemon.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Applicant: International Business Machines Corporation
    Inventors: CHULHO KIM, HANHONG XUE, TSAI-YANG JEA, HUNG Q. THAI
  • Patent number: 7958327
    Abstract: A data processing system with a processor and memory includes an instruction set architecture (ISA) that provides an asynchronous memory move (AMM) store (ST) instruction. When the processor executes the AMM ST instruction, the processor performs a series of functions, which initiates an asynchronous memory move (AMM) operation. The AMM ST instruction moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) performing a move of the data in virtual address space utilizing a source effective address that is memory mapped to the first memory location and a destination effective address that is memory mapped to the second memory location. When the move is completed in the virtual address space, the AMM operation performs the physical move of the data to the second memory location outside the processor core, without processor involvement.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: June 7, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7941627
    Abstract: An instruction set architecture (ISA) includes an asynchronous memory move (AMM) synchronization (SYNC) instruction. When processor of a data processing system executes the AMM SYNC instruction, the processor prevents an AMM operation generated by a subsequently received/executed AMM ST instruction from proceeding with the data move portion of the AMM operation within the memory subsystem until completion of all ongoing memory access operations within the memory subsystem and fabric. The AMM operation does not wait for a normal barrier operation. The processor forwards the information relevant to initiate the AMM operation to an asynchronous memory mover logic, and signals the logic to not proceed with the AMM operation until signaled of the completion of the AMM SYNC.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 10, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7937570
    Abstract: A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: an asynchronous memory mover (AMM) store (ST) instruction that initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and a LD CMP instruction for checking a status of an AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Ronald N. Kalla, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7930504
    Abstract: A method within a data processing system in which a processor handles conflicts, which occur during performance by an asynchronous memory mover of an asynchronous memory move (AMM) operation. The asynchronous memory mover performs an asynchronous memory move (AMM) operation by which the actual data is moved from a source to a destination memory location, independent of the processor. The memory mover sets a flag bit to indicate that the asynchronous memory mover is currently performing an AMM operation at the memory. When the processor receives a memory access operation, the processor checks the value of the flag bit before issuing the new memory access operation, and checks the associated address of the AMM operation to determine possible address conflicts. The processor then evaluates and responds to address conflicts to prevent corruption of data during an AMM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Patent number: 7921275
    Abstract: While an asynchronous memory move (AMM) operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers cache injection by the AMM mover of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache, the next cache lines in the sequence of data to the L2 cache, and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller places moved data into only a subset of the available cache lines of the upper level cache.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: April 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Robert S. Blackmore, Chulho Kim, Balaram Sinharoy, Hanhong Xue
  • Publication number: 20110078410
    Abstract: Disclosed are a method of and system for multiple party communications in a processing system including multiple processing subsystems. Each of the processing subsystems includes a central processing unit and one or more network adapters for connecting said each processing subsystem to the other processing subsystems. A multitude of nodes are established or created, and each of these nodes is associated with one of the processing subsystems. A first aspect of the invention involves pipelined communication using RDMA among three nodes, where the first node breaks up a large communication into multiple parts and sends these parts one after the other to the second node using RDMA, and the second node in turn absorbs and forwards each of these parts to a third node before all parts of the communication arrive from the first node.
    Type: Application
    Filed: July 17, 2006
    Publication date: March 31, 2011
    Applicant: International Business Machines Corporation
    Inventors: Robert S. Blackmore, Rama K. Govindaraju, Peter H. Hochschild, Chulho Kim, Rajeev Sivaram, Richard R. Treumann, Hanhong Xue
  • Publication number: 20110061053
    Abstract: This present invention provides a portable user space application release/reacquire of adapter resources for a given job on a node using information in a network resource table. The information in the network resource table is obtained when a user space application is loaded by some resource manager. The present invention provides a portable solution that will work for any interconnect where adapter resources need to be freed and reacquired without having to write a specific function in the device driver. In the present invention, the preemption request is done on a job basis using a key or “job key” that was previously loaded when the user space application or job originally requested the adapter resources. This is done for each OS instance where the job is run.
    Type: Application
    Filed: April 7, 2008
    Publication date: March 10, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: RICHARD J. COPPINGER, Deryck X. Hong, Chulho Kim, Hanhong Xue
  • Patent number: 7877436
    Abstract: A method and a data processing system for completing checkpoint processing of a distributed job with local tasks communicating with other remote tasks via a host fabric interface (HFI) and assigned HFI window. Each HFI window has a send count and a receive count, which tracks GSM messages that are sent from and received at the HFI window. When a checkpoint is initiated by a master task, each local task forwards the send count and the receive count to the master task. The master task sums the respective counts and then compares the totals to each other. When the send count total is equal to the receive count total, the tasks are permitted to continue processing. However, when the send count total is not equal to the receive count total, the master task notifies each task of the job to rollback to a previous checkpoint or kill the job execution.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Patent number: 7873879
    Abstract: A host fabric interface (HFI) enables debugging of global shared memory (GSM) operations received at a local node from a network fabric. The local node has a memory management unit (MMU), which provides an effective address to real address (EA-to-RA) translation table that is utilized by the HFI to evaluate when EAs of GSM operations/data from a received GSM packet is memory-mapped to RAs of the local memory. The HFI retrieves the EA associated with a GSM operation/data within a received GSM packet. The HFI forwards the EA to the MMU, which determines when the EA is mapped to RAs within the local memory for the local task. The HFI processing logic enables processing of the GSM packet only when the EA of the GSM operation/data within the GSM packet is an EA that has a local RA translation. Non-matching EAs result in an error condition that requires debugging.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Chulho Kim, Ramakrishnan Rajamony, Hanhong Xue
  • Patent number: 7840643
    Abstract: A method is provided for transferring data between first and second nodes of a network. Such method includes requesting first data to be transferred by a first upper layer protocol (ULP) operating on the first node of the network; and buffering second data for transfer to the second node by a lower protocol layer lower than the first ULP, the second data including an integral number of standard size units of data including the first data. The method further includes posting the second data to the network for delivery to the second node; receiving the second data at the second node; and from the received data, delivering the first data to a second ULP operating on the second node. The method is of particular application when transferring the data in unit size is faster than transferring the data in other than unit size.
    Type: Grant
    Filed: October 6, 2004
    Date of Patent: November 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: Rama K. Govindaraju, Chulho Kim, Hanhong Xue