Patents by Inventor Hugh Shen

Hugh Shen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915045
    Abstract: In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: February 27, 2024
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 11748267
    Abstract: A plurality of entries including address translation information are buffered in a data structure in a processor core. At least first and second translation entry invalidation requests specifying different first and second addresses are checked against all of the entries in the data structure. The checking includes accessing and checking at least a first entry in the data structure for an address match with the first address but not the second address, thereafter concurrently checking at least a second entry for an address match with both the first and second addresses, and thereafter completing checking for the first address and accessing and checking the first entry for an address match with the second address but not the first address. The processor core invalidates any entry in the data structure for which the checking detects an address match.
    Type: Grant
    Filed: August 4, 2022
    Date of Patent: September 5, 2023
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Luke Murray, Hugh Shen
  • Patent number: 11693776
    Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: July 4, 2023
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli
  • Patent number: 11693788
    Abstract: An arbiter gathers translation invalidation requests assigned to state machines of a lower-level cache into a set for joint handling in a processor core. The gathering includes selection of one of the set of gathered translation invalidation requests as an end-of-sequence (EOS) request. The arbiter issues to the processor core a sequence of the gathered translation invalidation requests terminating with the EOS request. Based on receipt of each of the gathered requests, the processor core invalidates any translation entries providing translation for the addresses specified by the translation invalidation requests and marks memory-referent requests dependent on the invalidated translation entries. Based on receipt of the EOS request and in response to all of the marked memory-referent requests draining from the processor core, the processor core issues a completion request to the lower-level cache indicating completion of servicing by the processor core of the set of gathered translation invalidation requests.
    Type: Grant
    Filed: June 7, 2022
    Date of Patent: July 4, 2023
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Luke Murray, Hugh Shen
  • Publication number: 20230063992
    Abstract: A coherent data processing system includes a system fabric communicatively coupling a plurality of coherence participants and fabric control logic. The fabric control logic quantifies congestion on the system fabric based on coherence messages associated with commands issued on the system fabric. Based on the congestion on the system fabric, the fabric control logic determines a rate of request issuance applicable to a set of coherence participants among the plurality of coherence participants. The fabric control logic issues at least one rate command to set a rate of request issuance to the system fabric of the set of coherence participants.
    Type: Application
    Filed: August 18, 2021
    Publication date: March 2, 2023
    Inventors: HUGH SHEN, GUY L. GUTHRIE, JEFFREY A. STUECHELI, LUKE MURRAY, ALEXANDER MICHAEL TAFT, BERNARD C. DRERUP, DEREK E. WILLIAMS
  • Publication number: 20230041702
    Abstract: A data processing system includes a plurality of processor cores each supported by a respective one of a plurality of vertical cache hierarchies. Based on receiving on a system fabric a cache injection request requesting injection of a data into a cache line identified by a target real address, the data is written into a cache in a first vertical cache hierarchy among the plurality of vertical cache hierarchies. Based on a value in a field of the cache injection request, a distribute field is set in a directory entry of the first vertical cache hierarchy. Upon eviction of the cache line the first vertical cache hierarchy, a determination is made whether the distribute field is set. Based on determining the distribute field is set, a lateral castout of the cache line from the first vertical cache hierarchy to a second vertical cache hierarchy is performed.
    Type: Application
    Filed: August 4, 2021
    Publication date: February 9, 2023
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, Bernard C. Drerup, Hugh Shen, Alexander Michael Taft, Luke Murray, Richard Nicholas
  • Patent number: 11573902
    Abstract: A coherent data processing system includes a system fabric communicatively coupling a plurality of coherence participants and fabric control logic. The fabric control logic quantifies congestion on the system fabric based on coherence messages associated with commands issued on the system fabric. Based on the congestion on the system fabric, the fabric control logic determines a rate of request issuance applicable to a set of coherence participants among the plurality of coherence participants. The fabric control logic issues at least one rate command to set a rate of request issuance to the system fabric of the set of coherence participants.
    Type: Grant
    Filed: August 18, 2021
    Date of Patent: February 7, 2023
    Assignee: International Business Machines Corporation
    Inventors: Hugh Shen, Guy L. Guthrie, Jeffrey A. Stuecheli, Luke Murray, Alexander Michael Taft, Bernard C. Drerup, Derek E. Williams
  • Patent number: 11561901
    Abstract: A data processing system includes a plurality of processor cores each supported by a respective one of a plurality of vertical cache hierarchies. Based on receiving on a system fabric a cache injection request requesting injection of a data into a cache line identified by a target real address, the data is written into a cache in a first vertical cache hierarchy among the plurality of vertical cache hierarchies. Based on a value in a field of the cache injection request, a distribute field is set in a directory entry of the first vertical cache hierarchy. Upon eviction of the cache line the first vertical cache hierarchy, a determination is made whether the distribute field is set. Based on determining the distribute field is set, a lateral castout of the cache line from the first vertical cache hierarchy to a second vertical cache hierarchy is performed.
    Type: Grant
    Filed: August 4, 2021
    Date of Patent: January 24, 2023
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Bernard C. Drerup, Hugh Shen, Alexander Michael Taft, Luke Murray, Richard Nicholas
  • Patent number: 11537519
    Abstract: A memory-referent instruction is executed to calculate a target effective address (EA) of a corresponding memory-referent request. An array entry in an upper level cache is allocated, and the EA is specified in a corresponding EA directory entry. While in-flight, the memory-referent request is buffered in a queue in association with a pointer to the entry in the EA directory. Based on receiving a translation invalidation request requesting invalidation of an address translation in a translation structure, the processor core walks the EA directory, determines the EA in the entry matches an address range specified by the translation invalidation request, and, based on the match, precisely marks the memory-referent request using the pointer to the EA directory entry. Based on the marking, the translation invalidation request is permitted to complete with reference to the processor core only after the memory-referent request has drained from the processing unit.
    Type: Grant
    Filed: July 29, 2021
    Date of Patent: December 27, 2022
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, David Campbell, Bryan Lloyd, Samuel David Kirchhoff, Jeffrey A. Stuecheli
  • Publication number: 20220405202
    Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.
    Type: Application
    Filed: June 18, 2021
    Publication date: December 22, 2022
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, JEFFREY A. STUECHELI
  • Publication number: 20220405125
    Abstract: In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.
    Type: Application
    Filed: June 18, 2021
    Publication date: December 22, 2022
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN
  • Patent number: 11354243
    Abstract: A processing unit for a data processing system includes a processor core that issues memory access requests and a cache memory coupled to the processor core. The cache memory includes a reservation circuit that tracks reservations established by the processor core via load-reserve requests and a plurality of read-claim (RC) state machines for servicing memory access requests of the processor core. The cache memory, responsive to receipt from the processor core of a store-conditional request specifying a store target address, allocates an RC state machine among the plurality of RC state machines to process the store-conditional request and transfers responsibility for tracking a reservation for the store target address from the reservation circuit to the RC state machine.
    Type: Grant
    Filed: November 17, 2020
    Date of Patent: June 7, 2022
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai, Luke Murray
  • Publication number: 20220156194
    Abstract: A processing unit for a data processing system includes a processor core that issues memory access requests and a cache memory coupled to the processor core. The cache memory includes a reservation circuit that tracks reservations established by the processor core via load-reserve requests and a plurality of read-claim (RC) state machines for servicing memory access requests of the processor core. The cache memory, responsive to receipt from the processor core of a store-conditional request specifying a store target address, allocates an RC state machine among the plurality of RC state machines to process the store-conditional request and transfers responsibility for tracking a reservation for the store target address from the reservation circuit to the RC state machine.
    Type: Application
    Filed: November 17, 2020
    Publication date: May 19, 2022
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, SANJEEV GHAI, LUKE MURRAY
  • Patent number: 11321145
    Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The processor core is configured to, based on receipt of an interrupt, generate and issue a synchronization request prior to executing an interrupt handler and is configured to, based on receipt of a synchronization acknowledgment for the synchronization request, execute the interrupt handler. The lower level cache is configured to, based on receipt of the synchronization request, record which of its state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache, and is configured to, based on determining that each such state machine has completed processing of its respective prior snooped request, issue the synchronization acknowledgment to the processor core.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: May 3, 2022
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Hugh Shen, Guy L. Guthrie
  • Patent number: 11281582
    Abstract: A data processing system includes multiple processing units all having access to a shared memory system. A processing unit includes a lower level cache configured to serve as a point of systemwide coherency and a processor core coupled to the lower level cache. The processor core includes an upper level cache, an execution unit that executes a store-conditional instruction to generate a store-conditional request that specifies a store target address and store data, and a flag that, when set, indicates the store-conditional request can be completed early in the processor core. The processor core also includes completion logic configured to commit an update of the shared memory system with the store data specified by the store-conditional request based on whether the flag is set.
    Type: Grant
    Filed: January 14, 2020
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, William J. Starke, Hugh Shen
  • Patent number: 11176038
    Abstract: A data processing system includes multiple processing units coupled to a system interconnect including a broadcast address interconnect and a data interconnect. The processing unit includes a processor core that executes memory access instructions and a cache memory, coupled to the processor core, which is configured to store data for access by the processor core. The processing unit is configured to broadcast, on the address interconnect, a cache-inhibited write request and write data for a destination device coupled to the system interconnect. In various embodiments, the initial cache-inhibited request and the write data can be communicated in the same or different requests on the address interconnect.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: November 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Publication number: 20210342275
    Abstract: An upper level cache receives from an associated processor core a plurality of memory access requests including at least first and second memory access requests of differing first and second classes. Based on class histories associated with the first and second classes of memory access requests, the upper level cache initiates, on the system interconnect fabric, a first interconnect transaction corresponding to the first memory access request without first issuing the first memory access request to the lower level cache via a private communication channel between the upper level cache and the lower level cache.
    Type: Application
    Filed: April 30, 2020
    Publication date: November 4, 2021
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, LUKE MURRAY
  • Patent number: 11163700
    Abstract: An upper level cache receives from an associated processor core a plurality of memory access requests including at least first and second memory access requests of differing first and second classes. Based on class histories associated with the first and second classes of memory access requests, the upper level cache initiates, on the system interconnect fabric, a first interconnect transaction corresponding to the first memory access request without first issuing the first memory access request to the lower level cache via a private communication channel between the upper level cache and the lower level cache.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Luke Murray
  • Patent number: 11157408
    Abstract: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a flush or clean memory access operation of an initiating coherence participant, determines whether the directory indicates the cache memory has coherence ownership of a target address of the request. Based on determining the directory indicates the cache memory has coherence ownership of the target address, the snoop logic provides a coherence response to the request that causes coherence ownership of the target address to be transferred to the initiating coherence participant, such that the initiating coherence participant can protect the target address against conflicting requests.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: October 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Luke Murray
  • Patent number: 11157409
    Abstract: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a first flush/clean memory access operation that specifies a target address, determines whether or not the cache memory has coherence ownership of the target address. Based on determining the cache memory has coherence ownership of the target address, the snoop logic services the request and thereafter enters a referee mode. While in the referee mode, the snoop logic protects a memory block identified by the target address against conflicting memory access requests by the plurality of processor cores until conclusion of a second flush/clean memory access operation that specifies the target address.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: October 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Luke Murray