Patents by Inventor Hugh Shen

Hugh Shen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11119781
    Abstract: A data processing system includes multiple processing units all having access to a shared memory. A processing unit of the data processing system includes a processor core including an upper level cache, core reservation logic that records addresses in the shared memory for which the processor core has obtained reservations, and an execution unit that executes memory access instructions including a fronting load instruction. Execution of the fronting load instruction generates a load request that specifies a load target address. The processing unit further includes lower level cache that, responsive to receipt of the load request and based on the load request indicating an address match for the load target address in the core reservation logic, protects the load target address against access by any conflicting memory access request during a protection interval following servicing of the load request.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai
  • Patent number: 11106608
    Abstract: A processing unit includes a processor core that executes a store-conditional instruction that generates a store-conditional request specifying a store target address. The processing unit further includes a reservation register that records shared memory addresses for which the processor core has obtained reservations and a cache that services the store-conditional request by conditionally updating the shared memory with the store data based on the reservation register indicating a reservation for the store target address. The cache includes a blocking state machine configured to protect the store target address against access by any conflicting memory access request snooped on a system interconnect during a protection window extension following servicing of the store-conditional request.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: August 31, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai
  • Patent number: 11086672
    Abstract: A data processing system includes multiple processing units all having access to a shared memory. A processing unit includes a lower level cache memory and a processor core coupled to the lower level cache memory. The processor core includes an execution unit for executing instructions in a plurality of simultaneous hardware threads, an upper level cache memory, and a plurality of wait flags each associated with a respective one of the plurality of simultaneous hardware threads. The processor core is configured to set a wait flag among the plurality of wait flags to indicate the associated hardware thread is in a wait state in which the hardware thread suspends instruction execution and to exit the wait state based on the wait flag being reset.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: August 10, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Hugh Shen, Guy L. Guthrie
  • Publication number: 20210216457
    Abstract: A data processing system includes multiple processing units all having access to a shared memory system. A processing unit includes a lower level cache configured to serve as a point of systemwide coherency and a processor core coupled to the lower level cache. The processor core includes an upper level cache, an execution unit that executes a store-conditional instruction to generate a store-conditional request that specifies a store target address and store data, and a flag that, when set, indicates the store-conditional request can be completed early in the processor core. The processor core also includes completion logic configured to commit an update of the shared memory system with the store data specified by the store-conditional request based on whether the flag is set.
    Type: Application
    Filed: January 14, 2020
    Publication date: July 15, 2021
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, WILLIAM J. STARKE, HUGH SHEN
  • Publication number: 20210182197
    Abstract: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a flush or clean memory access operation of an initiating coherence participant, determines whether the directory indicates the cache memory has coherence ownership of a target address of the request. Based on determining the directory indicates the cache memory has coherence ownership of the target address, the snoop logic provides a coherence response to the request that causes coherence ownership of the target address to be transferred to the initiating coherence participant, such that the initiating coherence participant can protect the target address against conflicting requests.
    Type: Application
    Filed: December 17, 2019
    Publication date: June 17, 2021
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, LUKE MURRAY
  • Publication number: 20210182198
    Abstract: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a first flush/clean memory access operation that specifies a target address, determines whether or not the cache memory has coherence ownership of the target address. Based on determining the cache memory has coherence ownership of the target address, the snoop logic services the request and thereafter enters a referee mode. While in the referee mode, the snoop logic protects a memory block identified by the target address against conflicting memory access requests by the plurality of processor cores until conclusion of a second flush/clean memory access operation that specifies the target address.
    Type: Application
    Filed: December 17, 2019
    Publication date: June 17, 2021
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, LUKE MURRAY
  • Patent number: 10997075
    Abstract: Statistical data is used to enable or disable snooping on a bus of a processor. A command is received via a first bus or a second bus communicably coupling processor cores and caches of chiplets on the processor. Cache logic on a chiplet determines whether or not a local cache on the chiplet can satisfy a request for data specified in the command. In response to determining that the local cache can satisfy the request for data, the cache logic updates statistical data maintained on the chiplet. The statistical data indicates a probability that the local cache can satisfy a future request for data. Based at least in part on the statistical data, the cache logic determines whether to enable or disable snooping on the second bus by the local cache.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: May 4, 2021
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hien M. Le, Hugh Shen, Derek E. Williams, Phillip G. Williams
  • Patent number: 10977183
    Abstract: A multiprocessor data processing system includes a processor core having a translation structure for buffering a plurality of translation entries. The processor core receives a sequence of a plurality of translation invalidation requests. In response to receipt of each of the plurality of translation invalidation requests, the processor core determines that each of the plurality of translation invalidation requests indicates that it does not require draining of memory referent instructions for which address translation has been performed by reference to a respective one of a plurality of translation entries to be invalidated. Based on the determination, the processor core invalidates the plurality of translation entries in the translation structure without regard to draining from the processor core of memory access requests for which address translation was performed by reference to the plurality of translation entries.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 10970215
    Abstract: A cache memory includes a data array, a directory of contents of the data array that specifies coherence state information, and snoop logic that processes operations snooped from a system fabric by reference to the data array and the directory. The snoop logic, responsive to snooping on the system fabric a request of a flush/clean memory access operation of one of a plurality of processor cores that specifies a target address, services the request and thereafter enters a referee mode. While in the referee mode, the snoop logic protects a memory block identified by the target address against conflicting memory access requests by the plurality of processor cores such that no other coherence participant is permitted to assume coherence ownership of the memory block.
    Type: Grant
    Filed: December 3, 2019
    Date of Patent: April 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Luke Murray
  • Publication number: 20210096990
    Abstract: A data processing system includes multiple processing units coupled to a system interconnect including a broadcast address interconnect and a data interconnect. The processing unit includes a processor core that executes memory access instructions and a cache memory, coupled to the processor core, which is configured to store data for access by the processor core. The processing unit is configured to broadcast, on the address interconnect, a cache-inhibited write request and write data for a destination device coupled to the system interconnect. In various embodiments, the initial cache-inhibited request and the write data can be communicated in the same or different requests on the address interconnect.
    Type: Application
    Filed: September 30, 2019
    Publication date: April 1, 2021
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN
  • Patent number: 10956070
    Abstract: A data processing system includes a plurality of processor cores each having a respective associated cache memory, a memory controller, and a system memory coupled to the memory controller. A zero request of a processor core among the plurality of processor cores is transmitted on an interconnect fabric of the data processing system. The zero request specifies a target address of a target memory block to be zeroed and has no associated data payload. The memory controller receives the zero request on the interconnect fabric and services the zero request by zeroing in the system memory the target memory block identified by the target address, such that the target memory block is zeroed without caching the zeroed target memory block in the cache memory of the processor core.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 10884740
    Abstract: A processing unit for a data processing system includes a cache memory having reservation logic and a processor core coupled to the cache memory. The processor includes an execution unit that executes instructions in a plurality of concurrent hardware threads of execution including at least first and second hardware threads. The instructions include, within the first hardware thread, a first load-reserve instruction that identifies a target address for which a reservation is requested. The processor core additionally includes a load unit that records the target address of the first load-reserve instruction and that, responsive to detecting, in the second hardware thread, a second load-reserve instruction identifying the target address recorded by the load unit, blocks the second load-reserve instruction from establishing a reservation for the target address in the reservation logic.
    Type: Grant
    Filed: November 8, 2018
    Date of Patent: January 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Kimberly M. Fernsler, Hugh Shen
  • Publication number: 20200409771
    Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The processor core is configured to, based on receipt of an interrupt, generate and issue a synchronization request prior to executing an interrupt handler and is configured to, based on receipt of a synchronization acknowledgment for the synchronization request, execute the interrupt handler. The lower level cache is configured to, based on receipt of the synchronization request, record which of its state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache, and is configured to, based on determining that each such state machine has completed processing of its respective prior snooped request, issue the synchronization acknowledgment to the processor core.
    Type: Application
    Filed: June 27, 2019
    Publication date: December 31, 2020
    Inventors: DEREK E. WILLIAMS, HUGH SHEN, GUY L. GUTHRIE
  • Publication number: 20200356409
    Abstract: A data processing system includes multiple processing units all having access to a shared memory. A processing unit includes a lower level cache memory and a processor core coupled to the lower level cache memory. The processor core includes an execution unit for executing instructions in a plurality of simultaneous hardware threads, an upper level cache memory, and a plurality of wait flags each associated with a respective one of the plurality of simultaneous hardware threads. The processor core is configured to set a wait flag among the plurality of wait flags to indicate the associated hardware thread is in a wait state in which the hardware thread suspends instruction execution and to exit the wait state based on the wait flag being reset.
    Type: Application
    Filed: May 7, 2019
    Publication date: November 12, 2020
    Inventors: Derek E. WILLIAMS, Hugh SHEN, Guy L. GUTHRIE
  • Patent number: 10831607
    Abstract: In a processing unit, a processor core executes instructions in a plurality of simultaneous hardware threads, where multiple of the plurality of hardware threads concurrently execute memory transactions. A transactional memory circuit in the processing unit tracks transaction footprints of the memory transactions of the multiple hardware thread. In response to detecting failure of a given memory transaction of one of the plurality of multiple threads due to an overflow condition, the transactional memory circuit transitions to a throttled operating mode and reduces a number of hardware threads permitted to concurrently execute memory transactions.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai, Hung Doan
  • Patent number: 10831660
    Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The lower level cache includes one or more state machines for handling requests snooped from the system interconnect. The processing unit includes an interrupt unit configured to, based on receipt of an interrupt request while the processor core is in a powered up state, record which of the one or more state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache and present an interrupt to the processor core based on determining that each state machine that was active processing a prior snooped request that can invalidate a cache line in the upper level cache has completed processing of its respective prior snooped request.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 10824567
    Abstract: A data processing system includes a processor core having a shared store-through upper level cache and a store-in lower level cache. The processor core executes a plurality of simultaneous hardware threads of execution including at least a first thread and a second thread, and the shared store-through upper level cache stores a first cache line accessible to both the first thread and the second thread. The processor core executes in the first thread a store instruction that generates a store request specifying a target address of a storage location corresponding to the first cache line. Based on the target address hitting in the shared store-through upper level cache, the first cache line is temporarily marked, in the shared store-through upper level cache, as private to the first thread, such that any memory access request by the second thread targeting the storage location will miss in the shared store-through upper level cache.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: November 3, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Hugh Shen, Guy L. Guthrie, William J. Starke
  • Patent number: 10740239
    Abstract: A multiprocessor data processing system includes a processor core having a translation structure for buffering a plurality of translation entries. In response to receipt of a translation invalidation request, the processor core determines from the translation invalidation request that the translation invalidation request does not require draining of memory referent instructions for which address translation has been performed by reference to a translation entry to be invalidated. Based on the determination, the processor core invalidates the translation entry in the translation structure and confirms completion of invalidation of the translation entry without regard to draining from the processor core of memory access requests for which address translation was performed by reference to the translation entry.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 10725937
    Abstract: A data processing system includes multiple processing units all having access to a shared memory. A processing unit includes a processor core that executes memory access instructions including a store-conditional instruction that generates a store-conditional request specifying a store target address and store data. The processing unit further includes a reservation register that records shared memory addresses for which the processor core has obtained reservations and a cache that services the store-conditional request by conditionally updating the shared memory with the store data based on the reservation register indicating a reservation for the store target address. The processing unit additional includes a blocking state machine configured to protect the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: July 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai
  • Patent number: 10691605
    Abstract: In at least some embodiments, a processor core generates a store operation by executing a store instruction in an instruction sequence. The store operation is marked as a high priority store operation in response to the store instruction being marked as high priority and is not so marked otherwise. The store operation is buffered in a store queue associated with a cache memory of the processor core. Handling of the store operation in the store queue is expedited in response to the store operation being marked as a high priority store operation and not expedited otherwise.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: June 23, 2020
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams