Patents by Inventor Hugh Shen
Hugh Shen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12056050Abstract: A data processing system includes a master, a central request agent, and a plurality of snoopers communicatively coupled to a system fabric for communicating requests subject to retry. The master issues on the system fabric a multicast request intended for the plurality of snoopers. The central request agent receives the multicast request on the system fabric, assigns the multicast request to a particular state machine among a plurality of state machines in the central request agent, and provides the master a coherence response indicating successful completion of the multicast request. The central request agent repetitively issues on the system fabric a multicast request in association with a machine identifier identifying the particular state machine until a coherence response indicates the multicast request is successfully received by all of the plurality of snoopers.Type: GrantFiled: December 21, 2022Date of Patent: August 6, 2024Assignee: International Busi Corporation ess MachinesInventors: Derek E. Williams, Luke Murray, Guy L. Guthrie, Hugh Shen
-
Publication number: 20240211398Abstract: A data processing system includes a master, a central request agent, and a plurality of snoopers communicatively coupled to a system fabric for communicating requests subject to retry. The master issues on the system fabric a multicast request intended for the plurality of snoopers. The central request agent receives the multicast request on the system fabric, assigns the multicast request to a particular state machine among a plurality of state machines in the central request agent, and provides the master a coherence response indicating successful completion of the multicast request. The central request agent repetitively issues on the system fabric a multicast request in association with a machine identifier identifying the particular state machine until a coherence response indicates the multicast request is successfully received by all of the plurality of snoopers.Type: ApplicationFiled: December 21, 2022Publication date: June 27, 2024Inventors: Derek E. WILLIAMS, Luke MURRAY, Guy L. GUTHRIE, Hugh SHEN
-
Patent number: 11915045Abstract: In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.Type: GrantFiled: June 18, 2021Date of Patent: February 27, 2024Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
-
Patent number: 11748267Abstract: A plurality of entries including address translation information are buffered in a data structure in a processor core. At least first and second translation entry invalidation requests specifying different first and second addresses are checked against all of the entries in the data structure. The checking includes accessing and checking at least a first entry in the data structure for an address match with the first address but not the second address, thereafter concurrently checking at least a second entry for an address match with both the first and second addresses, and thereafter completing checking for the first address and accessing and checking the first entry for an address match with the second address but not the first address. The processor core invalidates any entry in the data structure for which the checking detects an address match.Type: GrantFiled: August 4, 2022Date of Patent: September 5, 2023Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Luke Murray, Hugh Shen
-
Patent number: 11693776Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.Type: GrantFiled: June 18, 2021Date of Patent: July 4, 2023Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli
-
Patent number: 11693788Abstract: An arbiter gathers translation invalidation requests assigned to state machines of a lower-level cache into a set for joint handling in a processor core. The gathering includes selection of one of the set of gathered translation invalidation requests as an end-of-sequence (EOS) request. The arbiter issues to the processor core a sequence of the gathered translation invalidation requests terminating with the EOS request. Based on receipt of each of the gathered requests, the processor core invalidates any translation entries providing translation for the addresses specified by the translation invalidation requests and marks memory-referent requests dependent on the invalidated translation entries. Based on receipt of the EOS request and in response to all of the marked memory-referent requests draining from the processor core, the processor core issues a completion request to the lower-level cache indicating completion of servicing by the processor core of the set of gathered translation invalidation requests.Type: GrantFiled: June 7, 2022Date of Patent: July 4, 2023Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Luke Murray, Hugh Shen
-
Publication number: 20230063992Abstract: A coherent data processing system includes a system fabric communicatively coupling a plurality of coherence participants and fabric control logic. The fabric control logic quantifies congestion on the system fabric based on coherence messages associated with commands issued on the system fabric. Based on the congestion on the system fabric, the fabric control logic determines a rate of request issuance applicable to a set of coherence participants among the plurality of coherence participants. The fabric control logic issues at least one rate command to set a rate of request issuance to the system fabric of the set of coherence participants.Type: ApplicationFiled: August 18, 2021Publication date: March 2, 2023Inventors: HUGH SHEN, GUY L. GUTHRIE, JEFFREY A. STUECHELI, LUKE MURRAY, ALEXANDER MICHAEL TAFT, BERNARD C. DRERUP, DEREK E. WILLIAMS
-
Publication number: 20230041702Abstract: A data processing system includes a plurality of processor cores each supported by a respective one of a plurality of vertical cache hierarchies. Based on receiving on a system fabric a cache injection request requesting injection of a data into a cache line identified by a target real address, the data is written into a cache in a first vertical cache hierarchy among the plurality of vertical cache hierarchies. Based on a value in a field of the cache injection request, a distribute field is set in a directory entry of the first vertical cache hierarchy. Upon eviction of the cache line the first vertical cache hierarchy, a determination is made whether the distribute field is set. Based on determining the distribute field is set, a lateral castout of the cache line from the first vertical cache hierarchy to a second vertical cache hierarchy is performed.Type: ApplicationFiled: August 4, 2021Publication date: February 9, 2023Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, Bernard C. Drerup, Hugh Shen, Alexander Michael Taft, Luke Murray, Richard Nicholas
-
Patent number: 11573902Abstract: A coherent data processing system includes a system fabric communicatively coupling a plurality of coherence participants and fabric control logic. The fabric control logic quantifies congestion on the system fabric based on coherence messages associated with commands issued on the system fabric. Based on the congestion on the system fabric, the fabric control logic determines a rate of request issuance applicable to a set of coherence participants among the plurality of coherence participants. The fabric control logic issues at least one rate command to set a rate of request issuance to the system fabric of the set of coherence participants.Type: GrantFiled: August 18, 2021Date of Patent: February 7, 2023Assignee: International Business Machines CorporationInventors: Hugh Shen, Guy L. Guthrie, Jeffrey A. Stuecheli, Luke Murray, Alexander Michael Taft, Bernard C. Drerup, Derek E. Williams
-
Patent number: 11561901Abstract: A data processing system includes a plurality of processor cores each supported by a respective one of a plurality of vertical cache hierarchies. Based on receiving on a system fabric a cache injection request requesting injection of a data into a cache line identified by a target real address, the data is written into a cache in a first vertical cache hierarchy among the plurality of vertical cache hierarchies. Based on a value in a field of the cache injection request, a distribute field is set in a directory entry of the first vertical cache hierarchy. Upon eviction of the cache line the first vertical cache hierarchy, a determination is made whether the distribute field is set. Based on determining the distribute field is set, a lateral castout of the cache line from the first vertical cache hierarchy to a second vertical cache hierarchy is performed.Type: GrantFiled: August 4, 2021Date of Patent: January 24, 2023Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Bernard C. Drerup, Hugh Shen, Alexander Michael Taft, Luke Murray, Richard Nicholas
-
Patent number: 11537519Abstract: A memory-referent instruction is executed to calculate a target effective address (EA) of a corresponding memory-referent request. An array entry in an upper level cache is allocated, and the EA is specified in a corresponding EA directory entry. While in-flight, the memory-referent request is buffered in a queue in association with a pointer to the entry in the EA directory. Based on receiving a translation invalidation request requesting invalidation of an address translation in a translation structure, the processor core walks the EA directory, determines the EA in the entry matches an address range specified by the translation invalidation request, and, based on the match, precisely marks the memory-referent request using the pointer to the EA directory entry. Based on the marking, the translation invalidation request is permitted to complete with reference to the processor core only after the memory-referent request has drained from the processing unit.Type: GrantFiled: July 29, 2021Date of Patent: December 27, 2022Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, David Campbell, Bryan Lloyd, Samuel David Kirchhoff, Jeffrey A. Stuecheli
-
Publication number: 20220405125Abstract: In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.Type: ApplicationFiled: June 18, 2021Publication date: December 22, 2022Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN
-
Publication number: 20220405202Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.Type: ApplicationFiled: June 18, 2021Publication date: December 22, 2022Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, JEFFREY A. STUECHELI
-
Patent number: 11354243Abstract: A processing unit for a data processing system includes a processor core that issues memory access requests and a cache memory coupled to the processor core. The cache memory includes a reservation circuit that tracks reservations established by the processor core via load-reserve requests and a plurality of read-claim (RC) state machines for servicing memory access requests of the processor core. The cache memory, responsive to receipt from the processor core of a store-conditional request specifying a store target address, allocates an RC state machine among the plurality of RC state machines to process the store-conditional request and transfers responsibility for tracking a reservation for the store target address from the reservation circuit to the RC state machine.Type: GrantFiled: November 17, 2020Date of Patent: June 7, 2022Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai, Luke Murray
-
Publication number: 20220156194Abstract: A processing unit for a data processing system includes a processor core that issues memory access requests and a cache memory coupled to the processor core. The cache memory includes a reservation circuit that tracks reservations established by the processor core via load-reserve requests and a plurality of read-claim (RC) state machines for servicing memory access requests of the processor core. The cache memory, responsive to receipt from the processor core of a store-conditional request specifying a store target address, allocates an RC state machine among the plurality of RC state machines to process the store-conditional request and transfers responsibility for tracking a reservation for the store target address from the reservation circuit to the RC state machine.Type: ApplicationFiled: November 17, 2020Publication date: May 19, 2022Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, SANJEEV GHAI, LUKE MURRAY
-
Patent number: 11321145Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The processor core is configured to, based on receipt of an interrupt, generate and issue a synchronization request prior to executing an interrupt handler and is configured to, based on receipt of a synchronization acknowledgment for the synchronization request, execute the interrupt handler. The lower level cache is configured to, based on receipt of the synchronization request, record which of its state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache, and is configured to, based on determining that each such state machine has completed processing of its respective prior snooped request, issue the synchronization acknowledgment to the processor core.Type: GrantFiled: June 27, 2019Date of Patent: May 3, 2022Assignee: International Business Machines CorporationInventors: Derek E. Williams, Hugh Shen, Guy L. Guthrie
-
Patent number: 11281582Abstract: A data processing system includes multiple processing units all having access to a shared memory system. A processing unit includes a lower level cache configured to serve as a point of systemwide coherency and a processor core coupled to the lower level cache. The processor core includes an upper level cache, an execution unit that executes a store-conditional instruction to generate a store-conditional request that specifies a store target address and store data, and a flag that, when set, indicates the store-conditional request can be completed early in the processor core. The processor core also includes completion logic configured to commit an update of the shared memory system with the store data specified by the store-conditional request based on whether the flag is set.Type: GrantFiled: January 14, 2020Date of Patent: March 22, 2022Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, William J. Starke, Hugh Shen
-
Patent number: 11176038Abstract: A data processing system includes multiple processing units coupled to a system interconnect including a broadcast address interconnect and a data interconnect. The processing unit includes a processor core that executes memory access instructions and a cache memory, coupled to the processor core, which is configured to store data for access by the processor core. The processing unit is configured to broadcast, on the address interconnect, a cache-inhibited write request and write data for a destination device coupled to the system interconnect. In various embodiments, the initial cache-inhibited request and the write data can be communicated in the same or different requests on the address interconnect.Type: GrantFiled: September 30, 2019Date of Patent: November 16, 2021Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
-
Publication number: 20210342275Abstract: An upper level cache receives from an associated processor core a plurality of memory access requests including at least first and second memory access requests of differing first and second classes. Based on class histories associated with the first and second classes of memory access requests, the upper level cache initiates, on the system interconnect fabric, a first interconnect transaction corresponding to the first memory access request without first issuing the first memory access request to the lower level cache via a private communication channel between the upper level cache and the lower level cache.Type: ApplicationFiled: April 30, 2020Publication date: November 4, 2021Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, HUGH SHEN, LUKE MURRAY
-
Patent number: 11163700Abstract: An upper level cache receives from an associated processor core a plurality of memory access requests including at least first and second memory access requests of differing first and second classes. Based on class histories associated with the first and second classes of memory access requests, the upper level cache initiates, on the system interconnect fabric, a first interconnect transaction corresponding to the first memory access request without first issuing the first memory access request to the lower level cache via a private communication channel between the upper level cache and the lower level cache.Type: GrantFiled: April 30, 2020Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Luke Murray