Patents by Inventor Guy L. Guthrie

Guy L. Guthrie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10379856
    Abstract: A data processing system implementing a weak memory model includes a plurality of processing units coupled to an interconnect fabric. In response execution of a multicopy atomic store instruction, an initiating processing unit broadcasts a store request on the interconnect fabric to obtain coherence ownership of a target cache line. The initiating processing unit posts a kill request to at least one of the plurality of processing units to request invalidation of a copy of the target cache line. In response to successful posting of the kill request, the initiating processing unit broadcasts a store complete request on the interconnect fabric to enforce completion of the invalidation of the copy of the target cache line. In response to the store complete request receiving a coherence response indicating success, the initiating processing unit permits an update to the target cache line requested by the multicopy atomic store instruction to be atomically visible.
    Type: Grant
    Filed: June 4, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Derek E. Williams
  • Patent number: 10380031
    Abstract: Ensuring forward progress for nested translations in a memory management unit (MMU) including receiving a plurality of nested translation requests, wherein each of the plurality of nested translation requests requires at least one congruence class lock; detecting, using a congruence class scoreboard, a collision of the plurality of nested translation requests based on the required congruence class locks; quiescing, in response to detecting the collision of the plurality of nested translation requests, a translation pipeline in the MMU including switching operation of the translation pipeline from a multi-thread mode to a single-thread mode and marking a first subset of the plurality of nested translation requests as high-priority nested translation requests; and servicing the high-priority nested translation requests through the translation pipeline in the single-thread mode.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Jody B. Joyner, Jon K. Kriegel, Bradley Nelson, Charles D. Wait
  • Publication number: 20190220409
    Abstract: A cache coherent data processing system includes at least non-overlapping first, second, and third coherency domains. A master in the first coherency domain of the cache coherent data processing system selects a scope of an initial broadcast of an interconnect operation from among a set of scopes including (1) a remote scope including both the first coherency domain and the second coherency domain, but excluding the third coherency domain that is a peer of the first coherency domain, and (2) a local scope including only the first coherency domain. The master then performs an initial broadcast of the interconnect operation within the cache coherent data processing system utilizing the selected scope, where performing the initial broadcast includes the master initiating broadcast of the interconnect operation within the first coherency domain.
    Type: Application
    Filed: January 17, 2018
    Publication date: July 18, 2019
    Inventors: GUY L. GUTHRIE, MICHAEL S. SIEGEL, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Patent number: 10331563
    Abstract: Statistical data is used to enable or disable snooping on a bus of a processor. A command is received via a first bus or a second bus communicably coupling processor cores and caches of chiplets on the processor. Cache logic on a chiplet determines whether or not a local cache on the chiplet can satisfy a request for data specified in the command. In response to determining that the local cache can satisfy the request for data, the cache logic updates statistical data maintained on the chiplet. The statistical data indicates a probability that the local cache can satisfy a future request for data. Based at least in part on the statistical data, the cache logic determines whether to enable or disable snooping on the second bus by the local cache.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: June 25, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hien M. Le, Hugh Shen, Derek E. Williams, Phillip G. Williams
  • Patent number: 10331373
    Abstract: A data processing system includes at least one processor core each having an associated store-through upper level cache and an associated store-in lower level cache. In response to execution of a memory move instruction sequence including a plurality of copy-type instructions and a plurality of paste-type instructions, the at least one processor core transmits a corresponding plurality of copy-type and paste-type requests to its associated lower level cache, where each copy-type request specifies a source real address and each paste-type request specifies a destination real address. In response to receipt of each copy-type request, the associated lower level cache copies a respective data granule from a respective storage location specified by the source real address of that copy-type request into a non-architected buffer.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: June 25, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, William J. Starke, Derek E. Williams
  • Publication number: 20190188138
    Abstract: A data processing system includes first and second processing nodes and response logic coupled by an interconnect fabric. A first coherence participant in the first processing node is configured to issue a memory access request specifying a target memory block, and a second coherence participant in the second processing node is configured to issue a probe request regarding a memory region tracked in a memory coherence directory. The first coherence participant is configured to, responsive to receiving the probe request after the memory access request and before receiving a systemwide coherence response for the memory access request, detect an address collision between the probe request and the memory access request and, responsive thereto, transmit a speculative coherence response. The response logic is configured to, responsive to the speculative coherence response, provide a systemwide coherence response for the probe request that prevents the probe request from succeeding.
    Type: Application
    Filed: December 19, 2017
    Publication date: June 20, 2019
    Inventors: GUY L. GUTHRIE, DAVID J. KROLAK, MICHAEL S. SIEGEL, DEREK E. WILLIAMS
  • Patent number: 10318432
    Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving an operation that is associated with a first thread. Logical partition (LPAR) information for the operation is used to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.
    Type: Grant
    Filed: July 9, 2018
    Date of Patent: June 11, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
  • Patent number: 10318435
    Abstract: Ensuring forward progress for nested translations in a memory management unit (MMU) including receiving a plurality of nested translation requests, wherein each of the plurality of nested translation requests requires at least one congruence class lock; detecting, using a congruence class scoreboard, a collision of the plurality of nested translation requests based on the required congruence class locks; quiescing, in response to detecting the collision of the plurality of nested translation requests, a translation pipeline in the MMU including switching operation of the translation pipeline from a multi-thread mode to a single-thread mode and marking a first subset of the plurality of nested translation requests as high-priority nested translation requests; and servicing the high-priority nested translation requests through the translation pipeline in the single-thread mode.
    Type: Grant
    Filed: August 22, 2017
    Date of Patent: June 11, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Jody B. Joyner, Jon K. Kriegel, Bradley Nelson, Charles D. Wait
  • Publication number: 20190163633
    Abstract: A claw-back request, received from an accelerator, is issued for an address line. While waiting for a response to the claw-back request, a cast-out push request with a matching address line is received. The cast-out push request is associated with a cache having a modified copy of the address line. A combined-response, associated with the cast-out push request, is received from a bus. Data associated with the modified copy of the address line is received from the cache. A claw-back response, with the data associated with the modified version of the address line, is issued to an accelerator.
    Type: Application
    Filed: November 30, 2017
    Publication date: May 30, 2019
    Inventors: Kenneth M. Valk, Guy L. Guthrie, Derek E. Williams, Michael S. Siegel, John D. Irish
  • Publication number: 20190138630
    Abstract: A technique for operating a data processing system that implements a split transaction coherency protocol that has an address tenure and a data tenure includes receiving, at a data source, a command (that includes an address tenure for requested data) that is issued from a data sink. The data source issues a response that indicates data associated with the address tenure is available to be transferred to the data sink during a data tenure. In response to determining that the data is available subsequent to issuing the response, the data source issues a first data packet to the data sink that includes the data during the data tenure. In response to determining that the data is not available subsequent to issuing the response, the data source issues a second data packet to the data sink that includes a data header that indicates the data is unavailable.
    Type: Application
    Filed: November 9, 2017
    Publication date: May 9, 2019
    Inventors: BERNARD C. DRERUP, GUY L. GUTHRIE, MICHAEL S. SIEGEL, JEFFREY A. STUECHELI
  • Publication number: 20190121760
    Abstract: A processing unit connected via a system fabric to multiple processing units calls a first single command in a bus protocol that allows sampling over the system fabric of the capability of snoopers distributed across the processing units to handle an interrupt. The processing unit, in response to detecting at least one first selection of snoopers with capability to handle the interrupt, calling a second single command in the bus protocol to poll the first selection of snoopers over the system fabric for an availability status. The processing unit, in response to detecting at least one second selection of snoopers respond with the available status indicating an availability to handle the interrupt, assigning a single snooper from among the second selection of snoopers to handle the interrupt by calling a third single command in the bus protocol.
    Type: Application
    Filed: October 25, 2017
    Publication date: April 25, 2019
    Inventors: RICHARD L. ARNDT, FLORIAN AUERNHAMMER, WAYNE M. BARRETT, ROBERT A. DREHMEL, GUY L. GUTHRIE, MICHAEL S. SIEGEL, WILLIAM J. STARKE
  • Patent number: 10241945
    Abstract: In a data processing system implementing a weak memory model, a lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed. The lower level cache also receives, from the processor core, a barrier request that requests enforcement of ordering of memory access requests prior to the barrier request with respect to memory access requests after the barrier request. Prior to completion of processing of the barrier request by the lower level cache, the lower level cache speculatively issues a request on the interconnect fabric to obtain a copy of a data granule specified by a memory access request among the pluralities of requests that follows the barrier request in program order.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: March 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Derek E. Williams
  • Patent number: 10235215
    Abstract: A memory lock mechanism within a multi-processor system is disclosed. A lock control section is initially assigned to a data block within a system memory of the multiprocessor system. In response to a request for accessing the data block by a processing unit within the multiprocessor system, a determination is made by a memory controller whether or not the lock control section of the data block has been set. If the lock control section of the data block has been set, the request for accessing the data block is denied. Otherwise, if the lock control section of the data block has not been set, the lock control section of the data block is set, and the request for accessing the data block is allowed.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: March 19, 2019
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Guy L. Guthrie, William J. Starke
  • Publication number: 20190065380
    Abstract: Reducing translation latency within a memory management unit (MMU) using external caching structures including requesting, by the MMU on a node, page table entry (PTE) data and coherent ownership of the PTE data from a page table in memory; receiving, by the MMU, the PTE data, a source flag, and an indication that the MMU has coherent ownership of the PTE data, wherein the source flag identifies a source location of the PTE data; performing a lateral cast out to a local high-level cache on the node in response to determining that the source flag indicates that the source location of the PTE data is external to the node; and directing at least one subsequent request for the PTE data to the local high-level cache.
    Type: Application
    Filed: November 21, 2017
    Publication date: February 28, 2019
    Inventors: GUY L. GUTHRIE, JODY B. JOYNER, RONALD N. KALLA, MICHAEL S. SIEGEL, JEFFREY A. STUECHELI, CHARLES D. WAIT, FREDERICK J. ZIEGLER
  • Publication number: 20190065379
    Abstract: Reducing translation latency within a memory management unit (MMU) using external caching structures including requesting, by the MMU on a node, page table entry (PTE) data and coherent ownership of the PTE data from a page table in memory; receiving, by the MMU, the PTE data, a source flag, and an indication that the MMU has coherent ownership of the PTE data, wherein the source flag identifies a source location of the PTE data; performing a lateral cast out to a local high-level cache on the node in response to determining that the source flag indicates that the source location of the PTE data is external to the node; and directing at least one subsequent request for the PTE data to the local high-level cache.
    Type: Application
    Filed: August 22, 2017
    Publication date: February 28, 2019
    Inventors: GUY L. GUTHRIE, JODY B. JOYNER, RONALD N. KALLA, MICHAEL S. SIEGEL, JEFFREY A. STUECHELI, CHARLES D. WAIT, FREDERICK J. ZIEGLER
  • Publication number: 20190065399
    Abstract: Ensuring forward progress for nested translations in a memory management unit (MMU) including receiving a plurality of nested translation requests, wherein each of the plurality of nested translation requests requires at least one congruence class lock; detecting, using a congruence class scoreboard, a collision of the plurality of nested translation requests based on the required congruence class locks; quiescing, in response to detecting the collision of the plurality of nested translation requests, a translation pipeline in the MMU including switching operation of the translation pipeline from a multi-thread mode to a single-thread mode and marking a first subset of the plurality of nested translation requests as high-priority nested translation requests; and servicing the high-priority nested translation requests through the translation pipeline in the single-thread mode.
    Type: Application
    Filed: November 27, 2017
    Publication date: February 28, 2019
    Inventors: GUY L. GUTHRIE, JODY B. JOYNER, JON K. KRIEGEL, BRADLEY NELSON, CHARLES D. WAIT
  • Publication number: 20190065398
    Abstract: Ensuring forward progress for nested translations in a memory management unit (MMU) including receiving a plurality of nested translation requests, wherein each of the plurality of nested translation requests requires at least one congruence class lock; detecting, using a congruence class scoreboard, a collision of the plurality of nested translation requests based on the required congruence class locks; quiescing, in response to detecting the collision of the plurality of nested translation requests, a translation pipeline in the MMU including switching operation of the translation pipeline from a multi-thread mode to a single-thread mode and marking a first subset of the plurality of nested translation requests as high-priority nested translation requests; and servicing the high-priority nested translation requests through the translation pipeline in the single-thread mode.
    Type: Application
    Filed: August 22, 2017
    Publication date: February 28, 2019
    Inventors: GUY L. GUTHRIE, JODY B. JOYNER, JON K. KRIEGEL, BRADLEY NELSON, CHARLES D. WAIT
  • Patent number: 10216519
    Abstract: A data processing system implementing a weak memory model includes a plurality of processing units coupled to an interconnect fabric. In response execution of a multicopy atomic store instruction, an initiating processing unit broadcasts a store request on the interconnect fabric to obtain coherence ownership of a target cache line. The initiating processing unit posts a kill request to at least one of the plurality of processing units to request invalidation of a copy of the target cache line. In response to successful posting of the kill request, the initiating processing unit broadcasts a store complete request on the interconnect fabric to enforce completion of the invalidation of the copy of the target cache line. In response to the store complete request receiving a coherence response indicating success, the initiating processing unit permits an update to the target cache line requested by the multicopy atomic store instruction to be atomically visible.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: February 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Derek E. Williams
  • Publication number: 20190042486
    Abstract: A technique for operating a data processing system includes determining, by an arbiter of a processing unit of the data processing system, whether an over-commit has occurred. In response to determining that the over-commit has occurred, the arbiter selects a broadcast command to be dropped based on a number of hops traversed through the data processing system by the broadcast command.
    Type: Application
    Filed: August 2, 2017
    Publication date: February 7, 2019
    Inventors: GUY L. GUTHRIE, CHARLES MARINO, PRAVEEN S. REDDY
  • Publication number: 20190042428
    Abstract: A technique for operating a data processing system includes transitioning, by a cache, to a highest point of coherency (HPC) for a cache line in a required state without receiving data for one or more segments of the cache line that are needed. The cache issues a command to a lowest point of coherency (LPC) that requests data for the one or more segments of the cache line that were not received and are needed. The cache receives the data for the one or more segments of the cache line from the LPC that were not previously received and were needed.
    Type: Application
    Filed: August 2, 2017
    Publication date: February 7, 2019
    Inventors: GUY L. GUTHRIE, MICHAEL S. SIEGEL, WILLIAM J. STARKE, JEFFREY A. STUECHELI