Patents by Inventor Guy L. Guthrie

Guy L. Guthrie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10102130
    Abstract: A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and a system interconnect coupled to the system memory and the multiple vertical cache hierarchies. A first cache memory in a first vertical cache hierarchy issues on the system interconnect a request for a target cache line. Responsive to the request and prior to receiving a systemwide coherence response for the request, the first cache memory receives from a second cache memory in a second vertical cache hierarchy by cache-to-cache intervention the target cache line and an early indication of the systemwide coherence response for the request. In response to the early indication of the systemwide coherence response and prior to receiving the systemwide coherence response, the first cache memory initiates processing to install the target cache line in the first cache memory.
    Type: Grant
    Filed: April 11, 2016
    Date of Patent: October 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Jonathan R. Jackson, Michael S. Siegel, William J. Starke, Jeffrey A. Stuecheli, Derek E. Williams
  • Patent number: 10067713
    Abstract: In a data processing system implementing a weak memory model, a lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed. The lower level cache also receives, from the processor core, a barrier request that requests enforcement of ordering of memory access requests prior to the barrier request with respect to memory access requests after the barrier request. In response to the barrier request, the lower level cache enforces a barrier indicated by the barrier request with respect to a final paste-type request ending the memory move but not with respect to other copy-type requests and paste-type requests in the memory move.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: September 4, 2018
    Assignee: International Business Machines Corporation
    Inventors: Bradly G. Frey, Guy L. Guthrie, Cathy May, William J. Starke, Derek E. Williams
  • Patent number: 10042580
    Abstract: A lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed, as well as a barrier request that requests ordering of memory access requests prior to and after the barrier request. The barrier request precedes a copy-type request and a paste-type request of the memory move in program order. Prior to completion of processing of the barrier request, the lower level cache allocates first and second state machines to service the copy-type and paste-type requests. The first state machine speculatively reads a data granule identified by a source real address of the copy-type request into a non-architected buffer. After processing of the barrier request is complete, the second state machine writes the data granule from the non-architected buffer to a storage location identified by a destination real address of the paste-type request.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: August 7, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Derek E. Williams
  • Patent number: 10019374
    Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving an operation that is associated with a first thread. Logical partition (LPAR) information for the operation is used to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: July 10, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
  • Publication number: 20180165392
    Abstract: In a data processing system, a processor creating level qualifying logic within instrumentation of a hardware description language (HDL) simulation model of a design. The level qualifying logic is configured to generate a first event of a first type for a first simulation level and to generate a second event of second type for a second simulation level. The processor simulates the design utilizing the HDL simulation model, where the simulation includes generating the first event of the first type responsive to the simulating being performed at the first simulation level and generating the second event of the second type responsive to the simulating being performed at the second simulation level. Responsive to the simulating, the processor records, within data storage, at least one occurrence of an event from a set including the first event and the second event.
    Type: Application
    Filed: December 12, 2016
    Publication date: June 14, 2018
    Inventors: GUY L. GUTHRIE, HUGH SHEN, DEREK E. WILLIAMS
  • Patent number: 9996298
    Abstract: A processor core of a data processing system, in response to a first instruction, generates a copy-type request specifying a source real address and transmits it to a lower level cache. In response to a second instruction, the processor core generates a paste-type request specifying a destination real address associated with a memory-mapped device and transmits it to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues a command to write the data granule from the non-architected buffer to the memory-mapped device. In response to receipt from the memory-mapped device of a busy response, the processor core abandons the memory move instruction sequence and performs alternative processing.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: June 12, 2018
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli, Derek E. Williams
  • Patent number: 9959213
    Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving, by a store queue controller, an operation that is associated with a first thread. The store queue controller uses level one (L1) cache memory miss information for the operation to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: May 1, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
  • Patent number: 9928119
    Abstract: In a multithreaded data processing system including a plurality of processor cores, storage-modifying requests of a plurality of concurrently executing hardware threads are received in a shared queue. The storage-modifying requests include a translation invalidation request of an initiating hardware thread. The translation invalidation request is removed from the shared queue and buffered in sidecar logic in one of a plurality of sidecars each associated with a respective one of the plurality of hardware threads. While the translation invalidation request is buffered in the sidecar, the sidecar logic broadcasts the translation invalidation request so that it is received and processed by the plurality of processor cores. In response to confirmation of completion of processing of the translation invalidation request by the initiating processor core, the sidecar logic removes the translation invalidation request from the sidecar.
    Type: Grant
    Filed: March 28, 2016
    Date of Patent: March 27, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
  • Patent number: 9916245
    Abstract: Accessing partial cachelines in a data cache including storing a first portion of a cacheline in a cache entry of the data cache; relaunching a load instruction targeting a second portion of the cacheline, wherein the second portion of the cacheline is not stored in the data cache; determining that the load instruction targets a portion of the cacheline not stored in the cache entry; storing the second portion of the cacheline in the data cache; and reading the second portion of the cacheline from the data cache according to the load instruction.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: March 13, 2018
    Assignee: International Business Machines Corporation
    Inventors: Richard J. Eickemeyer, Kimberly M. Fernsler, Guy L. Guthrie, David A. Hrusecky, Elizabeth A. McGlone
  • Patent number: 9910782
    Abstract: In at least some embodiments, a processor core generates a store operation by executing a store instruction in an instruction sequence. The store operation is marked as a high priority store operation in response to the store instruction being marked as high priority and is not so marked otherwise. The store operation is buffered in a store queue associated with a cache memory of the processor core. Handling of the store operation in the store queue is expedited in response to the store operation being marked as a high priority store operation and not expedited otherwise.
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: March 6, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams
  • Publication number: 20180052788
    Abstract: In a data processing system implementing a weak memory model, a lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed. The lower level cache also receives, from the processor core, a barrier request that requests enforcement of ordering of memory access requests prior to the barrier request with respect to memory access requests after the barrier request. Prior to completion of processing of the barrier request by the lower level cache, the lower level cache speculatively issues a request on the interconnect fabric to obtain a copy of a data granule specified by a memory access request among the pluralities of requests that follows the barrier request in program order.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: GUY L. GUTHRIE, DEREK E. WILLIAMS
  • Publication number: 20180052687
    Abstract: A processor core has a store-through upper level cache and a store-in lower level cache. In response to execution of a memory move instruction sequence including a plurality of copy-type instruction and a plurality of paste-type instructions, the processor core transmits a corresponding plurality of copy-type and paste-type requests to the lower level cache, where each copy-type request specifies a source real address and each paste-type request specifies a destination real address. In response to receipt of each copy-type request, the lower level cache copies a respective one of a plurality of data granules from a respective storage location specified by the source real address of that copy-type request into a non-architected buffer. In response to receipt of each paste-type request, the lower level cache writes a respective one of the plurality of data granules from the non-architected buffer to a respective storage location specified by the destination real address of that paste-type request.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: BRADLY G. FREY, SANJEEV GHAI, GUY L. GUTHRIE, CATHY MAY, WILLIAM J. STARKE, DEREK E. WILLIAMS
  • Publication number: 20180052599
    Abstract: A data processing system includes a processor core having a store-in lower level cache, a memory controller, a memory-mapped device, and an interconnect fabric communicatively coupling the lower level cache and the memory-mapped device. In response to a first instruction in the processor core, a copy-type request specifying a source real address is transmitted to the lower level cache. In response to a second instruction in the processor core, a paste-type request specifying a destination real address associated with the memory-mapped device is transmitted to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues on the interconnect fabric a command that writes the data granule from the non-architected buffer to the memory-mapped device.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: LAKSHMINARAYANA B. ARIMILLI, GUY L. GUTHRIE, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Publication number: 20180052605
    Abstract: A data processing system includes a processor core having a store-through upper level cache and a store-in lower level cache. In response to a first instruction, the processor core generates a copy-type request and transmits the copy-type request to the lower level cache, where the copy-type request specifies a source real address. In response to a second instruction, the processor core generates a paste-type request and transmits the paste-type request to the lower level cache, where the paste-type request specifies a destination real address. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer, and in response to receipt of the paste-type request, the lower level cache writes the data granule from the non-architected buffer to a storage location specified by the destination real address.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: DEREK E. WILLIAMS, GUY L. GUTHRIE, SANJEEV GHAI, WILLIAM J. STARKE
  • Publication number: 20180052771
    Abstract: Statistical data is used to enable or disable snooping on a bus of a processor. A command is received via a first bus or a second bus communicably coupling processor cores and caches of chiplets on the processor. Cache logic on a chiplet determines whether or not a local cache on the chiplet can satisfy a request for data specified in the command. In response to determining that the local cache can satisfy the request for data, the cache logic updates statistical data maintained on the chiplet. The statistical data indicates a probability that the local cache can satisfy a future request for data. Based at least in part on the statistical data, the cache logic determines whether to enable or disable snooping on the second bus by the local cache.
    Type: Application
    Filed: October 27, 2017
    Publication date: February 22, 2018
    Inventors: Guy L. Guthrie, Hien M. Le, Hugh Shen, Derek E. Williams, Phillip G. Williams
  • Publication number: 20180052607
    Abstract: A data processing system includes at least one processor core each having an associated store-through upper level cache and an associated store-in lower level cache. In response to execution of a memory move instruction sequence including a plurality of copy-type instructions and a plurality of paste-type instructions, the at least one processor core transmits a corresponding plurality of copy-type and paste-type requests to its associated lower level cache, where each copy-type request specifies a source real address and each paste-type request specifies a destination real address. In response to receipt of each copy-type request, the associated lower level cache copies a respective data granule from a respective storage location specified by the source real address of that copy-type request into a non-architected buffer.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: GUY L. GUTHRIE, WILLIAM J. STARKE, DEREK E. WILLIAMS
  • Publication number: 20180052606
    Abstract: In a data processing system implementing a weak memory model, a lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed. The lower level cache also receives, from the processor core, a barrier request that requests enforcement of ordering of memory access requests prior to the barrier request with respect to memory access requests after the barrier request. In response to the barrier request, the lower level cache enforces a barrier indicated by the barrier request with respect to a final paste-type request ending the memory move but not with respect to other copy-type requests and paste-type requests in the memory move.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: BRADLY G. FREY, GUY L. GUTHRIE, CATHY MAY, WILLIAM J. STARKE, DEREK E. WILLIAMS
  • Publication number: 20180052609
    Abstract: A lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed, as well as a barrier request that requests ordering of memory access requests prior to and after the barrier request. The barrier request precedes a copy-type request and a paste-type request of the memory move in program order. Prior to completion of processing of the barrier request, the lower level cache allocates first and second state machines to service the copy-type and paste-type requests. The first state machine speculatively reads a data granule identified by a source real address of the copy-type request into a non-architected buffer. After processing of the barrier request is complete, the second state machine writes the data granule from the non-architected buffer to a storage location identified by a destination real address of the paste-type request.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: GUY L. GUTHRIE, DEREK E. WILLIAMS
  • Publication number: 20180052608
    Abstract: A processor core of a data processing system, in response to a first instruction, generates a copy-type request specifying a source real address and transmits it to a lower level cache. In response to a second instruction, the processor core generates a paste-type request specifying a destination real address associated with a memory-mapped device and transmits it to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues a command to write the data granule from the non-architected buffer to the memory-mapped device. In response to receipt from the memory-mapped device of a busy response, the processor core abandons the memory move instruction sequence and performs alternative processing.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: LAKSHMINARAYANA B. ARIMILLI, GUY L. GUTHRIE, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Patent number: 9898416
    Abstract: In a multithreaded data processing system including a plurality of processor cores and a system fabric, translation entries can be invalidated without deadlock. A processing unit forwards translation invalidation request(s) received on the system fabric to a processor core via a non-blocking channel. Each of the translation invalidation requests specifies a respective target address and requests invalidation of any translation entry in the processor core that translates its respective target address. Responsive to a translation snoop machine of the processing unit snooping broadcast of a synchronization request on the system fabric of the data processing system, the translation synchronization request is presented to the processor core, and the translation snoop machine remains in an active state until a signal confirming completion of processing of the one or more translation invalidation requests and the synchronization request at the processor core is received and thereafter returns to an inactive state.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: February 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams