Patents by Inventor Jeffrey A. Stuecheli

Jeffrey A. Stuecheli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180060070
    Abstract: An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of the processing elements and also receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions provided to the first subset of processing elements. Furthermore, each of processing elements is configurable by the managing element to compare input data portions received from either the load streaming unit or two or more of the other processing elements, wherein the input data portions are stored for processing in respective queues. Each processing unit is further configurable to select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion.
    Type: Application
    Filed: October 31, 2017
    Publication date: March 1, 2018
    Inventors: Ganesh Balakrishnan, Bartholomew Blaner, John J. Reilly, Jeffrey A. Stuecheli
  • Publication number: 20180052599
    Abstract: A data processing system includes a processor core having a store-in lower level cache, a memory controller, a memory-mapped device, and an interconnect fabric communicatively coupling the lower level cache and the memory-mapped device. In response to a first instruction in the processor core, a copy-type request specifying a source real address is transmitted to the lower level cache. In response to a second instruction in the processor core, a paste-type request specifying a destination real address associated with the memory-mapped device is transmitted to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues on the interconnect fabric a command that writes the data granule from the non-architected buffer to the memory-mapped device.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: LAKSHMINARAYANA B. ARIMILLI, GUY L. GUTHRIE, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Publication number: 20180052608
    Abstract: A processor core of a data processing system, in response to a first instruction, generates a copy-type request specifying a source real address and transmits it to a lower level cache. In response to a second instruction, the processor core generates a paste-type request specifying a destination real address associated with a memory-mapped device and transmits it to the lower level cache. In response to receipt of the copy-type request, the lower level cache copies a data granule from a storage location specified by the source real address into a non-architected buffer. In response to receipt of the paste-type request, the lower level cache issues a command to write the data granule from the non-architected buffer to the memory-mapped device. In response to receipt from the memory-mapped device of a busy response, the processor core abandons the memory move instruction sequence and performs alternative processing.
    Type: Application
    Filed: August 22, 2016
    Publication date: February 22, 2018
    Inventors: LAKSHMINARAYANA B. ARIMILLI, GUY L. GUTHRIE, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Patent number: 9891912
    Abstract: An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of processing elements and receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions. Each processing element is configurable by the managing element to compare input data portions received from the load streaming unit or two or more of the other processing elements. Each processing unit can further select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion. Each processing element can provide the selected output data portion to the managing element or as an input to one of the processing elements.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: February 13, 2018
    Assignee: International Business Machines Corporation
    Inventors: Ganesh Balakrishnan, Bartholomew Blaner, John J. Reilly, Jeffrey A. Stuecheli
  • Patent number: 9892066
    Abstract: A memory system comprises a memory device coupled to a memory controller, the memory controller for receiving one or more memory requests from one or more core devices via an interconnect bus. The memory controller tracks utilization of the interconnect bus by tracking a selection of the one or more memory requests with fetched data from the one or more memory devices and waiting for scheduling to return on the interconnect bus during a time window. The memory controller, responsive to detecting utilization of the interconnect bus during the time window reaches a memory utilization threshold, dynamically selects a reduced read data size for a size of the fetched data to be returned with at least one read request from among the selection of one or more memory requests, the reduced data size selected from among at least two read data size options for the at least one read request of a maximum read data size and the reduced read data size that is less than the maximum read data size.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: February 13, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John S. Dodson, Didier R. Louis, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 9824014
    Abstract: In at least some embodiments, a processor core generates a store operation by executing a store instruction in an instruction sequence. The store operation is marked as a high priority store operation operation in response to detecting, in the instruction sequence, a barrier instruction that precedes the store instruction in program order and that includes a field set to indicate the store operation should be accorded high priority and is not so marked otherwise. The store operation is buffered in a store queue associated with a cache memory of the processor core. Handling of the store operation in the store queue is expedited in response to the store operation being marked as a high priority store operation and not expedited otherwise.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: November 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams
  • Patent number: 9811466
    Abstract: In at least some embodiments, a processor core generates a store operation by executing a store instruction in an instruction sequence. The store operation is marked as a high priority store operation in response to detecting, in the instruction sequence, a barrier instruction that precedes the store instruction in program order and that includes a field set to indicate the store operation should be accorded high priority and is not so marked otherwise. The store operation is buffered in a store queue associated with a cache memory of the processor core. Handling of the store operation in the store queue is expedited in response to the store operation being marked as a high priority store operation and not expedited otherwise.
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: November 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams
  • Publication number: 20170293558
    Abstract: A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and a system interconnect. In response to a load-and-reserve request from a first processor core, a first cache memory supporting the first processor core issues on the system interconnect a memory access request for a target cache line of the load-and-reserve request. Responsive to the memory access request and prior to receiving a systemwide coherence response for the memory access request, the first cache memory receives from a second cache memory in a second vertical cache hierarchy by cache-to-cache intervention the target cache line and an early indication of the systemwide coherence response for the memory access request. In response to the early indication and prior to receiving the systemwide coherence response, the first cache memory initiating processing to update the target cache line in the first cache memory.
    Type: Application
    Filed: April 11, 2016
    Publication date: October 12, 2017
    Inventors: GUY L. GUTHRIE, JONATHAN R. JACKSON, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Publication number: 20170293557
    Abstract: A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and a system interconnect coupled to the system memory and the multiple vertical cache hierarchies. A first cache memory in a first vertical cache hierarchy issues on the system interconnect a request for a target cache line. Responsive to the request and prior to receiving a systemwide coherence response for the request, the first cache memory receives from a second cache memory in a second vertical cache hierarchy by cache-to-cache intervention the target cache line and an early indication of the systemwide coherence response for the request. In response to the early indication of the systemwide coherence response and prior to receiving the systemwide coherence response, the first cache memory initiates processing to install the target cache line in the first cache memory.
    Type: Application
    Filed: April 11, 2016
    Publication date: October 12, 2017
    Inventors: GUY L. GUTHRIE, JONATHAN R. JACKSON, MICHAEL S. SIEGEL, WILLIAM J. STARKE, JEFFREY A. STUECHELI, DEREK E. WILLIAMS
  • Patent number: 9778933
    Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.
    Type: Grant
    Filed: June 8, 2015
    Date of Patent: October 3, 2017
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
  • Patent number: 9766890
    Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: September 19, 2017
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
  • Patent number: 9753862
    Abstract: A data processing system includes an upper level cache memory and a lower level cache memory employing different replacement policies. The lower level cache memory provides a respective one of a plurality of counters for each of a plurality of cache lines in a particular congruence class. The lower level cache memory initializes a counter value for a cache line in the particular congruence class that was castout from the upper level cache memory based on an indication of whether the cache line was accessed in the upper level cache memory following installation in the upper level cache memory. The lower level cache memory selects a victim cache line from among the plurality of cache lines in the particular congruence class for eviction from the lower level cache memory by reference to counter values of the plurality of counters.
    Type: Grant
    Filed: October 25, 2016
    Date of Patent: September 5, 2017
    Assignee: International Business Machines Corporation
    Inventors: Bernard C. Drerup, Guy L. Guthrie, Jeffrey A. Stuecheli, Phillip G. Williams
  • Patent number: 9740629
    Abstract: According to embodiments of the present disclosure, a method for invalidating an address translation entry in an effective address to real address translation table (ERAT) for a computer memory can include receiving a first invalidation request. According to some embodiments, the method may also include determining that a first entry in the ERAT corresponds with the first invalidation request, wherein the ERAT has a plurality of entries, each entry in the plurality of entries having an indicator. In particular embodiments, the method may then determine that a first indicator associated with the first entry indicates that the first entry is not being used by any of a plurality of memory access entities (MAE), wherein a first MAE can concurrently use a same entry as a second MAE. The first entry may then be invalidated in response to determining that the first entry is not being used.
    Type: Grant
    Filed: December 19, 2014
    Date of Patent: August 22, 2017
    Assignee: International Business Machines Corporation
    Inventors: Bartholomew Blaner, Jay G. Heaslip, Kenneth A. Lauricella, Jeffrey A. Stuecheli
  • Patent number: 9727483
    Abstract: According to embodiments of the present disclosure, a method for invalidating an address translation entry in an effective address to real address translation table (ERAT) for a computer memory can include receiving a first invalidation request. According to some embodiments, the method may also include determining that a first entry in the ERAT corresponds with the first invalidation request, wherein the ERAT has a plurality of entries, each entry in the plurality of entries having an indicator. In particular embodiments, the method may then determine that a first indicator associated with the first entry indicates that the first entry is not being used by any of a plurality of memory access entities (MAE), wherein a first MAE can concurrently use a same entry as a second MAE. The first entry may then be invalidated in response to determining that the first entry is not being used.
    Type: Grant
    Filed: June 1, 2015
    Date of Patent: August 8, 2017
    Assignee: International Business Machines Corporation
    Inventors: Bartholomew Blaner, Jay G. Heaslip, Kenneth A. Lauricella, Jeffrey A. Stuecheli
  • Patent number: 9727488
    Abstract: A set-associative cache memory includes a plurality of congruence classes each including multiple entries for storing cache lines of data. A respective one of a plurality of counters is maintained for each cache line stored in the multiple entries. In response to a memory access request, the cache memory selects a victim cache line stored in a particular entry of a particular congruence class for eviction from the cache memory by reference to at least a counter value of the victim cache line. The cache memory also receives a new cache line of data for insertion into the particular entry and an indication of a coherence state of the new cache line at a data source from which the cache memory received the new cache line. The cache memory installs the new cache line in the particular entry and sets an initial counter value of the counter for the new cache line based on the received indication of the coherence state at the data source.
    Type: Grant
    Filed: October 7, 2016
    Date of Patent: August 8, 2017
    Assignee: International Business Machines Corporation
    Inventors: Bernard C. Drerup, Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli
  • Patent number: 9727489
    Abstract: A set-associative cache memory includes a plurality of congruence classes each including multiple entries for storing cache lines of data. A respective one of a plurality of counters is maintained for each cache line stored in the multiple entries. In response to a memory access request, the cache memory selects a victim cache line stored in a particular entry of a particular congruence class for eviction from the cache memory by reference to at least a counter value of the victim cache line. The cache memory also receives a new cache line of data for insertion into the particular entry and an indication of a distance from the cache memory to a data source from which the cache memory received the new cache line. The cache memory installs the new cache line in the particular entry and sets an initial counter value of the counter for the new cache line based on the received indication of the distance.
    Type: Grant
    Filed: October 7, 2016
    Date of Patent: August 8, 2017
    Assignee: International Business Machines Corporation
    Inventors: Bernard C. Drerup, Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli
  • Patent number: 9684461
    Abstract: A memory system comprises memory devices coupled to a memory controller via a memory interface bus, the memory controller for receiving one or more memory requests via an interconnect. The memory controller tracks utilization of the memory interface bus for reads from the memory devices for a selection of the one or more memory requests during a time window. The memory controller, responsive to detecting utilization of the memory interface bus for reads during the time window reaches a memory utilization threshold, dynamically selects a reduced read data size for a size of data to be accessed from the memory devices by at least one read operation, the reduced data size selected from among at least two read data size options for the at least one read operation of a maximum read data size and the reduced read data size that is less than the maximum read data size.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: June 20, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John S. Dodson, Stephen J. Powell, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 9665346
    Abstract: Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.
    Type: Grant
    Filed: October 28, 2014
    Date of Patent: May 30, 2017
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Bruce G. Mealey, Karthick Rajamani, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 9652399
    Abstract: In at least some embodiments, a processor core generates one or more store operations by executing one or more store instructions in an instruction sequence. The one or more store operations are marked as a high priority store operations in response to detecting, in the instruction sequence, a window opening instruction and a window closing instruction bounding the one or more store instructions and are not so marked otherwise. The one or more store operations are buffered in a store queue associated with a cache memory of the processor core. Handling of the one or more store operations in the store queue is expedited in response to the one or more store operations being marked as high priority store operations and not expedited otherwise.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: May 16, 2017
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams
  • Patent number: 9645937
    Abstract: In at least some embodiments, a processor core generates one or more store operations by executing one or more store instructions in an instruction sequence. The one or more store operations are marked as a high priority store operations in response to detecting, in the instruction sequence, a window opening instruction and a window closing instruction bounding the one or more store instructions and are not so marked otherwise. The one or more store operations are buffered in a store queue associated with a cache memory of the processor core. Handling of the one or more store operations in the store queue is expedited in response to the one or more store operations being marked as high priority store operations and not expedited otherwise.
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: May 9, 2017
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli, Derek E. Williams