Patents by Inventor Jeffrey A. Stuecheli

Jeffrey A. Stuecheli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9355035
    Abstract: A set associative cache is managed by a memory controller which places writeback instructions for modified (dirty) cache lines into a virtual write queue, determines when the number of the sets containing a modified cache line is greater than a high water mark, and elevates a priority of the writeback instructions over read operations. The controller can return the priority to normal when the number of modified sets is less than a low water mark. In an embodiment wherein the system memory device includes rank groups, the congruence classes can be mapped based on the rank groups. The number of writes pending in a rank group exceeding a different threshold can additionally be a requirement to trigger elevation of writeback priority. A dirty vector can be used to provide an indication that corresponding sets contain a modified cache line, particularly in least-recently used segments of the corresponding sets.
    Type: Grant
    Filed: November 18, 2013
    Date of Patent: May 31, 2016
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Benjiman L. Goodman, Jody B. Joyner, Stephen J. Powell, William J. Starke, Jeffrey A. Stuecheli
  • Publication number: 20160124900
    Abstract: A method for sorting data in an array processor. Each of a first tier of processing elements in the array processor receives data inputs from a load streaming unit. Each of the first tier processing elements compares input data portions received from the load streaming unit, wherein the input data portions are stored for processing in respective queues. The first tier processing elements select one of the input data portions to be an output data portion based on the comparison, and in response to the selection, remove a corresponding queue entry and request next input data from the load streaming unit. Each of the first tier processing elements further provides the output data portion as an input data portion to a second tier processing element that generates output data based on a comparison of output data received from at least two first tier processing elements.
    Type: Application
    Filed: June 3, 2015
    Publication date: May 5, 2016
    Inventors: Ganesh Balakrishnan, Bartholomew Blaner, John J. Reilly, Jeffrey A. Stuecheli
  • Publication number: 20160124755
    Abstract: An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of processing elements and receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions. Each processing element is configurable by the managing element to compare input data portions received from the load streaming unit or two or more of the other processing elements. Each processing unit can further select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion. Each processing element can provide the selected output data portion to the managing element or as an input to one of the processing elements.
    Type: Application
    Filed: October 31, 2014
    Publication date: May 5, 2016
    Inventors: Ganesh Balakrishnan, Bartholomew Blaner, John J. Reilly, Jeffrey A. Stuecheli
  • Publication number: 20160085721
    Abstract: Various implementations of a method, system, and computer program product for pattern matching using a reconfigurable array processor are disclosed. In one embodiment, a processor array manager of the reconfigurable array processor receives an input data stream for pattern matching and generates a tokenized input data stream from the input data stream. A different portion of the tokenized input data stream is provided to each of a plurality of processing elements of the reconfigurable array processor. Each processing element can compare the received portion of the tokenized input data stream against one or more reference patterns to generate an intermediate result that indicates whether the portion of the tokenized input data stream matches a reference pattern. The processor array manager can combine the intermediate results received from each processing element to yield a final result that indicates whether the input data stream includes a reference pattern.
    Type: Application
    Filed: January 21, 2015
    Publication date: March 24, 2016
    Inventors: Bulent Abali, Ganesh Balakrishnan, Bartholomew Blaner, Peter A. Sandon, Jeffrey A. Stuecheli
  • Publication number: 20160085720
    Abstract: Various implementations of a method, system, and computer program product for pattern matching using a reconfigurable array processor are disclosed. In one embodiment, a processor array manager of the reconfigurable array processor receives an input data stream for pattern matching and generates a tokenized input data stream from the input data stream. A different portion of the tokenized input data stream is provided to each of a plurality of processing elements of the reconfigurable array processor. Each processing element can compare the received portion of the tokenized input data stream against one or more reference patterns to generate an intermediate result that indicates whether the portion of the tokenized input data stream matches a reference pattern. The processor array manager can combine the intermediate results received from each processing element to yield a final result that indicates whether the input data stream includes a reference pattern.
    Type: Application
    Filed: September 22, 2014
    Publication date: March 24, 2016
    Inventors: Bulent Abali, Ganesh Balakrishnan, Bartholomew Blaner, Peter A. Sandon, Jeffrey A. Stuecheli
  • Patent number: 9286220
    Abstract: In response to snooping a read-type memory access request of a requestor on a system fabric of a data processing system, a memory channel interface forwards the request to a memory buffer and starts a timer. In response to the forwarded request, the memory buffer performs a lookup of a target address of the request in a memory controller cache. In response to the target address hitting in a coherence state permitting provision of early data, the memory buffer provides a response indicating early data and provides a copy of a target memory block of the request to the memory channel interface. The memory channel interface, responsive to receipt prior to expiration of the timer of the response indicating early data, transmits the copy of the target memory block to the requestor via the system fabric prior to receiving a combined response of the data processing system to the request.
    Type: Grant
    Filed: September 25, 2013
    Date of Patent: March 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: John T. Hollaway, Jr., Charles F. Marino, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 9256537
    Abstract: A coherent attached processor proxy (CAPP) of a primary coherent system receives a memory access request specifying a target address in the primary coherent system from an attached processor (AP) external to the primary coherent system. The CAPP includes a CAPP directory of contents of a cache memory in the AP that holds copies of memory blocks belonging to a coherent address space of the primary coherent system. In response to the memory access request, the CAPP performs a first determination of a coherence state for the target address and allocates a master machine to service the memory access request in accordance with the first determination. Thereafter, during allocation of the master machine, the CAPP updates the coherence state and performs a second determination of the coherence state. The master machine services the memory access request in accordance with the second determination.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: February 9, 2016
    Assignee: International Business Machines Corporation
    Inventors: Bartholomew Blaner, David W. Cummings, Michael S. Siegel, Jeffrey A. Stuecheli
  • Publication number: 20160034400
    Abstract: A technique for data prefetching for a multi-core chip includes determining memory utilization of the multi-core chip. In response to the memory utilization of the multi-core chip exceeding a first level, data prefetching for the multi-core chip is modified from a first data prefetching arrangement to a second data prefetching arrangement to minimize unused prefetched cache lines. In response to the memory utilization of the multi-core chip not exceeding the first level, the first data prefetching arrangement is maintained. The first and second data prefetching arrangements are different.
    Type: Application
    Filed: July 29, 2014
    Publication date: February 4, 2016
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: JASON NATHANIEL DALE, MILES R. DOOLEY, RICHARD J. EICKEMEYER, JR., JOHN BARRY GRISWELL, JR., FRANCIS PATRICK O'CONNELL, JEFFREY A. STUECHELI
  • Patent number: 9231618
    Abstract: A bypass mechanism allows a memory controller to transmit requested data to an interconnect before the data's error code has been decoded, e.g., a cyclical redundancy check (CRC). The tag, tag CRC, data, and data CRC are pipelined from DRAM in four frames, each having multiple clock cycles. The tag includes a bypass bit indicating whether data transmission to the interconnect should begin before CRC decoding. After receiving the tag CRC, the controller decodes it and reserves a request machine which sends a transmit request signal to inform the interconnect that data is available. Once the transmit request is granted by the interconnect, the controller can immediately start sending the data, before decoding the data CRC. So long as no error is found, the controller completes transmission of the data to the interconnect, including providing an indication that the data as transmitted is error-free.
    Type: Grant
    Filed: December 6, 2013
    Date of Patent: January 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Benjiman L. Goodman, Harrison M. McCreary, Stephen J. Powell, William J. Starke, Jeffrey A. Stuecheli
  • Patent number: 9218292
    Abstract: A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line.
    Type: Grant
    Filed: June 18, 2013
    Date of Patent: December 22, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Benjiman L. Goodman, Jody B. Joyner, Stephen J. Powell, Aaron C. Sawdey, Jeffrey A. Stuecheli
  • Publication number: 20150363317
    Abstract: A technique for operating a memory system for a node includes interrogating, by a cache, an associated cache directory to determine whether a shared cache line to be installed in the cache is associated with an invalid global state in the cache. The invalid global state specifies that a version of the shared cache line has been intervened off-node. In response to the shared cache line being in the invalid global state the cache spawns a castout invalid global command for the shared cache line. The shared cache line is installed in the cache. A coherence state for the shared cache line is updated in the associated cache directory to indicate the shared cache line is shared.
    Type: Application
    Filed: June 17, 2014
    Publication date: December 17, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: GUY L. GUTHRIE, HIEN MINH LE, JEFFREY A. STUECHELI, PHILLIP G. WILLIAMS
  • Publication number: 20150363316
    Abstract: A technique for operating a memory system for a node includes interrogating, by a cache, an associated cache directory to determine whether a shared cache line to be installed in the cache is associated with an invalid global state in the cache. The invalid global state specifies that a version of the shared cache line has been intervened off-node. In response to the shared cache line being in the invalid global state the cache spawns a castout invalid global command for the shared cache line. The shared cache line is installed in the cache. A coherence state for the shared cache line is updated in the associated cache directory to indicate the shared cache line is shared.
    Type: Application
    Filed: June 10, 2015
    Publication date: December 17, 2015
    Inventors: GUY L. GUTHRIE, HIEN MINH LE, JEFFREY A. STUECHELI, PHILLIP G. WILLIAMS
  • Patent number: 9213647
    Abstract: A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line.
    Type: Grant
    Filed: September 23, 2013
    Date of Patent: December 15, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Benjiman L. Goodman, Jody B. Joyner, Stephen J. Powell, Aaron C. Sawdey, Jeffrey A. Stuecheli
  • Patent number: 9208091
    Abstract: A coherent attached processor proxy (CAPP) includes transport logic having a first interface configured to support communication with a system fabric of a primary coherent system and a second interface configured to support communication with an attached processor (AP) that is external to the primary coherent system and that includes a cache memory that holds copies of memory blocks belonging to a coherent address space of the primary coherent system. The CAPP further includes one or more master machines that initiate memory access requests on the system fabric of the primary coherent system on behalf of the AP, one or more snoop machines that service requests snooped on the system fabric, and a CAPP directory having a precise directory having a plurality of entries each associated with a smaller data granule and a coarse directory having a plurality of entries each associated with a larger data granule.
    Type: Grant
    Filed: June 19, 2013
    Date of Patent: December 8, 2015
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Bartholomew Blaner, Michael S. Siegel, Jeffrey A. Stuecheli, Charles F. Marino
  • Patent number: 9208092
    Abstract: A coherent attached processor proxy (CAPP) includes transport logic having a first interface configured to support communication with a system fabric of a primary coherent system and a second interface configured to support communication with an attached processor (AP) that is external to the primary coherent system and that includes a cache memory that holds copies of memory blocks belonging to a coherent address space of the primary coherent system. The CAPP further includes one or more master machines that initiate memory access requests on the system fabric of the primary coherent system on behalf of the AP, one or more snoop machines that service requests snooped on the system fabric, and a CAPP directory having a precise directory having a plurality of entries each associated with a smaller data granule and a coarse directory having a plurality of entries each associated with a larger data granule.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: December 8, 2015
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Bartholomew Blaner, Michael S. Siegel, Jeffrey A. Stuecheli, Charles F. Marino
  • Publication number: 20150347334
    Abstract: A request to send a first message from a first component to a second component is received at an arbiter. The first component is located in a first time zone and the second component is located in a second time zone. The arbiter determines that the second component is located in the second time zone. It is determined that the second time zone can be communicated with via one or more communications channels in a first direction. It is determined whether bandwidth is available on the one or more communications channels in the first direction. If bandwidth is available on the one or more communications channels in the first direction, a data path between the first component and the one or more communications channels in the first direction is created and the request is granted. Otherwise, the grant of the request is delayed.
    Type: Application
    Filed: May 30, 2014
    Publication date: December 3, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert C. Dixon, Lonny J. Lambrecht, Charles F. Marino, Jeffrey A. Stuecheli
  • Publication number: 20150347343
    Abstract: A request to send a first message from a first component to a second component is received at an arbiter. The first component is located in a first time zone and the second component is located in a second time zone. The arbiter determines that the second component is located in the second time zone. It is determined that the second time zone can be communicated with via one or more communications channels in a first direction. It is determined whether bandwidth is available on the one or more communications channels in the first direction. If bandwidth is available on the one or more communications channels in the first direction, a data path between the first component and the one or more communications channels in the first direction is created and the request is granted. Otherwise, the grant of the request is delayed.
    Type: Application
    Filed: September 12, 2014
    Publication date: December 3, 2015
    Inventors: Robert C. Dixon, Lonny J. Lambrecht, Charles F. Marino, Jeffrey A. Stuecheli
  • Publication number: 20150347333
    Abstract: A request to send a message from a first component, located on a first processor, to a second component, located on a second processor, is received. It is determined that the second processor can be communicated with via a first bidirectional communication path. It is determined that bandwidth is available on the first bidirectional communication path. It is determined that bandwidth is available on a second bidirectional communication path. In response to a determination that bandwidth is available on the second bidirectional communication path, a data path is created between the first component and the second bidirectional communication path and the request to send the message to the second component is granted. In response to a determination that bandwidth is not available on the first bidirectional communication path or on the second bidirectional communication path, the grant of the request to send the message to the second component is delayed.
    Type: Application
    Filed: May 30, 2014
    Publication date: December 3, 2015
    Applicant: International Business Machines Corporation
    Inventors: Robert C. Dixon, Lonny J. Lambrecht, Charles F. Marino, Jeffrey A. Stuecheli
  • Publication number: 20150347340
    Abstract: A request to send a message from a first component, located on a first processor, to a second component, located on a second processor, is received. It is determined that the second processor can be communicated with via a first bidirectional communication path. It is determined that bandwidth is available on the first bidirectional communication path. It is determined that bandwidth is available on a second bidirectional communication path. In response to a determination that bandwidth is available on the second bidirectional communication path, a data path is created between the first component and the second bidirectional communication path and the request to send the message to the second component is granted. In response to a determination that bandwidth is not available on the first bidirectional communication path or on the second bidirectional communication path, the grant of the request to send the message to the second component is delayed.
    Type: Application
    Filed: September 12, 2014
    Publication date: December 3, 2015
    Inventors: Robert C. Dixon, Lonny J. Lambrecht, Charles F. Marino, Jeffrey A. Stuecheli
  • Patent number: 9189403
    Abstract: A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: November 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, William J. Starke, Jeffrey Stuecheli, Derek E. Williams, Thomas R. Puzak