Patents by Inventor Bernard C. Drerup
Bernard C. Drerup has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20180341592Abstract: Techniques are disclosed for identifying data streams in a processor that are likely to and not likely to benefit from data prefetching. A prefetcher receives at least a first request in a plurality of requests to pre-fetch data from a stream in a plurality of streams. The prefetcher assigns a confidence level to the first request based on an amount of confirmations observed in the stream. The request is in a confident state if the confidence level exceeds a specified value. The first request is in a non-confident state if the confidence level does not exceed the specified value. Requests to prefetch data in the plurality of requests that are associated with respective streams with a low prefetch utilization are deprioritized. Doing so allows a memory controller to determine whether to drop the at least the first request based on the confidence level, prefetch utilization, and memory resource utilization.Type: ApplicationFiled: November 13, 2017Publication date: November 29, 2018Inventors: Bernard C. Drerup, Richard J. Eickemeyer, Guy L. Guthrie, Mohit Karve, George W. Rohrbaugh, III, Brian W. Thompto
-
Publication number: 20180341591Abstract: Techniques are disclosed for identifying data streams in a processor that are likely to and not likely to benefit from data prefetching. A prefetcher receives at least a first request in a plurality of requests to pre-fetch data from a stream in a plurality of streams. The prefetcher assigns a confidence level to the first request based on an amount of confirmations observed in the stream. The request is in a confident state if the confidence level exceeds a specified value. The first request is in a non-confident state if the confidence level does not exceed the specified value. Requests to prefetch data in the plurality of requests that are associated with respective streams with a low prefetch utilization are deprioritized. Doing so allows a memory controller to determine whether to drop the at least the first request based on the confidence level, prefetch utilization, and memory resource utilization.Type: ApplicationFiled: May 26, 2017Publication date: November 29, 2018Inventors: Bernard C. Drerup, Richard J. Eickemeyer, Guy L. Guthrie, Mohit Karve, George W. Rohrbaugh, III, Brian W. Thompto
-
Publication number: 20180101478Abstract: In one embodiment, a set-associative cache memory has a plurality of congruence classes each including multiple entries for storing cache lines of data. The cache memory includes a bank of counters, which includes a respective one of a plurality of counters for each cache line stored in the plurality of congruence classes. The cache memory selects victim cache lines for eviction from the cache memory by reference to counter values of counters within the bank of counters. A dynamic distribution of counter values of counters within the bank of counters is determined. In response, an amount counter values of counters within the bank of counters are adjusted on a cache miss is adjusted based on the dynamic distribution of the counter values.Type: ApplicationFiled: October 7, 2016Publication date: April 12, 2018Inventors: BERNARD C. DRERUP, RAM RAGHAVAN, SAHIL SABHARWAL, JEFFREY A. STUECHELI
-
Publication number: 20180101476Abstract: A set-associative cache memory includes a bank of counters including a respective one of a plurality of counters for each cache line stored in a plurality of congruence classes of the cache memory. Prior to receiving a memory access request that maps to a particular congruence class of the cache memory, the cache memory pre-selects a first victim cache line stored in a particular entry of a particular congruence class for eviction based on at least a counter value of the victim cache line. In response to receiving a memory access request that maps to the particular congruence class and that misses, the cache memory evicts the pre-selected first victim cache line from the particular entry, installs a new cache line in the particular entry, and pre-selects a second victim cache line from the particular congruence class based on at least a counter value of the second victim cache line.Type: ApplicationFiled: October 7, 2016Publication date: April 12, 2018Inventors: BERNARD C. DRERUP, RAM RAGHAVAN, SAHIL SABHARWAL, JEFFREY A. STUECHELI
-
Patent number: 9940239Abstract: A set-associative cache memory includes a bank of counters including a respective one of a plurality of counters for each cache line stored in a plurality of congruence classes of the cache memory. Prior to receiving a memory access request that maps to a particular congruence class of the cache memory, the cache memory pre-selects a first victim cache line stored in a particular entry of a particular congruence class for eviction based on at least a counter value of the victim cache line. In response to receiving a memory access request that maps to the particular congruence class and that misses, the cache memory evicts the pre-selected first victim cache line from the particular entry, installs a new cache line in the particular entry, and pre-selects a second victim cache line from the particular congruence class based on at least a counter value of the second victim cache line.Type: GrantFiled: October 7, 2016Date of Patent: April 10, 2018Assignee: International Business Machines CorporationInventors: Bernard C. Drerup, Ram Raghavan, Sahil Sabharwal, Jeffrey A. Stuecheli
-
Patent number: 9940246Abstract: In one embodiment, a set-associative cache memory has a plurality of congruence classes each including multiple entries for storing cache lines of data. The cache memory includes a bank of counters, which includes a respective one of a plurality of counters for each cache line stored in the plurality of congruence classes. The cache memory selects victim cache lines for eviction from the cache memory by reference to counter values of counters within the bank of counters. A dynamic distribution of counter values of counters within the bank of counters is determined. In response, an amount counter values of counters within the bank of counters are adjusted on a cache miss is adjusted based on the dynamic distribution of the counter values.Type: GrantFiled: October 7, 2016Date of Patent: April 10, 2018Assignee: International Business Machines CorporationInventors: Bernard C. Drerup, Ram Raghavan, Sahil Sabharwal, Jeffrey A. Stuecheli
-
Patent number: 9778933Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.Type: GrantFiled: June 8, 2015Date of Patent: October 3, 2017Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
-
Patent number: 9766890Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.Type: GrantFiled: December 23, 2014Date of Patent: September 19, 2017Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
-
Patent number: 9753862Abstract: A data processing system includes an upper level cache memory and a lower level cache memory employing different replacement policies. The lower level cache memory provides a respective one of a plurality of counters for each of a plurality of cache lines in a particular congruence class. The lower level cache memory initializes a counter value for a cache line in the particular congruence class that was castout from the upper level cache memory based on an indication of whether the cache line was accessed in the upper level cache memory following installation in the upper level cache memory. The lower level cache memory selects a victim cache line from among the plurality of cache lines in the particular congruence class for eviction from the lower level cache memory by reference to counter values of the plurality of counters.Type: GrantFiled: October 25, 2016Date of Patent: September 5, 2017Assignee: International Business Machines CorporationInventors: Bernard C. Drerup, Guy L. Guthrie, Jeffrey A. Stuecheli, Phillip G. Williams
-
Patent number: 9727489Abstract: A set-associative cache memory includes a plurality of congruence classes each including multiple entries for storing cache lines of data. A respective one of a plurality of counters is maintained for each cache line stored in the multiple entries. In response to a memory access request, the cache memory selects a victim cache line stored in a particular entry of a particular congruence class for eviction from the cache memory by reference to at least a counter value of the victim cache line. The cache memory also receives a new cache line of data for insertion into the particular entry and an indication of a distance from the cache memory to a data source from which the cache memory received the new cache line. The cache memory installs the new cache line in the particular entry and sets an initial counter value of the counter for the new cache line based on the received indication of the distance.Type: GrantFiled: October 7, 2016Date of Patent: August 8, 2017Assignee: International Business Machines CorporationInventors: Bernard C. Drerup, Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli
-
Patent number: 9727488Abstract: A set-associative cache memory includes a plurality of congruence classes each including multiple entries for storing cache lines of data. A respective one of a plurality of counters is maintained for each cache line stored in the multiple entries. In response to a memory access request, the cache memory selects a victim cache line stored in a particular entry of a particular congruence class for eviction from the cache memory by reference to at least a counter value of the victim cache line. The cache memory also receives a new cache line of data for insertion into the particular entry and an indication of a coherence state of the new cache line at a data source from which the cache memory received the new cache line. The cache memory installs the new cache line in the particular entry and sets an initial counter value of the counter for the new cache line based on the received indication of the coherence state at the data source.Type: GrantFiled: October 7, 2016Date of Patent: August 8, 2017Assignee: International Business Machines CorporationInventors: Bernard C. Drerup, Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli
-
Patent number: 9665297Abstract: A processor core is supported by an upper level cache and a lower level cache that receives, from an interconnect fabric, a write injection request requesting injection of a partial cache line of data into a target cache line identified by a target real address. In response to receipt of the write injection request, a determination is made that the upper level cache is a highest point of coherency for the target real address. In response to the determination, the upper level cache and lower level cache collaborate to transfer the target cache line from the upper level cache to the lower level cache. The lower level cache updates the target cache line by merging the partial cache of data into the target cache line and storing the updated target cache line in the lower level cache.Type: GrantFiled: October 25, 2016Date of Patent: May 30, 2017Assignee: International Business Machines CorporationInventors: Luis E. De La Torre, Bernard C. Drerup, Sanjeev Ghai, Guy L. Guthrie, Alexander M. Taft, Derek E. Williams
-
Patent number: 9575825Abstract: A processor core of a data processing system receives a push instruction of a sending thread that requests that a message payload identified by at least one operand of the push instruction be pushed to a mailbox of a receiving thread. In response to receiving the push instruction, the processor core executes the push instruction of the sending thread. In response to executing the push instruction, the processor core initiates transmission of the message payload to the mailbox of the receiving thread. In one embodiment, the processor core initiates transmission of the message payload by transmitting a co-processor request to a switch of the data processing system via an interconnect fabric.Type: GrantFiled: December 23, 2014Date of Patent: February 21, 2017Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Bradly G. Frey, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
-
Patent number: 9569293Abstract: A processor core of a data processing system receives a push instruction of a sending thread that requests that a message payload identified by at least one operand of the push instruction be pushed to a mailbox of a receiving thread. In response to receiving the push instruction, the processor core executes the push instruction of the sending thread. In response to executing the push instruction, the processor core initiates transmission of the message payload to the mailbox of the receiving thread. In one embodiment, the processor core initiates transmission of the message payload by transmitting a co-processor request to a switch of the data processing system via an interconnect fabric.Type: GrantFiled: June 8, 2015Date of Patent: February 14, 2017Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, Bradly G. Frey, Guy L. Guthrie, John D. Irish, William J. Starke, Jeffrey A. Stuecheli
-
Publication number: 20160179593Abstract: A processor core of a data processing system receives a push instruction of a sending thread that requests that a message payload identified by at least one operand of the push instruction be pushed to a mailbox of a receiving thread. In response to receiving the push instruction, the processor core executes the push instruction of the sending thread. In response to executing the push instruction, the processor core initiates transmission of the message payload to the mailbox of the receiving thread. In one embodiment, the processor core initiates transmission of the message payload by transmitting a co-processor request to a switch of the data processing system via an interconnect fabric.Type: ApplicationFiled: June 8, 2015Publication date: June 23, 2016Inventors: LAKSHMINARAYANA B. ARIMILLI, BERNARD C. DRERUP, BRADLY G. FREY, GUY L. GUTHRIE, JOHN D. IRISH, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Publication number: 20160179517Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.Type: ApplicationFiled: December 23, 2014Publication date: June 23, 2016Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: LAKSHMINARAYANA B. ARIMILLI, BERNARD C. DRERUP, GUY L. GUTHRIE, JOHN D. IRISH, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Publication number: 20160179591Abstract: A processor core of a data processing system receives a push instruction of a sending thread that requests that a message payload identified by at least one operand of the push instruction be pushed to a mailbox of a receiving thread. In response to receiving the push instruction, the processor core executes the push instruction of the sending thread. In response to executing the push instruction, the processor core initiates transmission of the message payload to the mailbox of the receiving thread. In one embodiment, the processor core initiates transmission of the message payload by transmitting a co-processor request to a switch of the data processing system via an interconnect fabric.Type: ApplicationFiled: December 23, 2014Publication date: June 23, 2016Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: LAKSHMINARAYANA B. ARIMILLI, BERNARD C. DRERUP, BRADLY G. FREY, GUY L. GUTHRIE, JOHN D. IRISH, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Publication number: 20160179518Abstract: In at least some embodiments, a processor core executes a sending thread including a first push instruction and a second push instruction subsequent to the first push instruction in a program order. Each of the first and second push instructions requests that a respective message payload be pushed to a mailbox of a receiving thread. In response to executing the first and second push instructions, the processor core transmits respective first and second co-processor requests to a switch in the data processing system via an interconnect fabric of the data processing system. The processor core transmits the second co-processor request to the switch without regard to acceptance of the first co-processor request by the switch.Type: ApplicationFiled: June 8, 2015Publication date: June 23, 2016Inventors: LAKSHMINARAYANA B. ARIMILLI, BERNARD C. DRERUP, GUY L. GUTHRIE, JOHN D. IRISH, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Patent number: 9342387Abstract: In a data processing system, a switch of the data processing system receives a request to push a message referenced by an instruction of a sending thread to a receiving thread. In response to receiving the request, the switch determines whether the sending thread is authorized to push the message to the receiving thread by attempting to access an entry of a data structure of the switch utilizing a key derived from at least one identifier of the sending thread. In response to access to the entry being successful, content of the entry is utilized to determine an address of a mailbox of the receiving thread, and the switch pushes the message to the mailbox of the receiving thread. In response to access to the entry not being successful, the switch refrains from pushing the message to the mailbox of the receiving thread.Type: GrantFiled: June 8, 2015Date of Patent: May 17, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, John D. Irish, Charles F. Marino, William J. Starke
-
Patent number: 9286148Abstract: In a data processing system, a switch of the data processing system receives a request to push a message referenced by an instruction of a sending thread to a receiving thread. In response to receiving the request, the switch determines whether the sending thread is authorized to push the message to the receiving thread by attempting to access an entry of a data structure of the switch utilizing a key derived from at least one identifier of the sending thread. In response to access to the entry being successful, content of the entry is utilized to determine an address of a mailbox of the receiving thread, and the switch pushes the message to the mailbox of the receiving thread. In response to access to the entry not being successful, the switch refrains from pushing the message to the mailbox of the receiving thread.Type: GrantFiled: December 22, 2014Date of Patent: March 15, 2016Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Bernard C. Drerup, John D. Irish, Charles F. Marino, William J. Starke