Patents by Inventor Harold Wade Cain, III

Harold Wade Cain, III has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11567767
    Abstract: A system for processing gather and scatter instructions can implement a front-end subsystem, a back-end subsystem, or both. The front-end subsystem includes a prediction unit configured to determine a predicted quantity of coalesced memory access operations required by an instruction. A decode unit converts the instruction into a plurality of access operations based on the predicted quantity, and transmits the plurality of access operations and an indication of the predicted quantity to an issue queue. The back-end subsystem includes a load-store unit that receives a plurality of access operations corresponding to an instruction, determines a subset of the plurality of access operations that can be coalesced, and forms a coalesced memory access operation from the subset. A queue stores multiple memory addresses for a given load-store entry to provide for execution of coalesced memory accesses.
    Type: Grant
    Filed: July 30, 2020
    Date of Patent: January 31, 2023
    Assignees: MARVELL ASIA PTE, LTD., CRAY INC.
    Inventors: Harold Wade Cain, III, Rabin Andrew Sugumar, Nagesh Bangalore Lakshminarayana, Daniel Jonathan Ernst, Sanyam Mehta
  • Patent number: 11567771
    Abstract: A system for processing gather and scatter instructions can implement a front-end subsystem, a back-end subsystem, or both. The front-end subsystem includes a prediction unit configured to determine a predicted quantity of coalesced memory access operations required by an instruction. A decode unit converts the instruction into a plurality of access operations based on the predicted quantity, and transmits the plurality of access operations and an indication of the predicted quantity to an issue queue. The back-end subsystem includes a load-store unit that receives a plurality of access operations corresponding to an instruction, determines a subset of the plurality of access operations that can be coalesced, and forms a coalesced memory access operation from the subset. A queue stores multiple memory addresses for a given load-store entry to provide for execution of coalesced memory accesses.
    Type: Grant
    Filed: July 30, 2020
    Date of Patent: January 31, 2023
    Assignees: Marvell Asia PTE, LTD., Cray Inc.
    Inventors: Harold Wade Cain, III, Nagesh Bangalore Lakshminarayana, Daniel Jonathan Ernst, Sanyam Mehta
  • Patent number: 11550723
    Abstract: An apparatus, method, and system for memory bandwidth aware data prefetching is presented. The method may comprise monitoring a number of request responses received in an interval at a current prefetch request generation rate, comparing the number of request responses received in the interval to at least a first threshold, and adjusting the current prefetch request generation rate to an updated prefetch request generation rate by selecting the updated prefetch request generation rate from a plurality of prefetch request generation rates, based on the comparison. The request responses may be NACK or RETRY responses. The method may further comprise either retaining a current prefetch request generation rate or selecting a maximum prefetch request generation rate as the updated prefetch request generation rate in response to an indication that prefetching is accurate.
    Type: Grant
    Filed: August 27, 2018
    Date of Patent: January 10, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Niket Choudhary, David Scott Ray, Thomas Philip Speier, Eric Robinson, Harold Wade Cain, III, Nikhil Narendradev Sharma, Joseph Gerald McDonald, Brian Michael Stempel, Garrett Michael Drapala
  • Publication number: 20220035632
    Abstract: A system for processing gather and scatter instructions can implement a front-end subsystem, a back-end subsystem, or both. The front-end subsystem includes a prediction unit configured to determine a predicted quantity of coalesced memory access operations required by an instruction. A decode unit converts the instruction into a plurality of access operations based on the predicted quantity, and transmits the plurality of access operations and an indication of the predicted quantity to an issue queue. The back-end subsystem includes a load-store unit that receives a plurality of access operations corresponding to an instruction, determines a subset of the plurality of access operations that can be coalesced, and forms a coalesced memory access operation from the subset. A queue stores multiple memory addresses for a given load-store entry to provide for execution of coalesced memory accesses.
    Type: Application
    Filed: July 30, 2020
    Publication date: February 3, 2022
    Inventors: Harold Wade Cain, III, Rabin Andrew Sugumar, Nagesh Bangalore Lakshminarayana, Daniel Jonathan Ernst, Sanyam Mehta
  • Publication number: 20220035633
    Abstract: A system for processing gather and scatter instructions can implement a front-end subsystem, a back-end subsystem, or both. The front-end subsystem includes a prediction unit configured to determine a predicted quantity of coalesced memory access operations required by an instruction. A decode unit converts the instruction into a plurality of access operations based on the predicted quantity, and transmits the plurality of access operations and an indication of the predicted quantity to an issue queue. The back-end subsystem includes a load-store unit that receives a plurality of access operations corresponding to an instruction, determines a subset of the plurality of access operations that can be coalesced, and forms a coalesced memory access operation from the subset. A queue stores multiple memory addresses for a given load-store entry to provide for execution of coalesced memory accesses.
    Type: Application
    Filed: July 30, 2020
    Publication date: February 3, 2022
    Inventors: Harold Wade Cain, III, Nagesh Bangalore Lakshminarayana, Daniel Jonathan Ernst, Sanyam Mehta
  • Patent number: 11221971
    Abstract: Systems and methods are directed to managing access to a shared memory. A request received at a memory controller, for access to the shared memory from a client of one or more clients configured to access the shared memory, is placed in at least one queue in the memory controller. A series of one or more timeout values is assigned to the request, based, at least in part on a priority associated with the client which generated the request. The priority may be fixed or based on a Quality-of-Service (QoS) class of the client. A timer is incremented while the request remains in the first queue. As the timer traverses each one of the one or more timeout values in the series, a criticality level of the request is incremented. A request with a higher criticality level may be prioritized for servicing over a request with a lower criticality level.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: January 11, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Derek Hower, Harold Wade Cain, III, Carl Alan Waldspurger
  • Patent number: 10885015
    Abstract: Systems, methods, and products for database system transaction management are provided herein. One aspect provides for annotating via a computing device at least one data object residing on the computing device utilizing at least one transaction tag, the at least one transaction tag being configured to indicate a status of an associated data object; processing at least one database transaction utilizing a transactional memory process, wherein access to the at least one data object is determined based on the status of the at least one data object; and updating the status of the at least one data object responsive to an attempted access of the at least one data object by the at least one database transaction. Other embodiments and aspects are also described herein.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: January 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Harold Wade Cain, III, Donna N. Dillenberger, Michel H. T. Hack, Hong Min, Gong Su, James Zu-Chia Teng
  • Patent number: 10831254
    Abstract: Allocating power between multiple central processing units (CPUs) in a multi-CPU processor based on total current availability and individual CPU quality-of-service (QoS) requirements is disclosed. Current from a power rail is allocated to CPUs by a global current manger (GCM) circuit related to performance criteria set by CPUs. The CPUs can request increased current allocation from the GCM circuit, such as in response to executing a higher performance task. If the increased current allocation request keeps total current on the power rail within its maximum rail current limit, the GCM circuit approves the request to allow the CPU increased current allocation. This can allow CPUs executing higher performance tasks to have a larger current allocation than CPUs executing lower performance tasks without the maximum rail current limit being exceeded, and without having to necessarily lower voltage of the power rail, which could unnecessarily lower performance of all CPUs.
    Type: Grant
    Filed: September 12, 2018
    Date of Patent: November 10, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Shivam Priyadarshi, SeyedMajid Zahedi, Derek Robert Hower, Carl Alan Waldspurger, Jeffrey Todd Bridges, Sanjay Bhikhubhai Patel, Gabriel Martel Tarr, Chih Kang Lin, Ryan Donovan Wells, Harold Wade Cain, III
  • Publication number: 20200065247
    Abstract: An apparatus, method, and system for memory bandwidth aware data prefetching is presented. The method may comprise monitoring a number of request responses received in an interval at a current prefetch request generation rate, comparing the number of request responses received in the interval to at least a first threshold, and adjusting the current prefetch request generation rate to an updated prefetch request generation rate by selecting the updated prefetch request generation rate from a plurality of prefetch request generation rates, based on the comparison. The request responses may be NACK or RETRY responses. The method may further comprise either retaining a current prefetch request generation rate or selecting a maximum prefetch request generation rate as the updated prefetch request generation rate in response to an indication that prefetching is accurate.
    Type: Application
    Filed: August 27, 2018
    Publication date: February 27, 2020
    Inventors: Niket CHOUDHARY, David Scott RAY, Thomas Philip SPEIER, Eric ROBINSON, Harold Wade CAIN, III, Nikhil Narendradev SHARMA, Joseph Gerald MCDONALD, Brian Michael STEMPEL, Garrett Michael DRAPALA
  • Patent number: 10255074
    Abstract: Selective flushing of instructions in an instruction pipeline in a processor back to an execution-determined target address in response to a precise interrupt is disclosed. A selective instruction pipeline flush controller determines if a precise interrupt has occurred for an executed instruction in the instruction pipeline. The selective instruction pipeline flush controller determines if an instruction at the correct resolved target address of the instruction that caused the precise interrupt is contained in the instruction pipeline. If so, the selective instruction pipeline flush controller can selectively flush instructions back to the instruction in the pipeline that contains the correct resolved target address to reduce the amount of new instruction fetching.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: April 9, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Vignyan Reddy Kothinti Naresh, Rami Mohammad Al Sheikh, Harold Wade Cain, III
  • Publication number: 20190086982
    Abstract: Allocating power between multiple central processing units (CPUs) in a multi-CPU processor based on total current availability and individual CPU quality-of-service (QoS) requirements is disclosed. Current from a power rail is allocated to CPUs by a global current manger (GCM) circuit related to performance criteria set by CPUs. The CPUs can request increased current allocation from the GCM circuit, such as in response to executing a higher performance task. If the increased current allocation request keeps total current on the power rail within its maximum rail current limit, the GCM circuit approves the request to allow the CPU increased current allocation. This can allow CPUs executing higher performance tasks to have a larger current allocation than CPUs executing lower performance tasks without the maximum rail current limit being exceeded, and without having to necessarily lower voltage of the power rail, which could unnecessarily lower performance of all CPUs.
    Type: Application
    Filed: September 12, 2018
    Publication date: March 21, 2019
    Inventors: Shivam Priyadarshi, SeyedMajid Zahedi, Derek Robert Hower, Carl Alan Waldspurger, Jeffrey Todd Bridges, Sanjay Bhikhubhai Patel, Gabriel Martel Tarr, Chih Kang Lin, Ryan Donovan Wells, Harold Wade Cain, III
  • Patent number: 10223278
    Abstract: Systems and methods are directed to selectively bypassing allocation of cache lines in a cache. A bypass predictor table is provided with reuse counters to track reuse characteristics of cache lines, based on memory regions to which the cache lines belong in memory. A contender reuse counter provides an indication of a likelihood of reuse of a contender cache line in the cache pursuant to a miss in the cache for the contender cache line, and a victim reuse counter provides an indication of a likelihood of reuse for a victim cache line that will be evicted if the contender cache line is allocated in the cache. A decision whether to allocate the contender cache line in the cache or bypass allocation of the contender cache line in the cache is based on the contender reuse counter value and the victim reuse counter value.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: March 5, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Shivam Priyadarshi, Brandon Harley Anthony Dwiel, Rami Mohammad A. Al Sheikh, Harold Wade Cain, III
  • Patent number: 10185668
    Abstract: Systems and methods relate to cost-aware cache management policies. In a cost-aware least recently used (LRU) replacement policy, temporal locality as well as miss cost is taken into account in selecting a cache line for replacement, wherein the miss cost is based on an associated operation type including instruction cache read, data cache read, data cache write, prefetch, and write back. In a cost-aware dynamic re-reference interval prediction (DRRIP) based cache management policy, miss costs associated with operation types pertaining to a cache line are considered for assigning re-reference interval prediction values (RRPV) for inserting the cache line, pursuant to a cache miss and for updating the RRPV upon a hit for the cache line. The operation types comprise instruction cache access, data cache access, prefetch, and write back. These policies improve victim selection, while minimizing cache thrashing and scans.
    Type: Grant
    Filed: September 20, 2016
    Date of Patent: January 22, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Rami Mohammad A. Al Sheikh, Shivam Priyadarshi, Harold Wade Cain, III
  • Publication number: 20190018798
    Abstract: Systems and methods relate to cost-aware cache management policies. In a cost-aware least recently used (LRU) replacement policy, temporal locality as well as miss cost is taken into account in selecting a cache line for replacement, wherein the miss cost is based on an associated operation type including instruction cache read, data cache read, data cache write, prefetch, and write back. In a cost-aware dynamic re-reference interval prediction (DRRIP) based cache management policy, miss costs associated with operation types pertaining to a cache line are considered for assigning re-reference interval prediction values (RRPV) for inserting the cache line, pursuant to a cache miss and for updating the RRPV upon a hit for the cache line. The operation types comprise instruction cache access, data cache access, prefetch, and write back. These policies improve victim selection, while minimizing cache thrashing and scans.
    Type: Application
    Filed: September 18, 2018
    Publication date: January 17, 2019
    Inventors: Rami Mohammad AL SHEIKH, Shivam PRIYADARSHI, Harold Wade CAIN III
  • Publication number: 20190013062
    Abstract: Systems and methods for selective refresh of a cache, such as a last-level cache implemented as an embedded DRAM (eDRAM). A refresh bit and a reuse bit are associated with each way of at least one set of the cache. A least recently used (LRU) stack tracks positions of the ways, with positions towards a most recently used position of a threshold comprising more recently used positions and positions towards a least recently used position of the threshold comprise less recently used positions. A line in a way is selectively refreshed if the position of the way is one of the more recently used positions and if the refresh bit associated with the way is set, or the position of the way is one of the less recently used positions and if the refresh bit and the reuse bit associated with the way are both set.
    Type: Application
    Filed: July 7, 2017
    Publication date: January 10, 2019
    Inventors: Francois Ibrahim ATALLAH, Gregory Michael WRIGHT, Shivam PRIYADARSHI, Garrett Michael DRAPALA, Harold Wade CAIN, III, Erik HEDBERG
  • Patent number: 10169240
    Abstract: Systems and methods for managing memory access bandwidth include a spatial locality predictor. The spatial locality predictor includes a memory region table with prediction counters associated with memory regions of a memory. When cache lines are evicted from a cache, the sizes of the cache lines which were accessed by a processor are used for updating the prediction counters. Depending on values of the prediction counters, the sizes of cache lines which are likely to be used by the processor are predicted for the corresponding memory regions. Correspondingly, the memory access bandwidth between the processor and the memory may be reduced to fetch a smaller size data (e.g., half cache line) than a full cache line if the size of the cache line likely to be used is predicted to be less than that of the full cache line. Prediction counters may be incremented or decremented by different amounts depending on access bits corresponding to portions of a cache line.
    Type: Grant
    Filed: September 20, 2016
    Date of Patent: January 1, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Brandon Harley Anthony Dwiel, Harold Wade Cain, III, Shivam Priyadarshi
  • Patent number: 10102375
    Abstract: Techniques for preventing side-channel attacks on a cache are provided. A method according to these techniques includes executing a software instruction indicating that a portion of software requiring data protection is about to be executed, setting the cache to operate in a randomized mode to de-correlate cache timing and cache miss behavior from data being processed by the portion of software requiring data protection responsive to the instruction indicating that the portion of software requiring data protection is about to be executed, executing the portion of software requiring data protection, storing the data being processed by the portion of software requiring data protection, and setting the cache to operate in a standard operating mode responsive to an instruction indicating that execution of the portion of software requiring data protection has completed.
    Type: Grant
    Filed: August 11, 2016
    Date of Patent: October 16, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Rosario Cammarota, Roberto Avanzi, Ramesh Chandra Chauhan, Harold Wade Cain, III, Darren Lasko
  • Publication number: 20180113895
    Abstract: Systems, methods, and products for database system transaction management are provided herein. One aspect provides for annotating via a computing device at least one data object residing on the computing device utilizing at least one transaction tag, the at least one transaction tag being configured to indicate a status of an associated data object; processing at least one database transaction utilizing a transactional memory process, wherein access to the at least one data object is determined based on the status of the at least one data object; and updating the status of the at least one data object responsive to an attempted access of the at least one data object by the at least one database transaction. Other embodiments and aspects are also described herein.
    Type: Application
    Filed: December 21, 2017
    Publication date: April 26, 2018
    Inventors: Harold Wade Cain, III, Donna N. Dillenberger, Michel H.T. Hack, Hong Min, Gong Su, James Zu-Chia Teng
  • Publication number: 20180081811
    Abstract: Systems and methods for dynamically partitioning a shared cache, include dynamically determining a probability to be associated with each one of two or more processors configured to access the shared cache. Based on the probability for a processor, a first cache line of the processor is inserted in a most recently used (MRU) position of a least recently used (LRU) stack associated with the shared cache, pursuant to a miss in the shared cache for the first cache line. Based on the probability for the processor, a second cache line is promoted to the MRU position of the LRU stack, pursuant to a hit in the shared cache for the second cache line. The probability for the processor is determined based on hill-climbing, wherein fluctuations in the probability are reduced, local maxima are prevented, and the probability is prevented from falling below a threshold.
    Type: Application
    Filed: September 20, 2016
    Publication date: March 22, 2018
    Inventors: Rami Mohammad A. AL SHEIKH, Harold Wade CAIN, III
  • Publication number: 20180046808
    Abstract: Techniques for preventing side-channel attacks on a cache are provided. A method according to these techniques includes executing a software instruction indicating that a portion of software requiring data protection is about to be executed, setting the cache to operate in a randomized mode to de-correlate cache timing and cache miss behavior from data being processed by the portion of software requiring data protection responsive to the instruction indicating that the portion of software requiring data protection is about to be executed, executing the portion of software requiring data protection, storing the data being processed by the portion of software requiring data protection, and setting the cache to operate in a standard operating mode responsive to an instruction indicating that execution of the portion of software requiring data protection has completed.
    Type: Application
    Filed: August 11, 2016
    Publication date: February 15, 2018
    Inventors: Rosario CAMMAROTA, Roberto AVANZI, Ramesh Chandra CHAUHAN, Harold Wade CAIN, III, Darren LASKO