Patents by Inventor Tyler J. Huberty

Tyler J. Huberty has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240095037
    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
    Type: Application
    Filed: July 28, 2023
    Publication date: March 21, 2024
    Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
  • Patent number: 11921640
    Abstract: A cache may store critical cache lines and non-critical cache lines, and may attempt to retain critical cache lines in the cache by, for example, favoring the critical cache lines in replacement data updates, retaining the critical cache lines with a certain probability when victim cache blocks are being selected, etc. Criticality values may be retained at various levels of the cache hierarchy. Additionally, accelerated eviction may be employed if the threads previously accessing the critical cache blocks are viewed as dead.
    Type: Grant
    Filed: April 22, 2022
    Date of Patent: March 5, 2024
    Assignee: Apple Inc.
    Inventors: Tyler J. Huberty, Vivek Venkatraman, Sandeep Gupta, Eric J. Furbish, Srinivasa Rangan Sridharan, Stephan G. Meier
  • Patent number: 11886354
    Abstract: Techniques are disclosed relating to cache thrash detection. In some embodiments, cache controller circuitry is configured to monitor and track performance metrics across multiple levels of a cache hierarchy, detect cache thrashing based on one or more performance metrics, and modify a cache insertion policy to mitigate cache thrashing. Disclosed techniques may advantageously detect and reduce or avoid cache thrashing, which may increase processor performance, decrease power consumption for a given workload, or both, relative to traditional techniques.
    Type: Grant
    Filed: May 20, 2022
    Date of Patent: January 30, 2024
    Assignee: Apple Inc.
    Inventors: Anwar Q. Rohillah, Tyler J. Huberty
  • Patent number: 11822480
    Abstract: A cache may store critical cache lines and non-critical cache lines, and may attempt to retain critical cache lines in the cache by, for example, favoring the critical cache lines in replacement data updates, retaining the critical cache lines with a certain probability when victim cache blocks are being selected, etc. Criticality values may be retained at various levels of the cache hierarchy. Additionally, accelerated eviction may be employed if the threads previously accessing the critical cache blocks are viewed as dead.
    Type: Grant
    Filed: April 22, 2022
    Date of Patent: November 21, 2023
    Assignee: Apple Inc.
    Inventors: Tyler J. Huberty, Vivek Venkatraman, Sandeep Gupta, Eric J. Furbish, Srinivasa Rangan Sridharan, Stephan G. Meier
  • Patent number: 11768690
    Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.
    Type: Grant
    Filed: November 22, 2021
    Date of Patent: September 26, 2023
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
  • Patent number: 11755333
    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: September 12, 2023
    Assignee: Apple Inc.
    Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
  • Publication number: 20230092898
    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
    Type: Application
    Filed: December 10, 2021
    Publication date: March 23, 2023
    Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
  • Publication number: 20230066236
    Abstract: A cache may store critical cache lines and non-critical cache lines, and may attempt to retain critical cache lines in the cache by, for example, favoring the critical cache lines in replacement data updates, retaining the critical cache lines with a certain probability when victim cache blocks are being selected, etc. Criticality values may be retained at various levels of the cache hierarchy. Additionally, accelerated eviction may be employed if the threads previously accessing the critical cache blocks are viewed as dead.
    Type: Application
    Filed: April 22, 2022
    Publication date: March 2, 2023
    Inventors: Tyler J. Huberty, Vivek Venkatraman, Sandeep Gupta, Eric J. Furbish, Srinivasa Rangan Sridharan, Stephan G. Meier
  • Publication number: 20230060225
    Abstract: A cache may store critical cache lines and non-critical cache lines, and may attempt to retain critical cache lines in the cache by, for example, favoring the critical cache lines in replacement data updates, retaining the critical cache lines with a certain probability when victim cache blocks are being selected, etc. Criticality values may be retained at various levels of the cache hierarchy. Additionally, accelerated eviction may be employed if the threads previously accessing the critical cache blocks are viewed as dead.
    Type: Application
    Filed: April 22, 2022
    Publication date: March 2, 2023
    Inventors: Tyler J. Huberty, Vivek Venkatraman, Sandeep Gupta, Eric J. Furbish, Srinivasa Rangan Sridharan, Stephan G. Meier
  • Publication number: 20220083343
    Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.
    Type: Application
    Filed: November 22, 2021
    Publication date: March 17, 2022
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
  • Patent number: 11210104
    Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: December 28, 2021
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
  • Patent number: 11176045
    Abstract: In an embodiment, a processor includes a plurality of prefetch circuits configured to prefetch data into a data cache. A primary prefetch circuit may be configured to generate first prefetch requests in response to a demand access, and may be configured to invoke a second prefetch circuit in response to the demand access. The second prefetch circuit may implement a different prefetch mechanism than the first prefetch circuit. If the second prefetch circuit reaches a threshold confidence level in prefetching for the demand access, the second prefetch circuit may communicate an indication to the primary prefetch circuit. The primary prefetch circuit may reduce a number of prefetch requests generated for the demand access responsive to the communication from the second prefetch circuit.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: November 16, 2021
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Nikhil Gupta
  • Publication number: 20210303471
    Abstract: In an embodiment, a processor includes a plurality of prefetch circuits configured to prefetch data into a data cache. A primary prefetch circuit may be configured to generate first prefetch requests in response to a demand access, and may be configured to invoke a second prefetch circuit in response to the demand access. The second prefetch circuit may implement a different prefetch mechanism than the first prefetch circuit. If the second prefetch circuit reaches a threshold confidence level in prefetching for the demand access, the second prefetch circuit may communicate an indication to the primary prefetch circuit. The primary prefetch circuit may reduce a number of prefetch requests generated for the demand access responsive to the communication from the second prefetch circuit.
    Type: Application
    Filed: March 27, 2020
    Publication date: September 30, 2021
    Inventors: Stephan G. Meier, Tyler J. Huberty, Nikhil Gupta
  • Patent number: 10776521
    Abstract: Techniques are disclosed for obtaining data using memory timing characteristics. In some embodiments, a physical unclonable function is used to obtain the data. In various embodiments, a computer system programs a timing parameter of a memory accessible by the computer system to a value that is outside of a specified operable range for the timing parameter. In various embodiments, the computer system performs one or more memory operations to a least a portion of the memory and detects a pattern of errors in the portion of the memory. In some embodiments, the computer system generates a response dependent on the pattern of errors. The response may be used to identify the computer system.
    Type: Grant
    Filed: August 16, 2017
    Date of Patent: September 15, 2020
    Assignee: Apple Inc.
    Inventors: Jeremie S. Kim, Minesh H. Patel, Stephan G. Meier, Tyler J. Huberty, Onur Mutlu
  • Patent number: 10621100
    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit for a multi-level cache system. The access patterns that are matched to the access maps may include prefetches for different cache levels. Centralizing the generation of prefetches into one prefetch circuit may provide better observability and controllability of prefetching at various levels of the cache hierarchy, in an embodiment. Prefetches at different levels may be controlled individually based on the accuracy of those prefetches, in an embodiment. Additionally, in an embodiment, access patterns that are longer that a given threshold may have the granularity of the prefetches change so that more data is prefetched and the prefetches occur farther in advance, in some embodiments.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: April 14, 2020
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Gerard R. Williams, III, Pradeep Kanapathipillai
  • Patent number: 10331567
    Abstract: A prefetch circuit may include a memory, each entry of which may store an address and other prefetch data used to generate prefetch requests. For each entry, there may be at least one “quality factor” (QF) that may control prefetch request generation for that entry. A global quality factor (GQF) may control generation of prefetch requests across the plurality of entries. The prefetch circuit may include one or more additional prefetch mechanisms. For example, a stride-based prefetch circuit may be included that may generate prefetch requests for strided access patterns having strides larger than a certain stride size. Another example is a spatial memory streaming (SMS)-based mechanism in which prefetch data from multiple evictions from the memory in the prefetch circuit is captured and used for SMS prefetching based on how well the prefetch data appears to match a spatial memory streaming pattern.
    Type: Grant
    Filed: February 17, 2017
    Date of Patent: June 25, 2019
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Nikhil Gupta, Francesco Spadini, Gideon Levinsky
  • Patent number: 10180905
    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit for a multi-level cache system. The access patterns that are matched to the access maps may include prefetches for different cache levels. Centralizing the generation of prefetches into one prefetch circuit may provide better observability and controllability of prefetching at various levels of the cache hierarchy, in an embodiment. Prefetches at different levels may be controlled individually based on the accuracy of those prefetches, in an embodiment. Additionally, in an embodiment, access patterns that are longer that a given threshold may have the granularity of the prefetches change so that more data is prefetched and the prefetches occur farther in advance, in some embodiments.
    Type: Grant
    Filed: April 7, 2016
    Date of Patent: January 15, 2019
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Gerard R. Williams, III, Pradeep Kanapathipillai
  • Publication number: 20180307862
    Abstract: Techniques are disclosed for obtaining data using memory timing characteristics. In some embodiments, a physical unclonable function is used to obtain the data. In various embodiments, a computer system programs a timing parameter of a memory accessible by the computer system to a value that is outside of a specified operable range for the timing parameter. In various embodiments, the computer system performs one or more memory operations to a least a portion of the memory and detects a pattern of errors in the portion of the memory. In some embodiments, the computer system generates a response dependent on the pattern of errors. The response may be used to identify the computer system.
    Type: Application
    Filed: August 16, 2017
    Publication date: October 25, 2018
    Inventors: Jeremie S. Kim, Minesh H. Patel, Stephan G. Meier, Tyler J. Huberty, Onur Mutlu
  • Patent number: 9904624
    Abstract: In an embodiment, a system may include multiple processors and a cache coupled to the processors. Each processor includes a data cache and a prefetch circuit that may be configured to generate prefetch requests. Each processor may also generate memory operations responsive to cache misses in the data cache. Each processor may transmit the prefetch requests and memory operations to the cache. The cache may queue the memory operations and prefetch requests, and may be configured to detect, on a per-processor basis, occupancy in the queue of memory requests and low confidence prefetch requests from the processor. The cache may determine if the per-processor occupancies exceed one or more thresholds, and may generate a throttle control to the processors responsive to the occupancies. In an embodiment, the cache may generate the throttle control responsive to a history of the last N samples of the occupancies.
    Type: Grant
    Filed: April 7, 2016
    Date of Patent: February 27, 2018
    Assignee: Apple Inc.
    Inventors: Tyler J. Huberty, Stephan G. Meier, Khubaib Khubaib
  • Patent number: 9886385
    Abstract: In a content-directed prefetcher, a pointer detection circuit identifies a given memory pointer candidate within a data cache line fill from a lower level cache (LLC), where the LLC is at a lower level of a memory hierarchy relative to the data cache. A pointer filter circuit initiates a prefetch request to the LLC candidate dependent on determining that a given counter in a quality factor (QF) table satisfies QF counter threshold value. The QF table is indexed dependent upon a program counter address and relative cache line offset of the candidate. Upon initiation of the prefetch request, the given counter is updated to reflect a prefetch cost. In response to determining that a subsequent data cache line fill arriving from the LLC corresponds to the prefetch request for the given memory pointer candidate, a particular counter of the QF table may be updated to reflect a successful prefetch credit.
    Type: Grant
    Filed: August 25, 2016
    Date of Patent: February 6, 2018
    Assignee: Apple Inc.
    Inventors: Tyler J. Huberty, Stephan G. Meier, Mridul Agarwal