Patents by Inventor Terence M. Potter

Terence M. Potter has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11947462
    Abstract: Techniques are disclosed relating to cache footprint management. In some embodiments, execution circuitry is configured to perform operations for instructions from multiple threads in parallel. Cache circuitry may store information operated on by threads executed by the execution circuitry. Scheduling circuitry may arbitrate among threads to schedule threads for execution by the execution circuitry. Tracking circuitry may determine one or more performance metrics for the cache circuitry. Control circuitry may, based on the one or more performance metrics meeting a threshold, reduce a limit on a number of threads considered for arbitration by the scheduling circuitry, to control a footprint of information stored by the cache circuitry. Disclosed techniques may advantageously reduce or avoid cache thrashing for certain processor workloads.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: April 2, 2024
    Assignee: Apple Inc.
    Inventors: Yoong Chert Foo, Terence M. Potter, Donald R. DeSota, Benjiman L. Goodman, Aroun Demeure, Cheng Li, Winnie W. Yeung
  • Patent number: 11941742
    Abstract: Techniques are disclosed relating to processor communications fabrics. In some embodiments, a processor includes multiple client circuitry and fabric circuitry that includes at least first and second instances of a tile. The tile may include: client inputs configured to interface with client circuits, tile inputs configured to interface with one or more other tile instances, and communication resources assignable to the client inputs and tile inputs. The communications resources may include: multiple internal links, client outputs configured to interface with client circuits, and tile outputs configured to interface with one or more other tile instances. Control circuitry may, in a given cycle, assign communication resources of a given tile instance to at least a portion of the client inputs and tile inputs for a next cycle, based on priority information. The control circuitry may update priority information based on assignment results over multiple cycles.
    Type: Grant
    Filed: June 23, 2022
    Date of Patent: March 26, 2024
    Assignee: Apple Inc.
    Inventors: Adam J. Smith, Sergio V. Tota, Christopher G. Martin, Yoong Chert Foo, Terence M. Potter, Max J. Batley
  • Publication number: 20240095065
    Abstract: Techniques are disclosed relating to multi-stage thread scheduling. In some embodiments, processor circuitry includes multiple channel pipelines for multiple channels and multiple execution pipelines shared by the channel pipelines and configured to perform different types of operations provided by the channel pipelines. First scheduler circuitry may arbitrate among threads to assign threads to channels. Second scheduler circuitry may arbitrate among channels to assign an operation from a given channel to a given execution pipeline. The execution pipelines may provide backpressure information to the first scheduler circuitry based on execution status and the first scheduler circuitry may adjust priority of a thread for assignment to a channel based on the backpressure information. Disclosed techniques may reduce channel conflicts and starvation for execution resources.
    Type: Application
    Filed: November 10, 2022
    Publication date: March 21, 2024
    Inventors: Benjiman L. Goodman, Anjana Rajendran, Sheenam Jayaswal, Terence M. Potter, Yoong Chert Foo
  • Publication number: 20240095176
    Abstract: Techniques are disclosed relating to thread preemption in the context of memory-backed registers. In some embodiments, a memory hierarchy includes one or more cache levels and one or more memory circuits. Execution circuitry may operate on operands in architectural registers to execute instructions of threads, where data for the architectural registers is stored and backed by the memory hierarchy. Control circuitry may, in response to a context switch indication for a given thread: flush and invalidate a set of architectural register data from a first cache level and store memory page information (e.g., a page catalog base address) associated with the set of architectural register data.
    Type: Application
    Filed: November 10, 2022
    Publication date: March 21, 2024
    Inventors: Benjiman L. Goodman, Yoong Chert Foo, Karl D. Mann, Terence M. Potter, Frank W. Liljeros, Jeffrey T. Brady
  • Publication number: 20240045808
    Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.
    Type: Application
    Filed: October 19, 2023
    Publication date: February 8, 2024
    Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
  • Publication number: 20230419585
    Abstract: Techniques are disclosed relating to processor communications fabrics. In some embodiments, a processor includes multiple client circuitry and fabric circuitry that includes at least first and second instances of a tile. The tile may include: client inputs configured to interface with client circuits, tile inputs configured to interface with one or more other tile instances, and communication resources assignable to the client inputs and tile inputs. The communications resources may include: multiple internal links, client outputs configured to interface with client circuits, and tile outputs configured to interface with one or more other tile instances. Control circuitry may, in a given cycle, assign communication resources of a given tile instance to at least a portion of the client inputs and tile inputs for a next cycle, based on priority information. The control circuitry may update priority information based on assignment results over multiple cycles.
    Type: Application
    Filed: June 23, 2022
    Publication date: December 28, 2023
    Inventors: Adam J. Smith, Sergio V. Tota, Christopher G. Martin, Yoong Chert Foo, Terence M. Potter, Max J. Batley
  • Publication number: 20230385201
    Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.
    Type: Application
    Filed: July 31, 2023
    Publication date: November 30, 2023
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
  • Patent number: 11829298
    Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: November 28, 2023
    Assignee: Apple Inc.
    Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
  • Publication number: 20230325196
    Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.
    Type: Application
    Filed: April 12, 2023
    Publication date: October 12, 2023
    Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
  • Patent number: 11714759
    Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.
    Type: Grant
    Filed: August 17, 2020
    Date of Patent: August 1, 2023
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
  • Patent number: 11645084
    Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.
    Type: Grant
    Filed: September 9, 2021
    Date of Patent: May 9, 2023
    Assignee: Apple Inc.
    Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
  • Patent number: 11521343
    Abstract: Disclosed techniques relate to memory space management for graphics processing. In some embodiments, first and second graphics cores are configured to execute instructions for multiple threadgroups. In some embodiments, the threads groups include a first threadgroup with multiple single-instruction multiple-data (SIMD) groups configured to execute a first shader program and a second threadgroup with multiple SIMD groups configured to execute a second, different shader program. Control circuitry may be configured to provide access to data stored in memory circuitry according to a shader memory space. The shader memory space may be accessible to threadgroups executed by the first graphics shader core, including the first and second threadgroups, but is not accessible to threadgroups executed by the second graphics shader core. Disclosed techniques may reduce latency, increase bandwidth available to the shader, reduce coherency cost, or any combination thereof.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: December 6, 2022
    Assignee: Apple Inc.
    Inventors: Terence M. Potter, Yoong Chert Foo, Ali Rabbani Rankouhi, Justin A. Hensley, Jonathan M. Redshaw
  • Patent number: 11430174
    Abstract: Techniques are disclosed relating to specifying memory consistency constraints. In some embodiments, an instruction may specify, for a memory operation, a type of memory consistency and a scope at which to enforce the type of consistency. For example, these fields may specify whether to sequence memory accesses relative to the operation at one or more of multiple different cache levels based on the type of memory consistency and the scope.
    Type: Grant
    Filed: January 15, 2021
    Date of Patent: August 30, 2022
    Assignee: Apple Inc.
    Inventors: Terence M. Potter, Richard W. Schreyer, James J. Ding, Alexander K. Kan, Michael Imbrogno
  • Patent number: 11360780
    Abstract: Techniques are disclosed relating to context switching in a SIMD processor. In some embodiments, an apparatus includes pipeline circuitry configured to execute graphics instructions included in threads of a group of single-instruction multiple-data (SIMD) threads in a thread group. In some embodiments, context switch circuitry is configured to atomically: save, for the SIMD group, a program counter and information that indicates whether threads in the SIMD group are active using one or more context switch registers, set all threads to an active state for the SIMD group, and branch to handler code for the SIMD group. In some embodiments, the pipeline circuitry is configured to execute the handler code to save context information for the SIMD group and subsequently execute threads of another thread group. Disclosed techniques may allow instruction-level context switching even when some SIMD threads are non-active.
    Type: Grant
    Filed: January 22, 2020
    Date of Patent: June 14, 2022
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Jeffrey T. Brady, Brian K. Reynolds, Jeffrey A. Lohman
  • Publication number: 20220148249
    Abstract: Disclosed techniques relate to memory space management for graphics processing. In some embodiments, first and second graphics cores are configured to execute instructions for multiple threadgroups. In some embodiments, the threads groups include a first threadgroup with multiple single-instruction multiple-data (SIMD) groups configured to execute a first shader program and a second threadgroup with multiple SIMD groups configured to execute a second, different shader program. Control circuitry may be configured to provide access to data stored in memory circuitry according to a shader memory space. The shader memory space may be accessible to threadgroups executed by the first graphics shader core, including the first and second threadgroups, but is not accessible to threadgroups executed by the second graphics shader core. Disclosed techniques may reduce latency, increase bandwidth available to the shader, reduce coherency cost, or any combination thereof.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 12, 2022
    Inventors: Terence M. Potter, Yoong Chert Foo, Ali Rabbani Rankouhi, Justin A. Hensley, Jonathan M. Redshaw
  • Patent number: 11294672
    Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: April 5, 2022
    Assignee: Apple Inc.
    Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
  • Patent number: 11256518
    Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.
    Type: Grant
    Filed: October 9, 2019
    Date of Patent: February 22, 2022
    Assignee: Apple Inc.
    Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
  • Publication number: 20220050790
    Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.
    Type: Application
    Filed: August 17, 2020
    Publication date: February 17, 2022
    Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
  • Publication number: 20210406031
    Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.
    Type: Application
    Filed: September 9, 2021
    Publication date: December 30, 2021
    Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
  • Patent number: 11204774
    Abstract: Techniques are disclosed relating to a thread-group-scoped gate instruction. In some embodiments, graphics processor circuitry is configured to execute, for multiple SIMD groups of a thread group, a graphics program that includes a gate instruction. During execution of the gate instruction for a first SIMD group, the processor accesses state information to determine that a threshold number of other SIMD groups in the thread group have not yet executed the gate instruction. Based on the determination, the processor executes a particular set of instructions of the graphics program for the first SIMD group (that is not executed by one or more other SIMD groups that reach the gate instruction after the first SIMD group). For example, the particular set of instructions may be a utility program that performs one or more operations for the entire thread group but is only executed by a subset of the SIMD groups.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: December 21, 2021
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Anjana Rajendran, Jeffrey A. Lohman, Terence M. Potter