Patents by Inventor Terence M. Potter

Terence M. Potter has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Cache footprint management

Patent number: 11947462

Abstract: Techniques are disclosed relating to cache footprint management. In some embodiments, execution circuitry is configured to perform operations for instructions from multiple threads in parallel. Cache circuitry may store information operated on by threads executed by the execution circuitry. Scheduling circuitry may arbitrate among threads to schedule threads for execution by the execution circuitry. Tracking circuitry may determine one or more performance metrics for the cache circuitry. Control circuitry may, based on the one or more performance metrics meeting a threshold, reduce a limit on a number of threads considered for arbitration by the scheduling circuitry, to control a footprint of information stored by the cache circuitry. Disclosed techniques may advantageously reduce or avoid cache thrashing for certain processor workloads.

Type: Grant

Filed: March 3, 2022

Date of Patent: April 2, 2024

Assignee: Apple Inc.

Inventors: Yoong Chert Foo, Terence M. Potter, Donald R. DeSota, Benjiman L. Goodman, Aroun Demeure, Cheng Li, Winnie W. Yeung
Tiled processor communication fabric

Patent number: 11941742

Abstract: Techniques are disclosed relating to processor communications fabrics. In some embodiments, a processor includes multiple client circuitry and fabric circuitry that includes at least first and second instances of a tile. The tile may include: client inputs configured to interface with client circuits, tile inputs configured to interface with one or more other tile instances, and communication resources assignable to the client inputs and tile inputs. The communications resources may include: multiple internal links, client outputs configured to interface with client circuits, and tile outputs configured to interface with one or more other tile instances. Control circuitry may, in a given cycle, assign communication resources of a given tile instance to at least a portion of the client inputs and tile inputs for a next cycle, based on priority information. The control circuitry may update priority information based on assignment results over multiple cycles.

Type: Grant

Filed: June 23, 2022

Date of Patent: March 26, 2024

Assignee: Apple Inc.

Inventors: Adam J. Smith, Sergio V. Tota, Christopher G. Martin, Yoong Chert Foo, Terence M. Potter, Max J. Batley
Multi-stage Thread Scheduling

Publication number: 20240095065

Abstract: Techniques are disclosed relating to multi-stage thread scheduling. In some embodiments, processor circuitry includes multiple channel pipelines for multiple channels and multiple execution pipelines shared by the channel pipelines and configured to perform different types of operations provided by the channel pipelines. First scheduler circuitry may arbitrate among threads to assign threads to channels. Second scheduler circuitry may arbitrate among channels to assign an operation from a given channel to a given execution pipeline. The execution pipelines may provide backpressure information to the first scheduler circuitry based on execution status and the first scheduler circuitry may adjust priority of a thread for assignment to a channel based on the backpressure information. Disclosed techniques may reduce channel conflicts and starvation for execution resources.

Type: Application

Filed: November 10, 2022

Publication date: March 21, 2024

Inventors: Benjiman L. Goodman, Anjana Rajendran, Sheenam Jayaswal, Terence M. Potter, Yoong Chert Foo
Preemption Techniques for Memory-Backed Registers

Publication number: 20240095176

Abstract: Techniques are disclosed relating to thread preemption in the context of memory-backed registers. In some embodiments, a memory hierarchy includes one or more cache levels and one or more memory circuits. Execution circuitry may operate on operands in architectural registers to execute instructions of threads, where data for the architectural registers is stored and backed by the memory hierarchy. Control circuitry may, in response to a context switch indication for a given thread: flush and invalidate a set of architectural register data from a first cache level and store memory page information (e.g., a page catalog base address) associated with the set of architectural register data.

Type: Application

Filed: November 10, 2022

Publication date: March 21, 2024

Inventors: Benjiman L. Goodman, Yoong Chert Foo, Karl D. Mann, Terence M. Potter, Frank W. Liljeros, Jeffrey T. Brady
On-demand Memory Allocation

Publication number: 20240045808

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Application

Filed: October 19, 2023

Publication date: February 8, 2024

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
Tiled Processor Communication Fabric

Publication number: 20230419585

Abstract: Techniques are disclosed relating to processor communications fabrics. In some embodiments, a processor includes multiple client circuitry and fabric circuitry that includes at least first and second instances of a tile. The tile may include: client inputs configured to interface with client circuits, tile inputs configured to interface with one or more other tile instances, and communication resources assignable to the client inputs and tile inputs. The communications resources may include: multiple internal links, client outputs configured to interface with client circuits, and tile outputs configured to interface with one or more other tile instances. Control circuitry may, in a given cycle, assign communication resources of a given tile instance to at least a portion of the client inputs and tile inputs for a next cycle, based on priority information. The control circuitry may update priority information based on assignment results over multiple cycles.

Type: Application

Filed: June 23, 2022

Publication date: December 28, 2023

Inventors: Adam J. Smith, Sergio V. Tota, Christopher G. Martin, Yoong Chert Foo, Terence M. Potter, Max J. Batley
Private Memory Management using Utility Thread

Publication number: 20230385201

Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.

Type: Application

Filed: July 31, 2023

Publication date: November 30, 2023

Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
On-demand memory allocation

Patent number: 11829298

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Grant

Filed: February 28, 2020

Date of Patent: November 28, 2023

Assignee: Apple Inc.

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
SIMD Operand Permutation with Selection from among Multiple Registers

Publication number: 20230325196

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Application

Filed: April 12, 2023

Publication date: October 12, 2023

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Private memory management using utility thread

Patent number: 11714759

Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.

Type: Grant

Filed: August 17, 2020

Date of Patent: August 1, 2023

Assignee: Apple Inc.

Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
SIMD operand permutation with selection from among multiple registers

Patent number: 11645084

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Grant

Filed: September 9, 2021

Date of Patent: May 9, 2023

Assignee: Apple Inc.

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Graphics memory space for shader core

Patent number: 11521343

Abstract: Disclosed techniques relate to memory space management for graphics processing. In some embodiments, first and second graphics cores are configured to execute instructions for multiple threadgroups. In some embodiments, the threads groups include a first threadgroup with multiple single-instruction multiple-data (SIMD) groups configured to execute a first shader program and a second threadgroup with multiple SIMD groups configured to execute a second, different shader program. Control circuitry may be configured to provide access to data stored in memory circuitry according to a shader memory space. The shader memory space may be accessible to threadgroups executed by the first graphics shader core, including the first and second threadgroups, but is not accessible to threadgroups executed by the second graphics shader core. Disclosed techniques may reduce latency, increase bandwidth available to the shader, reduce coherency cost, or any combination thereof.

Type: Grant

Filed: November 24, 2020

Date of Patent: December 6, 2022

Assignee: Apple Inc.

Inventors: Terence M. Potter, Yoong Chert Foo, Ali Rabbani Rankouhi, Justin A. Hensley, Jonathan M. Redshaw
Memory consistency in memory hierarchy with relaxed ordering

Patent number: 11430174

Abstract: Techniques are disclosed relating to specifying memory consistency constraints. In some embodiments, an instruction may specify, for a memory operation, a type of memory consistency and a scope at which to enforce the type of consistency. For example, these fields may specify whether to sequence memory accesses relative to the operation at one or more of multiple different cache levels based on the type of memory consistency and the scope.

Type: Grant

Filed: January 15, 2021

Date of Patent: August 30, 2022

Assignee: Apple Inc.

Inventors: Terence M. Potter, Richard W. Schreyer, James J. Ding, Alexander K. Kan, Michael Imbrogno
Instruction-level context switch in SIMD processor

Patent number: 11360780

Abstract: Techniques are disclosed relating to context switching in a SIMD processor. In some embodiments, an apparatus includes pipeline circuitry configured to execute graphics instructions included in threads of a group of single-instruction multiple-data (SIMD) threads in a thread group. In some embodiments, context switch circuitry is configured to atomically: save, for the SIMD group, a program counter and information that indicates whether threads in the SIMD group are active using one or more context switch registers, set all threads to an active state for the SIMD group, and branch to handler code for the SIMD group. In some embodiments, the pipeline circuitry is configured to execute the handler code to save context information for the SIMD group and subsequently execute threads of another thread group. Disclosed techniques may allow instruction-level context switching even when some SIMD threads are non-active.

Type: Grant

Filed: January 22, 2020

Date of Patent: June 14, 2022

Assignee: Apple Inc.

Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Jeffrey T. Brady, Brian K. Reynolds, Jeffrey A. Lohman
Graphics Memory Space for Shader Core

Publication number: 20220148249

Abstract: Disclosed techniques relate to memory space management for graphics processing. In some embodiments, first and second graphics cores are configured to execute instructions for multiple threadgroups. In some embodiments, the threads groups include a first threadgroup with multiple single-instruction multiple-data (SIMD) groups configured to execute a first shader program and a second threadgroup with multiple SIMD groups configured to execute a second, different shader program. Control circuitry may be configured to provide access to data stored in memory circuitry according to a shader memory space. The shader memory space may be accessible to threadgroups executed by the first graphics shader core, including the first and second threadgroups, but is not accessible to threadgroups executed by the second graphics shader core. Disclosed techniques may reduce latency, increase bandwidth available to the shader, reduce coherency cost, or any combination thereof.

Type: Application

Filed: November 24, 2020

Publication date: May 12, 2022

Inventors: Terence M. Potter, Yoong Chert Foo, Ali Rabbani Rankouhi, Justin A. Hensley, Jonathan M. Redshaw
Routing circuitry for permutation of single-instruction multiple-data operands

Patent number: 11294672

Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.

Type: Grant

Filed: August 22, 2019

Date of Patent: April 5, 2022

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
Datapath circuitry for math operations using SIMD pipelines

Patent number: 11256518

Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.

Type: Grant

Filed: October 9, 2019

Date of Patent: February 22, 2022

Assignee: Apple Inc.

Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
Private Memory Management using Utility Thread

Publication number: 20220050790

Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table.

Type: Application

Filed: August 17, 2020

Publication date: February 17, 2022

Inventors: Benjiman L. Goodman, Terence M. Potter, Anjana Rajendran, Mark I. Luffel, William V. Miller
SIMD Operand Permutation with Selection from among Multiple Registers

Publication number: 20210406031

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Application

Filed: September 9, 2021

Publication date: December 30, 2021

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Thread-group-scoped gate instruction

Patent number: 11204774

Abstract: Techniques are disclosed relating to a thread-group-scoped gate instruction. In some embodiments, graphics processor circuitry is configured to execute, for multiple SIMD groups of a thread group, a graphics program that includes a gate instruction. During execution of the gate instruction for a first SIMD group, the processor accesses state information to determine that a threshold number of other SIMD groups in the thread group have not yet executed the gate instruction. Based on the determination, the processor executes a particular set of instructions of the graphics program for the first SIMD group (that is not executed by one or more other SIMD groups that reach the gate instruction after the first SIMD group). For example, the particular set of instructions may be a utility program that performs one or more operations for the entire thread group but is only executed by a subset of the SIMD groups.

Type: Grant

Filed: August 31, 2020

Date of Patent: December 21, 2021

Assignee: Apple Inc.

Inventors: Benjiman L. Goodman, Anjana Rajendran, Jeffrey A. Lohman, Terence M. Potter

1 2 3 4 5 … next