Patents by Inventor Terence M. Potter

Terence M. Potter has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Graphics Hardware Driven Pause for Quality of Service Adjustment

Publication number: 20200104180

Abstract: In general, embodiments are disclosed for tracking and allocating graphics processor hardware resources. More particularly, a graphics hardware resource allocation system is able to generate a priority list for a plurality of data masters for graphics processor based on a comparison between a current utilizations for the data masters and a target utilizations for the data masters. The graphics hardware resource allocation system designate, based on the priority list, a first data master with a higher priority to submit work to the graphics processor compared to a second data master. The graphics hardware resource allocation system determines a stall counter value for the data master and generates a notification to pause work for the second data master based on the stall counter value.

Type: Application

Filed: September 28, 2018

Publication date: April 2, 2020

Inventors: Kutty Banerjee, Benjamin Bowman, Terence M. Potter, Tatsuya Iwamoto, Gokhan Avkarogullari
Pipelined Allocation for Operand Cache

Publication number: 20200065104

Abstract: Techniques are disclosed relating to controlling an operand cache in a pipelined fashion. An operand cache may cache operands fetched from the register file or generated by previous instructions to improve performance and/or reduce power consumption. In some embodiments, instructions are pipelined and separate tag information is maintained to indicate allocation of an operand cache entry and ownership of the operand cache entry. In some embodiments, this may allow an operand to remain in the operand cache (and potentially be retrieved or modified) during an interval between allocation of the entry for another operand and ownership of the entry by the other operand. This may improve operand cache efficiency by allowing the entry to be used while to retrieving the other operand from the register file, for example.

Type: Application

Filed: August 24, 2018

Publication date: February 27, 2020

Inventors: Robert D. Kenney, Terence M. Potter, Andrew M. Havlir, Sivayya V. Ayinala
GPU resource priorities based on hardware utilization

Patent number: 10503546

Abstract: In general, techniques are disclosed for tracking and allocating graphics processor hardware over specified periods of time. More particularly, hardware sensors may be used to determine the utilization of graphics processor hardware after each of a number of specified intervals (referred to as “sample intervals”). The utilization values so captured may be combined after a first number of sample intervals (the combined interval referred to as an “epoch interval”) and used to determine a normalized utilization of the graphic processor's hardware resources. Normalized epoch utilization values have been adjusted to account for resources used by concurrently executing processes. In some embodiments, a lower priority process that obtains and fails to release resources that should be allocated to one or more higher priority processes may be detected, paused, and its hardware resources given to the higher priority processes.

Type: Grant

Filed: June 6, 2017

Date of Patent: December 10, 2019

Assignee: Apple Inc.

Inventors: Tatsuya Iwamoto, Kutty Banerjee, Benjiman L. Goodman, Terence M. Potter
Resource synchronization for graphics processing

Patent number: 10504270

Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

Type: Grant

Filed: December 22, 2016

Date of Patent: December 10, 2019

Assignee: Apple Inc.

Inventors: Terence M. Potter, Richard W. Schreyer, James J. Ding, Alexander K. Kan, Michael Imbrogno
Local image blocks for graphics processing

Patent number: 10445852

Abstract: Techniques are disclosed relating to a hardware-supported flexible data structure for graphics processing. In some embodiments, dimensions of the data structure are configurable in an X direction, a Y direction, a number of samples per pixel, and an amount of data per sample. In some embodiments, these attributes are configurable using hardware registers. In some embodiments, the data structure is persistent across a tile being processed such that local memory context is accessible to both rendering threads of a render pass and mid-render compute threads.

Type: Grant

Filed: December 22, 2016

Date of Patent: October 15, 2019

Assignee: Apple Inc.

Inventors: Terence M. Potter, Robert Kenney, Aaftab A. Munshi, Justin A. Hensley, Richard W. Schreyer
Processing circuitry for encoded fields of related threads

Patent number: 10387119

Abstract: Techniques are disclosed relating to performing arithmetic operations to generate values for different related threads. In some embodiments, the threads are graphics threads and the values are operand locations. In some embodiments, an apparatus includes circuitry configured to generate results for multiple threads by performing a plurality of arithmetic operations indicated by an instruction. In some embodiments, the instruction specifies: an input value that is common to the multiple threads and, for at least one of the multiple threads, a type value that indicates whether to generate a result for the thread by performing an arithmetic operation based on a first input that is a result of an arithmetic operation from another thread of the multiple threads or to generate a result for the thread using the input value that is common to the multiple threads.

Type: Grant

Filed: September 28, 2018

Date of Patent: August 20, 2019

Assignee: Apple Inc.

Inventors: Liang-Kai Wang, Terence M. Potter, Brian K. Reynolds, Justin Friesenhahn
Clause chaining for clause-based instruction execution

Patent number: 10353711

Abstract: Techniques are disclosed relating to clause-based execution of program instructions, which may be single-instruction multiple data (SIMD) computer instructions. In some embodiments, an apparatus includes execution circuitry configured to receive clauses of instructions and SIMD groups of input data to be operated on by the clauses. In some embodiments, the apparatus further includes one or more storage elements configured to store state information for clauses processed by the execution circuitry. In some embodiments, the apparatus further includes scheduling circuitry configured to send instructions of a first clause and corresponding input data for execution by the execution circuitry and indicate, prior to sending instruction and input data of a second clause to the execution circuitry for execution, whether the second clause and a first clause are assigned to operate on groups of input data corresponding to the same instruction stream.

Type: Grant

Filed: September 6, 2016

Date of Patent: July 16, 2019

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Brian K. Reynolds, Liang Xia, Terence M. Potter
Providing instruction characteristics to graphics scheduling circuitry based on decoded instructions

Patent number: 10324726

Abstract: Techniques are disclosed relating to scheduling graphics instructions for execution on different types of execution units based on characteristics of decoded and cached graphics instruction. In some embodiments, a graphics unit includes multiple different types of execution units that are configured to execute different types of instructions (e.g., different units for datapath, sample, load/store, etc.). In some embodiments, the graphics unit stores decoded instructions in an instruction cache in at least one cache level, along with information specifying characteristics of the instructions. The characteristics may be stored at clause granularity and may indicate the type of instructions in each clause (e.g., corresponding to which type of execution unit is configured to execute the instructions).

Type: Grant

Filed: February 10, 2017

Date of Patent: June 18, 2019

Assignee: Apple Inc.

Inventors: Michael A. Geary, Brian K. Reynolds, Terence M. Potter
Memory consistency in graphics memory hierarchy with relaxed ordering

Patent number: 10324844

Abstract: Techniques are disclosed relating to memory consistency in a memory hierarchy with relaxed ordering. In some embodiments, an apparatus includes a first level cache that is shared by a plurality of shader processing elements and a second level cache that is shared by the shader processing elements and at least a texture processing unit. In some embodiments, the apparatus is configured to execute operations specified by graphics instructions that include (1) an attribute of the operation that specifies a type of memory consistency to be imposed for the operation and (2) scope information for the attribute that specifies whether the memory consistency specified by the attribute should be enforced at the first level cache or the second level cache. In some embodiments, the apparatus is configured to determine whether to sequence memory accesses at the first level cache and the second level cache based on the attribute and the scope.

Type: Grant

Filed: December 22, 2016

Date of Patent: June 18, 2019

Assignee: Apple Inc.

Inventors: Anthony P. DeLaurier, Owen C. Anderson, Michael J. Swift, Aaftab A. Munshi, Terence M. Potter
Floating-point multiply-add with down-conversion

Patent number: 10282169

Abstract: Techniques are disclosed relating to floating-point operations with down-conversion. In some embodiments, a floating-point unit is configured to perform fused multiply-addition operations based on first and second different instruction types. In some embodiments, the first instruction type specifies result in the first floating-point format and the second instruction type specifies fused multiply addition of input operands in the first floating-point format to generate a result in a second, lower-precision floating-point format. For example, the first format may be a 32-bit format and the second format may be a 16-bit format. In some embodiments, the floating-point unit includes rounding circuitry, exponent circuitry, and/or increment circuitry configured to generate signals for the second instruction type in the same pipeline stage as for the first instruction type. In some embodiments, disclosed techniques may reduce the number of pipeline stages included in the floating-point circuitry.

Type: Grant

Filed: April 6, 2016

Date of Patent: May 7, 2019

Assignee: Apple Inc.

Inventors: Liang-Kai Wang, Terence M. Potter, Andrew M. Havlir, Yu Sun, Nicolas X. Pena, Xiao-Long Wu, Christopher A. Burns
Mid-render compute for graphics processing

Patent number: 10223822

Abstract: Techniques are disclosed relating to performing mid-render auxiliary compute tasks for graphics processing. In some embodiments, auxiliary compute tasks are performed during a render pass, using at least a portion of a memory context of the render pass, without accessing a shared memory during the render pass. Relative to flushing render data to shared memory to perform compute tasks, this may reduce memory accesses and/or cache thrashing, which may in turn increase performance and/or reduce power consumption.

Type: Grant

Filed: December 22, 2016

Date of Patent: March 5, 2019

Assignee: Apple Inc.

Inventors: Terence M. Potter, Ralph C. Taylor, Richard W. Schreyer, Aaftab A. Munshi, Justin A. Hensley
HARDWARE RESOURCE ALLOCATION SYSTEM

Publication number: 20190042312

Abstract: In various embodiments, a resource allocation management circuit may allocate a plurality of different types of hardware resources (e.g., different types of registers) to a plurality of threads. The different types of hardware resources may correspond to a plurality of hardware resource allocation circuits. The resource allocation management circuit may track allocation of the hardware resources to the threads using state identification values of the threads. In response to determining that fewer than a respective requested number of one or more types of the hardware resources are available, the resource allocation management circuit may identify one or more threads for deallocation. As a result, the hardware resource allocation system may allocate hardware resources to threads more efficiently (e.g.

Type: Application

Filed: August 4, 2017

Publication date: February 7, 2019

Inventors: Mark D. Earl, Dimitri Tan, Christopher L. Spencer, Jeffrey T. Brady, Ralph C. Taylor, Terence M. Potter
Processing Circuitry for Encoded Fields of Related Threads

Publication number: 20190034166

Abstract: Techniques are disclosed relating to performing arithmetic operations to generate values for different related threads. In some embodiments, the threads are graphics threads and the values are operand locations. In some embodiments, an apparatus includes circuitry configured to generate results for multiple threads by performing a plurality of arithmetic operations indicated by an instruction. In some embodiments, the instruction specifies: an input value that is common to the multiple threads and, for at least one of the multiple threads, a type value that indicates whether to generate a result for the thread by performing an arithmetic operation based on a first input that is a result of an arithmetic operation from another thread of the multiple threads or to generate a result for the thread using the input value that is common to the multiple threads.

Type: Application

Filed: September 28, 2018

Publication date: January 31, 2019

Inventors: Liang-Kai Wang, Terence M. Potter, Brian K. Reynolds, Justin Friesenhahn
GPU Resource Tracking

Publication number: 20180349146

Abstract: In general, techniques are disclosed for tracking and allocating graphics processor hardware over specified periods of time. More particularly, hardware sensors may be used to determine the utilization of graphics processor hardware after each of a number of specified intervals (referred to as “sample intervals”). The utilization values so captured may be combined after a first number of sample intervals (the combined interval referred to as an “epoch interval”) and used to determine a normalized utilization of the graphic processor's hardware resources. Normalized epoch utilization values have been adjusted to account for resources used by concurrently executing processes. In some embodiments, a lower priority process that obtains and fails to release resources that should be allocated to one or more higher priority processes may be detected, paused, and its hardware resources given to the higher priority processes.

Type: Application

Filed: June 6, 2017

Publication date: December 6, 2018

Inventors: Tatsuya Iwamoto, Kutty Banerjee, Benjiman L. Goodman, Terence M. Potter
Parallel processing circuitry for encoded fields of related threads

Patent number: 10089077

Abstract: Techniques are disclosed relating to performing arithmetic operations to generate values for different related threads. In some embodiments, the threads are graphics threads and the values are operand locations. In some embodiments, an apparatus performs an arithmetic operation using first circuitry, on type value inputs for different threads that are encoded to represent values to be operated on by the first circuitry. In some embodiments, second arithmetic circuitry is configured to perform an arithmetic operation on an output of the first circuitry and an input (e.g., address information such as a base and an offset) that is common to the different threads and has a greater number of bits than the output of the first circuitry. In various embodiments, disclosed techniques may allow decoding of encoded values for different threads (which may reduce memory requirements relative to non-encoded values) with a shorter critical path and lower power consumption, e.g., relative to sequential decoding.

Type: Grant

Filed: January 10, 2017

Date of Patent: October 2, 2018

Assignee: Apple Inc.

Inventors: Liang-Kai Wang, Terence M. Potter, Brian K. Reynolds, Justin Friesenhahn
Local Image Blocks for Graphics Processing

Publication number: 20180182058

Abstract: Techniques are disclosed relating to a hardware-supported flexible data structure for graphics processing. In some embodiments, dimensions of the data structure are configurable in an X direction, a Y direction, a number of samples per pixel, and an amount of data per sample. In some embodiments, these attributes are configurable using hardware registers. In some embodiments, the data structure is persistent across a tile being processed such that local memory context is accessible to both rendering threads of a render pass and mid-render compute threads.

Type: Application

Filed: December 22, 2016

Publication date: June 28, 2018

Inventors: Terence M. Potter, Robert Kenney, Aaftab A. Munshi, Justin A. Hensley, Richard W. Schreyer
Resource Synchronization for Graphics Processing

Publication number: 20180182154

Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

Type: Application

Filed: December 22, 2016

Publication date: June 28, 2018

Inventors: Terence M. Potter, Richard W. Schreyer, James J. Ding, Alexander K. Kan, Michael Imbrogno
Memory Consistency in Graphics Memory Hierarchy with Relaxed Ordering

Publication number: 20180181489

Abstract: Techniques are disclosed relating to memory consistency in a memory hierarchy with relaxed ordering. In some embodiments, an apparatus includes a first level cache that is shared by a plurality of shader processing elements and a second level cache that is shared by the shader processing elements and at least a texture processing unit. In some embodiments, the apparatus is configured to execute operations specified by graphics instructions that include (1) an attribute of the operation that specifies a type of memory consistency to be imposed for the operation and (2) scope information for the attribute that specifies whether the memory consistency specified by the attribute should be enforced at the first level cache or the second level cache. In some embodiments, the apparatus is configured to determine whether to sequence memory accesses at the first level cache and the second level cache based on the attribute and the scope.

Type: Application

Filed: December 22, 2016

Publication date: June 28, 2018

Inventors: Anthony P. DeLaurier, Owen C. Anderson, Michael J. Swift, Aaftab A. Munshi, Terence M. Potter
Mid-Render Compute for Graphics Processing

Publication number: 20180182153

Abstract: Techniques are disclosed relating to performing mid-render auxiliary compute tasks for graphics processing. In some embodiments, auxiliary compute tasks are performed during a render pass, using at least a portion of a memory context of the render pass, without accessing a shared memory during the render pass. Relative to flushing render data to shared memory to perform compute tasks, this may reduce memory accesses and/or cache thrashing, which may in turn increase performance and/or reduce power consumption.

Type: Application

Filed: December 22, 2016

Publication date: June 28, 2018

Inventors: Terence M. Potter, Ralph C. Taylor, Richard W. Schreyer, Aaftab A. Munshi, Justin A. Hensley
PROCESSING CIRCUIT HARDWARE RESOURCE ALLOCATION SYSTEM

Publication number: 20180173560

Abstract: In various embodiments, hardware resources of a processing circuit may be allocated to a plurality of processes based on priorities of the processes. A hardware resource utilization sensor may detect a current utilization of the hardware resources by a process. A utilization accumulation circuit may determine a utilization of the hardware resources by the process over a particular amount of time. A target utilization of the hardware resources for the process may be determined based on the utilization of the hardware resources over the particular amount of time. A comparator circuit may compare the current utilization to the target utilization. A process priority adjustment circuit may adjust a priority of the process based on the comparison. Based on the adjusted priority, a different amount of hardware resources may be allocated to the processes.

Type: Application

Filed: December 21, 2016

Publication date: June 21, 2018

Inventors: Gokhan Avkarogullari, Terence M. Potter, Benjiman L. Goodman, Ralph C. Taylor, Kutty Banerjee

prev 1 2 3 4 5 6 7 next