Patents by Inventor Robert D. Kenney

Robert D. Kenney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SIMD operand permutation with selection from among multiple registers

Patent number: 12008377

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Grant

Filed: April 12, 2023

Date of Patent: June 11, 2024

Assignee: Apple Inc.

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
SIMD Operand Permutation with Selection from among Multiple Registers

Publication number: 20230325196

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Application

Filed: April 12, 2023

Publication date: October 12, 2023

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
SIMD operand permutation with selection from among multiple registers

Patent number: 11645084

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Grant

Filed: September 9, 2021

Date of Patent: May 9, 2023

Assignee: Apple Inc.

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Multi-channel data path circuitry

Patent number: 11422822

Abstract: Techniques are disclosed relating to sharing datapath circuitry among multiple SIMD groups. In some embodiments, pipeline circuitry is configured to perform operations specified by instructions of first and second assigned SIMD groups. The pipeline circuitry may include first and second front-end circuitry configured to decode instructions of the respective SIMD groups. The pipeline circuitry may include shared execution circuitry configured to perform operations specified by the first and second assigned SIMD groups and arbitration circuitry configured to select an instruction from among at least the first and second front-end circuitry for assignment to the shared execution circuitry in a current cycle. The arbitration circuitry may select an instruction based on one or more of: stall counts, whether available instructions are being speculatively executed, whether ones of available instructions target a particular portion of the shared execution circuitry, numbers of execution cycles, and SIMD group ages.

Type: Grant

Filed: May 8, 2020

Date of Patent: August 23, 2022

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Jason N. Dale
Routing circuitry for permutation of single-instruction multiple-data operands

Patent number: 11294672

Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.

Type: Grant

Filed: August 22, 2019

Date of Patent: April 5, 2022

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
Datapath circuitry for math operations using SIMD pipelines

Patent number: 11256518

Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.

Type: Grant

Filed: October 9, 2019

Date of Patent: February 22, 2022

Assignee: Apple Inc.

Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
SIMD Operand Permutation with Selection from among Multiple Registers

Publication number: 20210406031

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Application

Filed: September 9, 2021

Publication date: December 30, 2021

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Multi-channel Data Path Circuitry

Publication number: 20210349725

Abstract: Techniques are disclosed relating to sharing datapath circuitry among multiple SIMD groups. In some embodiments, pipeline circuitry is configured to perform operations specified by instructions of first and second assigned SIMD groups. The pipeline circuitry may include first and second front-end circuitry configured to decode instructions of the respective SIMD groups. The pipeline circuitry may include shared execution circuitry configured to perform operations specified by the first and second assigned SIMD groups and arbitration circuitry configured to select an instruction from among at least the first and second front-end circuitry for assignment to the shared execution circuitry in a current cycle. The arbitration circuitry may select an instruction based on one or more of: stall counts, whether available instructions are being speculatively executed, whether ones of available instructions target a particular portion of the shared execution circuitry, numbers of execution cycles, and SIMD group ages.

Type: Application

Filed: May 8, 2020

Publication date: November 11, 2021

Inventors: Robert D. Kenney, Jason N. Dale
SIMD operand permutation with selection from among multiple registers

Patent number: 11126439

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Grant

Filed: November 15, 2019

Date of Patent: September 21, 2021

Assignee: Apple Inc.

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Register file arbitration

Patent number: 11080055

Abstract: Techniques are disclosed relating to arbitration among register file accesses. In some embodiments, an apparatus includes a register file configured to store operands for multiple client circuits and arbitration circuitry configured to select from among multiple received requests to access the register file. In some embodiments, the apparatus includes first interface circuitry configured to provide access requests from a first client circuit to the arbitration circuitry and supplemental interface circuitry configured to receive unsuccessful requests from the first client circuit and provide the received unsuccessful requests to the arbitration circuitry. The supplemental interface circuitry may provide additional catch-up bandwidth to clients that lose arbitration, which may result in fairness during bandwidth shortages.

Type: Grant

Filed: August 22, 2019

Date of Patent: August 3, 2021

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Terence M. Potter
SIMD Operand Permutation with Selection from among Multiple Registers

Publication number: 20210149679

Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.

Type: Application

Filed: November 15, 2019

Publication date: May 20, 2021

Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
Datapath Circuitry for Math Operations using SIMD Pipelines

Publication number: 20210109761

Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.

Type: Application

Filed: October 9, 2019

Publication date: April 15, 2021

Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
Register File Arbitration

Publication number: 20210055929

Abstract: Techniques are disclosed relating to arbitration among register file accesses. In some embodiments, an apparatus includes a register file configured to store operands for multiple client circuits and arbitration circuitry configured to select from among multiple received requests to access the register file. In some embodiments, the apparatus includes first interface circuitry configured to provide access requests from a first client circuit to the arbitration circuitry and supplemental interface circuitry configured to receive unsuccessful requests from the first client circuit and provide the received unsuccessful requests to the arbitration circuitry. The supplemental interface circuitry may provide additional catch-up bandwidth to clients that lose arbitration, which may result in fairness during bandwidth shortages.

Type: Application

Filed: August 22, 2019

Publication date: February 25, 2021

Inventors: Robert D. Kenney, Terence M. Potter
Routing Circuitry for Permutation of Single-Instruction Multiple-Data Operands

Publication number: 20210055931

Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.

Type: Application

Filed: August 22, 2019

Publication date: February 25, 2021

Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
GPU task scheduling

Patent number: 10902545

Abstract: Techniques are disclosed relating to scheduling tasks for graphics processing. In one embodiment, a graphics unit is configured to render a frame of graphics data using a plurality of pass groups and the frame of graphics data includes a plurality of frame portions. In this embodiment, the graphics unit includes scheduling circuitry configured to receive a plurality of tasks, maintain pass group information for each of the plurality of tasks, and maintain relative age information for the plurality of frame portions. In this embodiment, the scheduling circuitry is configured to select a task for execution based on the pass group information and the age information. In some embodiments, the scheduling circuitry is configured to select tasks from an oldest frame portion and current pass group before selecting other tasks. This scheduling approach may result in efficient execution of various different types of graphics workloads.

Type: Grant

Filed: December 17, 2014

Date of Patent: January 26, 2021

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Benjiman L. Goodman, Terence M. Potter
Data alignment and formatting for graphics processing unit

Patent number: 10769746

Abstract: A data queuing and format apparatus is disclosed. A first selection circuit may be configured to selectively couple a first subset of data to a first plurality of data lines dependent upon control information, and a second selection circuit may be configured to selectively couple a second subset of data to a second plurality of data lines dependent upon the control information. A storage array may include multiple storage units, and each storage unit may be configured to receive data from one or more data lines of either the first or second plurality of data lines dependent upon the control information.

Type: Grant

Filed: September 25, 2014

Date of Patent: September 8, 2020

Assignee: Apple Inc.

Inventors: Liang Xia, Robert D. Kenney, Benjiman L. Goodman, Terence M. Potter
Techniques for ALU sharing between threads

Patent number: 10699366

Abstract: Techniques are disclosed relating to sharing an arithmetic logic unit (ALU) between multiple threads. In some embodiments, the threads also have dedicated ALUs for other types of operations. In some embodiments, arbitration circuitry is configured to receive operations to be performed by the shared arithmetic logic unit from the set of threads and issue the received operations to the shared arithmetic logic unit. In some embodiments, the arbitration circuitry is configured to switch to a different one of the set of threads for each instruction issued to the shared arithmetic logic unit. In some embodiments, the shared ALU is configured to perform 32-bit operations and the dedicated ALUs are configured to perform the same operations using 16-bit precision. In some embodiments, the shared ALU is shared between two threads and is physically located adjacent to other datapath circuitry for the two threads.

Type: Grant

Filed: August 7, 2018

Date of Patent: June 30, 2020

Assignee: Apple Inc.

Inventor: Robert D. Kenney
Pipelined allocation for operand cache

Patent number: 10678548

Abstract: Techniques are disclosed relating to controlling an operand cache in a pipelined fashion. An operand cache may cache operands fetched from the register file or generated by previous instructions to improve performance and/or reduce power consumption. In some embodiments, instructions are pipelined and separate tag information is maintained to indicate allocation of an operand cache entry and ownership of the operand cache entry. In some embodiments, this may allow an operand to remain in the operand cache (and potentially be retrieved or modified) during an interval between allocation of the entry for another operand and ownership of the entry by the other operand. This may improve operand cache efficiency by allowing the entry to be used while to retrieving the other operand from the register file, for example.

Type: Grant

Filed: August 24, 2018

Date of Patent: June 9, 2020

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Terence M. Potter, Andrew M. Havlir, Sivayya V. Ayinala
Pipelined Allocation for Operand Cache

Publication number: 20200065104

Abstract: Techniques are disclosed relating to controlling an operand cache in a pipelined fashion. An operand cache may cache operands fetched from the register file or generated by previous instructions to improve performance and/or reduce power consumption. In some embodiments, instructions are pipelined and separate tag information is maintained to indicate allocation of an operand cache entry and ownership of the operand cache entry. In some embodiments, this may allow an operand to remain in the operand cache (and potentially be retrieved or modified) during an interval between allocation of the entry for another operand and ownership of the entry by the other operand. This may improve operand cache efficiency by allowing the entry to be used while to retrieving the other operand from the register file, for example.

Type: Application

Filed: August 24, 2018

Publication date: February 27, 2020

Inventors: Robert D. Kenney, Terence M. Potter, Andrew M. Havlir, Sivayya V. Ayinala
Hints for shared store pipeline and multi-rate targets

Patent number: 10452401

Abstract: Techniques are disclosed relating to selecting store instructions for dispatch to a shared pipeline. In some embodiments, the shared pipeline processes instructions for different target clients with different data rate capabilities. Therefore, in some embodiments, the pipeline is configured to generate state information that is based on a determined amount of work in the pipeline that targets at least one slower target. In some embodiments, the state information indicates whether the amount of work is above a threshold for the particular target. In some embodiments, scheduling circuitry is configured to select instructions for dispatch to the pipeline based on the state information. For example, the scheduling circuitry may refrain from selecting instructions with a slower target when the slower target is above its threshold amount of work in the pipeline.

Type: Grant

Filed: March 20, 2017

Date of Patent: October 22, 2019

Assignee: Apple Inc.

Inventor: Robert D. Kenney

1 2 next