Patents by Inventor Robert D. Kenney
Robert D. Kenney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12008377Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: GrantFiled: April 12, 2023Date of Patent: June 11, 2024Assignee: Apple Inc.Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Publication number: 20230325196Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: ApplicationFiled: April 12, 2023Publication date: October 12, 2023Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Patent number: 11645084Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: GrantFiled: September 9, 2021Date of Patent: May 9, 2023Assignee: Apple Inc.Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Patent number: 11422822Abstract: Techniques are disclosed relating to sharing datapath circuitry among multiple SIMD groups. In some embodiments, pipeline circuitry is configured to perform operations specified by instructions of first and second assigned SIMD groups. The pipeline circuitry may include first and second front-end circuitry configured to decode instructions of the respective SIMD groups. The pipeline circuitry may include shared execution circuitry configured to perform operations specified by the first and second assigned SIMD groups and arbitration circuitry configured to select an instruction from among at least the first and second front-end circuitry for assignment to the shared execution circuitry in a current cycle. The arbitration circuitry may select an instruction based on one or more of: stall counts, whether available instructions are being speculatively executed, whether ones of available instructions target a particular portion of the shared execution circuitry, numbers of execution cycles, and SIMD group ages.Type: GrantFiled: May 8, 2020Date of Patent: August 23, 2022Assignee: Apple Inc.Inventors: Robert D. Kenney, Jason N. Dale
-
Patent number: 11294672Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.Type: GrantFiled: August 22, 2019Date of Patent: April 5, 2022Assignee: Apple Inc.Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
-
Patent number: 11256518Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.Type: GrantFiled: October 9, 2019Date of Patent: February 22, 2022Assignee: Apple Inc.Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
-
Publication number: 20210406031Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: ApplicationFiled: September 9, 2021Publication date: December 30, 2021Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Publication number: 20210349725Abstract: Techniques are disclosed relating to sharing datapath circuitry among multiple SIMD groups. In some embodiments, pipeline circuitry is configured to perform operations specified by instructions of first and second assigned SIMD groups. The pipeline circuitry may include first and second front-end circuitry configured to decode instructions of the respective SIMD groups. The pipeline circuitry may include shared execution circuitry configured to perform operations specified by the first and second assigned SIMD groups and arbitration circuitry configured to select an instruction from among at least the first and second front-end circuitry for assignment to the shared execution circuitry in a current cycle. The arbitration circuitry may select an instruction based on one or more of: stall counts, whether available instructions are being speculatively executed, whether ones of available instructions target a particular portion of the shared execution circuitry, numbers of execution cycles, and SIMD group ages.Type: ApplicationFiled: May 8, 2020Publication date: November 11, 2021Inventors: Robert D. Kenney, Jason N. Dale
-
Patent number: 11126439Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: GrantFiled: November 15, 2019Date of Patent: September 21, 2021Assignee: Apple Inc.Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Patent number: 11080055Abstract: Techniques are disclosed relating to arbitration among register file accesses. In some embodiments, an apparatus includes a register file configured to store operands for multiple client circuits and arbitration circuitry configured to select from among multiple received requests to access the register file. In some embodiments, the apparatus includes first interface circuitry configured to provide access requests from a first client circuit to the arbitration circuitry and supplemental interface circuitry configured to receive unsuccessful requests from the first client circuit and provide the received unsuccessful requests to the arbitration circuitry. The supplemental interface circuitry may provide additional catch-up bandwidth to clients that lose arbitration, which may result in fairness during bandwidth shortages.Type: GrantFiled: August 22, 2019Date of Patent: August 3, 2021Assignee: Apple Inc.Inventors: Robert D. Kenney, Terence M. Potter
-
Publication number: 20210149679Abstract: Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline.Type: ApplicationFiled: November 15, 2019Publication date: May 20, 2021Inventors: Christopher A. Burns, Liang-Kai Wang, Robert D. Kenney, Terence M. Potter
-
Publication number: 20210109761Abstract: Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry.Type: ApplicationFiled: October 9, 2019Publication date: April 15, 2021Inventors: Liang-Kai Wang, Robert D. Kenney, Terence M. Potter, Vinod Reddy Nalamalapu, Sivayya V. Ayinala
-
Publication number: 20210055929Abstract: Techniques are disclosed relating to arbitration among register file accesses. In some embodiments, an apparatus includes a register file configured to store operands for multiple client circuits and arbitration circuitry configured to select from among multiple received requests to access the register file. In some embodiments, the apparatus includes first interface circuitry configured to provide access requests from a first client circuit to the arbitration circuitry and supplemental interface circuitry configured to receive unsuccessful requests from the first client circuit and provide the received unsuccessful requests to the arbitration circuitry. The supplemental interface circuitry may provide additional catch-up bandwidth to clients that lose arbitration, which may result in fairness during bandwidth shortages.Type: ApplicationFiled: August 22, 2019Publication date: February 25, 2021Inventors: Robert D. Kenney, Terence M. Potter
-
Publication number: 20210055931Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.Type: ApplicationFiled: August 22, 2019Publication date: February 25, 2021Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
-
Patent number: 10902545Abstract: Techniques are disclosed relating to scheduling tasks for graphics processing. In one embodiment, a graphics unit is configured to render a frame of graphics data using a plurality of pass groups and the frame of graphics data includes a plurality of frame portions. In this embodiment, the graphics unit includes scheduling circuitry configured to receive a plurality of tasks, maintain pass group information for each of the plurality of tasks, and maintain relative age information for the plurality of frame portions. In this embodiment, the scheduling circuitry is configured to select a task for execution based on the pass group information and the age information. In some embodiments, the scheduling circuitry is configured to select tasks from an oldest frame portion and current pass group before selecting other tasks. This scheduling approach may result in efficient execution of various different types of graphics workloads.Type: GrantFiled: December 17, 2014Date of Patent: January 26, 2021Assignee: Apple Inc.Inventors: Robert D. Kenney, Benjiman L. Goodman, Terence M. Potter
-
Patent number: 10769746Abstract: A data queuing and format apparatus is disclosed. A first selection circuit may be configured to selectively couple a first subset of data to a first plurality of data lines dependent upon control information, and a second selection circuit may be configured to selectively couple a second subset of data to a second plurality of data lines dependent upon the control information. A storage array may include multiple storage units, and each storage unit may be configured to receive data from one or more data lines of either the first or second plurality of data lines dependent upon the control information.Type: GrantFiled: September 25, 2014Date of Patent: September 8, 2020Assignee: Apple Inc.Inventors: Liang Xia, Robert D. Kenney, Benjiman L. Goodman, Terence M. Potter
-
Patent number: 10699366Abstract: Techniques are disclosed relating to sharing an arithmetic logic unit (ALU) between multiple threads. In some embodiments, the threads also have dedicated ALUs for other types of operations. In some embodiments, arbitration circuitry is configured to receive operations to be performed by the shared arithmetic logic unit from the set of threads and issue the received operations to the shared arithmetic logic unit. In some embodiments, the arbitration circuitry is configured to switch to a different one of the set of threads for each instruction issued to the shared arithmetic logic unit. In some embodiments, the shared ALU is configured to perform 32-bit operations and the dedicated ALUs are configured to perform the same operations using 16-bit precision. In some embodiments, the shared ALU is shared between two threads and is physically located adjacent to other datapath circuitry for the two threads.Type: GrantFiled: August 7, 2018Date of Patent: June 30, 2020Assignee: Apple Inc.Inventor: Robert D. Kenney
-
Patent number: 10678548Abstract: Techniques are disclosed relating to controlling an operand cache in a pipelined fashion. An operand cache may cache operands fetched from the register file or generated by previous instructions to improve performance and/or reduce power consumption. In some embodiments, instructions are pipelined and separate tag information is maintained to indicate allocation of an operand cache entry and ownership of the operand cache entry. In some embodiments, this may allow an operand to remain in the operand cache (and potentially be retrieved or modified) during an interval between allocation of the entry for another operand and ownership of the entry by the other operand. This may improve operand cache efficiency by allowing the entry to be used while to retrieving the other operand from the register file, for example.Type: GrantFiled: August 24, 2018Date of Patent: June 9, 2020Assignee: Apple Inc.Inventors: Robert D. Kenney, Terence M. Potter, Andrew M. Havlir, Sivayya V. Ayinala
-
Publication number: 20200065104Abstract: Techniques are disclosed relating to controlling an operand cache in a pipelined fashion. An operand cache may cache operands fetched from the register file or generated by previous instructions to improve performance and/or reduce power consumption. In some embodiments, instructions are pipelined and separate tag information is maintained to indicate allocation of an operand cache entry and ownership of the operand cache entry. In some embodiments, this may allow an operand to remain in the operand cache (and potentially be retrieved or modified) during an interval between allocation of the entry for another operand and ownership of the entry by the other operand. This may improve operand cache efficiency by allowing the entry to be used while to retrieving the other operand from the register file, for example.Type: ApplicationFiled: August 24, 2018Publication date: February 27, 2020Inventors: Robert D. Kenney, Terence M. Potter, Andrew M. Havlir, Sivayya V. Ayinala
-
Patent number: 10452401Abstract: Techniques are disclosed relating to selecting store instructions for dispatch to a shared pipeline. In some embodiments, the shared pipeline processes instructions for different target clients with different data rate capabilities. Therefore, in some embodiments, the pipeline is configured to generate state information that is based on a determined amount of work in the pipeline that targets at least one slower target. In some embodiments, the state information indicates whether the amount of work is above a threshold for the particular target. In some embodiments, scheduling circuitry is configured to select instructions for dispatch to the pipeline based on the state information. For example, the scheduling circuitry may refrain from selecting instructions with a slower target when the slower target is above its threshold amount of work in the pipeline.Type: GrantFiled: March 20, 2017Date of Patent: October 22, 2019Assignee: Apple Inc.Inventor: Robert D. Kenney