Patents by Inventor Kevin Hurd

Kevin Hurd has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CONVERSION OPERATIONS AND SPECIAL VALUE USE CASES SUPPORTING 8-BIT FLOATING POINT FORMAT IN A GRAPHICS ARCHITECTURE

Publication number: 20250110733

Abstract: An apparatus to facilitate conversion operations and special value use cases supporting 8-bit floating point format in a graphics architecture is disclosed. The apparatus includes a processor comprising a decoder to decode an instruction fetched for execution into a decoded instruction, wherein the decoded instruction to cause the processor to perform conversion operation corresponding to an 8-bit floating point format operand; a scheduler to schedule the decoded instruction and provide input data for an input operand of the conversion operation indicated by the decoded instruction; and conversion circuitry to execute the decoded instruction to perform the conversion operation to convert the input operand to an output operand in accordance with the 8-bit floating point format operand, the conversion circuitry comprising hardware circuitry to rescale, normalize, and convert the input operand to the output operand.

Type: Application

Filed: September 29, 2023

Publication date: April 3, 2025

Applicant: Intel Corporation

Inventors: Jorge Eduardo Parra Osorio, Fangwen Fu, Guei-Yuan Lueh, Jiasheng Chen, Naveen K. Mellempudi, Kevin Hurd, Alexandre Hadj-Chaib, Elliot Taylor, Marius Cornea-Hasegan
SUPPORTING 8-BIT FLOATING POINT FORMAT FOR PARALLEL COMPUTING AND STOCHASTIC ROUNDING OPERATIONS IN A GRAPHICS ARCHITECTURE

Publication number: 20250110741

Abstract: An apparatus to facilitate supporting 8-bit floating point format for parallel computing and stochastic rounding operations in a graphics architecture is disclosed. The apparatus includes a processor comprising: a decoder to decode an instruction fetched for execution into a decoded instruction, wherein the decoded instruction is a matrix instruction that is to operate on 8-bit floating point operands to perform a parallel dot product operation; a scheduler to schedule the decoded instruction and provide input data for the 8-bit floating point operands in accordance with an 8-bit floating data format indicated by the decoded instruction; and circuitry to execute the decoded instruction to perform 32-way dot-product using 8-bit wide dot-product layers, each 8-bit wide dot-product layer comprises one or more sets of interconnected multipliers, shifters, and adders, wherein each set of multipliers, shifters, and adders is to generate a dot product of the 8-bit floating point operands.

Type: Application

Filed: September 29, 2023

Publication date: April 3, 2025

Applicant: Intel Corporation

Inventors: Jorge Eduardo Parra Osorio, Fangwen Fu, Guei-Yuan Lueh, Hong Jiang, Jiasheng Chen, Naveen K. Mellempudi, Kevin Hurd, Chunhui Mei, Alexandre Hadj-Chaib, Elliot Taylor, Shuai Mu
FLOATING-POINT CONVERSION VIA AN INTEGER UNIT

Publication number: 20250036361

Abstract: Described herein is a graphics processor comprising a memory interface and a graphics processing cluster coupled with the memory interface. The graphics processing cluster includes a multi-lane parallel floating-point unit and a multi-lane parallel integer unit. The multi-lane parallel integer unit includes an integer pipeline including a plurality of parallel integer logic units configured to perform integer compute operations on a plurality of input data elements and a format conversion pipeline including a plurality of parallel format conversion units configured to convert a plurality of input data elements from a first one of a plurality of datatype formats to a second one of the plurality of datatype formats, the plurality of datatype formats including integer and floating-point formats.

Type: Application

Filed: July 25, 2023

Publication date: January 30, 2025

Applicant: Intel Corporation

Inventors: Supratim Pal, Jiasheng Chen, Kevin Hurd, Jorge E. Parra Osorio, Christopher Spencer, Guei-Yuan Lueh, Pradeep K. Golconda, Fangwen Fu, Wei Xiong, Hongzheng Li, James Valerio, Mukundan Swaminathan, Nicholas Murphy, Shuai Mu, Clifford Gibson, Buqi Cheng
AVOIDING THE USE OF A RESULT CROSSBAR WHEN DOWN CONVERTING TO PACKED REGISTER FORMATS

Publication number: 20250036412

Abstract: Described herein is a graphics processor comprising a memory interface and a graphics processing cluster coupled with the memory interface. The graphics processing cluster includes a plurality of processing resources. A processing resource of the plurality of processing resources includes a source crossbar communicatively coupled with a register file, the source crossbar to reorder data elements of a source operand and a format conversion pipeline to convert a plurality of input data elements specified by the source operand from a first format of a plurality of datatype formats to a second format of the plurality of datatype formats, the plurality of datatype formats including integer and floating-point formats.

Type: Application

Filed: July 25, 2023

Publication date: January 30, 2025

Applicant: Intel Corporation

Inventors: Supratim Pal, Jiasheng Chen, Christopher Spencer, Jorge E. Parra Osorio, Kevin Hurd, Guei-Yuan Lueh, Pradeep K. Golconda, Fangwen Fu, Wei Xiong, Hongzheng Li, James Valerio, Mukundan Swaminathan, Nicholas Murphy, Shuai Mu, Clifford Gibson, Buqi Cheng
32-BIT CHANNEL-ALIGNED INTEGER MULTIPLICATION VIA MULTIPLE MULTIPLIERS PER-CHANNEL

Publication number: 20250037347

Abstract: Described herein is a graphics processor comprising an instruction cache and a plurality of processing elements coupled with the instruction cache. The plurality of processing elements include functional units configured to provide an integer pipeline to execute instructions to perform operations on integer data elements. The integer pipeline including a first multiplier and a second multiplier, the first multiplier and the second multiplier configured to execute operations for a single instruction.

Type: Application

Filed: July 25, 2023

Publication date: January 30, 2025

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Supratim Pal, Kevin Hurd, Jorge E. Parra Osorio, Christopher Spencer, Takashi Nakagawa, Guei-Yuan Lueh, Pradeep K. Golconda, James Valerio, Mukundan Swaminathan, Nicholas Murphy, Clifford Gibson, Li-An Tang, Fangwen Fu, Kaiyu Chen, Buqi Cheng
CONFIGURABLE PROCESSING RESOURCE EVENT FILTER FOR GPU HARDWARE-BASED PERFORMANCE MONITORING

Publication number: 20240419447

Abstract: Described herein is a graphics processor comprising a plurality of processing elements associated with performance monitoring circuitry. The performance monitoring circuitry is configurable to generate performance data for multiple concurrently executed workloads via flexible event filtering hardware that can isolate a data stream of performance events and display performance monitoring data that is specific to each of the multiple concurrently executed workloads. In one embodiment, performance monitoring for the separate workloads can be configured, for example, by filtering based on the respective contexts used to execute the workloads, the specific instructions executed respectively by the workloads, or the datatypes used respectively by the workloads.

Type: Application

Filed: June 16, 2023

Publication date: December 19, 2024

Applicant: Intel Corporation

Inventors: Prashant D. Chaudhari, Kevin Hurd
HARDWARE ENHANCEMENTS FOR DOUBLE PRECISION SYSTOLIC SUPPORT

Publication number: 20240111826

Abstract: An apparatus to facilitate hardware enhancements for double precision systolic support is disclosed. The apparatus includes matrix acceleration hardware having double-precision (DP) matrix multiplication circuitry including a multiplier circuits to multiply pairs of input source operands in a DP floating-point format; adders to receive multiplier outputs from the multiplier circuits and accumulate the multiplier outputs in a high precision intermediate format; an accumulator circuit to accumulate adder outputs from the adders with at least one of a third global source operand on a first pass of the DP matrix multiplication circuitry or an intermediate result from the first pass on a second pass of the DP matrix multiplication circuitry, wherein the accumulator circuit to generate an accumulator output in the high precision intermediate format; and a down conversion and rounding circuit to down convert and round an output of the second pass as final result in the DP floating-point format.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Kevin Hurd, Changwon Rhee, Jorge Parra, Fangwen Fu, Theo Drane, William Zorn, Peter Caday, Gregory Henry, Guei-Yuan Lueh, Farzad Chehrazi, Amit Karande, Turbo Majumder, Xinmin Tian, Milind Girkar, Hong Jiang
SINGLE PRECISION SUPPORT FOR SYSTOLIC PIPELINE IN A GRAPHICS ENVIRONMENT

Publication number: 20240111825

Abstract: An apparatus to facilitate single precision support for systolic pipeline in a graphics environment is disclosed. The apparatus includes a processor comprising systolic array hardware including a plurality of data processing units, wherein the systolic array hardware is to: receive data for performance of a matrix multiplication operation in a first precision format; convert an original value of the data into two split values with a second precision format having a lower precision than the first precision format; perform the matrix multiplication operation using the two split values in the second precision format, the matrix multiplication operation comprising a split-term operation that utilizes two passes through the systolic array hardware with feedback wiring and local reduction; and generate an emulated result for the matrix multiplication operation in the first precision format.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Changwon Rhee, Kevin Hurd, Gregory Henry, Peter Caday, Kristopher Wong
SUPPORTING VECTOR MULTIPLY ADD WITH DOUBLE ACCUMULATOR ACCESS IN A GRAPHICS ENVIRONMENT

Publication number: 20240103810

Abstract: An apparatus to facilitate supporting vector multiply add with double accumulator access in a graphics environment is disclosed. The apparatus includes a processor comprising processing resources, the processing resources comprising multiplier circuitry to: receive operands for a matrix multiplication operation, wherein the operands comprising two source matrices to be multiplied as part of the matrix multiplication operation; and issue a multiply and add vector (MADV) instruction for the multiplication operation utilizing a double accumulator access output, wherein the MADV instruction to multiply two vectors of the two source matrices in a single floating point (FP) pipeline of the processor.

Type: Application

Filed: September 27, 2022

Publication date: March 28, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Supratim Pal, Changwon Rhee, Hong Jiang, Kevin Hurd, Shuai Mu
SYSTEM AND METHOD FOR HANDLING ERRORS IN A VEHICLE NEURAL NETWORK PROCESSOR

Publication number: 20240086270

Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted, without terminating execution of the neural network.

Type: Application

Filed: August 18, 2023

Publication date: March 14, 2024

Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
High power voltage regulator module with different plane orientations

Patent number: 11894770

Abstract: A Voltage Regulator Module (VRM) includes a first voltage rail circuit board oriented in a first plane having formed therein a first plurality of conductors and configured to produce a first rail voltage, a second voltage rail circuit board oriented in a second plane that is substantially parallel to the first plane having formed therein a second plurality of conductors and configured to produce a second rail voltage. The VRM also includes a first capacitor circuit board oriented in a third plane that is substantially perpendicular to the first plane and a second capacitor circuit board oriented in a fourth plane that is substantially parallel to the third plane. The VRM includes a plurality of conductors intercoupling the first voltage rail circuit board, the first capacitor circuit board, the second voltage rail circuit board, and the second capacitor circuit board.

Type: Grant

Filed: November 7, 2018

Date of Patent: February 6, 2024

Assignee: Tesla, Inc.

Inventors: Shishuang Sun, Kevin Hurd, Satyan Chandra
INTEGRATED CIRCUITS WITH MACHINE LEARNING EXTENSIONS

Publication number: 20230342111

Abstract: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Application

Filed: June 30, 2023

Publication date: October 26, 2023

Inventors: Martin Langhammer, Dongdong Chen, Kevin Hurd
System and method for handling errors in a vehicle neural network processor

Patent number: 11734095

Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.

Type: Grant

Filed: September 23, 2021

Date of Patent: August 22, 2023

Assignee: Tesla, Inc.

Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
Integrated circuits with machine learning extensions

Patent number: 11726744

Abstract: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Grant

Filed: March 26, 2021

Date of Patent: August 15, 2023

Assignee: Intel Corporation

Inventors: Martin Langhammer, Dongdong Chen, Kevin Hurd
SYSTEM AND METHOD FOR HANDLING ERRORS IN A VEHICLE NEURAL NETWORK PROCESSOR

Publication number: 20220083412

Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.

Type: Application

Filed: September 23, 2021

Publication date: March 17, 2022

Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
INTEGRATED CIRCUITS WITH MACHINE LEARNING EXTENSIONS

Publication number: 20220012015

Abstract: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Application

Filed: September 24, 2021

Publication date: January 13, 2022

Inventors: Martin Langhammer, Dongdong Chen, Kevin Hurd
System and method for handling errors in a vehicle neural network processor

Patent number: 11132245

Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.

Type: Grant

Filed: March 30, 2020

Date of Patent: September 28, 2021

Assignee: Tesla, Inc.

Inventors: Christopher Hsiong, Emil Talpes, Debjit Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering
INTEGRATED CIRCUITS WITH MACHINE LEARNING EXTENSIONS

Publication number: 20210240440

Abstract: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Application

Filed: March 26, 2021

Publication date: August 5, 2021

Inventors: Martin Langhammer, Dongdong Chen, Kevin Hurd
Integrated circuits with machine learning extensions

Patent number: 10970042

Abstract: An integrated circuit with specialized processing blocks is provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Grant

Filed: September 27, 2018

Date of Patent: April 6, 2021

Assignee: Intel Corporation

Inventors: Martin Langhammer, Dongdong Chen, Kevin Hurd
SYSTEM AND METHOD FOR HANDLING ERRORS IN A VEHICLE NEURAL NETWORK PROCESSOR

Publication number: 20200394095

Abstract: A system for handling errors in a neural network includes a neural network processor for executing a neural network associated with use of a vehicle. The neural network processor includes an error detector configured to detect a data error associated with execution of the neural network and a neural network controller configured to receive a report of the data error from the error detector. In response to receiving the report, the neural network controller is further configured to signal that a pending result of the neural network is tainted without terminating execution of the neural network.

Type: Application

Filed: March 30, 2020

Publication date: December 17, 2020

Inventors: Christopher Hsiong, Emil Talpes, Debjlt Das Sarma, Peter Bannon, Kevin Hurd, Benjamin Floering

1 2 next