Patents by Inventor Lance Hacking

Lance Hacking has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Schedule-aware tensor distribution module

Patent number: 12288153

Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

Type: Grant

Filed: January 10, 2024

Date of Patent: April 29, 2025

Assignee: Intel Corporation

Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
Schedule-Aware Tensor Distribution Module

Publication number: 20240220785

Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

Type: Application

Filed: January 10, 2024

Publication date: July 4, 2024

Applicant: Intel Corporation

Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
Schedule-aware tensor distribution module

Patent number: 11907827

Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

Type: Grant

Filed: June 28, 2019

Date of Patent: February 20, 2024

Assignee: Intel Corporation

Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
ACCELERATING NEURAL NETWORKS WITH LOW PRECISION-BASED MULTIPLICATION AND EXPLOITING SPARSITY IN HIGHER ORDER BITS

Publication number: 20240005135

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

Type: Application

Filed: April 18, 2023

Publication date: January 4, 2024

Applicant: Intel Corporation

Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
Accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits

Patent number: 11714998

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

Type: Grant

Filed: June 23, 2020

Date of Patent: August 1, 2023

Assignee: INTEL CORPORATION

Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
ESTIMATION OF POWER PROFILES FOR NEURAL NETWORK MODELS RUNNING ON AI ACCELERATORS

Publication number: 20230004430

Abstract: Technology for estimating neural network (NN) power profiles includes obtaining a plurality of workloads for a compiled NN model, the plurality of workloads determined for a hardware execution device, determining a hardware efficiency factor for the compiled NN model, and generating, based on the hardware efficiency factor, a power profile for the compiled NN model on one or more of a per-layer basis or a per-workload basis. The hardware efficiency factor can be determined on based on a hardware efficiency measurement and a hardware utilization measurement, and can be determined on a per-workload basis. A configuration file can be provided for generating the power profile, and an output visualization of the power profile can be generated. Further, feedback information can be generated to perform one or more of selecting a hardware device, optimizing a breakdown of workloads, optimizing a scheduling of tasks, or confirming a hardware device design.

Type: Application

Filed: July 2, 2022

Publication date: January 5, 2023

Inventors: Richard Richmond, Eric Luk, Lingdan Zeng, Lance Hacking, Alessandro Palla, Mohamed Elmalaki, Sara Almalih
Methods, apparatus, articles of manufacture to perform accelerated matrix multiplication

Patent number: 11347828

Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.

Type: Grant

Filed: March 27, 2020

Date of Patent: May 31, 2022

Assignee: Intel Corporation

Inventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
Schedule-Aware Tensor Distribution Module

Publication number: 20200410327

Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

Type: Application

Filed: June 28, 2019

Publication date: December 31, 2020

Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
ACCELERATING NEURAL NETWORKS WITH LOW PRECISION-BASED MULTIPLICATION AND EXPLOITING SPARSITY IN HIGHER ORDER BITS

Publication number: 20200320375

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

Type: Application

Filed: June 23, 2020

Publication date: October 8, 2020

Applicant: Intel Corporation

Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
METHODS, APPARATUS, ARTICLES OF MANUFACTURE TO PERFORM ACCELERATED MATRIX MULTIPLICATION

Publication number: 20200226203

Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.

Type: Application

Filed: March 27, 2020

Publication date: July 16, 2020

Inventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
CONFIGURABLE PROCESSOR ELEMENT ARRAYS FOR IMPLEMENTING CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20200134417

Abstract: Example apparatus disclosed herein include an array of processor elements, the array including rows each having a first number of processor elements and columns each having a second number of processor elements. Disclosed example apparatus also include configuration registers to store descriptors to configure the array to implement a layer of a convolutional neural network based on a dataflow schedule corresponding to one of multiple tensor processing templates, ones of the processor elements to be configured based on the descriptors to implement the one of the tensor processing templates to operate on input activation data and filter data associated with the layer of the convolutional neural network to produce output activation data associated with the layer of the convolutional neural network. Disclosed example apparatus further include memory to store the input activation data, the filter data and the output activation data associated with the layer of the convolutional neural network.

Type: Application

Filed: December 24, 2019

Publication date: April 30, 2020

Inventors: Debabrata Mohapatra, Arnab Raha, Gautham Chinya, Huichu Liu, Cormac Brick, Lance Hacking
Debug interface

Patent number: 9372768

Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.

Type: Grant

Filed: December 26, 2013

Date of Patent: June 21, 2016

Assignee: Intel Corporation

Inventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
Technique for monitoring activity within an integrated circuit

Patent number: 9189302

Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.

Type: Grant

Filed: January 29, 2014

Date of Patent: November 17, 2015

Assignee: Intel Corporation

Inventor: Lance Hacking
DEBUG INTERFACE

Publication number: 20150186232

Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.

Type: Application

Filed: December 26, 2013

Publication date: July 2, 2015

Inventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
Technique for Monitoring Activity within an Integrated Circuit

Publication number: 20140149999

Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.

Type: Application

Filed: January 29, 2014

Publication date: May 29, 2014

Inventor: Lance Hacking
INTERCONNECT BANDWIDTH THROTTLER

Publication number: 20140108684

Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.

Type: Application

Filed: October 11, 2012

Publication date: April 17, 2014

Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
Reducing idle leakage power in an IC

Patent number: 8392728

Abstract: A method to reduce idle leakage power in I/O pins of an integrated circuit using external circuitry. Initially, I/O pins on a package are subdivided into those that will also remain powered up and those that will power down during idle state. When a system enters a low power mode, a signal is sent to the external circuitry. The signal notifies the I/O pins that always remain powered up to notify the external circuitry to power down the other set of I/O pins.

Type: Grant

Filed: December 22, 2006

Date of Patent: March 5, 2013

Assignee: Intel Corporation

Inventors: Lance Hacking, Belliappa Kuttanna, Rajesh Patel, Ashish Choubal, Terry Fletcher, Steven S. Varnum, Binta Patel
Technique for promoting determinism among multiple clock domains

Patent number: 8312309

Abstract: A technique to promote determinism among multiple clocking domains within a computer system or integrated circuit, In one embodiment, one or more execution units are placed in a deterministic state with respect to multiple clocks within a processor system having a number of different clocking domains.

Type: Grant

Filed: March 5, 2008

Date of Patent: November 13, 2012

Assignee: Intel Corporation

Inventors: Eric L. Hendrickson, Sanjoy Mondal, Larry Thatcher, William Hodges, Lance Hacking, Sankaran Menon
Interconnect bandwidth throttler

Patent number: 8289850

Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.

Type: Grant

Filed: September 23, 2011

Date of Patent: October 16, 2012

Assignee: Intel Corporation

Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
INTERCONNECT BANDWIDTH THROTTLER

Publication number: 20120054387

Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.

Type: Application

Filed: September 23, 2011

Publication date: March 1, 2012

Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel

1 2 next