Patents by Inventor Lance Hacking

Lance Hacking has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11907827
    Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: February 20, 2024
    Assignee: Intel Corporation
    Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
  • Publication number: 20240005135
    Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.
    Type: Application
    Filed: April 18, 2023
    Publication date: January 4, 2024
    Applicant: Intel Corporation
    Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
  • Patent number: 11714998
    Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: August 1, 2023
    Assignee: INTEL CORPORATION
    Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
  • Publication number: 20230004430
    Abstract: Technology for estimating neural network (NN) power profiles includes obtaining a plurality of workloads for a compiled NN model, the plurality of workloads determined for a hardware execution device, determining a hardware efficiency factor for the compiled NN model, and generating, based on the hardware efficiency factor, a power profile for the compiled NN model on one or more of a per-layer basis or a per-workload basis. The hardware efficiency factor can be determined on based on a hardware efficiency measurement and a hardware utilization measurement, and can be determined on a per-workload basis. A configuration file can be provided for generating the power profile, and an output visualization of the power profile can be generated. Further, feedback information can be generated to perform one or more of selecting a hardware device, optimizing a breakdown of workloads, optimizing a scheduling of tasks, or confirming a hardware device design.
    Type: Application
    Filed: July 2, 2022
    Publication date: January 5, 2023
    Inventors: Richard Richmond, Eric Luk, Lingdan Zeng, Lance Hacking, Alessandro Palla, Mohamed Elmalaki, Sara Almalih
  • Patent number: 11347828
    Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: May 31, 2022
    Assignee: Intel Corporation
    Inventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
  • Publication number: 20200410327
    Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.
    Type: Application
    Filed: June 28, 2019
    Publication date: December 31, 2020
    Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
  • Publication number: 20200320375
    Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.
    Type: Application
    Filed: June 23, 2020
    Publication date: October 8, 2020
    Applicant: Intel Corporation
    Inventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
  • Publication number: 20200226203
    Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.
    Type: Application
    Filed: March 27, 2020
    Publication date: July 16, 2020
    Inventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
  • Publication number: 20200134417
    Abstract: Example apparatus disclosed herein include an array of processor elements, the array including rows each having a first number of processor elements and columns each having a second number of processor elements. Disclosed example apparatus also include configuration registers to store descriptors to configure the array to implement a layer of a convolutional neural network based on a dataflow schedule corresponding to one of multiple tensor processing templates, ones of the processor elements to be configured based on the descriptors to implement the one of the tensor processing templates to operate on input activation data and filter data associated with the layer of the convolutional neural network to produce output activation data associated with the layer of the convolutional neural network. Disclosed example apparatus further include memory to store the input activation data, the filter data and the output activation data associated with the layer of the convolutional neural network.
    Type: Application
    Filed: December 24, 2019
    Publication date: April 30, 2020
    Inventors: Debabrata Mohapatra, Arnab Raha, Gautham Chinya, Huichu Liu, Cormac Brick, Lance Hacking
  • Patent number: 9372768
    Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.
    Type: Grant
    Filed: December 26, 2013
    Date of Patent: June 21, 2016
    Assignee: Intel Corporation
    Inventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
  • Patent number: 9189302
    Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.
    Type: Grant
    Filed: January 29, 2014
    Date of Patent: November 17, 2015
    Assignee: Intel Corporation
    Inventor: Lance Hacking
  • Publication number: 20150186232
    Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.
    Type: Application
    Filed: December 26, 2013
    Publication date: July 2, 2015
    Inventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
  • Publication number: 20140149999
    Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.
    Type: Application
    Filed: January 29, 2014
    Publication date: May 29, 2014
    Inventor: Lance Hacking
  • Publication number: 20140108684
    Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
  • Patent number: 8392728
    Abstract: A method to reduce idle leakage power in I/O pins of an integrated circuit using external circuitry. Initially, I/O pins on a package are subdivided into those that will also remain powered up and those that will power down during idle state. When a system enters a low power mode, a signal is sent to the external circuitry. The signal notifies the I/O pins that always remain powered up to notify the external circuitry to power down the other set of I/O pins.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: March 5, 2013
    Assignee: Intel Corporation
    Inventors: Lance Hacking, Belliappa Kuttanna, Rajesh Patel, Ashish Choubal, Terry Fletcher, Steven S. Varnum, Binta Patel
  • Patent number: 8312309
    Abstract: A technique to promote determinism among multiple clocking domains within a computer system or integrated circuit, In one embodiment, one or more execution units are placed in a deterministic state with respect to multiple clocks within a processor system having a number of different clocking domains.
    Type: Grant
    Filed: March 5, 2008
    Date of Patent: November 13, 2012
    Assignee: Intel Corporation
    Inventors: Eric L. Hendrickson, Sanjoy Mondal, Larry Thatcher, William Hodges, Lance Hacking, Sankaran Menon
  • Patent number: 8289850
    Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: October 16, 2012
    Assignee: Intel Corporation
    Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
  • Publication number: 20120054387
    Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.
    Type: Application
    Filed: September 23, 2011
    Publication date: March 1, 2012
    Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
  • Patent number: 8050177
    Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has take place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: November 1, 2011
    Assignee: Intel Corporation
    Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
  • Patent number: 7707350
    Abstract: A front side bus swizzle mechanism modifies the front side (address and data) bus on a chip so that, when the chip is positioned on one side of a printed circuit board, connection to a second chip located on the opposite side of the printed circuit board is simplified. The simplified connection may result in less complexity and minimize the consumption of additional printed circuit board real estate.
    Type: Grant
    Filed: March 28, 2008
    Date of Patent: April 27, 2010
    Assignee: Intel Corporation
    Inventors: Michael E. Altenburg, Binta M. Patel, Lance Hacking, David K. Dean