Patents by Inventor Lance Hacking
Lance Hacking has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240220785Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.Type: ApplicationFiled: January 10, 2024Publication date: July 4, 2024Applicant: Intel CorporationInventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
-
Patent number: 11907827Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.Type: GrantFiled: June 28, 2019Date of Patent: February 20, 2024Assignee: Intel CorporationInventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
-
Publication number: 20240005135Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.Type: ApplicationFiled: April 18, 2023Publication date: January 4, 2024Applicant: Intel CorporationInventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
-
Patent number: 11714998Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.Type: GrantFiled: June 23, 2020Date of Patent: August 1, 2023Assignee: INTEL CORPORATIONInventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
-
Publication number: 20230004430Abstract: Technology for estimating neural network (NN) power profiles includes obtaining a plurality of workloads for a compiled NN model, the plurality of workloads determined for a hardware execution device, determining a hardware efficiency factor for the compiled NN model, and generating, based on the hardware efficiency factor, a power profile for the compiled NN model on one or more of a per-layer basis or a per-workload basis. The hardware efficiency factor can be determined on based on a hardware efficiency measurement and a hardware utilization measurement, and can be determined on a per-workload basis. A configuration file can be provided for generating the power profile, and an output visualization of the power profile can be generated. Further, feedback information can be generated to perform one or more of selecting a hardware device, optimizing a breakdown of workloads, optimizing a scheduling of tasks, or confirming a hardware device design.Type: ApplicationFiled: July 2, 2022Publication date: January 5, 2023Inventors: Richard Richmond, Eric Luk, Lingdan Zeng, Lance Hacking, Alessandro Palla, Mohamed Elmalaki, Sara Almalih
-
Patent number: 11347828Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.Type: GrantFiled: March 27, 2020Date of Patent: May 31, 2022Assignee: Intel CorporationInventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
-
Publication number: 20200410327Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.Type: ApplicationFiled: June 28, 2019Publication date: December 31, 2020Inventors: Gautham Chinya, Huichu Liu, Arnab Raha, Debabrata Mohapatra, Cormac Brick, Lance Hacking
-
Publication number: 20200320375Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.Type: ApplicationFiled: June 23, 2020Publication date: October 8, 2020Applicant: Intel CorporationInventors: Avishaii Abuhatzera, Om Ji Omer, Ritwika Chowdhury, Lance Hacking
-
Publication number: 20200226203Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.Type: ApplicationFiled: March 27, 2020Publication date: July 16, 2020Inventors: Biji George, Om Ji Omer, Dipan Kumar Mandal, Cormac Brick, Lance Hacking, Sreenivas Subramoney, Belliappa Kuttanna
-
Publication number: 20200134417Abstract: Example apparatus disclosed herein include an array of processor elements, the array including rows each having a first number of processor elements and columns each having a second number of processor elements. Disclosed example apparatus also include configuration registers to store descriptors to configure the array to implement a layer of a convolutional neural network based on a dataflow schedule corresponding to one of multiple tensor processing templates, ones of the processor elements to be configured based on the descriptors to implement the one of the tensor processing templates to operate on input activation data and filter data associated with the layer of the convolutional neural network to produce output activation data associated with the layer of the convolutional neural network. Disclosed example apparatus further include memory to store the input activation data, the filter data and the output activation data associated with the layer of the convolutional neural network.Type: ApplicationFiled: December 24, 2019Publication date: April 30, 2020Inventors: Debabrata Mohapatra, Arnab Raha, Gautham Chinya, Huichu Liu, Cormac Brick, Lance Hacking
-
Patent number: 9372768Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.Type: GrantFiled: December 26, 2013Date of Patent: June 21, 2016Assignee: Intel CorporationInventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
-
Patent number: 9189302Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.Type: GrantFiled: January 29, 2014Date of Patent: November 17, 2015Assignee: Intel CorporationInventor: Lance Hacking
-
Publication number: 20150186232Abstract: Techniques of debugging a computing system are described herein. The techniques may include generating debug data at agents in the computing system. The techniques may include recording the debug data at a storage element, wherein the storage element is disposed in a non-core portion of the circuit interconnect accessible to the agents.Type: ApplicationFiled: December 26, 2013Publication date: July 2, 2015Inventors: Jeremy Conner, Sabar Souag, Karunakara Kotary, Victor Ruybalid, Noel Eck, Ramana Rachakonda, Sankaran Menon, Lance Hacking
-
Publication number: 20140149999Abstract: A technique to monitor events within a computer system or integrated circuit. In one embodiment, a software-accessible event monitoring storage and hardware-specific monitoring logic are selectable and their corresponding outputs may be monitored by accessing a counter to count events corresponding to each of software-accessible storage and hardware-specific monitoring logic.Type: ApplicationFiled: January 29, 2014Publication date: May 29, 2014Inventor: Lance Hacking
-
Publication number: 20140108684Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.Type: ApplicationFiled: October 11, 2012Publication date: April 17, 2014Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
-
Patent number: 8392728Abstract: A method to reduce idle leakage power in I/O pins of an integrated circuit using external circuitry. Initially, I/O pins on a package are subdivided into those that will also remain powered up and those that will power down during idle state. When a system enters a low power mode, a signal is sent to the external circuitry. The signal notifies the I/O pins that always remain powered up to notify the external circuitry to power down the other set of I/O pins.Type: GrantFiled: December 22, 2006Date of Patent: March 5, 2013Assignee: Intel CorporationInventors: Lance Hacking, Belliappa Kuttanna, Rajesh Patel, Ashish Choubal, Terry Fletcher, Steven S. Varnum, Binta Patel
-
Patent number: 8312309Abstract: A technique to promote determinism among multiple clocking domains within a computer system or integrated circuit, In one embodiment, one or more execution units are placed in a deterministic state with respect to multiple clocks within a processor system having a number of different clocking domains.Type: GrantFiled: March 5, 2008Date of Patent: November 13, 2012Assignee: Intel CorporationInventors: Eric L. Hendrickson, Sanjoy Mondal, Larry Thatcher, William Hodges, Lance Hacking, Sankaran Menon
-
Patent number: 8289850Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.Type: GrantFiled: September 23, 2011Date of Patent: October 16, 2012Assignee: Intel CorporationInventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
-
Publication number: 20120054387Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has taken place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable. Characteristics such as performance, thermal considerations, and average power are adjustable using the interconnect bandwidth throttler.Type: ApplicationFiled: September 23, 2011Publication date: March 1, 2012Inventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel
-
Patent number: 8050177Abstract: An interconnect bandwidth throttler is disclosed. The interconnect bandwidth throttler turns off the interconnect, based on whether a maximum number of transactions has take place within a predetermined throttle window. Both the maximum number of transactions and the throttle window are adjustable.Type: GrantFiled: March 31, 2008Date of Patent: November 1, 2011Assignee: Intel CorporationInventors: Lance Hacking, Ramana Rachakonda, Belliappa Kuttanna, Rajesh Patel