Patents by Inventor Blake Alan Hechtman

Blake Alan Hechtman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FLEXIBLE MACHINE LEARNING MODEL COMPRESSION

Publication number: 20250148357

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compresses a machine learning model having a plurality of parameters. In one aspect, one of the methods includes obtaining trained values of a set of parameters for at least a portion of a machine learning model; identifying one or more dense ranges for the trained values; determining a least number of bits required to represent each trained value within the one or more dense ranges; identifying a second format having a range that is smaller than a range of the first format; and generating a compressed version of the at least a portion of the machine learning model.

Type: Application

Filed: November 7, 2023

Publication date: May 8, 2025

Inventors: Aditya Binodkumar Agrawal, Blake Alan Hechtman, Matthew Leever Hedlund, David Alexander Majnemer, Marissa Karen Ikonomidis
TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION

Publication number: 20240378416

Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.

Type: Application

Filed: February 16, 2024

Publication date: November 14, 2024

Inventors: Blake Alan Hechtman, Sameer Kumar
GENERAL PADDING SUPPORT FOR CONVOLUTION ON SYSTOLIC ARRAYS

Publication number: 20240232598

Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

Type: Application

Filed: September 18, 2023

Publication date: July 11, 2024

Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
Training neural networks using distributed batch normalization

Patent number: 11907825

Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.

Type: Grant

Filed: October 21, 2019

Date of Patent: February 20, 2024

Assignee: Google LLC

Inventors: Blake Alan Hechtman, Sameer Kumar
APPROXIMATE K NEAREST NEIGHBORS ON HARDWARE ACCELERATORS

Publication number: 20230418797

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a kNN computation using a hardware accelerator. One of the methods includes obtaining a set of one or more query vectors; obtaining a set of database vectors; and performing, on a hardware accelerator and for each query vector in the set, a search for the k most similar database vectors to the query vector, comprising: computing, by circuitry of the hardware accelerator and for each query vector, a respective similarity value between the query vector and each database vector; and for each query vector, identifying, by the hardware accelerator and for each bin, (i) an index of the most similar database vector within the bin and (ii) the respective similarity value for the most similar database vector within the bin.

Type: Application

Filed: June 26, 2023

Publication date: December 28, 2023

Inventors: Felix Ren-Chyan Chern, Blake Alan Hechtman, Andrew Thomas Davis, Ruiqi Guo, Sanjiv Kumar, David Alexander Majnemer
General padding support for convolution on systolic arrays

Patent number: 11763142

Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

Type: Grant

Filed: September 2, 2022

Date of Patent: September 19, 2023

Assignee: Google LLC

Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
RESHAPE AND BROADCAST OPTIMIZATIONS TO AVOID UNNECESSARY DATA MOVEMENT

Publication number: 20230206126

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for transforming patterns of operations on tensors in a computational graph to reduce the memory burden incurred when reshape operations are performed, in particular when deployed to hardware platforms that have vector instructions or vector memory requiring alignment of operands.

Type: Application

Filed: December 23, 2022

Publication date: June 29, 2023

Inventor: Blake Alan Hechtman
GENERAL PADDING SUPPORT FOR CONVOLUTION ON SYSTOLIC ARRAYS

Publication number: 20220414441

Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

Type: Application

Filed: September 2, 2022

Publication date: December 29, 2022

Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
Reshape and broadcast optimizations to avoid unnecessary data movement

Patent number: 11537939

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for transforming patterns of operations on tensors in a computational graph to reduce the memory burden incurred when reshape operations are performed, in particular when deployed to hardware platforms that have vector instructions or vector memory requiring alignment of operands.

Type: Grant

Filed: May 3, 2019

Date of Patent: December 27, 2022

Assignee: Google LLC

Inventor: Blake Alan Hechtman
Multiple output fusion for operations performed in a multi-dimensional array of processing units

Patent number: 11500959

Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors and similarly structured data that are generated in parallel, for example, on nodes organized in a mesh or torus topology defined by connections in at least two dimension between the nodes. The methods provide parallel computation and communication between nodes in the topology.

Type: Grant

Filed: August 16, 2019

Date of Patent: November 15, 2022

Assignee: Google LLC

Inventors: David Alexander Majnemer, Blake Alan Hechtman
General padding support for convolution on systolic arrays

Patent number: 11449739

Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

Type: Grant

Filed: August 22, 2019

Date of Patent: September 20, 2022

Assignee: Google LLC

Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
LOCAL SELF-ATTENTION COMPUTER VISION NEURAL NETWORKS

Publication number: 20210390410

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using a computer vision neural network that has one or more local self-attention layers. Each local self-attention layer is configured to apply or more local self-attention mechanisms to the layer input to the local self-attention layer.

Type: Application

Filed: June 14, 2021

Publication date: December 16, 2021

Inventors: Ashish Teku Vaswani, Prajit Ramachandran, Aravind Srinivas Lakshminarayanan, Blake Alan Hechtman, Niki J. Parmar
GENERAL PADDING SUPPORT FOR CONVOLUTION ON SYSTOLIC ARRAYS

Publication number: 20210056396

Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

Type: Application

Filed: August 22, 2019

Publication date: February 25, 2021

Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
Multiple Output Fusion For Operations Performed In A Multi-Dimensional Array of Processing Units

Publication number: 20210049231

Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors and similarly structured data that are generated in parallel, for example, on nodes organized in a mesh or torus topology defined by connections in at least two dimension between the nodes. The methods provide parallel computation and communication between nodes in the topology.

Type: Application

Filed: August 16, 2019

Publication date: February 18, 2021

Inventors: David Alexander Majnemer, Blake Alan Hechtman
RESHAPE AND BROADCAST OPTIMIZATIONS TO AVOID UNNECESSARY DATA MOVEMENT

Publication number: 20200349465

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for transforming patterns of operations on tensors in a computational graph to reduce the memory burden incurred when reshape operations are performed, in particular when deployed to hardware platforms that have vector instructions or vector memory requiring alignment of operands.

Type: Application

Filed: May 3, 2019

Publication date: November 5, 2020

Inventor: Blake Alan Hechtman
TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION

Publication number: 20200125949

Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.

Type: Application

Filed: October 21, 2019

Publication date: April 23, 2020

Inventors: Blake Alan Hechtman, Sameer Kumar