Patents by Inventor Yiren Zhao

Yiren Zhao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Neural network activation compression with non-uniform mantissas

Patent number: 12277502

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: January 17, 2024

Date of Patent: April 15, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
ADJUSTING ACTIVATION COMPRESSION FOR NEURAL NETWORK TRAINING

Publication number: 20250061320

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

Type: Application

Filed: November 4, 2024

Publication date: February 20, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
Adjusting activation compression for neural network training

Patent number: 12165038

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

Type: Grant

Filed: February 14, 2019

Date of Patent: December 10, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
Neural network activation compression with non-uniform mantissas

Patent number: 12067495

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: January 3, 2023

Date of Patent: August 20, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
Neural network activation compression with outlier block floating-point

Patent number: 12045724

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: December 31, 2018

Date of Patent: July 23, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
NEURAL NETWORK ACTIVATION COMPRESSION WITH NON-UNIFORM MANTISSAS

Publication number: 20240152758

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: January 17, 2024

Publication date: May 9, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
NEURAL NETWORK ACTIVATION COMPRESSION WITH NON-UNIFORM MANTISSAS

Publication number: 20230140185

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: January 3, 2023

Publication date: May 4, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
Neural network activation compression with non-uniform mantissas

Patent number: 11562247

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: January 24, 2019

Date of Patent: January 24, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
ADJUSTING ACTIVATION COMPRESSION FOR NEURAL NETWORK TRAINING

Publication number: 20200264876

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

Type: Application

Filed: February 14, 2019

Publication date: August 20, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
NEURAL NETWORK ACTIVATION COMPRESSION WITH NON-UNIFORM MANTISSAS

Publication number: 20200242474

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: January 24, 2019

Publication date: July 30, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
NEURAL NETWORK ACTIVATION COMPRESSION WITH NARROW BLOCK FLOATING-POINT

Publication number: 20200210838

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: December 31, 2018

Publication date: July 2, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
NEURAL NETWORK ACTIVATION COMPRESSION WITH OUTLIER BLOCK FLOATING-POINT

Publication number: 20200210839

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: December 31, 2018

Publication date: July 2, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao