Patents by Inventor Yiren Zhao

Yiren Zhao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12277502
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: January 17, 2024
    Date of Patent: April 15, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Publication number: 20250061320
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.
    Type: Application
    Filed: November 4, 2024
    Publication date: February 20, 2025
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
  • Patent number: 12165038
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: December 10, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
  • Patent number: 12067495
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: August 20, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Patent number: 12045724
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: July 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
  • Publication number: 20240152758
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: January 17, 2024
    Publication date: May 9, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Publication number: 20230140185
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: January 3, 2023
    Publication date: May 4, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Patent number: 11562247
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: January 24, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Publication number: 20200264876
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.
    Type: Application
    Filed: February 14, 2019
    Publication date: August 20, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
  • Publication number: 20200242474
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: January 24, 2019
    Publication date: July 30, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Publication number: 20200210838
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: December 31, 2018
    Publication date: July 2, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
  • Publication number: 20200210839
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: December 31, 2018
    Publication date: July 2, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao