Patents by Inventor Daniel Lo

Daniel Lo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12277502
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: January 17, 2024
    Date of Patent: April 15, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Patent number: 12260910
    Abstract: The present disclosure is directed to a sense amplifier architecture for a memory device having a plurality of memory cells. Groups of non-volatile memory cells store respective codewords formed by stored logic states, logic high or logic low, of the memory cells of the group. The sense amplifier architecture has a plurality of sense amplifier reading branches, each sense amplifier reading branch coupled to a respective memory cell and configured to provide an output signal, which is indicative of a cell current flowing through the same memory cell; a comparison stage, to perform a comparison between the cell currents of memory cells of a group; and a logic stage, to determine, based on comparison results provided by the comparison stage, a read codeword corresponding to the group of memory cells.
    Type: Grant
    Filed: December 29, 2022
    Date of Patent: March 25, 2025
    Assignee: STMICROELECTRONICS S.r.l.
    Inventors: Fabio Enrico Carlo Disegni, Marcella Carissimi, Alessandro Tomasoni, Daniele Lo Iacono
  • Publication number: 20250061320
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.
    Type: Application
    Filed: November 4, 2024
    Publication date: February 20, 2025
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
  • Patent number: 12165038
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: December 10, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
  • Patent number: 12067495
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: August 20, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Publication number: 20240273287
    Abstract: The performance of a text parser implemented with a state machine is improved by reducing a critical dependence path. In one aspect, all possible current states for a given text input are read from a state table circuit, and the correct next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. Further, multiple input units are configured to operate on multiple text characters in parallel, with each input unit propagating outputs for its state table circuit to the next downstream input unit. Each downstream input unit is configured to use the propagated states to provide the proper outputs to appropriates multiplexer inputs. The number of possible output states may be dynamically reduced, thereby reducing the size of the output multiplexer needed to select the next state.
    Type: Application
    Filed: April 8, 2024
    Publication date: August 15, 2024
    Inventors: Daniel LO, Blake D. PELTON
  • Patent number: 12045724
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: July 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
  • Patent number: 11989508
    Abstract: The performance of a text parser implemented with a state machine is improved by reducing a critical dependence path. In one aspect, all possible current states for a given text input are read from a state table circuit, and the correct next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. Further, multiple input units are configured to operate on multiple text characters in parallel, with each input unit propagating outputs for its state table circuit to the next downstream input unit. Each downstream input unit is configured to use the propagated states to provide the proper outputs to appropriates multiplexer inputs. The number of possible output states may be dynamically reduced, thereby reducing the size of the output multiplexer needed to select the next state.
    Type: Grant
    Filed: February 17, 2021
    Date of Patent: May 21, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Daniel Lo, Blake D. Pelton
  • Publication number: 20240152758
    Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
    Type: Application
    Filed: January 17, 2024
    Publication date: May 9, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
  • Patent number: 11863182
    Abstract: A table-based state machine is improved by reducing critical dependence path. In one aspect, all current states for a given input are read from a state table circuit, and the next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. In a further aspect, multiple input units are configured to operate on multiple inputs in parallel, with each input unit propagating its state table circuit for its current input to the next downstream input unit. Each downstream input unit is configured to use the propagated state table circuit to provide the state table circuit reads to the proper output multiplexer input. The number of possible output states for a given input may be dynamically reduced, reducing the size of the output multiplexer selecting the next state.
    Type: Grant
    Filed: June 3, 2022
    Date of Patent: January 2, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Daniel Lo, Blake D. Pelton
  • Patent number: 11853897
    Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.
    Type: Grant
    Filed: December 9, 2022
    Date of Patent: December 26, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Taesik Na, Daniel Lo, Haishan Zhu, Eric Sen Chung
  • Publication number: 20230403028
    Abstract: Methods and systems are provided for decoding variable-length codes in a parallel process. A stream of variable-length code words is divided into fixed length words. A plurality of parallel sets of decoder circuits each receive, in parallel, a current fixed length word and a prior fixed length word. Each decoder circuit has a respective fixed leftover bit-count. Each decoder circuit generates a respective output that may include a decoded symbol and a new leftover bit-count. Each respective output is determined based on the respective current fixed length word, the respective prior fixed length word, and the respective fixed leftover bit-count. A set of selected decoder circuit outputs is generated for each set of the parallel sets of decoder circuits based on a set of first leftover bit-counts. One output from each set of selected decoder circuit outputs is selected as a final output based on a second prior leftover bit-count.
    Type: Application
    Filed: July 27, 2021
    Publication date: December 14, 2023
    Inventors: Daniel LO, Blake D. PELTON
  • Patent number: 11741362
    Abstract: A system for training a neural network receives training data and performing lower precision format training calculations using lower precision format data at one or more training phases. One or more results from the lower precision format training calculations are converted to higher precision format data, and higher precision format training calculations are performed using the higher precision format data at one or more additional training phases. The neural network is modified using the results from the one or more additional training phases. The mixed precision format training calculations train the neural network more efficiently, while maintaining an overall accuracy.
    Type: Grant
    Filed: May 8, 2018
    Date of Patent: August 29, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daniel Lo, Eric Sen Chung, Bita Darvish Rouhani
  • Publication number: 20230267319
    Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.
    Type: Application
    Filed: April 28, 2023
    Publication date: August 24, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Bita Darvish Rouhani, Taesik Na, Eric S. Chung, Daniel Lo, Douglas C. Burger
  • Publication number: 20230245699
    Abstract: The present disclosure is directed to a sense amplifier architecture for a memory device having a plurality of memory cells. Groups of non-volatile memory cells store respective codewords formed by stored logic states, logic high or logic low, of the memory cells of the group. The sense amplifier architecture has a plurality of sense amplifier reading branches, each sense amplifier reading branch coupled to a respective memory cell and configured to provide an output signal, which is indicative of a cell current flowing through the same memory cell; a comparison stage, to perform a comparison between the cell currents of memory cells of a group; and a logic stage, to determine, based on comparison results provided by the comparison stage, a read codeword corresponding to the group of memory cells.
    Type: Application
    Filed: December 29, 2022
    Publication date: August 3, 2023
    Applicant: STMICROELECTRONICS S.r.l.
    Inventors: Fabio Enrico Carlo DISEGNI, Marcella CARISSIMI, Alessandro TOMASONI, Daniele LO IACONO
  • Publication number: 20230223079
    Abstract: The present disclosure is directed to a method for storing information in a coded manner in non-volatile memory cells. The method includes providing a group of non-volatile memory cells of non volatile memory. The memory cell is of the type in which a stored logic state, which can be logic high or logic low, can be changed through application of a current to the cell and the state in the memory cell is read by reading a current provided by the cell. The group of non-volatile memory cells include a determined number of non-volatile memory cells which is greater than two. The group of non-volatile memory cells store a codeword formed by the values of said stored states of the cells of the group taken according to a given order. Given a set of codewords obtainable by the stored values in the determined number of non-volatile memory cells in a group, the method includes storing the information in at least two subsets of said set of codewords comprising each at least a codeword.
    Type: Application
    Filed: December 29, 2022
    Publication date: July 13, 2023
    Applicant: STMICROELECTRONICS S.r.l.
    Inventors: Alessandro TOMASONI, Fabio Enrico Carlo DISEGNI, Marcella CARISSIMI, Daniele LO IACONO
  • Publication number: 20230196085
    Abstract: Methods and apparatus are disclosed for providing emulation of quantized precision operations in a neural network. In some examples, the quantized precision operations are performed in a block floating-point format where values of a tensor share a common exponent. Techniques for selecting higher precision or lower precision can be used based on a variety of input metrics. When converting to a quantized tensor, a residual tensor is produced. In one embodiment, an error value associated with converting from a normal-precision floating point number to the quantized tensor is used to determine whether to use the residual tensor in a dot product calculation. Using the residual tensor increases the precision of an output from a node. Selection of whether to use the residual tensor can depend on various input metrics including the error value, the layer number, the exponent value, the layer type, etc.
    Type: Application
    Filed: February 16, 2023
    Publication date: June 22, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Eric S. Chung, Daniel Lo, Jialiang Zhang, Ritchie Zhao
  • Patent number: 11676003
    Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: June 13, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bita Darvish Rouhani, Taesik Na, Eric S. Chung, Daniel Lo, Douglas C. Burger
  • Publication number: 20230180214
    Abstract: The Internet generally provides anonymity to the online activities of visitors to web sites and other online resources. This prevents the operators of web sites and others from identifying visitors who do not wish to be identified. Accordingly, embodiments generate mappings between entities (e.g., IP addresses, domains, cookies, or devices) and accounts (e.g., companies) to de-anonymize online activities. In an embodiment, summary mappings are generated based on activity data. Each summary mapping may comprise an entity, potential account identifier, and an activity vector that measures observations of an association between the entity and potential account identifier from an activity source for multiple summary periods. A model may be applied to the summary mappings to compute signal strengths for a plurality of candidate mappings. A winning mapping may then be selected for each entity in the candidate mappings, and used to associate the entity with an account in one or more downstream functions.
    Type: Application
    Filed: December 2, 2022
    Publication date: June 8, 2023
    Inventors: Chihi Tai, Daniel Lo, Tai Vo, Yulia Tyutina, Kenneth Golonka, Viral Bajaria
  • Patent number: 11645493
    Abstract: Methods and apparatus are disclosed supporting a design flow for developing quantized neural networks. In one example of the disclosed technology, a method includes quantizing a normal-precision floating-point neural network model into a quantized format. For example, the quantized format can be a block floating-point format, where two or more elements of tensors in the neural network share a common exponent. A set of test input is applied to a normal-precision flooding point model and the corresponding quantized model and the respective output tensors are compared. Based on this comparison, hyperparameters or other attributes of the neural networks can be adjusted. Further, quantization parameters determining the widths of data and selection of shared exponents for the block floating-point format can be selected. An adjusted, quantized neural network is retrained and programmed into a hardware accelerator.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: May 9, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Eric S. Chung, Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao