Patents by Inventor Daniel Lo

Daniel Lo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Neural network activation compression with non-uniform mantissas

Patent number: 12277502

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: January 17, 2024

Date of Patent: April 15, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
Sense amplifier architecture for a non-volatile memory storing coded information

Patent number: 12260910

Abstract: The present disclosure is directed to a sense amplifier architecture for a memory device having a plurality of memory cells. Groups of non-volatile memory cells store respective codewords formed by stored logic states, logic high or logic low, of the memory cells of the group. The sense amplifier architecture has a plurality of sense amplifier reading branches, each sense amplifier reading branch coupled to a respective memory cell and configured to provide an output signal, which is indicative of a cell current flowing through the same memory cell; a comparison stage, to perform a comparison between the cell currents of memory cells of a group; and a logic stage, to determine, based on comparison results provided by the comparison stage, a read codeword corresponding to the group of memory cells.

Type: Grant

Filed: December 29, 2022

Date of Patent: March 25, 2025

Assignee: STMICROELECTRONICS S.r.l.

Inventors: Fabio Enrico Carlo Disegni, Marcella Carissimi, Alessandro Tomasoni, Daniele Lo Iacono
ADJUSTING ACTIVATION COMPRESSION FOR NEURAL NETWORK TRAINING

Publication number: 20250061320

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

Type: Application

Filed: November 4, 2024

Publication date: February 20, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
Adjusting activation compression for neural network training

Patent number: 12165038

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and, in particular, for adjusting floating-point formats used to store activation values during training. In certain examples of the disclosed technology, a computing system includes processors, memory, and a floating-point compressor in communication with the memory. The computing system is configured to produce a neural network comprising activation values expressed in a first floating-point format, select a second floating-point format for the neural network based on a performance metric, convert at least one of the activation values to the second floating-point format, and store the compressed activation values in the memory. Aspects of the second floating-point format that can be adjusted include the number of bits used to express mantissas, exponent format, use of non-uniform mantissas, and/or use of outlier values to express some of the mantissas.

Type: Grant

Filed: February 14, 2019

Date of Patent: December 10, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Bita Darvish Rouhani, Eric S. Chung, Yiren Zhao, Amar Phanishayee, Ritchie Zhao
Neural network activation compression with non-uniform mantissas

Patent number: 12067495

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: January 3, 2023

Date of Patent: August 20, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
HIGH-PERFORMANCE MICROCODED TEXT PARSER

Publication number: 20240273287

Abstract: The performance of a text parser implemented with a state machine is improved by reducing a critical dependence path. In one aspect, all possible current states for a given text input are read from a state table circuit, and the correct next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. Further, multiple input units are configured to operate on multiple text characters in parallel, with each input unit propagating outputs for its state table circuit to the next downstream input unit. Each downstream input unit is configured to use the propagated states to provide the proper outputs to appropriates multiplexer inputs. The number of possible output states may be dynamically reduced, thereby reducing the size of the output multiplexer needed to select the next state.

Type: Application

Filed: April 8, 2024

Publication date: August 15, 2024

Inventors: Daniel LO, Blake D. PELTON
Neural network activation compression with outlier block floating-point

Patent number: 12045724

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Grant

Filed: December 31, 2018

Date of Patent: July 23, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao, Ritchie Zhao
High-performance microcoded text parser

Patent number: 11989508

Abstract: The performance of a text parser implemented with a state machine is improved by reducing a critical dependence path. In one aspect, all possible current states for a given text input are read from a state table circuit, and the correct next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. Further, multiple input units are configured to operate on multiple text characters in parallel, with each input unit propagating outputs for its state table circuit to the next downstream input unit. Each downstream input unit is configured to use the propagated states to provide the proper outputs to appropriates multiplexer inputs. The number of possible output states may be dynamically reduced, thereby reducing the size of the output multiplexer needed to select the next state.

Type: Grant

Filed: February 17, 2021

Date of Patent: May 21, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Daniel Lo, Blake D. Pelton
NEURAL NETWORK ACTIVATION COMPRESSION WITH NON-UNIFORM MANTISSAS

Publication number: 20240152758

Abstract: Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Type: Application

Filed: January 17, 2024

Publication date: May 9, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Amar Phanishayee, Eric S. Chung, Yiren Zhao
High-performance table-based state machine

Patent number: 11863182

Abstract: A table-based state machine is improved by reducing critical dependence path. In one aspect, all current states for a given input are read from a state table circuit, and the next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. In a further aspect, multiple input units are configured to operate on multiple inputs in parallel, with each input unit propagating its state table circuit for its current input to the next downstream input unit. Each downstream input unit is configured to use the propagated state table circuit to provide the state table circuit reads to the proper output multiplexer input. The number of possible output states for a given input may be dynamically reduced, reducing the size of the output multiplexer selecting the next state.

Type: Grant

Filed: June 3, 2022

Date of Patent: January 2, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Daniel Lo, Blake D. Pelton
Neural network training with decreased memory consumption and processor utilization

Patent number: 11853897

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

Type: Grant

Filed: December 9, 2022

Date of Patent: December 26, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Taesik Na, Daniel Lo, Haishan Zhu, Eric Sen Chung
PARALLELIZED DECODING OF VARIABLE-LENGTH PREFIX CODES

Publication number: 20230403028

Abstract: Methods and systems are provided for decoding variable-length codes in a parallel process. A stream of variable-length code words is divided into fixed length words. A plurality of parallel sets of decoder circuits each receive, in parallel, a current fixed length word and a prior fixed length word. Each decoder circuit has a respective fixed leftover bit-count. Each decoder circuit generates a respective output that may include a decoded symbol and a new leftover bit-count. Each respective output is determined based on the respective current fixed length word, the respective prior fixed length word, and the respective fixed leftover bit-count. A set of selected decoder circuit outputs is generated for each set of the parallel sets of decoder circuits based on a set of first leftover bit-counts. One output from each set of selected decoder circuit outputs is selected as a final output based on a second prior leftover bit-count.

Type: Application

Filed: July 27, 2021

Publication date: December 14, 2023

Inventors: Daniel LO, Blake D. PELTON
Training neural networks using mixed precision computations

Patent number: 11741362

Abstract: A system for training a neural network receives training data and performing lower precision format training calculations using lower precision format data at one or more training phases. One or more results from the lower precision format training calculations are converted to higher precision format data, and higher precision format training calculations are performed using the higher precision format data at one or more additional training phases. The neural network is modified using the results from the one or more additional training phases. The mixed precision format training calculations train the neural network more efficiently, while maintaining an overall accuracy.

Type: Grant

Filed: May 8, 2018

Date of Patent: August 29, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Lo, Eric Sen Chung, Bita Darvish Rouhani
TRAINING NEURAL NETWORK ACCELERATORS USING MIXED PRECISION DATA FORMATS

Publication number: 20230267319

Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.

Type: Application

Filed: April 28, 2023

Publication date: August 24, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Bita Darvish Rouhani, Taesik Na, Eric S. Chung, Daniel Lo, Douglas C. Burger
SENSE AMPLIFIER ARCHITECTURE FOR A NON-VOLATILE MEMORY STORING CODED INFORMATION

Publication number: 20230245699

Abstract: The present disclosure is directed to a sense amplifier architecture for a memory device having a plurality of memory cells. Groups of non-volatile memory cells store respective codewords formed by stored logic states, logic high or logic low, of the memory cells of the group. The sense amplifier architecture has a plurality of sense amplifier reading branches, each sense amplifier reading branch coupled to a respective memory cell and configured to provide an output signal, which is indicative of a cell current flowing through the same memory cell; a comparison stage, to perform a comparison between the cell currents of memory cells of a group; and a logic stage, to determine, based on comparison results provided by the comparison stage, a read codeword corresponding to the group of memory cells.

Type: Application

Filed: December 29, 2022

Publication date: August 3, 2023

Applicant: STMICROELECTRONICS S.r.l.

Inventors: Fabio Enrico Carlo DISEGNI, Marcella CARISSIMI, Alessandro TOMASONI, Daniele LO IACONO
METHOD FOR STORING INFORMATION IN A CODED MANNER IN NON-VOLATILE MEMORY CELLS, DECODING METHOD AND NON-VOLATILE MEMORY

Publication number: 20230223079

Abstract: The present disclosure is directed to a method for storing information in a coded manner in non-volatile memory cells. The method includes providing a group of non-volatile memory cells of non volatile memory. The memory cell is of the type in which a stored logic state, which can be logic high or logic low, can be changed through application of a current to the cell and the state in the memory cell is read by reading a current provided by the cell. The group of non-volatile memory cells include a determined number of non-volatile memory cells which is greater than two. The group of non-volatile memory cells store a codeword formed by the values of said stored states of the cells of the group taken according to a given order. Given a set of codewords obtainable by the stored values in the determined number of non-volatile memory cells in a group, the method includes storing the information in at least two subsets of said set of codewords comprising each at least a codeword.

Type: Application

Filed: December 29, 2022

Publication date: July 13, 2023

Applicant: STMICROELECTRONICS S.r.l.

Inventors: Alessandro TOMASONI, Fabio Enrico Carlo DISEGNI, Marcella CARISSIMI, Daniele LO IACONO
RESIDUAL QUANTIZATION FOR NEURAL NETWORKS

Publication number: 20230196085

Abstract: Methods and apparatus are disclosed for providing emulation of quantized precision operations in a neural network. In some examples, the quantized precision operations are performed in a block floating-point format where values of a tensor share a common exponent. Techniques for selecting higher precision or lower precision can be used based on a variety of input metrics. When converting to a quantized tensor, a residual tensor is produced. In one embodiment, an error value associated with converting from a normal-precision floating point number to the quantized tensor is used to determine whether to use the residual tensor in a dot product calculation. Using the residual tensor increases the precision of an output from a node. Selection of whether to use the residual tensor can depend on various input metrics including the error value, the layer number, the exponent value, the layer type, etc.

Type: Application

Filed: February 16, 2023

Publication date: June 22, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Daniel Lo, Jialiang Zhang, Ritchie Zhao
Training neural network accelerators using mixed precision data formats

Patent number: 11676003

Abstract: Technology related to training a neural network accelerator using mixed precision data formats is disclosed. In one example of the disclosed technology, a neural network accelerator is configured to accelerate a given layer of a multi-layer neural network. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format.

Type: Grant

Filed: December 18, 2018

Date of Patent: June 13, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Bita Darvish Rouhani, Taesik Na, Eric S. Chung, Daniel Lo, Douglas C. Burger
Mapping Entities to Accounts for De-Anonymization of Online Activity

Publication number: 20230180214

Abstract: The Internet generally provides anonymity to the online activities of visitors to web sites and other online resources. This prevents the operators of web sites and others from identifying visitors who do not wish to be identified. Accordingly, embodiments generate mappings between entities (e.g., IP addresses, domains, cookies, or devices) and accounts (e.g., companies) to de-anonymize online activities. In an embodiment, summary mappings are generated based on activity data. Each summary mapping may comprise an entity, potential account identifier, and an activity vector that measures observations of an association between the entity and potential account identifier from an activity source for multiple summary periods. A model may be applied to the summary mappings to compute signal strengths for a plurality of candidate mappings. A winning mapping may then be selected for each entity in the candidate mappings, and used to associate the entity with an account in one or more downstream functions.

Type: Application

Filed: December 2, 2022

Publication date: June 8, 2023

Inventors: Chihi Tai, Daniel Lo, Tai Vo, Yulia Tyutina, Kenneth Golonka, Viral Bajaria
Flow for quantized neural networks

Patent number: 11645493

Abstract: Methods and apparatus are disclosed supporting a design flow for developing quantized neural networks. In one example of the disclosed technology, a method includes quantizing a normal-precision floating-point neural network model into a quantized format. For example, the quantized format can be a block floating-point format, where two or more elements of tensors in the neural network share a common exponent. A set of test input is applied to a normal-precision flooding point model and the corresponding quantized model and the respective output tensors are compared. Based on this comparison, hyperparameters or other attributes of the neural networks can be adjusted. Further, quantization parameters determining the widths of data and selection of shared exponents for the block floating-point format can be selected. An adjusted, quantized neural network is retrained and programmed into a hardware accelerator.

Type: Grant

Filed: May 4, 2018

Date of Patent: May 9, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Eric S. Chung, Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao

1 2 3 4 5 next