Patents by Inventor Andy Wagner
Andy Wagner has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250088858Abstract: A first wireless communication device having an application processor configured to generate, for transmission to a second wireless communication device, a first identity resolving key (IRK) that is unique to the second wireless communication device, wherein the first IRK indicates the second wireless communication device is allowed to perform find location operations with the first wireless communication device, a Bluetooth controller configured to perform Bluetooth scanning operations to receive a Bluetooth advertisement having a payload comprising an IRK and an always on processor (AOP) configured to compare the IRK to the first IRK.Type: ApplicationFiled: September 5, 2024Publication date: March 13, 2025Inventors: Yann LY-GAGNON, Andy WAGNER, Anjali S. SANDESARA, Bryant LIU, Michelle J. NG, Yilok L. WONG
-
Publication number: 20250088859Abstract: An apparatus configured to process an indication of a wireless communication device with which the apparatus is allowed to perform find location operations, wherein the indication comprises a first identity resolving key (IRK) that is unique to the wireless communication device, process a payload of an advertisement comprising an IRK and compare the IRK to the first IRK received by the always on processor.Type: ApplicationFiled: September 5, 2024Publication date: March 13, 2025Inventors: Yann LY-GAGNON, Andy WAGNER, Anjali S SANDESARA, Bryant LIU, Michelle J NG, Yilok L WONG
-
Publication number: 20250088834Abstract: An apparatus configured to initiate an operation to locate a target device, generate, for transmission to the target device, Bluetooth discovery signals for detecting a proximity of the target device, generate, for transmission to the target device at substantially a same time as the Bluetooth discovery signals, a discovery message via a network connection, detect a Bluetooth discovery response or a response to the discovery message from the target device and trigger an ultra-wideband (UWB) ranging operation based on detection of the Bluetooth discovery response or response to the discovery message.Type: ApplicationFiled: September 6, 2024Publication date: March 13, 2025Inventors: Vignesh Babu MOORTHY, Andy WAGNER, Michelle Julia NG, Qiang CHEN, Richard ONG, Robert W. BRUMLEY, Robert GOLSHAN, Yann LY-GAGNON
-
Patent number: 12182716Abstract: Embodiments of the present disclosure include systems and methods for compressing and decompressing data generated by sub-blocks in a neural network. In some embodiment, an input matrix is received at a compression block in the neural network. The compression block compresses the input matrix into a compressed matrix and outputs the compressed matrix. The compressed matrix has a reduced dimensionality relative to a dimensionality of the input matrix. A decompression block retrieves the compressed matrix. The decompression block decompresses compressed matrix into a decompressed matrix and outputs the decompressed matrix. The decompressed matrix has a same dimensionality as the dimensionality of the input matrix. The compression and decompression blocks are optimized based on feedback received from the neural network.Type: GrantFiled: August 25, 2020Date of Patent: December 31, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Publication number: 20240176953Abstract: Embodiments of the present disclosure include systems and methods for packing tokens to train sequence models. In some embodiments, a plurality of datasets for training a sequence model is received. Each dataset in the plurality of datasets includes a sequence of correlated tokens. A set of training data is generated that includes a subset of a sequence of tokens from a first dataset in the plurality of datasets and a subset of a sequence of tokens from a second, different dataset in the plurality of datasets. The sequence model is trained using the set of training data.Type: ApplicationFiled: February 2, 2024Publication date: May 30, 2024Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Patent number: 11954448Abstract: Embodiments of the present disclosure include systems and methods for determining position values for training data that is used to train transformer models. In some embodiments, a set of input data for training a transformer model is received. The set of input data comprises a set of tokens. Based on an offset value, a set of successive position values for the set of tokens is determined. Each position value in the set of successive position values represents a position of a token in the set of tokens relative to other tokens in the set of tokens. A set of training data is generated to comprise the set of tokens and the set of successive position values. The transformer model is trained using the set of training data.Type: GrantFiled: July 21, 2020Date of Patent: April 9, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Patent number: 11928429Abstract: Embodiments of the present disclosure include systems and methods for packing tokens to train sequence models. In some embodiments, a plurality of datasets for training a sequence model is received. Each dataset in the plurality of datasets includes a sequence of correlated tokens. A set of training data is generated that includes a subset of a sequence of tokens from a first dataset in the plurality of datasets and a subset of a sequence of tokens from a second, different dataset in the plurality of datasets. The sequence model is trained using the set of training data.Type: GrantFiled: May 22, 2020Date of Patent: March 12, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Patent number: 11893469Abstract: Embodiments of the present disclosure include systems and methods for training transformer models using position masking. In some embodiments, a set of data for training a transformer model is received. The set of data includes a sequence of tokens and a set of position values. Each position value in the set of position values represents a position of a token in the sequence of tokens relative to other tokens in the sequence of tokens. A subset of the set of position values in the set of data is selected. Each position value in the subset of the set of position values is replaced with a second defined value to form a second set of defined values. The transformer model is trained using the set of data.Type: GrantFiled: May 22, 2020Date of Patent: February 6, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Patent number: 11886983Abstract: Embodiments of the present disclosure include systems and methods for reducing hardware resource utilization by residual neural networks. In some embodiments, a first matrix is received at a layer included in a neural network. The first matrix is compressed to produce a second matrix. The second matrix has a reduced dimensionality relative to a dimensionality of the first matrix. The second matrix is processed through a network block in the layer included in the neural network. The processed second matrix is expanded to produce a third matrix. The third matrix has a dimensionality that is equal to a dimensionality of the first matrix. The third matrix is added to the first matrix to produce a fourth matrix.Type: GrantFiled: August 25, 2020Date of Patent: January 30, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Patent number: 11663444Abstract: Systems and methods for pipelined neural network processing with continuous and asynchronous updates are described. A method for processing a neural network comprising L layers, where L is an integer greater than two, includes partitioning the L layers among a set of computing resources configured to process forward passes and backward passes associated with each of the L layers. The method further includes initiating processing of the forward passes and the backward passes using the set of computing resources. The method further includes upon completion of a first set of forward passes and a first set of backward passes associated with a first layer of the L layers, initiating update of parameters associated with the first layer when gradients are available for updating the parameters associated with the first layer without waiting to calculate gradients associated with any of remaining L layers.Type: GrantFiled: September 27, 2019Date of Patent: May 30, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Saurabh M. Kulkarni, Marc Tremblay, Sujeeth S. Bharadwaj
-
Patent number: 11610120Abstract: Embodiments of the present disclosure include systems and methods for training neural networks. In one embodiment, neural network may receive input data and produce output results in response to the input data and weights of the neural network. An error is determined at an output of the neural network based on the output results. The error is propagated in a reverse direction through the neural network from the output and one or more intermediate outputs to adjust the weights.Type: GrantFiled: May 8, 2020Date of Patent: March 21, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Patent number: 11537890Abstract: Embodiments of the present disclosure include systems and methods for compressing weights for distributed neural networks. In some embodiments, a first network comprising a first set of weights is trained using a set of training data. A second network comprising a second set of weights is trained using the set of training data. A number of weights in the first set of weights is greater than a number of weights in the second set of weights. The first set of weights are adjusted based on a first loss determined by the first network and a second loss determined by the second network. The second set of weights are adjusted based on the first loss determined by the first network and the second loss determined by the second network. Values of the second set of weights are sent to a computing system.Type: GrantFiled: September 9, 2020Date of Patent: December 27, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay
-
Publication number: 20220382978Abstract: Embodiments of the present disclosure include systems and methods for training masked language models based on partial sequences of tokens. A sequence of tokens for training a transformer model is received. A defined proportion of the sequence of tokens is selected. Each value of the defined proportion of the sequence of tokens is replaced with a defined value.Type: ApplicationFiled: May 28, 2021Publication date: December 1, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Fanny NINA PARAVECINO
-
Publication number: 20220222521Abstract: Weights may be updated during training of a neural network artificial intelligence model. Certain techniques split the training data into mini-batches, process each mini-batch in a pipeline, and then apply the weight updates after processing of the mini-batch completes. However, waiting for the mini-batch to complete before applying the weight updates causes significant delays during a ramp-down period as the data must be flushed out of the pipeline and then again during a ramp-up period as the pipeline is being filled with data from the next mini-batch. The present disclosure avoids such delays and improves performance by applying the weight updates at specific intervals, without splitting the data into mini-batches. The updated weights may be applied during a steady-state operation of the pipeline.Type: ApplicationFiled: January 13, 2021Publication date: July 14, 2022Inventors: Andy WAGNER, Tiyasa MITRA
-
Publication number: 20220108162Abstract: Embodiments of the present disclosure include systems and methods for decimating hidden layers for training transformer models. In some embodiments, input data for training a transform model is received receive at a transformer layer included in the transformer model. The transformer layer comprises a hidden layer. The hidden layer comprises a set of neurons configured to process training data. A subset of the set of neurons of the hidden layer is selected. Only the subset of the set of neurons of the hidden layer are used to train the transformer model with the input data.Type: ApplicationFiled: October 1, 2020Publication date: April 7, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Publication number: 20220076112Abstract: Embodiments of the present disclosure include systems and methods for compressing weights for distributed neural networks. In some embodiments, a first network comprising a first set of weights is trained using a set of training data. A second network comprising a second set of weights is trained using the set of training data. A number of weights in the first set of weights is greater than a number of weights in the second set of weights. The first set of weights are adjusted based on a first loss determined by the first network and a second loss determined by the second network. The second set of weights are adjusted based on the first loss determined by the first network and the second loss determined by the second network. Values of the second set of weights are sent to a computing system.Type: ApplicationFiled: September 9, 2020Publication date: March 10, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Publication number: 20220076127Abstract: Embodiments of the present disclosure include systems and methods for forcing weights of transformer model layers when training a transformer model. In some embodiments, input data is received at a first layer included in a transformer model. The input data is processed through the first layer of the transformer model to produce a first output data. The first output data is processed through the first layer of the transformer model to produce a second output data. The first output data is processed through a second layer included in the transformer model to produce a third output data. A difference is calculated between the second output data and the third output data. Weights included in the first layer of the transformer model are adjusted based on the calculated difference.Type: ApplicationFiled: September 9, 2020Publication date: March 10, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Publication number: 20220067490Abstract: Embodiments of the present disclosure include systems and methods for reducing hardware resource utilization by residual neural networks. In some embodiments, a first matrix is received at a layer included in a neural network. The first matrix is compressed to produce a second matrix. The second matrix has a reduced dimensionality relative to a dimensionality of the first matrix. The second matrix is processed through a network block in the layer included in the neural network. The processed second matrix is expanded to produce a third matrix. The third matrix has a dimensionality that is equal to a dimensionality of the first matrix. The third matrix is added to the first matrix to produce a fourth matrix.Type: ApplicationFiled: August 25, 2020Publication date: March 3, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Publication number: 20220067529Abstract: Embodiments of the present disclosure include systems and methods for compressing and decompressing data generated by sub-blocks in a neural network. In some embodiment, an input matrix is received at a compression block in the neural network. The compression block compresses the input matrix into a compressed matrix and outputs the compressed matrix. The compressed matrix has a reduced dimensionality relative to a dimensionality of the input matrix. A decompression block retrieves the compressed matrix. The decompression block decompresses compressed matrix into a decompressed matrix and outputs the decompressed matrix. The decompressed matrix has a same dimensionality as the dimensionality of the input matrix. The compression and decompression blocks are optimized based on feedback received from the neural network.Type: ApplicationFiled: August 25, 2020Publication date: March 3, 2022Inventors: Andy WAGNER, Tiyasa MITRA, Marc TREMBLAY
-
Publication number: 20220067280Abstract: Embodiments of the present disclosure include systems and methods for training transformer models. In some embodiments, a set of input data are received. The input data comprises a plurality of tokens including masked tokens. The plurality of tokens in an embedding layer are processed. The embedding layer is coupled to a transformer layer. The plurality of tokens are processed in the transformer layer, which is coupled to a classifier layer. The plurality of tokens are processed in the classifier layer. The classifier layer is coupled to a loss layer. At least one of the embedding layer and the classifier layer combine masked tokens at a current position with tokens at one or more of a previous position and a subsequent position.Type: ApplicationFiled: August 25, 2020Publication date: March 3, 2022Inventors: Andy Wagner, Tiyasa Mitra, Marc Tremblay