Patents by Inventor Sergey Ioffe
Sergey Ioffe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20260154548Abstract: Aspects of the present disclosure are directed to novel activation functions which enable improved reproducibility and accuracy tradeoffs in neural networks. In particular, the present disclosure provides a family of activation functions that, on one hand, are smooth with continuous gradient and optionally monotonic but, on the other hand, also mimic the mathematical behavior of a Rectified Linear Unit (ReLU). As examples, the activation functions described herein include a smooth rectified linear unit function and also a leaky version of such function. In various implementations, the proposed functions can provide both a complete stop region and a constant positive gradient (e.g., that can be 1) pass region like a ReLU, thereby matching accuracy performance of a ReLU. Additional implementations include a leaky version and/or functions that feature different constant gradients in the pass region.Type: ApplicationFiled: January 23, 2026Publication date: June 4, 2026Inventors: Gil Shamir, Dong Lin, Sergey Ioffe
-
Patent number: 12536426Abstract: Aspects of the present disclosure are directed to novel activation functions which enable improved reproducibility and accuracy tradeoffs in neural networks. In particular, the present disclosure provides a family of activation functions that, on one hand, are smooth with continuous gradient and optionally monotonic but, on the other hand, also mimic the mathematical behavior of a Rectified Linear Unit (ReLU). As examples, the activation functions described herein include a smooth rectified linear unit function and also a leaky version of such function. In various implementations, the proposed functions can provide both a complete stop region and a constant positive gradient (e.g., that can be 1) pass region like a ReLU, thereby matching accuracy performance of a ReLU. Additional implementations include a leaky version and/or functions that feature different constant gradients in the pass region.Type: GrantFiled: June 16, 2020Date of Patent: January 27, 2026Assignee: GOOGLE LLCInventors: Gil Shamir, Dong Lin, Sergey Ioffe
-
Publication number: 20250013864Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: ApplicationFiled: June 11, 2024Publication date: January 9, 2025Inventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 12125257Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.Type: GrantFiled: July 9, 2021Date of Patent: October 22, 2024Assignee: Google LLCInventors: Vincent O. Vanhoucke, Christian Szegedy, Sergey Ioffe
-
Publication number: 20240265253Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for training a neural network, wherein the neural network is configured to receive an input data item and to process the input data item to generate a respective score for each label in a predetermined set of multiple labels. The method includes actions of obtaining a set of training data that includes a plurality of training items, wherein each training item is associated with a respective label from the predetermined set of multiple labels; and modifying the training data to generate regularizing training data, comprising: for each training item, determining whether to modify the label associated with the training item, and changing the label associated with the training item to a different label from the predetermined set of labels, and training the neural network on the regularizing data.Type: ApplicationFiled: February 20, 2024Publication date: August 8, 2024Inventor: Sergey Ioffe
-
Publication number: 20240249138Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.Type: ApplicationFiled: December 22, 2023Publication date: July 25, 2024Inventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 12033073Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: GrantFiled: January 22, 2021Date of Patent: July 9, 2024Assignee: Google LLCInventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 11934956Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for training a neural network, wherein the neural network is configured to receive an input data item and to process the input data item to generate a respective score for each label in a predetermined set of multiple labels. The method includes actions of obtaining a set of training data that includes a plurality of training items, wherein each training item is associated with a respective label from the predetermined set of multiple labels; and modifying the training data to generate regularizing training data, comprising: for each training item, determining whether to modify the label associated with the training item, and changing the label associated with the training item to a different label from the predetermined set of labels, and training the neural network on the regularizing data.Type: GrantFiled: November 30, 2022Date of Patent: March 19, 2024Assignee: Google LLCInventor: Sergey Ioffe
-
Patent number: 11893485Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: GrantFiled: January 22, 2021Date of Patent: February 6, 2024Assignee: Google LLCInventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 11887004Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing a neural network. In one aspect, the neural network includes a batch renormalization layer between a first neural network layer and a second neural network layer. The first neural network layer generates first layer outputs having multiple components. The batch renormalization layer is configured to, during training of the neural network on a current batch of training examples, obtain respective current moving normalization statistics for each of the multiple components and determine respective affine transform parameters for each of the multiple components from the current moving normalization statistics. The batch renormalization layer receives a respective first layer output for each training example in the current batch and applies the affine transform to each component of a normalized layer output to generate a renormalized layer output for the training example.Type: GrantFiled: April 21, 2020Date of Patent: January 30, 2024Assignee: Google LLCInventor: Sergey Ioffe
-
Patent number: 11853885Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.Type: GrantFiled: April 18, 2022Date of Patent: December 26, 2023Assignee: Google LLCInventors: Sergey Ioffe, Corinna Cortes
-
Publication number: 20230093469Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for training a neural network, wherein the neural network is configured to receive an input data item and to process the input data item to generate a respective score for each label in a predetermined set of multiple labels. The method includes actions of obtaining a set of training data that includes a plurality of training items, wherein each training item is associated with a respective label from the predetermined set of multiple labels; and modifying the training data to generate regularizing training data, comprising: for each training item, determining whether to modify the label associated with the training item, and changing the label associated with the training item to a different label from the predetermined set of labels, and training the neural network on the regularizing data.Type: ApplicationFiled: November 30, 2022Publication date: March 23, 2023Inventor: Sergey Ioffe
-
Patent number: 11531874Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for training a neural network, wherein the neural network is configured to receive an input data item and to process the input data item to generate a respective score for each label in a predetermined set of multiple labels. The method includes actions of obtaining a set of training data that includes a plurality of training items, wherein each training item is associated with a respective label from the predetermined set of multiple labels; and modifying the training data to generate regularizing training data, comprising: for each training item, determining whether to modify the label associated with the training item, and changing the label associated with the training item to a different label from the predetermined set of labels, and training the neural network on the regularizing data.Type: GrantFiled: November 4, 2016Date of Patent: December 20, 2022Assignee: Google LLCInventor: Sergey Ioffe
-
Publication number: 20220237462Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.Type: ApplicationFiled: April 18, 2022Publication date: July 28, 2022Inventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 11308394Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.Type: GrantFiled: April 1, 2020Date of Patent: April 19, 2022Assignee: Google LLCInventors: Sergey Ioffe, Corinna Cortes
-
Patent number: 11281973Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: GrantFiled: July 30, 2021Date of Patent: March 22, 2022Assignee: Google LLCInventors: Sergey Ioffe, Corinna Cortes
-
Publication number: 20210357756Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: ApplicationFiled: July 30, 2021Publication date: November 18, 2021Inventors: Sergey Ioffe, Corinna Cortes
-
Publication number: 20210334605Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.Type: ApplicationFiled: July 9, 2021Publication date: October 28, 2021Inventors: Vincent O. Vanhoucke, Christian Szegedy, Sergey Ioffe
-
Publication number: 20210224653Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: ApplicationFiled: January 22, 2021Publication date: July 22, 2021Inventors: Sergey Ioffe, Corinna Cortes
-
Publication number: 20210216870Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.Type: ApplicationFiled: January 22, 2021Publication date: July 15, 2021Inventors: Sergey Ioffe, Corinna Cortes