Patents by Inventor Oleg Rybakov

Oleg Rybakov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD FOR SPEECH-TO-SPEECH CONVERSION

Publication number: 20240135117

Abstract: The present disclosure relates to a streaming speech-to-speech conversion model, where an encoder runs in real time while a user is speaking, then after the speaking stops, a decoder generates output audio in real time. A streaming-based approach produces an acceptable delay with minimal loss in conversion quality when compared to other non-streaming server-based models. A hybrid model approach for combines look-ahead in the encoder and a non-causal stacker with non-causal self-attention.

Type: Application

Filed: October 23, 2023

Publication date: April 25, 2024

Applicant: GOOGLE LLC

Inventors: Oleg RYBAKOV, Fadi BIADSY
Distributed model training

Patent number: 11853391

Abstract: Exemplary embodiments provide distributed parallel training of a machine learning model. Multiple processors may be used to train a machine learning model to reduce training time. To synchronize trained model data between the processors, data is communicated between the processors after some number of training cycles. To improve the communication efficiency, exemplary embodiments synchronize data among a set of processors after a predetermined number of training cycles, and synchronize data between one or more processors of each set of the processors after a predetermined number of training cycles. During the first synchronization among a set of processors, compressed model gradient data generated after performing the training cycles may be communicated. During the second synchronization between the set of processors, trained models or full model gradient data generated after performing the training cycles may be communicated.

Type: Grant

Filed: September 24, 2018

Date of Patent: December 26, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Pranav Prashant Ladkat, Oleg Rybakov, Nikko Strom, Sri Venkata Surya Siva Rama Krishna Garimella, Sree Hari Krishnan Parthasarathi
Streaming Speech-to-speech Model With Automatic Speaker Turn Detection

Publication number: 20230395061

Abstract: A method for turn detection in a speech-to-speech model includes receiving, as input to the speech-to-speech (S2S) model, a sequence of acoustic frames corresponding to an utterance. The method further includes, at each of a plurality of output steps, generating, by an audio encoder of the S2S model, a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames, and determining, by a turn detector of the S2S model, based on the higher order feature representation generated by the audio encoder at the corresponding output step, whether the utterance is at a breakpoint at the corresponding output step. When the turn detector determines that the utterance is at the breakpoint, the method includes synthesizing a sequence of output audio frames output by a speech decoder of the S2S model into a time-domain audio waveform of synthesized speech representing the utterance spoken by the user.

Type: Application

Filed: May 17, 2023

Publication date: December 7, 2023

Applicant: Google LLC

Inventors: Fadi Biadsy, Oleg Rybakov
4-bit Conformer with Accurate Quantization Training for Speech Recognition

Publication number: 20230298569

Abstract: A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes quantizing the trained ASR model to an integer target fixed-bit width. The quantized trained ASR model includes a plurality of weights. Each weight of the plurality of weights includes an integer with the target fixed-bit width. The method includes providing the quantized trained ASR model to a user device.

Type: Application

Filed: March 20, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Shaojin Ding, Oleg Rybakov, Phoenix Meadowlark, Shivani Agrawal, Yanzhang He, Lukasz Lew
Scalable Model Specialization Framework for Speech Model Personalization

Publication number: 20230298574

Abstract: A method for speech conversion includes obtaining a speech conversion model configured to convert input utterances of human speech directly into corresponding output utterances of synthesized speech. The method further includes receiving a speech conversion request including input audio data corresponding to an utterance spoken by a target speaker associated with atypical speech and a speaker identifier uniquely identifying the target speaker. The method includes activating, using the speaker identifier, a particular sub-model for biasing the speech conversion model to recognize a type of the atypical speech associated with the target speaker identified by the speaker identifier.

Type: Application

Filed: March 15, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew M. Rosenberg, Pedro J.Moreno Mengibar
Streaming Vocoder

Publication number: 20230267949

Abstract: A method includes receiving a current spectrogram frame and reconstructing a phase of the current spectrogram frame by, for each corresponding committed spectrogram frame in a sequence of M number of committed spectrogram frames preceding the current spectrogram frame, obtaining a value of a committed phase of the corresponding committed spectrogram frame and estimating the phase of the current spectrogram frame based on a magnitude of the current spectrogram frame and the value of the committed phase of each corresponding committed spectrogram frame in the sequence of M number of committed spectrogram frames preceding the current spectrogram frame. The method also includes synthesizing, for the current spectrogram frame, a new time-domain audio waveform frame based on the estimated phase of the current spectrogram frame.

Type: Application

Filed: February 2, 2023

Publication date: August 24, 2023

Applicant: Google LLC

Inventors: Oleg Rybakov, Liyang Jiang, Fadi Biadsy
SYSTEMS AND METHODS FOR AUTOMATIC IMAGE CAPTURE ON A MOBILE DEVICE

Publication number: 20230060395

Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.

Type: Application

Filed: November 9, 2022

Publication date: March 2, 2023

Inventors: John J. ROACH, Grigori NEPOMNIACHTCHI, Robert COUCH, Oleg RYBAKOV, Michael GILLEN, Kevin Andrew BELL
GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS

Publication number: 20230013370

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

Type: Application

Filed: July 1, 2022

Publication date: January 19, 2023

Inventors: Yunpeng Li, Marco Tagliasacchi, Dominik Roblek, Félix de Chaumont Quitry, Beat Gfeller, Hannah Raphaelle Muckenhirn, Victor Ungureanu, Oleg Rybakov, Karolis Misiunas, Zalán Borsos
Systems and methods for automatic image capture on a mobile device

Patent number: 11539848

Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.

Type: Grant

Filed: June 1, 2020

Date of Patent: December 27, 2022

Assignee: MITEK SYSTEMS, INC.

Inventors: John J. Roach, Grigori Nepomniachtchi, Robert Couch, Oleg Rybakov, Michael Gillen, Kevin Andrew Bell
Image similarity-based group browsing

Patent number: 11423076

Abstract: Various approaches discussed herein enable browsing groups of visually similar items to an item of interest, wherein the item of interest may be identified in a query image, for example. One or more visual attributes associated with the item of interest are identified, and the visually similar items matching at least one of the visual attributes are grouped together, wherein the group is ranked according to the visually similar items' overall visual similarity to the item of interest, for example by using a visual similarity score and/or metric.

Type: Grant

Filed: April 8, 2019

Date of Patent: August 23, 2022

Assignee: A9.COM, INC.

Inventors: Rahul Bhotika, Lixin Duan, Oleg Rybakov, Jian Dong
Neural network with re-ranking using engagement metrics

Patent number: 10997500

Abstract: The present disclosure is directed to generating neural network (NN) output using input data representing various types of events, such as input representing a certain type of event and also an engagement metric that may be representative of a property of the event or representative of a related but different type of event. For example, the output values generated using the NN may be associated with the likelihood that certain future events will occur, given the occurrence of certain past or current events. The output can then be modified (e.g., re-ranked, adjusted, etc.) based on the occurrence of certain other past or current events.

Type: Grant

Filed: May 23, 2017

Date of Patent: May 4, 2021

Assignee: Amazon Technologies, Inc.

Inventors: FNU Vishnu Narayanan, Oleg Rybakov, Siddharth Singh
Encodings for reversible sparse dimensionality reduction

Patent number: 10970629

Abstract: The present disclosure is directed to reducing model size of a machine learning model with encoding. The input to a machine learning model may be encoded using a probabilistic data structure with a plurality of mapping functions into a lower dimensional space. Encoding the input to the machine learning model results in a compact machine learning model with a reduced model size. The compact machine learning model can output an encoded representation of a higher-dimensional space. Use of such a machine learning model can include decoding the output of the machine learning model into the higher dimensional space of the non-encoded input.

Type: Grant

Filed: February 24, 2017

Date of Patent: April 6, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Leo Parker Dirac, Oleg Rybakov, Vijai Mohan
Recommendation system using improved neural network

Patent number: 10896459

Abstract: Some aspects of the present disclosure relate to generating and training a neural network by separating historical item interaction data into both inputs and outputs. This may be done, for example, based on date. For example, a neural network machine learning technique may be used to generate a prediction model using a set of inputs that includes both a number of items purchased by a number of users before a certain date as well as some or all attributes of those items, and a set of outputs that includes the items purchased after that date. The items purchased before that date and the associated attributes can be subjected to a time-decay function.

Type: Grant

Filed: April 7, 2020

Date of Patent: January 19, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Rejith George Joseph, Oleg Rybakov
Temporal ensemble of machine learning models trained during different time intervals

Patent number: 10824940

Abstract: The present disclosure is directed to training, and providing recommendations via, a temporal ensemble of neural networks. The neural networks in the temporal ensemble can be trained at different times. For example, a neural network can be periodically trained using current item interaction data, for example once per day using purchase histories of users of an electronic commerce system. The item interaction data can be split into a more recent group and a less recent group, for example the last two weeks of data and the remainder of the last two years of data. The periodic training of neural networks, using updated data and the sliding windows created by the date split, results in a number of different models for predicting item interaction events. Using a collection of these neural networks together as a temporal ensemble can increase recommendation accuracy without requiring additional hardware for training.

Type: Grant

Filed: November 30, 2016

Date of Patent: November 3, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Oleg Rybakov, Siddharth Singh
SYSTEMS AND METHODS FOR AUTOMATIC IMAGE CAPTURE ON A MOBILE DEVICE

Publication number: 20200304650

Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.

Type: Application

Filed: June 1, 2020

Publication date: September 24, 2020

Inventors: John J. ROACH, Grigori NEPOMNIACHTCHI, Robert COUCH, Oleg RYBAKOV, Michael GILLEN, Kevin Andrew BELL
Recommendation system using improved neural network

Patent number: 10650432

Abstract: Some aspects of the present disclosure relate to generating and training a neural network by separating historical item interaction data into both inputs and outputs. This may be done, for example, based on date. For example, a neural network machine learning technique may be used to generate a prediction model using a set of inputs that includes both a number of items purchased by a number of users before a certain date as well as some or all attributes of those items, and a set of outputs that includes the items purchased after that date. The items purchased before that date and the associated attributes can be subjected to a time-decay function.

Type: Grant

Filed: November 28, 2016

Date of Patent: May 12, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Rejith George Joseph, Oleg Rybakov
Recommendation system using improved neural network

Patent number: 10635973

Abstract: Techniques described herein are directed to improved artificial neural network machine learning techniques that may be employed with a recommendation system to provide predictions with improved accuracy. In some embodiments, item consumption events may be identified for a plurality of users. From these item consumption events, a set of inputs and a set of outputs may be generated according to a data split. In some embodiments, the set of outputs (and potentially the set of inputs) may include item consumption events that are weighted according to a time-decay function. Once a set of inputs and a set of outputs are identified, they may be used to train a prediction model using an artificial neural network. The prediction model may then be used to identify predictions for a specific user based on user-specific item consumption event data.

Type: Grant

Filed: June 28, 2016

Date of Patent: April 28, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Leo Parker Dirac, Rejith George Joseph, Vijai Mohan, Oleg Rybakov
Object recognition

Patent number: 10380461

Abstract: Approaches introduce a pre-processing and post-processing framework to a neural network-based approach to identify items represented in an image. For example, a classifier that is trained on several categories can be provided. An image that includes a representation of an item of interest is obtained. Rotated versions of the image are generated and each of a subset of the rotated images is analyzed to determine a probability that a respective image includes an instance of a particular category. The probabilities can be used to determine a probability distribution of output category data, and the data can be analyzed to select an image of the rotated versions of the image. Thereafter, a categorization tree can then be utilized, whereby for the item of interest represented the image, the category of the item can be determined. The determined category can be provided to an item retrieval algorithm to determine primary content for the item of interest.

Type: Grant

Filed: October 20, 2017

Date of Patent: August 13, 2019

Assignee: A9.COM, INC.

Inventors: Avinash Aghoram Ravichandran, Matias Omar Gregorio Benitez, Rahul Bhotika, Scott Daniel Helmer, Anshul Kumar Jain, Junxiong Jia, Rakesh Madhavan Nambiar, Oleg Rybakov
IMAGE SIMILARITY-BASED GROUP BROWSING

Publication number: 20190236098

Abstract: Various approaches discussed herein enable browsing groups of visually similar items to an item of interest, wherein the item of interest may be identified in a query image, for example. One or more visual attributes associated with the item of interest are identified, and the visually similar items matching at least one of the visual attributes are grouped together, wherein the group is ranked according to the visually similar items' overall visual similarity to the item of interest, for example by using a visual similarity score and/or metric.

Type: Application

Filed: April 8, 2019

Publication date: August 1, 2019

Inventors: Rahul Bhotika, Lixin Duan, Oleg Rybakov, Jian Dong
Image similarity-based group browsing

Patent number: 10282431

Abstract: Various approaches discussed herein enable browsing groups of visually similar items to an item of interest, wherein the item of interest may be identified in a query image, for example. One ore more visual attributes associated with the item of interest are identified, and the visually similar items matching at least one of the visual attributes are grouped together, wherein the group is ranked according to the visually similar items' overall visual similarity to the item of interest, for example by using a visual similarity score and/or metric.

Type: Grant

Filed: December 18, 2015

Date of Patent: May 7, 2019

Assignee: A9.COM, INC.

Inventors: Rahul Bhotika, Lixin Duan, Oleg Rybakov, Jian Dong

1 2 next