Patents by Inventor Oleg Rybakov

Oleg Rybakov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240232546
    Abstract: The present disclosure relates to a streaming speech-to-speech conversion model, where an encoder runs in real time while a user is speaking, then after the speaking stops, a decoder generates output audio in real time. A streaming-based approach produces an acceptable delay with minimal loss in conversion quality when compared to other non-streaming server-based models. A hybrid model approach for combines look-ahead in the encoder and a non-causal stacker with non-causal self-attention.
    Type: Application
    Filed: October 24, 2023
    Publication date: July 11, 2024
    Applicant: GOOGLE LLC
    Inventors: Oleg RYBAKOV, Fadi BIADSY
  • Publication number: 20240153514
    Abstract: Apparatus and methods related to enhancement of audio content are provided. An example method includes receiving, by a computing device and via a communications network interface, a compressed audio data frame, wherein the compressed audio data frame is received after transmission over a communications network, The method further includes decompressing the compressed audio data frame to extract an audio waveform. The method also includes predicting, by applying a neural network to the audio waveform, an enhanced version of the audio waveform, wherein the neural network has been trained on (i) a ground truth sample comprising unencoded audio waveforms prior to compression by an audio encoder, and (ii) a training dataset comprising decoded audio waveforms after compression of the unencoded audio waveforms by the audio encoder. The method additionally includes providing, by an audio output component of the computing device, the enhanced version of the audio waveform.
    Type: Application
    Filed: March 5, 2021
    Publication date: May 9, 2024
    Inventors: Omer Ahmed Siddig Osman, Dominik Roblek, Yunpeng Li, Marco Tagliasacchi, Oleg Rybakov, Victor Ungureanu, Eric Giguere
  • Publication number: 20240135117
    Abstract: The present disclosure relates to a streaming speech-to-speech conversion model, where an encoder runs in real time while a user is speaking, then after the speaking stops, a decoder generates output audio in real time. A streaming-based approach produces an acceptable delay with minimal loss in conversion quality when compared to other non-streaming server-based models. A hybrid model approach for combines look-ahead in the encoder and a non-causal stacker with non-causal self-attention.
    Type: Application
    Filed: October 23, 2023
    Publication date: April 25, 2024
    Applicant: GOOGLE LLC
    Inventors: Oleg RYBAKOV, Fadi BIADSY
  • Patent number: 11853391
    Abstract: Exemplary embodiments provide distributed parallel training of a machine learning model. Multiple processors may be used to train a machine learning model to reduce training time. To synchronize trained model data between the processors, data is communicated between the processors after some number of training cycles. To improve the communication efficiency, exemplary embodiments synchronize data among a set of processors after a predetermined number of training cycles, and synchronize data between one or more processors of each set of the processors after a predetermined number of training cycles. During the first synchronization among a set of processors, compressed model gradient data generated after performing the training cycles may be communicated. During the second synchronization between the set of processors, trained models or full model gradient data generated after performing the training cycles may be communicated.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: December 26, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Pranav Prashant Ladkat, Oleg Rybakov, Nikko Strom, Sri Venkata Surya Siva Rama Krishna Garimella, Sree Hari Krishnan Parthasarathi
  • Publication number: 20230395061
    Abstract: A method for turn detection in a speech-to-speech model includes receiving, as input to the speech-to-speech (S2S) model, a sequence of acoustic frames corresponding to an utterance. The method further includes, at each of a plurality of output steps, generating, by an audio encoder of the S2S model, a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames, and determining, by a turn detector of the S2S model, based on the higher order feature representation generated by the audio encoder at the corresponding output step, whether the utterance is at a breakpoint at the corresponding output step. When the turn detector determines that the utterance is at the breakpoint, the method includes synthesizing a sequence of output audio frames output by a speech decoder of the S2S model into a time-domain audio waveform of synthesized speech representing the utterance spoken by the user.
    Type: Application
    Filed: May 17, 2023
    Publication date: December 7, 2023
    Applicant: Google LLC
    Inventors: Fadi Biadsy, Oleg Rybakov
  • Publication number: 20230298574
    Abstract: A method for speech conversion includes obtaining a speech conversion model configured to convert input utterances of human speech directly into corresponding output utterances of synthesized speech. The method further includes receiving a speech conversion request including input audio data corresponding to an utterance spoken by a target speaker associated with atypical speech and a speaker identifier uniquely identifying the target speaker. The method includes activating, using the speaker identifier, a particular sub-model for biasing the speech conversion model to recognize a type of the atypical speech associated with the target speaker identified by the speaker identifier.
    Type: Application
    Filed: March 15, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew M. Rosenberg, Pedro J.Moreno Mengibar
  • Publication number: 20230298569
    Abstract: A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes quantizing the trained ASR model to an integer target fixed-bit width. The quantized trained ASR model includes a plurality of weights. Each weight of the plurality of weights includes an integer with the target fixed-bit width. The method includes providing the quantized trained ASR model to a user device.
    Type: Application
    Filed: March 20, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Shaojin Ding, Oleg Rybakov, Phoenix Meadowlark, Shivani Agrawal, Yanzhang He, Lukasz Lew
  • Publication number: 20230267949
    Abstract: A method includes receiving a current spectrogram frame and reconstructing a phase of the current spectrogram frame by, for each corresponding committed spectrogram frame in a sequence of M number of committed spectrogram frames preceding the current spectrogram frame, obtaining a value of a committed phase of the corresponding committed spectrogram frame and estimating the phase of the current spectrogram frame based on a magnitude of the current spectrogram frame and the value of the committed phase of each corresponding committed spectrogram frame in the sequence of M number of committed spectrogram frames preceding the current spectrogram frame. The method also includes synthesizing, for the current spectrogram frame, a new time-domain audio waveform frame based on the estimated phase of the current spectrogram frame.
    Type: Application
    Filed: February 2, 2023
    Publication date: August 24, 2023
    Applicant: Google LLC
    Inventors: Oleg Rybakov, Liyang Jiang, Fadi Biadsy
  • Publication number: 20230060395
    Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.
    Type: Application
    Filed: November 9, 2022
    Publication date: March 2, 2023
    Inventors: John J. ROACH, Grigori NEPOMNIACHTCHI, Robert COUCH, Oleg RYBAKOV, Michael GILLEN, Kevin Andrew BELL
  • Publication number: 20230013370
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.
    Type: Application
    Filed: July 1, 2022
    Publication date: January 19, 2023
    Inventors: Yunpeng Li, Marco Tagliasacchi, Dominik Roblek, Félix de Chaumont Quitry, Beat Gfeller, Hannah Raphaelle Muckenhirn, Victor Ungureanu, Oleg Rybakov, Karolis Misiunas, Zalán Borsos
  • Patent number: 11539848
    Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.
    Type: Grant
    Filed: June 1, 2020
    Date of Patent: December 27, 2022
    Assignee: MITEK SYSTEMS, INC.
    Inventors: John J. Roach, Grigori Nepomniachtchi, Robert Couch, Oleg Rybakov, Michael Gillen, Kevin Andrew Bell
  • Patent number: 11423076
    Abstract: Various approaches discussed herein enable browsing groups of visually similar items to an item of interest, wherein the item of interest may be identified in a query image, for example. One or more visual attributes associated with the item of interest are identified, and the visually similar items matching at least one of the visual attributes are grouped together, wherein the group is ranked according to the visually similar items' overall visual similarity to the item of interest, for example by using a visual similarity score and/or metric.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: August 23, 2022
    Assignee: A9.COM, INC.
    Inventors: Rahul Bhotika, Lixin Duan, Oleg Rybakov, Jian Dong
  • Patent number: 10997500
    Abstract: The present disclosure is directed to generating neural network (NN) output using input data representing various types of events, such as input representing a certain type of event and also an engagement metric that may be representative of a property of the event or representative of a related but different type of event. For example, the output values generated using the NN may be associated with the likelihood that certain future events will occur, given the occurrence of certain past or current events. The output can then be modified (e.g., re-ranked, adjusted, etc.) based on the occurrence of certain other past or current events.
    Type: Grant
    Filed: May 23, 2017
    Date of Patent: May 4, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: FNU Vishnu Narayanan, Oleg Rybakov, Siddharth Singh
  • Patent number: 10970629
    Abstract: The present disclosure is directed to reducing model size of a machine learning model with encoding. The input to a machine learning model may be encoded using a probabilistic data structure with a plurality of mapping functions into a lower dimensional space. Encoding the input to the machine learning model results in a compact machine learning model with a reduced model size. The compact machine learning model can output an encoded representation of a higher-dimensional space. Use of such a machine learning model can include decoding the output of the machine learning model into the higher dimensional space of the non-encoded input.
    Type: Grant
    Filed: February 24, 2017
    Date of Patent: April 6, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Leo Parker Dirac, Oleg Rybakov, Vijai Mohan
  • Patent number: 10896459
    Abstract: Some aspects of the present disclosure relate to generating and training a neural network by separating historical item interaction data into both inputs and outputs. This may be done, for example, based on date. For example, a neural network machine learning technique may be used to generate a prediction model using a set of inputs that includes both a number of items purchased by a number of users before a certain date as well as some or all attributes of those items, and a set of outputs that includes the items purchased after that date. The items purchased before that date and the associated attributes can be subjected to a time-decay function.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: January 19, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Rejith George Joseph, Oleg Rybakov
  • Patent number: 10824940
    Abstract: The present disclosure is directed to training, and providing recommendations via, a temporal ensemble of neural networks. The neural networks in the temporal ensemble can be trained at different times. For example, a neural network can be periodically trained using current item interaction data, for example once per day using purchase histories of users of an electronic commerce system. The item interaction data can be split into a more recent group and a less recent group, for example the last two weeks of data and the remainder of the last two years of data. The periodic training of neural networks, using updated data and the sliding windows created by the date split, results in a number of different models for predicting item interaction events. Using a collection of these neural networks together as a temporal ensemble can increase recommendation accuracy without requiring additional hardware for training.
    Type: Grant
    Filed: November 30, 2016
    Date of Patent: November 3, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Oleg Rybakov, Siddharth Singh
  • Publication number: 20200304650
    Abstract: Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.
    Type: Application
    Filed: June 1, 2020
    Publication date: September 24, 2020
    Inventors: John J. ROACH, Grigori NEPOMNIACHTCHI, Robert COUCH, Oleg RYBAKOV, Michael GILLEN, Kevin Andrew BELL
  • Patent number: 10650432
    Abstract: Some aspects of the present disclosure relate to generating and training a neural network by separating historical item interaction data into both inputs and outputs. This may be done, for example, based on date. For example, a neural network machine learning technique may be used to generate a prediction model using a set of inputs that includes both a number of items purchased by a number of users before a certain date as well as some or all attributes of those items, and a set of outputs that includes the items purchased after that date. The items purchased before that date and the associated attributes can be subjected to a time-decay function.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: May 12, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Rejith George Joseph, Oleg Rybakov
  • Patent number: 10635973
    Abstract: Techniques described herein are directed to improved artificial neural network machine learning techniques that may be employed with a recommendation system to provide predictions with improved accuracy. In some embodiments, item consumption events may be identified for a plurality of users. From these item consumption events, a set of inputs and a set of outputs may be generated according to a data split. In some embodiments, the set of outputs (and potentially the set of inputs) may include item consumption events that are weighted according to a time-decay function. Once a set of inputs and a set of outputs are identified, they may be used to train a prediction model using an artificial neural network. The prediction model may then be used to identify predictions for a specific user based on user-specific item consumption event data.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: April 28, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Leo Parker Dirac, Rejith George Joseph, Vijai Mohan, Oleg Rybakov
  • Patent number: 10380461
    Abstract: Approaches introduce a pre-processing and post-processing framework to a neural network-based approach to identify items represented in an image. For example, a classifier that is trained on several categories can be provided. An image that includes a representation of an item of interest is obtained. Rotated versions of the image are generated and each of a subset of the rotated images is analyzed to determine a probability that a respective image includes an instance of a particular category. The probabilities can be used to determine a probability distribution of output category data, and the data can be analyzed to select an image of the rotated versions of the image. Thereafter, a categorization tree can then be utilized, whereby for the item of interest represented the image, the category of the item can be determined. The determined category can be provided to an item retrieval algorithm to determine primary content for the item of interest.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: August 13, 2019
    Assignee: A9.COM, INC.
    Inventors: Avinash Aghoram Ravichandran, Matias Omar Gregorio Benitez, Rahul Bhotika, Scott Daniel Helmer, Anshul Kumar Jain, Junxiong Jia, Rakesh Madhavan Nambiar, Oleg Rybakov