Patents by Inventor Luis Carlos Cobo Rus

Luis Carlos Cobo Rus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915682
    Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: February 27, 2024
    Assignee: DeepMind Technologies Limited
    Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
  • Publication number: 20230395069
    Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.
    Type: Application
    Filed: August 21, 2023
    Publication date: December 7, 2023
    Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
  • Publication number: 20230308542
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Application
    Filed: May 19, 2023
    Publication date: September 28, 2023
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Patent number: 11735176
    Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: August 22, 2023
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
  • Patent number: 11677871
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Grant
    Filed: May 12, 2022
    Date of Patent: June 13, 2023
    Assignee: GOOGLE LLC
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Publication number: 20220272191
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Application
    Filed: May 12, 2022
    Publication date: August 25, 2022
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Publication number: 20220254330
    Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.
    Type: Application
    Filed: May 20, 2019
    Publication date: August 11, 2022
    Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
  • Patent number: 11336767
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: May 17, 2022
    Assignee: GOOGLE LLC
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Publication number: 20210217411
    Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.
    Type: Application
    Filed: March 29, 2021
    Publication date: July 15, 2021
    Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
  • Patent number: 10978059
    Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: April 13, 2021
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
  • Publication number: 20210099575
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Application
    Filed: December 14, 2020
    Publication date: April 1, 2021
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Publication number: 20210089909
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output audio examples using a generative neural network. One of the methods includes obtaining a training conditioning text input; processing a training generative input comprising the training conditioning text input using a feedforward generative neural network to generate a training audio output; processing the training audio output using each of a plurality of discriminators, wherein the plurality of discriminators comprises one or more conditional discriminators and one or more unconditional discriminators; determining a first combined prediction by combining the respective predictions of the plurality of discriminators; and determining an update to current values of a plurality of generative parameters of the feedforward generative neural network to increase a first error in the first combined prediction.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 25, 2021
    Inventors: Mikolaj Binkowski, Karen Simonyan, Jeffrey Donahue, Aidan Clark, Sander Etienne Lea Dieleman, Erich Konrad Elsen, Luis Carlos Cobo Rus, Norman Casagrande
  • Publication number: 20210073638
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.
    Type: Application
    Filed: November 16, 2020
    Publication date: March 11, 2021
    Inventors: Benjamin Kenneth Coppin, Mustafa Suleyman, Thomas Chadwick Walters, Timothy Mann, Chia-Yueh Carlton Chu, Martin Szummer, Luis Carlos Cobo Rus, Jean-Francois Crespo
  • Patent number: 10897535
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: January 19, 2021
    Assignee: GOOGLE LLC
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Patent number: 10839310
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: November 17, 2020
    Assignee: Google LLC
    Inventors: Benjamin Kenneth Coppin, Mustafa Suleyman, Thomas Chadwick Walters, Timothy Mann, Chia-Yueh Carlton Chu, Martin Szummer, Luis Carlos Cobo Rus, Jean-Francois Crespo
  • Publication number: 20200342857
    Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.
    Type: Application
    Filed: September 25, 2018
    Publication date: October 29, 2020
    Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
  • Publication number: 20200344351
    Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.
    Type: Application
    Filed: June 28, 2018
    Publication date: October 29, 2020
    Inventors: Cassandra Xia, Luis Carlos Cobo Rus
  • Publication number: 20180018580
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.
    Type: Application
    Filed: July 15, 2016
    Publication date: January 18, 2018
    Inventors: Benjamin Kenneth Coppin, Mustafa Suleyman, Thomas Chadwick Walters, Timothy Mann, Chia-Yueh Carlton Chu, Martin Szummer, Luis Carlos Cobo Rus, Jean-Francois Crespo