Patents by Inventor Luis Carlos Cobo Rus

Luis Carlos Cobo Rus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech synthesis utilizing audio waveform difference signal(s)

Patent number: 12211484

Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.

Type: Grant

Filed: January 19, 2024

Date of Patent: January 28, 2025

Assignee: DeepMind Technologies Limited

Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
METHODS AND APPARATUS FOR BYPASSING HOLDS

Publication number: 20240340373

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Application

Filed: June 17, 2024

Publication date: October 10, 2024

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
Methods and apparatus for bypassing holds

Patent number: 12015736

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Grant

Filed: May 19, 2023

Date of Patent: June 18, 2024

Assignee: GOOGLE LLC

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
SPEECH SYNTHESIS UTILIZING AUDIO WAVEFORM DIFFERENCE SIGNAL(S)

Publication number: 20240161729

Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.

Type: Application

Filed: January 19, 2024

Publication date: May 16, 2024

Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
Speech synthesis utilizing audio waveform difference signal(s)

Patent number: 11915682

Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.

Type: Grant

Filed: May 20, 2019

Date of Patent: February 27, 2024

Assignee: DeepMind Technologies Limited

Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
SPEAKER DIARIZATION USING SPEAKER EMBEDDING(S) AND TRAINED GENERATIVE MODEL

Publication number: 20230395069

Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.

Type: Application

Filed: August 21, 2023

Publication date: December 7, 2023

Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
METHODS AND APPARATUS FOR BYPASSING HOLDS

Publication number: 20230308542

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Application

Filed: May 19, 2023

Publication date: September 28, 2023

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
Speaker diarization using speaker embedding(s) and trained generative model

Patent number: 11735176

Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.

Type: Grant

Filed: March 29, 2021

Date of Patent: August 22, 2023

Assignee: GOOGLE LLC

Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
Methods and apparatus for bypassing holds

Patent number: 11677871

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Grant

Filed: May 12, 2022

Date of Patent: June 13, 2023

Assignee: GOOGLE LLC

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
METHODS AND APPARATUS FOR BYPASSING HOLDS

Publication number: 20220272191

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Application

Filed: May 12, 2022

Publication date: August 25, 2022

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
SPEECH SYNTHESIS UTILIZING AUDIO WAVEFORM DIFFERENCE SIGNAL(S)

Publication number: 20220254330

Abstract: Techniques are disclosed that enable generation of an audio waveform representing synthesized speech based on a difference signal determined using an autoregressive model. Various implementations include using a distribution of the difference signal values to represent sounds found in human speech with a higher level of granularity than sounds not frequently found in human speech. Additional or alternative implementations include using one or more speakers of a client device to render the generated audio waveform.

Type: Application

Filed: May 20, 2019

Publication date: August 11, 2022

Inventors: Luis Carlos Cobo Rus, Nal Kalchbrenner, Erich Elsen, Chenjie Gu
Methods and apparatus for bypassing holds

Patent number: 11336767

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Grant

Filed: December 14, 2020

Date of Patent: May 17, 2022

Assignee: GOOGLE LLC

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
SPEAKER DIARIZATION USING SPEAKER EMBEDDING(S) AND TRAINED GENERATIVE MODEL

Publication number: 20210217411

Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.

Type: Application

Filed: March 29, 2021

Publication date: July 15, 2021

Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
Speaker diarization using speaker embedding(s) and trained generative model

Patent number: 10978059

Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.

Type: Grant

Filed: September 25, 2018

Date of Patent: April 13, 2021

Assignee: GOOGLE LLC

Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus
METHODS AND APPARATUS FOR BYPASSING HOLDS

Publication number: 20210099575

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Application

Filed: December 14, 2020

Publication date: April 1, 2021

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Publication number: 20210089909

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output audio examples using a generative neural network. One of the methods includes obtaining a training conditioning text input; processing a training generative input comprising the training conditioning text input using a feedforward generative neural network to generate a training audio output; processing the training audio output using each of a plurality of discriminators, wherein the plurality of discriminators comprises one or more conditional discriminators and one or more unconditional discriminators; determining a first combined prediction by combining the respective predictions of the plurality of discriminators; and determining an update to current values of a plurality of generative parameters of the feedforward generative neural network to increase a first error in the first combined prediction.

Type: Application

Filed: September 25, 2020

Publication date: March 25, 2021

Inventors: Mikolaj Binkowski, Karen Simonyan, Jeffrey Donahue, Aidan Clark, Sander Etienne Lea Dieleman, Erich Konrad Elsen, Luis Carlos Cobo Rus, Norman Casagrande
SELECTING CONTENT ITEMS USING REINFORCEMENT LEARNING

Publication number: 20210073638

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.

Type: Application

Filed: November 16, 2020

Publication date: March 11, 2021

Inventors: Benjamin Kenneth Coppin, Mustafa Suleyman, Thomas Chadwick Walters, Timothy Mann, Chia-Yueh Carlton Chu, Martin Szummer, Luis Carlos Cobo Rus, Jean-Francois Crespo
Methods and apparatus for bypassing holds

Patent number: 10897535

Abstract: Automated monitoring of a voice communication session, when the session is in an on hold status, to determine when the session is no longer in the on hold status. When it is determined that the session is no longer in the on hold status, user interface output is rendered that is perceptible to a calling user that initiated the session, and that indicates that the on hold status of the session has ceased. In some implementations, an audio stream of the session can be monitored to determine, based on processing of the audio stream, a candidate end of the on hold status. In response, a response solicitation signal is injected into an outgoing portion of the audio. The audio stream can be further monitored for a response (if any) to the response solicitation signal. The response (if any) can be processed to determine whether the end of the on hold status is an actual end of the on hold status.

Type: Grant

Filed: June 28, 2018

Date of Patent: January 19, 2021

Assignee: GOOGLE LLC

Inventors: Cassandra Xia, Luis Carlos Cobo Rus
Selecting content items using reinforcement learning

Patent number: 10839310

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.

Type: Grant

Filed: July 15, 2016

Date of Patent: November 17, 2020

Assignee: Google LLC

Inventors: Benjamin Kenneth Coppin, Mustafa Suleyman, Thomas Chadwick Walters, Timothy Mann, Chia-Yueh Carlton Chu, Martin Szummer, Luis Carlos Cobo Rus, Jean-Francois Crespo
SPEAKER DIARIZATION USING SPEAKER EMBEDDING(S) AND TRAINED GENERATIVE MODEL

Publication number: 20200342857

Abstract: Speaker diarization techniques that enable processing of audio data to generate one or more refined versions of the audio data, where each of the refined versions of the audio data isolates one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by generating a speaker embedding for the single human speaker, and processing the audio data using a trained generative model—and using the speaker embedding in determining activations for hidden layers of the trained generative model during the processing. Output is generated over the trained generative model based on the processing, and the output is the refined version of the audio data.

Type: Application

Filed: September 25, 2018

Publication date: October 29, 2020

Inventors: Ignacio Lopez Moreno, Luis Carlos Cobo Rus

1 2 next