Patents by Inventor Joel Shor

Joel Shor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method For Detecting And Classifying Coughs Or Other Non-Semantic Sounds Using Audio Feature Set Learned From Speech

Publication number: 20250182780

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Type: Application

Filed: February 6, 2025

Publication date: June 5, 2025

Inventors: Jacob Garrison, Jacob Scott Peplinski, Joel Shor
Method for detecting and classifying coughs or other non-semantic sounds using audio feature set learned from speech

Patent number: 12249346

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Type: Grant

Filed: November 15, 2023

Date of Patent: March 11, 2025

Assignee: Google LLC

Inventors: Jacob Garrison, Jacob Scott Peplinski, Joel Shor
Self-supervised speech representations for fake audio detection

Patent number: 12198718

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Type: Grant

Filed: August 9, 2023

Date of Patent: January 14, 2025

Assignee: Google LLC

Inventors: Joel Shor, Alanna Foster Slocum
AUTOMATIC SPEECH RECOGNITION SYSTEM CONTEXTUALLY BIASED FOR MEDICAL SPEECH

Publication number: 20240257805

Abstract: Methods and systems of generating text representation of spoken medical speech are presented herein. Some methods may include the steps of providing a pre-trained automatic speech recognition (ASR) system stored in memory and executed on a processor; receiving, by the pre-trained ASR system, spoken medical speech; and generating text of the spoken medical speech by biasing the pre-trained ASR system using a contextual language model, where the contextual language model may include medical terminology that is not included in a vocabulary used to train the pre-trained ASR system.

Type: Application

Filed: January 26, 2024

Publication date: August 1, 2024

Inventor: Joel Shor
Methods and systems for implementing on-device non-semantic representation fine-tuning for speech classification

Patent number: 11996116

Abstract: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.

Type: Grant

Filed: August 24, 2020

Date of Patent: May 28, 2024

Assignee: Google LLC

Inventors: Joel Shor, Ronnie Maor, Oran Lang, Omry Tuval, Marco Tagliasacchi, Ira Shavitt, Felix de Chaumont Quitry, Dotan Emanuel, Aren Jansen
Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech

Publication number: 20240161769

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Type: Application

Filed: November 15, 2023

Publication date: May 16, 2024

Inventors: Jacob Garrison, Jacob Scott Peplinski, Joel Shor
Combined Compression and Feature Extraction Models for Storing and Analyzing Medical Videos

Publication number: 20240129515

Abstract: A method of compressing and detecting target features of a medical video is presented herein. In some embodiments, the method may include receiving an uncompressed medical video comprising at least one target feature, compressing the uncompressed medical video to generate a compressed medical video based on a predicted location of the at least one target feature using a first pretrained machine learning model, and detecting the location of the at least one target feature of the compressed medical video using a second pretrained machine learning model. In some embodiments, the first pretrained machine learning model and the second pretrained machine learning model may be trained in tandem using domain-specific medical videos.

Type: Application

Filed: August 31, 2023

Publication date: April 18, 2024

Inventor: Joel Shor
Method for detecting and classifying coughs or other non-semantic sounds using audio feature set learned from speech

Patent number: 11862188

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Type: Grant

Filed: October 21, 2021

Date of Patent: January 2, 2024

Assignee: Google LLC

Inventors: Jacob Garrison, Jacob Scott Peplinski, Joel Shor
SELF-SUPERVISED SPEECH REPRESENTATIONS FOR FAKE AUDIO DETECTION

Publication number: 20230386506

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Type: Application

Filed: August 9, 2023

Publication date: November 30, 2023

Applicant: Google LLC

Inventors: Joel Shor, Alanna Foster Slocum
Self-supervised speech representations for fake audio detection

Patent number: 11756572

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Type: Grant

Filed: December 2, 2020

Date of Patent: September 12, 2023

Assignee: Google LLC

Inventors: Joel Shor, Alanna Foster Slocum
Computing systems with modularized infrastructure for training generative adversarial networks

Patent number: 11710300

Abstract: Computing systems that provide a modularized infrastructure for training Generative Adversarial Networks (GANs) are provided herein. For example, the modularized infrastructure can include a lightweight library designed to make it easy to train and evaluate GANs. A user can interact with and/or build upon the modularized infrastructure to easily train GANs. The modularized infrastructure can include a number of distinct sets of code that handle various stages of and operations within the GAN training process. The sets of code can be modular. That is, the sets of code can be designed to exist independently yet be easily and intuitively combinable. Thus, the user can employ some or all of the sets of code or can replace a certain set of code with a set of custom-code while still generating a workable combination.

Type: Grant

Filed: October 12, 2018

Date of Patent: July 25, 2023

Assignee: GOOGLE LLC

Inventors: Joel Shor, Sergio Guadarrama Cotado
Self-Supervised Speech Representations for Fake Audio Detection

Publication number: 20220172739

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Type: Application

Filed: December 2, 2020

Publication date: June 2, 2022

Applicant: Google LLC

Inventors: Joel Shor, Joshua Foster Slocum
Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech

Publication number: 20220130415

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Type: Application

Filed: October 21, 2021

Publication date: April 28, 2022

Inventors: Jacob Garrison, Jacob Scott Peplinski, Joel Shor
Methods and Systems for Implementing On-Device Non-Semantic Representation Fine-Tuning for Speech Classification

Publication number: 20220059117

Abstract: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.

Type: Application

Filed: August 24, 2020

Publication date: February 24, 2022

Inventors: Joel Shor, Ronnie Maor, Oran Lang, Omry Tuval, Marco Tagliasacchi, Ira Shavitt, Felix de Chaumont Quitry, Dotan Emanuel, Aren Jansen
Image compression with recurrent neural networks

Patent number: 10713818

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

Type: Grant

Filed: January 28, 2019

Date of Patent: July 14, 2020

Assignee: Google LLC

Inventors: George Dan Toderici, Sean O'Malley, Rahul Sukthankar, Sung Jin Hwang, Damien Vincent, Nicholas Johnston, David Charles Minnen, Joel Shor, Michele Covell
Computing Systems with Modularized Infrastructure for Training Generative Adversarial Networks

Publication number: 20190138847

Abstract: Example aspects of the present disclosure are directed to computing systems that provide a modularized infrastructure for training Generative Adversarial Networks (GANs). For example, the modularized infrastructure can include a lightweight library designed to make it easy to train and evaluate GANs. A user can interact with and/or build upon the modularized infrastructure to easily train GANs. According to one aspect of the present disclosure, the modularized infrastructure can include a number of distinct sets of code that handle various stages of and operations within the GAN training process. The sets of code can be modular. That is, the sets of code can be designed to exist independently yet be easily and intuitively combinable. Thus, the user can employ some or all of the sets of code or can replace a certain set of code with a set of custom-code while still generating a workable combination.

Type: Application

Filed: October 12, 2018

Publication date: May 9, 2019

Inventors: Joel Shor, Sergio Guadarrama Cotado
Image compression with recurrent neural networks

Patent number: 10192327

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

Type: Grant

Filed: February 3, 2017

Date of Patent: January 29, 2019

Assignee: Google LLC

Inventors: George Dan Toderici, Sean O'Malley, Rahul Sukthankar, Sung Jin Hwang, Damien Vincent, Nicholas Johnston, David Charles Minnen, Joel Shor, Michele Covell