Patents by Inventor Yuxuan Wang

Yuxuan Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

END-TO-END TEXT-TO-SPEECH CONVERSION

Publication number: 20210366463

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Application

Filed: August 2, 2021

Publication date: November 25, 2021

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
SYNTHESIZING SPEECH FROM TEXT USING NEURAL NETWORKS

Publication number: 20210295858

Abstract: Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.

Type: Application

Filed: April 5, 2021

Publication date: September 23, 2021

Inventors: Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Michael Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Russell John Wyatt Skerry-Ryan, Ryan M. Rifkin, Ioannis Agiomyrgiannakis
End-to-end text-to-speech conversion

Patent number: 11107457

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Grant

Filed: November 26, 2019

Date of Patent: August 31, 2021

Assignee: Google LLC

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
Synthesizing speech from text using neural networks

Patent number: 10971170

Abstract: Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.

Type: Grant

Filed: August 8, 2018

Date of Patent: April 6, 2021

Assignee: Google LLC

Inventors: Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Michael Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Russell John Wyatt Skerry-Ryan, Ryan M. Rifkin, Ioannis Agiomyrgiannakis
METHODS AND MATERIALS FOR ASSESSING AND TREATING CANCER

Publication number: 20200377956

Abstract: Provided herein are methods and materials for detecting and/or treating subject (e.g. a human) having cancer. In some embodiments, methods and materials for identifying a subject as having cancer (e.g., a localized cancer) are provided in which the presence of member(s) of two or more classes of biomarkers are detected. In some embodiments, methods and materials for identifying a subject as having cancer (e.g. a localized cancer) are provided in which the presence of member(s) of at least one class of biomarkers and the presence of aneuploidy are detected. In some embodiments, methods described herein provide increased sensitivity and/or specificity in the detection of cancer in a subject (e.g. a human).

Type: Application

Filed: August 7, 2018

Publication date: December 3, 2020

Inventors: Bert Vogelstein, Kenneth W. Kinzler, Joshua Cohen, Nickolas Papadopoulos, Anne Marie Lennon, Cristian Tomasetti, Yuxuan Wang, Georges Jabboure Netto, Rachel Karchin, Chris Douville, Samir Hanash, Simeon Springer, Arthur P Grollman, Kathleen Dickman
Systems and methods for recognizing user speech

Patent number: 10672387

Abstract: The various implementations described herein include methods, devices, and systems for recognizing speech, such as user commands. In one aspect, a method includes: (1) receiving audio input data via the one or more microphones; (2) generating a plurality of energy channels for the audio input data; (3) generating a feature vector by performing a per-channel normalization to each channel of the plurality of energy channels; and (4) obtaining recognized speech from the audio input utilizing the feature vector.

Type: Grant

Filed: December 12, 2017

Date of Patent: June 2, 2020

Assignee: GOOGLE LLC

Inventors: Richard Lyon, Christopher Hughes, Yuxuan Wang, Ryan Rifkin, Pascal Getreuer
END-TO-END TEXT-TO-SPEECH CONVERSION

Publication number: 20200098350

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Application

Filed: November 26, 2019

Publication date: March 26, 2020

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
End-to-end text-to-speech conversion

Patent number: 10573293

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Grant

Filed: June 20, 2019

Date of Patent: February 25, 2020

Assignee: Google LLC

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
END-TO-END TEXT-TO-SPEECH CONVERSION

Publication number: 20190311708

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Application

Filed: June 20, 2019

Publication date: October 10, 2019

Inventors: Samy Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
METHODS AND MATERIALS FOR ASSESSING AND TREATING CANCER

Publication number: 20190256924

Abstract: Provided herein are methods and materials for detecting and/or treating subject (e.g., a human) having cancer. In some embodiments, methods and materials for identifying a subject as having cancer (e.g., a localized cancer) are provided in which the presence of member(s) of two or more classes of biomarkers are detected. In some embodiments, methods and materials for identifying a subject as having cancer (e.g., a localized cancer) are provided in which the presence of member(s) of at least one class of biomarkers and the presence of aneuploidy are detected. In some embodiments, methods described herein provide increased sensitivity and/or specificity in the detection of cancer in a subject (e.g. a human).

Type: Application

Filed: January 17, 2019

Publication date: August 22, 2019

Inventors: Bert Vogelstein, Kenneth W. Kinzler, Joshua Cohen, Nickolas Papadopoulos, Anne Marie Lennon, Cristian Tomasetti, Yuxuan Wang, Georges Jabboure Netto, Rachel Karchin, Chris Douville, Samir Hanash, Simeon Springer, Arthur Grollman, Kathleen Dickman
DETECTION OF TUMOR-DERIVED DNA IN CEREBROSPINAL FLUID

Publication number: 20190002987

Abstract: As cell-free DNA from brain and spinal cord tumors cannot usually be detected in the blood, we assessed the cerebrospinal fluid (CSF) that bathes the CNS for tumor DNA, here termed CSF-tDNA. The results suggest that CSF-tDNA could be useful for the management of patients with primary tumors of the brain or spinal cord.

Type: Application

Filed: July 12, 2016

Publication date: January 3, 2019

Inventors: Chetan BETTEGOWA, Kenneth W. KINZLER, Bert VOGELSTEIN, Yuxuan WANG, Luis DIAZ, Nickolas PAPADOPOULOUS
ASSAYING OVARIAN CYST FLUID

Publication number: 20180258490

Abstract: A diagnostic test for ovarian cysts is based on the detection of mutations characteristic of the most common neoplasms giving rise to these lesions. With this test, tumor-specific mutations were detected in the cyst fluids of 19 of 24 (79%) borderline tumors and 28 of 31 (90%) malignant ovarian cancers. In contrast, we detected no mutations in the cyst fluids from 10 non-neoplastic cysts and 12 benign tumors. When categorized by the need for exploratory surgery (i.e., presence of a borderline tumor or malignant cancer), the sensitivity of this test was 85% and the specificity was 100%. These tests could inform the diagnosis of ovarian cysts and improve the clinical management of the large number of women with these lesions.

Type: Application

Filed: August 11, 2016

Publication date: September 13, 2018

Inventors: Yuxuan Wang, Bert Vogelstein, Kenneth W. Kinzler, Luis Diaz, Nickolas Papadopoulos, Karin Sundfeldt, Bjorg Kristjansdottir
Systems and Methods for Recognizing User Speech

Publication number: 20180197533

Abstract: The various implementations described herein include methods, devices, and systems for recognizing speech, such as user commands. In one aspect, a method includes: (1) receiving audio input data via the one or more microphones; (2) generating a plurality of energy channels for the audio input data; (3) generating a feature vector by performing a per-channel normalization to each channel of the plurality of energy channels; and (4) obtaining recognized speech from the audio input utilizing the feature vector.

Type: Application

Filed: December 12, 2017

Publication date: July 12, 2018

Inventors: Richard Lyon, Christopher Hughes, Yuxuan Wang, Ryan Rifkin, Pascal Getreuer
HEAD AND NECK SQUAMOUS CELL CARCINOMA ASSAYS

Publication number: 20180171413

Abstract: We queried DNA from saliva or plasma of 93 HNSCC patients, searching for somatic mutations or human papillomavirus genes, collectively referred to as tumor DNA. When both plasma and saliva were tested, tumor DNA was detected in 96% (95% CI, 84% to 99%) of 47 patients. The fractions of patients with detectable tumor DNA in early- and late-stage disease were 100% (n=10) and 95% (n=37), respectively. Saliva is preferentially enriched for tumor DNA from the oral cavity, whereas plasma is preferentially enriched for tumor DNA from the other sites. Tumor DNA in the saliva and plasma is a valuable biomarker for detection of HNSCC.

Type: Application

Filed: June 16, 2016

Publication date: June 21, 2018

Inventors: Bert Vogelstein, Kenneth W. Kinzler, Luis Diaz, Nickolas Papadopoulos, Nishant Agrawal, Yuxuan Wang, Simeon Springer
Monaural speech filter

Patent number: 9524730

Abstract: A system receives monaural sound which includes speech and background noises. The received sound is divided by frequency and time into time-frequency units (TFUs). Each TFU is classified as speech or non-speech by a processing unit. The processing unit for each frequency range includes at least one of a deep neural network (DNN) or a linear support vector machine (LSVM). The DNN extracts and classifies the features of the TFU and includes a pre-trained stack of Restricted Boltzmann Machines (RBM), and each RBM includes a visible and a hidden layer. The LSVM classifies each TFU based on extracted features from the DNN, including those from the visible layer of the first RBM, and those from the hidden layer of the last RBM in the stack. The LSVM and DNN include training with a plurality of training noises. Each TFU classified as speech is output.

Type: Grant

Filed: March 29, 2013

Date of Patent: December 20, 2016

Assignee: OHIO STATE INNOVATION FOUNDATION

Inventors: DeLiang Wang, Yuxuan Wang
PAPANICOLAOU TEST FOR OVARIAN AND ENDOMETRIAL CANCERS

Publication number: 20150292027

Abstract: The recently developed liquid-based Papanicolaou (Pap) smear allows not only cytologic evaluation but also collection of DNA for detection of HPV, the causative agent of cervical cancer. We tested these samples to detect somatic mutations present in rare tumor cells that might accumulate in the cervix once shed from endometrial and ovarian cancers. A panel of commonly mutated genes in endometrial and ovarian cancers was assembled and used to identify mutations in all 46 endometrial or cervical cancer tissue samples. We were able also able to identify the same mutations in the DNA from liquid Pap smears in 100% of endometrial cancers (24 of 24) and in 41% of ovarian cancers (9 of 22). We developed a sequence-based method to query mutations in 12 genes in a single liquid Pap smear without prior knowledge of the tumor's genotype.

Type: Application

Filed: October 17, 2013

Publication date: October 15, 2015

Applicant: THE JOHNS HOPKINS UNIVERSITY

Inventors: Isaac Kinde, Kenneth W. Kinzler, Bert Vogelstein, Nickolas Papadopoulos, Luis Diaz, Chetan Bettegowda, Yuxuan Wang
MONAURAL SPEECH FILTER

Publication number: 20150066499

Abstract: A system receives monaural sound which includes speech and background noises. The received sound is divided by frequency and time into time-frequency units (TFUs). Each TFU is classified as speech or non-speech by a processing unit. The processing unit for each frequency range includes at least one of a deep neural network (DNN) or a linear support vector machine (LSVM). The DNN extracts and classifies the features of the TFU and includes a pre-trained stack of Restricted Boltzmann Machines (RBM), and each RBM includes a visible and a hidden layer. The LSVM classifies each TFU based on extracted features from the DNN, including those from the visible layer of the first RBM, and those from the hidden layer of the last RBM in the stack. The LSVM and DNN include training with a plurality of training noises. Each TFU classified as speech is output.

Type: Application

Filed: March 29, 2013

Publication date: March 5, 2015

Inventors: DeLiang Wang, Yuxuan Wang

prev 1 2