Patents by Inventor Chengzhu YU
Chengzhu YU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220317613Abstract: The present disclosure provides a consumable chip, a consumable, and a communication method between the image forming apparatus and the consumable chip. The consumable chip is capable of being installed on the consumable, and the consumable is capable of being detachably installed on the image forming apparatus. The consumable chip includes a first storage unit and a chip control unit. The first storage unit is configured to store first fixed data representing attribute information of the consumable. The chip control unit is configured to receive an authentication request sent by the image forming apparatus, obtain the first fixed data and first variable data representing consumption information of the consumable, generate authentication data by performing a calculation on the first fixed data and the first variable data according to a first preset algorithm, and send the authentication data for determining whether the consumable meets expectation to the image forming apparatus.Type: ApplicationFiled: March 25, 2022Publication date: October 6, 2022Inventors: Chengzhu YU, Dan NING
-
Patent number: 11430431Abstract: A method, computer program, and computer system is provided for converting a singing voice of a first person associated with a first speaker to a singing voice of a second person using a speaking voice of the second person associated with a second speaker. A context associated with one or more phonemes corresponding to the singing voice of a first person is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes, the target acoustic frames, and a sample of the speaking voice of the second person. A sample corresponding to the singing voice of a first person is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.Type: GrantFiled: February 6, 2020Date of Patent: August 30, 2022Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
-
Publication number: 20220189495Abstract: An apparatus and a method include receiving an input audio signal to be processed by a multi-band synchronized neural vocoder. The input audio signal is separated into a plurality of frequency bands. A plurality of audio signals corresponding to the plurality of frequency bands is obtained. Each of the audio signals is downsampled, and processed by the multi-band synchronized neural vocoder. An audio output signal is generated.Type: ApplicationFiled: March 4, 2022Publication date: June 16, 2022Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Meng YU, Heng LU, Dong YU
-
Publication number: 20220180856Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.Type: ApplicationFiled: February 24, 2022Publication date: June 9, 2022Applicant: TENCENT AMERICA LLCInventors: Chengzhu Yu, Dong Yu
-
Publication number: 20220115005Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.Type: ApplicationFiled: December 22, 2021Publication date: April 14, 2022Applicant: TENCENT AMERICA LLCInventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
-
Patent number: 11302301Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.Type: GrantFiled: March 3, 2020Date of Patent: April 12, 2022Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Dong Yu
-
Patent number: 11295751Abstract: An apparatus and a method include receiving an input audio signal to be processed by a multi-band synchronized neural vocoder. The input audio signal is separated into a plurality of frequency bands. A plurality of audio signals corresponding to the plurality of frequency bands is obtained. Each of the audio signals is downsampled, and processed by the multi-band synchronized neural vocoder. An audio output signal is generated.Type: GrantFiled: September 20, 2019Date of Patent: April 5, 2022Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Meng Yu, Heng Lu, Dong Yu
-
Patent number: 11257481Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.Type: GrantFiled: October 24, 2018Date of Patent: February 22, 2022Assignee: TENCENT AMERICA LLCInventors: Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
-
Patent number: 11257480Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.Type: GrantFiled: March 3, 2020Date of Patent: February 22, 2022Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
-
Publication number: 20220036874Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.Type: ApplicationFiled: October 14, 2021Publication date: February 3, 2022Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
-
Publication number: 20220027567Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.Type: ApplicationFiled: September 8, 2021Publication date: January 27, 2022Applicant: TENCENT AMERICA LLCInventors: Kun XU, Chao WENG, Chengzhu YU, Dong YU
-
Publication number: 20210375259Abstract: A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A spectrogram frame is generated based on the duration model. An audio waveform is generated based on the spectrogram frame. Video information is generated based on the audio waveform. The audio waveform is provided as an output along with a corresponding video.Type: ApplicationFiled: August 6, 2021Publication date: December 2, 2021Applicant: TENCENT AMERICA LLCInventors: Heng LU, Chengzhu Yu, Dong Yu
-
Patent number: 11183168Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.Type: GrantFiled: February 13, 2020Date of Patent: November 23, 2021Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
-
Patent number: 11170167Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.Type: GrantFiled: March 26, 2019Date of Patent: November 9, 2021Assignee: TENCENT AMERICA LLCInventors: Kun Xu, Chao Weng, Chengzhu Yu, Dong Yu
-
Patent number: 11151979Abstract: A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A spectrogram frame is generated based on the duration model. An audio waveform is generated based on the spectrogram frame. Video information is generated based on the audio waveform. The audio waveform is provided as an output along with a corresponding video.Type: GrantFiled: August 23, 2019Date of Patent: October 19, 2021Assignee: TENCENT AMERICA LLCInventors: Heng Lu, Chengzhu Yu, Dong Yu
-
Patent number: 11138966Abstract: A method for generating an automatic speech recognition (ASR) model using unsupervised learning includes obtaining, by a device, text information. The method includes determining, by the device, a set of phoneme sequences associated with the text information. The method includes obtaining, by the device, speech waveform data. The method includes determining, by the device, a set of phoneme boundaries associated with the speech waveform data. The method includes generating, by the device, the ASR model using an output distribution matching (ODM) technique based on determining the set of phoneme sequences associated with the text information and based on determining the set of phoneme boundaries associated with the speech waveform data.Type: GrantFiled: February 7, 2019Date of Patent: October 5, 2021Assignee: TENCENT AMERICA LLCInventors: Jianshu Chen, Chengzhu Yu, Dong Yu, Chih-Kuan Yeh
-
Publication number: 20210280165Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.Type: ApplicationFiled: March 3, 2020Publication date: September 9, 2021Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Heng Lu, Chao Weng, Dong Yu
-
Publication number: 20210280164Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.Type: ApplicationFiled: March 3, 2020Publication date: September 9, 2021Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Dong YU
-
Publication number: 20210256958Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.Type: ApplicationFiled: February 13, 2020Publication date: August 19, 2021Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
-
Publication number: 20210248997Abstract: A method, computer program, and computer system is provided for converting a singing voice of a first person associated with a first speaker to a singing voice of a second person using a speaking voice of the second person associated with a second speaker. A context associated with one or more phonemes corresponding to the singing voice of a first person is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes, the target acoustic frames, and a sample of the speaking voice of the second person. A sample corresponding to the singing voice of a first person is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.Type: ApplicationFiled: February 6, 2020Publication date: August 12, 2021Applicant: TENCENT AMERICA LLCInventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU