Patents by Inventor Chao Weng

Chao Weng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Target-image acquisition method, photographing device, and unmanned aerial vehicle

Patent number: 11328188

Abstract: The present disclosure provides a target-image acquisition method. The target-image acquisition method includes acquiring a visible-light image and an infrared (IR) image of a target, captured at a same time point by a photographing device; weighting and fusing the visible-light image and the IR image to obtain a fused image; and obtaining an image of the target according to the fused image. The present disclosure also provides a photographing device and an unmanned aerial vehicle (UAV) using the method above.

Type: Grant

Filed: July 14, 2020

Date of Patent: May 10, 2022

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Chao Weng, Lei Yan
IMAGE PROCESSING AND PRESENTATION

Publication number: 20220124246

Abstract: An image presentation method includes obtaining a first image and a second image having same contents; size-processing the first image according to at least one of a target resolution, an aspect ratio of the first image, or an aspect ratio of the second image to generate a size-processed first image having the target resolution; generating a presenting image at least by combining the size-processed first image and the second image; and encoding the presenting image in a code stream and transmitting the encoded image to the display device that requires the preset resolution for display. The first and second images include a visible-light image and an infrared image. The presenting image has a preset resolution no less than a sum of the target resolution and a resolution of the second image. The size-processed first image and the second image are arranged in the presenting image without partially blocking each other.

Type: Application

Filed: December 27, 2021

Publication date: April 21, 2022

Inventors: Chao WENG, Hongjing CHEN
MULTI-TASK TRAINING ARCHITECTURE AND STRATEGY FOR ATTENTION-BASED SPEECH RECOGNITION SYSTEM

Publication number: 20220115005

Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Applicant: TENCENT AMERICA LLC

Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
Unsupervised singing voice conversion with pitch adversarial network

Patent number: 11257480

Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.

Type: Grant

Filed: March 3, 2020

Date of Patent: February 22, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
Multi-task training architecture and strategy for attention-based speech recognition system

Patent number: 11257481

Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.

Type: Grant

Filed: October 24, 2018

Date of Patent: February 22, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
SINGING VOICE CONVERSION

Publication number: 20220036874

Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.

Type: Application

Filed: October 14, 2021

Publication date: February 3, 2022

Applicant: TENCENT AMERICA LLC

Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
MAXIMUM TEMPERATURE POINT TRACKING METHOD, DEVICE AND UNMANNED AERIAL VEHICLE

Publication number: 20220038633

Abstract: A method for tracking a maximum temperature point includes acquiring a first pair of coordinates of a maximum temperature point in a current frame of image sensed by an infrared camera, determining a rotation angle of a gimbal equipped with the infrared camera according to the first pair of coordinates of the maximum temperature point in the current frame of image and a pair of coordinates of a target position of the maximum temperature point in a subsequent frame of image, and controlling the gimbal to rotate according to the rotation angle, so as to adjust the maximum temperature point in the subsequent frame of image captured by the infrared camera to be located at the target position.

Type: Application

Filed: October 18, 2021

Publication date: February 3, 2022

Inventors: Chao WENG, Mingxi WANG, Wei ZHANG
AUTOMATIC LEXICAL SEMEME PREDICTION SYSTEM USING LEXICAL DICTIONARIES

Publication number: 20220027567

Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.

Type: Application

Filed: September 8, 2021

Publication date: January 27, 2022

Applicant: TENCENT AMERICA LLC

Inventors: Kun XU, Chao WENG, Chengzhu YU, Dong YU
MULTI-TAP MINIMUM VARIANCE DISTORTIONLESS RESPONSE BEAMFORMER WITH NEURAL NETWORKS FOR TARGET SPEECH SEPARATION

Publication number: 20220013123

Abstract: A method, computer system, and computer readable medium are provided for automatic speech recognition. Video data and audio data corresponding to one or more speakers is received. A minimum variance distortionless response function is applied to the received audio and video data. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated based on back-propagating the output of the applied minimum variance distortionless response function.

Type: Application

Filed: July 10, 2020

Publication date: January 13, 2022

Applicant: TENCENT AMERICA LLC

Inventors: Yong XU, Meng Yu, Shi-Xiong Zhang, Chao Weng, Jianming Liu, Dong Yu
Image processing and presentation

Patent number: 11212436

Abstract: An image presentation method includes obtaining a first image captured by a first image sensor and a second image captured by a second image sensor; size-processing the first image according to at least one of a target resolution, an aspect ratio of the first image, or an aspect ratio of the second image to generate a size-processed first image having the target resolution; and generating a presenting image at least by combining the size-processed first image and the second image. The presenting image has a preset resolution that is not less than a sum of the target resolution and a resolution of the second image.

Type: Grant

Filed: May 8, 2020

Date of Patent: December 28, 2021

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Chao Weng, Hongjing Chen
Singing voice conversion

Patent number: 11183168

Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.

Type: Grant

Filed: February 13, 2020

Date of Patent: November 23, 2021

Assignee: TENCENT AMERICA LLC

Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
IMAGE CONTROL METHOD AND DEVICE, AND MOBILE PLATFORM

Publication number: 20210360164

Abstract: An image control method includes receiving, by a camera, a photographing instruction transmitted by an image display device. The camera includes a first image sensor and a second image sensor. The method further includes controlling the second image sensor to perform photographing according to the photographing instruction to obtain a display code stream and transmitting the display code stream to the image display device. The photographing instruction is used to instruct the second image sensor to photograph for a partial area of a first image using a focal length to obtain a second image. The first image is obtained by the first image sensor and displayed in a main display window of the image display device. The display code stream includes a code stream corresponding to the second image sensor.

Type: Application

Filed: July 22, 2021

Publication date: November 18, 2021

Inventors: Chao WENG, Qi ZHOU, Li QIU
Automatic lexical sememe prediction system using lexical dictionaries

Patent number: 11170167

Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.

Type: Grant

Filed: March 26, 2019

Date of Patent: November 9, 2021

Assignee: TENCENT AMERICA LLC

Inventors: Kun Xu, Chao Weng, Chengzhu Yu, Dong Yu
PHOTOGRAPHY CONTROL METHOD AND MOBILE PLATFORM

Publication number: 20210341924

Abstract: A photography control method includes controlling a mobile platform to move, when the mobile platform is at a predetermined photography point, obtaining a pre-stored sample image of the predetermined photography point, and adjusting, according to a real-time photography picture captured by a camera device of the mobile platform and the sample image, a control parameter of the camera device to cause the camera device to photograph a target object.

Type: Application

Filed: July 14, 2021

Publication date: November 4, 2021

Inventors: Chao WENG, Li QIU, Qi ZHOU
Maximum temperature point tracking method, device and unmanned aerial vehicle

Patent number: 11153494

Abstract: In the method, a first pair of coordinates of a maximum temperature point in an image sensed by an infrared camera is acquired, where the image is an image captured by the infrared camera. A rotation angle of a gimbal equipped with the infrared camera is then determined according to the first pair of coordinates of the maximum temperature point and a pair of coordinates of a target position in the image. The gimbal is then controlled to rotate according to the determined rotation angle, so as to adjust the maximum temperature point in the image captured by the infrared camera to be located at the target position.

Type: Grant

Filed: December 27, 2019

Date of Patent: October 19, 2021

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Chao Weng, Mingxi Wang, Wei Zhang
UNSUPERVISED SINGING VOICE CONVERSION WITH PITCH ADVERSARIAL NETWORK

Publication number: 20210280165

Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.

Type: Application

Filed: March 3, 2020

Publication date: September 9, 2021

Applicant: TENCENT AMERICA LLC

Inventors: Chengzhu YU, Heng Lu, Chao Weng, Dong Yu
Photovoltaic panel recognition method, ground station, control apparatus, and unmanned aerial vehicle

Patent number: 11113529

Abstract: A method for identifying a photovoltaic panel includes: acquiring a grayscale image of an infrared image captured by a camera mounted on a UAV, the grayscale image including an image of a photovoltaic panel; performing edge extraction processing on an image in the grayscale image to obtain a monochrome image including a plurality of horizontal lines and a plurality of vertical lines, the horizontal lines being lines in a first direction, an average length of the lines in the first direction being greater than a preset length, the vertical lines being lines in a second direction, and an average length of the lines in the second direction being less than the preset length; and identifying the photovoltaic panel in the monochrome image based on a relative positional relationship between the horizontal lines and the vertical lines in the monochrome image.

Type: Grant

Filed: December 23, 2019

Date of Patent: September 7, 2021

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Zefei Li, Chao Weng
TOKEN-WISE TRAINING FOR ATTENTION BASED END-TO-END SPEECH RECOGNITION

Publication number: 20210264901

Abstract: A method of attention-based end-to-end (A-E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, determining a posterior probability vector at a time of a first wrong token among one or more output tokens of the model of which the cross-entropy training is performed, and determining a loss of the first wrong token at the time, based on the determined posterior probability vector. The method further includes determining a total loss of a training set of the model of which the cross-entropy training is performed, based on the determined loss of the first wrong token, and updating the model of which the cross-entropy training is performed, based on the determined total loss of the training set.

Type: Application

Filed: May 11, 2021

Publication date: August 26, 2021

Applicant: TENCENT AMERICA LLC

Inventors: Peidong WANG, Jia CUI, Chao WENG, Dong YU
SINGING VOICE CONVERSION

Publication number: 20210256958

Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.

Type: Application

Filed: February 13, 2020

Publication date: August 19, 2021

Applicant: TENCENT AMERICA LLC

Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
LEARNING SINGING FROM SPEECH

Publication number: 20210248997

Abstract: A method, computer program, and computer system is provided for converting a singing voice of a first person associated with a first speaker to a singing voice of a second person using a speaking voice of the second person associated with a second speaker. A context associated with one or more phonemes corresponding to the singing voice of a first person is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes, the target acoustic frames, and a sample of the speaking voice of the second person. A sample corresponding to the singing voice of a first person is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.

Type: Application

Filed: February 6, 2020

Publication date: August 12, 2021

Applicant: TENCENT AMERICA LLC

Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU

prev 1 2 3 4 5 next