Patents by Inventor Chao Weng

Chao Weng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11328188
    Abstract: The present disclosure provides a target-image acquisition method. The target-image acquisition method includes acquiring a visible-light image and an infrared (IR) image of a target, captured at a same time point by a photographing device; weighting and fusing the visible-light image and the IR image to obtain a fused image; and obtaining an image of the target according to the fused image. The present disclosure also provides a photographing device and an unmanned aerial vehicle (UAV) using the method above.
    Type: Grant
    Filed: July 14, 2020
    Date of Patent: May 10, 2022
    Assignee: SZ DJI TECHNOLOGY CO., LTD.
    Inventors: Chao Weng, Lei Yan
  • Publication number: 20220124246
    Abstract: An image presentation method includes obtaining a first image and a second image having same contents; size-processing the first image according to at least one of a target resolution, an aspect ratio of the first image, or an aspect ratio of the second image to generate a size-processed first image having the target resolution; generating a presenting image at least by combining the size-processed first image and the second image; and encoding the presenting image in a code stream and transmitting the encoded image to the display device that requires the preset resolution for display. The first and second images include a visible-light image and an infrared image. The presenting image has a preset resolution no less than a sum of the target resolution and a resolution of the second image. The size-processed first image and the second image are arranged in the presenting image without partially blocking each other.
    Type: Application
    Filed: December 27, 2021
    Publication date: April 21, 2022
    Inventors: Chao WENG, Hongjing CHEN
  • Publication number: 20220115005
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Application
    Filed: December 22, 2021
    Publication date: April 14, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Patent number: 11257480
    Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: February 22, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
  • Patent number: 11257481
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: February 22, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu
  • Publication number: 20220036874
    Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Application
    Filed: October 14, 2021
    Publication date: February 3, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
  • Publication number: 20220038633
    Abstract: A method for tracking a maximum temperature point includes acquiring a first pair of coordinates of a maximum temperature point in a current frame of image sensed by an infrared camera, determining a rotation angle of a gimbal equipped with the infrared camera according to the first pair of coordinates of the maximum temperature point in the current frame of image and a pair of coordinates of a target position of the maximum temperature point in a subsequent frame of image, and controlling the gimbal to rotate according to the rotation angle, so as to adjust the maximum temperature point in the subsequent frame of image captured by the infrared camera to be located at the target position.
    Type: Application
    Filed: October 18, 2021
    Publication date: February 3, 2022
    Inventors: Chao WENG, Mingxi WANG, Wei ZHANG
  • Publication number: 20220027567
    Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.
    Type: Application
    Filed: September 8, 2021
    Publication date: January 27, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Kun XU, Chao WENG, Chengzhu YU, Dong YU
  • Publication number: 20220013123
    Abstract: A method, computer system, and computer readable medium are provided for automatic speech recognition. Video data and audio data corresponding to one or more speakers is received. A minimum variance distortionless response function is applied to the received audio and video data. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated based on back-propagating the output of the applied minimum variance distortionless response function.
    Type: Application
    Filed: July 10, 2020
    Publication date: January 13, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Yong XU, Meng Yu, Shi-Xiong Zhang, Chao Weng, Jianming Liu, Dong Yu
  • Patent number: 11212436
    Abstract: An image presentation method includes obtaining a first image captured by a first image sensor and a second image captured by a second image sensor; size-processing the first image according to at least one of a target resolution, an aspect ratio of the first image, or an aspect ratio of the second image to generate a size-processed first image having the target resolution; and generating a presenting image at least by combining the size-processed first image and the second image. The presenting image has a preset resolution that is not less than a sum of the target resolution and a resolution of the second image.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: December 28, 2021
    Assignee: SZ DJI TECHNOLOGY CO., LTD.
    Inventors: Chao Weng, Hongjing Chen
  • Patent number: 11183168
    Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: November 23, 2021
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
  • Publication number: 20210360164
    Abstract: An image control method includes receiving, by a camera, a photographing instruction transmitted by an image display device. The camera includes a first image sensor and a second image sensor. The method further includes controlling the second image sensor to perform photographing according to the photographing instruction to obtain a display code stream and transmitting the display code stream to the image display device. The photographing instruction is used to instruct the second image sensor to photograph for a partial area of a first image using a focal length to obtain a second image. The first image is obtained by the first image sensor and displayed in a main display window of the image display device. The display code stream includes a code stream corresponding to the second image sensor.
    Type: Application
    Filed: July 22, 2021
    Publication date: November 18, 2021
    Inventors: Chao WENG, Qi ZHOU, Li QIU
  • Patent number: 11170167
    Abstract: Method and apparatus for automatically predicting lexical sememes using a lexical dictionary, comprising inputting a word, retrieving the word's semantic definition and sememes corresponding to the word from an online dictionary, setting each of the retrieved sememes as a candidate sememe, inputting the word's semantic definition and candidate sememe, and estimating the probability that the candidate sememe can be inferred from the word's semantic definition.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: November 9, 2021
    Assignee: TENCENT AMERICA LLC
    Inventors: Kun Xu, Chao Weng, Chengzhu Yu, Dong Yu
  • Publication number: 20210341924
    Abstract: A photography control method includes controlling a mobile platform to move, when the mobile platform is at a predetermined photography point, obtaining a pre-stored sample image of the predetermined photography point, and adjusting, according to a real-time photography picture captured by a camera device of the mobile platform and the sample image, a control parameter of the camera device to cause the camera device to photograph a target object.
    Type: Application
    Filed: July 14, 2021
    Publication date: November 4, 2021
    Inventors: Chao WENG, Li QIU, Qi ZHOU
  • Patent number: 11153494
    Abstract: In the method, a first pair of coordinates of a maximum temperature point in an image sensed by an infrared camera is acquired, where the image is an image captured by the infrared camera. A rotation angle of a gimbal equipped with the infrared camera is then determined according to the first pair of coordinates of the maximum temperature point and a pair of coordinates of a target position in the image. The gimbal is then controlled to rotate according to the determined rotation angle, so as to adjust the maximum temperature point in the image captured by the infrared camera to be located at the target position.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: October 19, 2021
    Assignee: SZ DJI TECHNOLOGY CO., LTD.
    Inventors: Chao Weng, Mingxi Wang, Wei Zhang
  • Publication number: 20210280165
    Abstract: A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.
    Type: Application
    Filed: March 3, 2020
    Publication date: September 9, 2021
    Applicant: TENCENT AMERICA LLC
    Inventors: Chengzhu YU, Heng Lu, Chao Weng, Dong Yu
  • Patent number: 11113529
    Abstract: A method for identifying a photovoltaic panel includes: acquiring a grayscale image of an infrared image captured by a camera mounted on a UAV, the grayscale image including an image of a photovoltaic panel; performing edge extraction processing on an image in the grayscale image to obtain a monochrome image including a plurality of horizontal lines and a plurality of vertical lines, the horizontal lines being lines in a first direction, an average length of the lines in the first direction being greater than a preset length, the vertical lines being lines in a second direction, and an average length of the lines in the second direction being less than the preset length; and identifying the photovoltaic panel in the monochrome image based on a relative positional relationship between the horizontal lines and the vertical lines in the monochrome image.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: September 7, 2021
    Assignee: SZ DJI TECHNOLOGY CO., LTD.
    Inventors: Zefei Li, Chao Weng
  • Publication number: 20210264901
    Abstract: A method of attention-based end-to-end (A-E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, determining a posterior probability vector at a time of a first wrong token among one or more output tokens of the model of which the cross-entropy training is performed, and determining a loss of the first wrong token at the time, based on the determined posterior probability vector. The method further includes determining a total loss of a training set of the model of which the cross-entropy training is performed, based on the determined loss of the first wrong token, and updating the model of which the cross-entropy training is performed, based on the determined total loss of the training set.
    Type: Application
    Filed: May 11, 2021
    Publication date: August 26, 2021
    Applicant: TENCENT AMERICA LLC
    Inventors: Peidong WANG, Jia CUI, Chao WENG, Dong YU
  • Publication number: 20210256958
    Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Application
    Filed: February 13, 2020
    Publication date: August 19, 2021
    Applicant: TENCENT AMERICA LLC
    Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU
  • Publication number: 20210248997
    Abstract: A method, computer program, and computer system is provided for converting a singing voice of a first person associated with a first speaker to a singing voice of a second person using a speaking voice of the second person associated with a second speaker. A context associated with one or more phonemes corresponding to the singing voice of a first person is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes, the target acoustic frames, and a sample of the speaking voice of the second person. A sample corresponding to the singing voice of a first person is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Application
    Filed: February 6, 2020
    Publication date: August 12, 2021
    Applicant: TENCENT AMERICA LLC
    Inventors: Chengzhu YU, Heng LU, Chao WENG, Dong YU