Patents by Inventor Qiyong Liu

Qiyong Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Dynamic noise and speech removal

Patent number: 12380908

Abstract: Online audio and video conference applications can utilize a noise removal module to eliminate unwanted audio from a participant's speech. A noise removal module can rely on differentiating between human speech versus other audio to filter out noise. However, in some conference environments, participant and non-participant human speech can be present. Artificial intelligence models can be trained to detect both noise and non-participant audio, based on a variety of factors. The models can label captured audio and various noise removal modules can filter noise based on the output of the models.

Type: Grant

Filed: March 22, 2022

Date of Patent: August 5, 2025

Assignee: Zoom Communications, Inc.

Inventors: Jiachuan Deng, Cheng-Lun Hu, Zhaofeng Jia, Qiyong Liu, Qi Yang
Ad hoc client audio device support for virtual conferences

Patent number: 12348899

Abstract: One example method for ad hoc client audio device support for virtual conferences includes transmitting, via a first communication channel by a virtual conference device connected to a virtual conference hosted by a virtual conference provider, a first signal, the first signal including connection information to enable a network connection between a remote client device and the virtual conference device; receiving, by the virtual conference device via a second communication channel, a request to connect from the remote client device; connecting to the remote client device using the second communication channel; receiving, via the connection to the remote client device, an audio stream from the remote client device, the audio stream captured by a microphone of the remote client device; and providing the audio stream to the virtual conference.

Type: Grant

Filed: November 9, 2022

Date of Patent: July 1, 2025

Assignee: Zoom Communications, Inc.

Inventors: Qiang Gao, Zhaofeng Jia, Qiyong Liu, Xinyu Yao, Shaoming Ye, Xiangming Zhu
ACOUSTIC FENCE

Publication number: 20250210026

Abstract: Systems and methods for audio management are disclosed. A conference device receives a plurality of audio signals comprising a first subset of audio signals originated inside an acoustic fence and a second subset of audio signals originated outside the acoustic fence. The conference device generates an in-beam signal based on enhancing the first subset of audio signals and suppressing the second subset of audio signals. The conference device generates a reference signal based on enhancing the second subset of audio signals and suppressing the first subset of audio signals. The conference device generates one or more masks based on the plurality of audio signals, the acoustic fence, and the reference signal. The conference device applies the one or more masks to the in-beam signal to suppress the second subset of audio signals originated outside the acoustic fence to obtain an output audio signal.

Type: Application

Filed: March 7, 2025

Publication date: June 26, 2025

Applicant: Zoom Communications, Inc.

Inventors: Zhenghang Gu, Zhaofeng Jia, Qiyong Liu, Ye Wang, Zexian Wu, Chunyu Zhang
SPATIAL AUDIO IN VIRTUAL CONFERENCE MINGLING

Publication number: 20250193625

Abstract: One example method includes presenting, by a client device, a view of a virtual conference hosted by a virtual conference provider, the virtual conference including a plurality of participants, the client device associated with a participant of the plurality of participants, the view including a plurality of groupings of participants within a virtual conference area, each grouping associated with a different meeting or sub-meeting of the virtual conference; assign a location within the virtual conference area to the participant; receiving, at the client device from the conference provider, one or more audio streams associated with one or more audio sources within the plurality of groupings, the one or more audio streams provided by one or more remote client devices; determining a first location within the virtual conference area of a first audio source of the one or more audio sources; generating a plurality of spatialized audio streams based on the first location of the first audio source, the location of th

Type: Application

Filed: February 14, 2025

Publication date: June 12, 2025

Applicant: Zoom Communications, Inc.

Inventors: Zhaofeng Jia, Qiyong Liu, Mengfan Zhang
TARGET SPEAKER MODE

Publication number: 20250182765

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media relate to a method for target speaker extraction. A target speaker extraction system receives an audio frame of an audio signal. A multi-speaker detection model analyzes the audio frame to determine whether the audio frame includes only a single-speaker or multiple speakers. When the audio frame includes only a single-speaker, the system inputs the audio frame to a target speaker VAD model to suppress speech in the audio frame from a non-target speaker based on comparing the audio frame to a voiceprint of a target speaker. When the audio frame includes multiple speakers, the system inputs the audio frame to a speech separation model to separate the voice of the target speaker from a voice mixture in the audio frame.

Type: Application

Filed: February 3, 2025

Publication date: June 5, 2025

Applicant: Zoom Communications, Inc.

Inventors: Yuhui Chen, Qiyong Liu, Zhengwei Wei, Yangbin Zeng
ENHANCING GROUP SOUND REACTIONS

Publication number: 20250150537

Abstract: Systems and methods for enhancing group sound during a networked conference are provided. A computer device accesses audio data, detects a first group sound in the audio data, and generates a first group sound identifier that identifies the first group sound. The computer device is one of a plurality of computer devices connected to the networked conference. The computer device transmits the first group sound identifier to a network server and receives a control signal from the network server. The network server receives multiple group sound identifiers from the plurality of computer devices and generates the control signal based on the multiple group sound identifiers. The multiple group sound identifiers include the first group sound identifier and a second group identifier. The control signal includes the second group sound identifier. The computer device reproduces a second group sound based on the second group sound identifier.

Type: Application

Filed: January 13, 2025

Publication date: May 8, 2025

Applicant: Zoom Video Communications, Inc.

Inventors: Oded Gal, Lin Han, Qiyong Liu
Acoustic fence

Patent number: 12272345

Abstract: For online audio/video conferencing applications deployed in an open office environment, using shared conference devices, it can be advantageous to define an acoustic fence. A non-participant audio received from outside the acoustic fence can be considered noise and filtered out before transmission of an audio signal to a far end recipient. Three suppression stages are used to filter the non-participant audio. The first suppression stage uses beamformers for suppression. The second suppression stage is mask-based, and the third suppression stage is reference-based. The three suppression stages filter out non-participant audio signals, having a wide range of frequencies.

Type: Grant

Filed: August 29, 2022

Date of Patent: April 8, 2025

Assignee: Zoom Communications, Inc.

Inventors: Zhenghang Gu, Zhaofeng Jia, Qiyong Liu, Ye Wang, Zexian Wu, Chunyu Zhang
Conference Musical Audio Enhancement

Publication number: 20250078852

Abstract: Audio enhancement of musical content is performed by a device coupled to a network. The device receives an audio signal to be transmitted over the network, and detects when musical content is present in the audio signal based on a content probability threshold. The device disables noise suppression for the audio signal and applies a linear filter to cancel echo for the audio signal. The device disables gain control for the audio signal and encodes the audio signal using a codec designed for music.

Type: Application

Filed: November 18, 2024

Publication date: March 6, 2025

Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
Spatial audio in virtual conference mingling

Patent number: 12231869

Abstract: One example method includes presenting, by a client device, a view of a virtual conference hosted by a virtual conference provider, the virtual conference including a plurality of participants, the client device associated with a participant of the plurality of participants, the view including a plurality of groupings of participants within a virtual conference area, each grouping associated with a different meeting or sub-meeting of the virtual conference; assign a location within the virtual conference area to the participant; receiving, at the client device from the conference provider, one or more audio streams associated with one or more audio sources within the plurality of groupings, the one or more audio streams provided by one or more remote client devices; determining a first location within the virtual conference area of a first audio source of the one or more audio sources; generating a plurality of spatialized audio streams based on the first location of the first audio source, the location of th

Type: Grant

Filed: October 28, 2022

Date of Patent: February 18, 2025

Assignee: Zoom Video Communications, Inc.

Inventors: Zhaofeng Jia, Qiyong Liu, Mengfan Zhang
Enhancing group sound reactions

Patent number: 12219098

Abstract: Systems and methods for enhancing group sound during a networked conference are provided. A server computer establishes a networked conference among a plurality of computer devices. The server computer receives one or more group sound indicators from one or more computer devices of the plurality of computer devices within a selected time interval. In response to determining that the total number of the one or more computer devices corresponding to the one or more group sound indicators is equal to or greater than a selected threshold, the server computer transmits to the plurality of computer devices a control signal identifying a group sound corresponding to the one or more group sound indicators. The server computer causes the plurality of computer devices to reproduce the group sound identified in the control signal.

Type: Grant

Filed: October 24, 2023

Date of Patent: February 4, 2025

Assignee: Zoom Video Communications, Inc.

Inventors: Oded Gal, Lin Han, Qiyong Liu
Target speaker mode

Patent number: 12217761

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media relate to a method for target speaker extraction. A target speaker extraction system receives an audio frame of an audio signal. A multi-speaker detection model analyzes the audio frame to determine whether the audio frame includes only a single-speaker or multiple speakers. When the audio frame includes only a single-speaker, the system inputs the audio frame to a target speaker VAD model to suppress speech in the audio frame from a non-target speaker based on comparing the audio frame to a voiceprint of a target speaker. When the audio frame includes multiple speakers, the system inputs the audio frame to a speech separation model to separate the voice of the target speaker from a voice mixture in the audio frame.

Type: Grant

Filed: October 31, 2021

Date of Patent: February 4, 2025

Assignee: Zoom Video Communications, Inc.

Inventors: Yuhui Chen, Qiyong Liu, Zhengwei Wei, Yangbin Zeng
Enhancing musical sound during a networked conference

Patent number: 12183357

Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.

Type: Grant

Filed: December 16, 2022

Date of Patent: December 31, 2024

Assignee: Zoom Video Communications, Inc.

Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
HYBRID DIGITAL SIGNAL PROCESSING-ARTIFICIAL INTELLIGENCE ACOUSTIC ECHO CANCELLATION FOR VIRTUAL CONFERENCES

Publication number: 20240251039

Abstract: Example methods and systems provide hybrid DSP-AI acoustic echo cancellation for virtual conferences. A digital signal processing (DSP)-based linear acoustic echo cancelation (AEC) can be performed on an input audio signal to filter out linear echo present in the input audio signal and generate a first filtered audio signal. A level of nonlinear echo present in the first filtered audio signal can then be determined. When the level of nonlinear echo satisfies a threshold, an artificial intelligence (AI)-based nonlinear AEC can be performed on the first filtered audio signal to generate an AI-filtered audio signal. When the level of nonlinear echo does not satisfy the threshold, a DSP-based nonlinear AEC can be performed on the first filtered audio signal to generate a second filtered audio signal.

Type: Application

Filed: January 20, 2023

Publication date: July 25, 2024

Applicant: Zoom Video Communications, Inc

Inventors: Jiachuan DENG, Cheng Lun Hu, Zhaofeng Jia, Qiyong Liu, Wei Wang, Yueguan Wang
MANUAL-ENROLLMENT-FREE PERSONALIZED DENOISE

Publication number: 20240212702

Abstract: Various embodiments of an apparatus, method(s), system(s) and computer program product(s) described herein are directed to a Denoise Engine. The Denoise Engine collects segments of voice content of a first user account from audio data associated with a virtual meeting. The audio data further includes additional types of audio content. The Denoise Engine identifies an audio embedding model. The Denoise Engine receives a speaker embedding generated by the audio embedding model. The speaker embedding based on the collected segments of voice content. The Denoise Engine generates personalized denoised voice content of the first user account for the virtual meeting by applying the speaker embedding to the audio data associated with a virtual meeting.

Type: Application

Filed: December 23, 2022

Publication date: June 27, 2024

Inventors: Jiachuan Deng, Cheng Lun Hu, Zhaofeng Jia, Qiyong Liu, Zhengwei Wei, Da-Yi Wu
Scrolling Motion Detection Within A Video Stream

Publication number: 20240195530

Abstract: Scrolling motion is detected within a video stream to output an indication of a scrolling motion vector for use in encoding a current picture of the video stream. A first line of pixels within a motion region of the current picture is identified. A second line of pixels matching the first line of pixels is identified within a last played picture of the video stream. The scrolling motion vector is determined based on a comparison of lines of pixels nearby the second line of pixels within the last played picture. The indication of the scrolling motion vector is then output for use in encoding the current picture.

Type: Application

Filed: December 7, 2023

Publication date: June 13, 2024

Inventors: Jing Wu, Zhaofeng Jia, Bo Ling, Qiyong Liu
SPATIAL AUDIO IN VIRTUAL CONFERENCE MINGLING

Publication number: 20240147177

Abstract: One example method includes presenting, by a client device, a view of a virtual conference hosted by a virtual conference provider, the virtual conference including a plurality of participants, the client device associated with a participant of the plurality of participants, the view including a plurality of groupings of participants within a virtual conference area, each grouping associated with a different meeting or sub-meeting of the virtual conference; assign a location within the virtual conference area to the participant; receiving, at the client device from the conference provider, one or more audio streams associated with one or more audio sources within the plurality of groupings, the one or more audio streams provided by one or more remote client devices; determining a first location within the virtual conference area of a first audio source of the one or more audio sources; generating a plurality of spatialized audio streams based on the first location of the first audio source, the location of th

Type: Application

Filed: October 28, 2022

Publication date: May 2, 2024

Inventors: Zhaofeng Jia, Qiyong Liu, Mengfan Zhang
MUSIC COLLABORATION USING VIRTUAL CONFERENCING

Publication number: 20240129685

Abstract: One example method for music collaboration using virtual conferencing includes receiving, by a client device, audio streams associated with a plurality of musicians in a virtual conference, each musician assigned to a virtual position within a virtual space established by the virtual conference, the client device associated with a participant in the virtual conference, the participant having a participant virtual position within the virtual space; determining relative virtual positions of each musician of at least a subset of the plurality of musicians in the virtual conference with respect to the participant virtual position; generating a plurality of spatialized audio streams based on the relative virtual positions of the respective musicians and the respective audio streams; and outputting the spatialized audio streams.

Type: Application

Filed: October 17, 2022

Publication date: April 18, 2024

Applicant: Zoom Video Communications, Inc.

Inventors: Zhaofeng Jia, Qiyong Liu, Mengfan Zhang, Xiangming Zhu
ONE-SHOT ACOUSTIC ECHO GENERATION NETWORK

Publication number: 20240087556

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for generating echo recordings. The system receives, by an autoencoder, an audio signal representation that represents an audio signal and a target echo embedding that comprises information about a target room. The autoencoder comprises an encoder and a decoder. The system generates, by the encoder, a content embedding and an estimated echo embedding. The system generates, by the decoder, an echo recording representation based on the content embedding and the target echo embedding.

Type: Application

Filed: November 13, 2023

Publication date: March 14, 2024

Applicant: Zoom Video Communications, Inc.

Inventors: Zhaofeng Jia, Yang Liu, Qiyong Liu
ACOUSTIC FENCE

Publication number: 20240071356

Abstract: For online audio/video conferencing applications deployed in an open office environment, using shared conference devices, it can be advantageous to define an acoustic fence. A non-participant audio received from outside the acoustic fence can be considered noise and filtered out before transmission of an audio signal to a far end recipient. Three suppression stages are used to filter the non-participant audio. The first suppression stage uses beamformers for suppression. The second suppression stage is mask-based, and the third suppression stage is reference-based. The three suppression stages filter out non-participant audio signals, having a wide range of frequencies.

Type: Application

Filed: August 29, 2022

Publication date: February 29, 2024

Inventors: Zhenghang Gu, Zhaofeng Jia, Qiyong Liu, Ye Wang, Zexian Wu, Chunyu Zhang
ENHANCING GROUP SOUND REACTIONS

Publication number: 20240056529

Abstract: Systems and methods for enhancing group sound during a networked conference are provided. A server computer establishes a networked conference among a plurality of computer devices. The server computer receives one or more group sound indicators from one or more computer devices of the plurality of computer devices within a selected time interval. In response to determining that the total number of the one or more computer devices corresponding to the one or more group sound indicators is equal to or greater than a selected threshold, the server computer transmits to the plurality of computer devices a control signal identifying a group sound corresponding to the one or more group sound indicators. The server computer causes the plurality of computer devices to reproduce the group sound identified in the control signal.

Type: Application

Filed: October 24, 2023

Publication date: February 15, 2024

Applicant: Zoom Video Communications, Inc.

Inventors: Oded Gal, Lin Han, Qiyong Liu

1 2 3 next