Patents by Inventor Wai Chung Chu

Wai Chung Chu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11950062
    Abstract: A system configured to improve sound source localization (SSL) processing by reducing a number of direction vectors and grouping the direction vectors into direction cells is provided. The system performs clustering to generate a smaller set of direction vectors included in a delay-direction codebook, reducing a size of the codebook to the number of unique delay vectors. In addition, the system groups the direction vectors into direction cells having a regular structure (e.g., predetermined uniformity and/or symmetry), which simplifies SSL processing and results in a substantial reduction in computational cost. The system may also select between multiple codebooks and/or dynamically adjust the codebook to compensate for changes to the microphone array. For example, a device with a microphone array fixed to a display that can tilt may adjust the codebook based on a tilt angle of the display to improve accuracy.
    Type: Grant
    Filed: March 31, 2022
    Date of Patent: April 2, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Wai Chung Chu, Carlo Murgia
  • Patent number: 11915698
    Abstract: A system configured to improve track selection while performing audio type detection using sound source localization (SSL) data is provided. A device processes audio data representing sounds from multiple sound sources to determine SSL data that distinguishes between each of the sound sources. The system detects an acoustic event and performs SSL track selection to select the sound source that corresponds to the acoustic event based on input features. To improve SSL track selection, the system detects current conditions of the environment and determines adaptive weight values that vary based on the current conditions, such as a noise level of the environment, whether playback is detected, whether the device is located near one or more walls, etc. By adjusting the adaptive weight values, the system improves an accuracy of the SSL track selection by prioritizing the input features that are most predictive during the current conditions.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: February 27, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Borham Lee, Wai Chung Chu
  • Patent number: 11749294
    Abstract: A system configured to perform directional speech separation. The system may dynamically associate direction-of-arrivals with one or more audio sources in order to generate output audio data that separates each of the audio sources. The system identifies a target direction for each audio source, dynamically determines directions that are correlated with the target direction, and generates output signals for each audio source. The system may associate individual frequency bands with specific directions based on a time delay detected by two or more microphones. The system may determine a cross-correlation between each direction and the target direction and select directions with strong correlation. The system may generate time-frequency mask data indicating frequency bands corresponding to the directions associated with a particular audio source. Using the mask data, the system generates output audio data specific to the audio source, resulting in directional speech separation between different audio sources.
    Type: Grant
    Filed: August 21, 2020
    Date of Patent: September 5, 2023
    Assignee: Amazon Technologies, Inc.
    Inventor: Wai Chung Chu
  • Patent number: 11545172
    Abstract: A system configured to perform sound source localization (SSL) using reflection classification is provided. A device processes audio data representing sounds from multiple sound sources to generate sound track data that includes an individual sound track for each of the sound sources. To detect reflections, the device determines whether a pair of sound tracks are strongly correlated. For example, the device may calculate a correlation value for each pairwise combination of the sound tracks and determine whether the correlation value exceeds a threshold value. When the correlation value exceeds the threshold, the device invokes a reflection classifier trained to distinguish between direct sound sources and reflected sound sources. For example, the device extracts feature data from the pair of sound tracks and processes the feature data using a trained model to determine which of the sound tracks corresponds to the direct sound source.
    Type: Grant
    Filed: March 9, 2021
    Date of Patent: January 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventor: Wai Chung Chu
  • Patent number: 11386911
    Abstract: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. The system may apply a two-channel dereverberation algorithm by calculating coherence-to-diffuse ratio (CDR) values and calculating dereverberation (DER) gain values based on the CDR values. While the DER gain values may be calculated at a first stage within the pipeline, the device may apply the DER gain values at a second stage within the pipeline. For example, the device may calculate the DER gain values prior to performing residual echo suppression (RES) processing but may apply the DER gain values after performing RES processing, in order to avoid excessive attenuation of the local speech. In addition to removing reverberation, the DER gain values also remove diffuse noise components, reducing an amount of noise reduction required. Thus, the device may soften noise reduction when the DER gain values are applied.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: July 12, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Kanthasamy Chelliah, Wai Chung Chu, Andreas Schwarz, Berkant Tacer, Carlo Murgia
  • Patent number: 11317201
    Abstract: A system efficiently selects at least one device from multiple devices based on received audio signals. In some instances, the system receives audio signals from devices that each comprise at least one microphone. A respective audio signal of the audio signals includes a representation of a sound originating from a location. The system then determines a device to be used to respond to the sound. In some instances, the system analyzes times in which the received audio signals that represent the sound are generated and/or volumes of the sound as represented by the received audio signals. The system can then select the device based on the analysis.
    Type: Grant
    Filed: January 30, 2017
    Date of Patent: April 26, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Samuel Henry Chang, Wai Chung Chu
  • Patent number: 11259117
    Abstract: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. The system may apply a two-channel dereverberation algorithm by calculating coherence-to-diffuse ratio (CDR) values and calculating dereverberation (DER) gain values based on the CDR values. While the device calculates the DER gain values prior to performing acoustic echo cancellation (AEC) processing, the device applies the DER gain values after performing residual echo suppression (RES) processing in order to avoid excessive attenuation of the local speech. To improve output speech quality, the device does not apply the DER gain values for nonreverberant signals, when a signal-to-noise ratio (SNR) value is too low, and/or when far-end talk (e.g., remote speech) is present. Dereverberation processing is further improved by using frequency dependent parameters to calculate the DER gain values and by adjusting other gain values when the DER gain values are applied.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: February 22, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Kanthasamy Chelliah, Wai Chung Chu, Andreas Schwarz, Carlos Renato Nakagawa, Berkant Tacer, Carlo Murgia
  • Patent number: 11217235
    Abstract: A device capable of autonomous motion may move in response to a user speaking an utterance, such as a command. Before moving, the device processes audio data received from a microphone array to identify different audio signals arriving at the device from different directions. Based on properties of the audio signals, the device determines which of the audio signals are merely reflections of other audio.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: January 4, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Wai Chung Chu, Anshuman Ganguly, Carlo Murgia
  • Patent number: 11107492
    Abstract: A system configured to perform directional speech separation using three or more microphones. The system may dynamically associate direction-of-arrivals with one or more audio sources in order to generate output audio data that separates each of the audio sources. Using three or more microphones, the system may separate audio sources covering 360 degrees surrounding the microphone array, whereas a two-microphone implementation is limited to 180 degrees. The system identifies a target direction for each audio source, dynamically determines directions that are correlated with the target direction, and generates output signals for each audio source. The system may associate individual frequency bands with specific directions based on a phase difference detected by two or more microphones.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: August 31, 2021
    Assignee: Amazon Technologies, Inc.
    Inventor: Wai Chung Chu
  • Patent number: 10867617
    Abstract: This disclosure describes, in part, techniques for processing audio data. For instance, an electronic device may include an automatic gain controller (AGC) that determines AGC gains for amplifying or attenuating an audio data. To determine the AGC gains, the AGC uses information from a residual echo suppressor (RES) and/or a noise reductor (NR). The information may indicate RES gains applied to the audio data by the RES and/or NR gains applied to the audio data by the NR. In some instances, to determine the AGC gain, the AGC determines time-constant parameter(s) using the information. The AGC then uses the time-constant parameter(s) to determine an input signal level for the audio data and/or the AGC gain. In some instances, to determine the AGC gain, the AGC operates in an attack mode or a release mode based on the information.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: December 15, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Carlos Renato Nakagawa, Carlo Murgia, Wai Chung Chu, Kuan-Chieh Yen
  • Publication number: 20200381002
    Abstract: A system configured to perform directional speech separation. The system may dynamically associate direction-of-arrivals with one or more audio sources in order to generate output audio data that separates each of the audio sources. The system identifies a target direction for each audio source, dynamically determines directions that are correlated with the target direction, and generates output signals for each audio source. The system may associate individual frequency bands with specific directions based on a time delay detected by two or more microphones. The system may determine a cross-correlation between each direction and the target direction and select directions with strong correlation. The system may generate time-frequency mask data indicating frequency bands corresponding to the directions associated with a particular audio source. Using the mask data, the system generates output audio data specific to the audio source, resulting in directional speech separation between different audio sources.
    Type: Application
    Filed: August 21, 2020
    Publication date: December 3, 2020
    Inventor: Wai Chung Chu
  • Patent number: 10755727
    Abstract: A system configured to perform directional speech separation. The system may dynamically associate direction-of-arrivals with one or more audio sources in order to generate output audio data that separates each of the audio sources. The system identifies a target direction for each audio source, dynamically determines directions that are correlated with the target direction, and generates output signals for each audio source. The system may associate individual frequency bands with specific directions based on a time delay detected by two or more microphones. The system may determine a cross-correlation between each direction and the target direction and select directions with strong correlation. The system may generate time-frequency mask data indicating frequency bands corresponding to the directions associated with a particular audio source. Using the mask data, the system generates output audio data specific to the audio source, resulting in directional speech separation between different audio sources.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: August 25, 2020
    Assignee: Amazon Technologies, Inc.
    Inventor: Wai Chung Chu
  • Patent number: 10600432
    Abstract: A system configured to perform power normalization for voice enhancement. The system may identify active intervals corresponding to voice activity and may selectively amplify the active intervals in order to generate output audio data at a near uniform loudness. The system may determine a variable gain for each of the active intervals based on a desired output loudness and a flatness value, which indicates how much a signal envelope is to be modified. For example, a low flatness value corresponds to no modification, with peak active interval values corresponding to the desired output loudness and lower active intervals being lower than the desired output loudness. In contrast, a high flatness value corresponds to extensive modification, with peak active interval values and lower active interval values both corresponding to the desired output loudness. Thus, individual words may share the same peak power level.
    Type: Grant
    Filed: March 28, 2017
    Date of Patent: March 24, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Wai Chung Chu, Carlo Murgia, Hyeong Cheol Kim
  • Patent number: 10115411
    Abstract: A system configured to improve speech quality by performing residual echo suppression (RES). The system may detect when double-talk conditions are present in individual frequency bands during a voice conversation and may determine gain values for the individual frequency bands. The system may determine whether double-talk conditions are present based on a normalized cross power spectral density function in a frequency domain. If double-talk conditions are present in a frequency band or far end energy is low, the system may determine a gain value that passes audio data in the frequency band, whereas if double-talk conditions are not present, the system may determine a gain value that attenuates audio data in the frequency band. The system may determine binary gain values using a decision threshold value or continuous gain values using a mapping function. The system may control an amount of suppression by selecting different mapping functions and/or parameters.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: October 30, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Wai Chung Chu, Carlo Murgia, Hyeong Cheol Kim
  • Patent number: 9818425
    Abstract: An echo cancellation system that generates multiple output paths, enabling Automatic Speech Recognition (ASR) processing in parallel with voice communication. For single direction AEC (e.g., ASR processing), the system prioritizes speech from a single user and ignores other speech by selecting a single directional output from a plurality of directional outputs as a first output path. For multi-directional AEC (e.g., voice communication), the system includes all speech by combining the plurality of directional outputs as a second output path. The system may use a weighted sum technique, such that each directional output is represented in the combined output based on a corresponding signal metric, or an equal weighting technique, such that a first group of directional outputs having a higher signal metric may be equally weighted using a first weight while a second group of directional outputs having a lower signal metric may be equally weighted using a second weight.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: November 14, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Robert Ayrapetian, Philip Ryan Hilmes, Wai Chung Chu, Hyeong Cheol Kim, Yuwen Su
  • Patent number: 9753119
    Abstract: A system may utilize sound localization techniques, such as time-difference-of-arrival techniques, to estimate an audio-based sound source position from which a sound originates. An optical image or depth map of an area containing the sound source location may then captured and analyzed to detect an object that is known or expected to have produced the sound. The position of the object may also be determined based on the analysis of the optical image or depth map. The position of the sound source may then be determined based at least in part on the position of the detected object or on a combination of the audio-based sound source position and the determined position of the object.
    Type: Grant
    Filed: January 29, 2014
    Date of Patent: September 5, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Kavitha Velusamy, Ning Yao, Wai Chung Chu, Sowmya Gopalan, Qiang Liu, Rahul Agrawal, Manika Puri
  • Patent number: 9621984
    Abstract: Devices, systems, and methods provide direction finding of an acoustic signal source with respect to a voice-controlled device. The direction can be found without using elevation data, instead determining the horizontal location based on power values of the received signal. A large number of candidate vectors having values for azimuth, elevation, and power may be generated by a steered response power algorithm. The large number of vectors is reduced to a small number of reference azimuths spanning an azimuth range by associating the vectors with the closest reference azimuth and then calculating an average and/or maximum power of the associated vectors at each reference azimuth. The reference azimuth with the highest average (or maximum) power may be set as the direction of the signal source. Alternatively, each reference azimuth having an average (or maximum) power exceeding a threshold may be considered a direction of one of multiple sources.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: April 11, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Wai Chung Chu
  • Patent number: 9390723
    Abstract: Features are disclosed for performing efficient dereverberation of speech signals captured with single- and multi-channel sensors in networked audio systems. Such features could be used in applications requiring automatic recognition of speech captured with sensors. Dereverberation is performed in the sub-band domain, and hence provides improved dereverberation performance in terms of signal quality, algorithmic delay, computational efficiency, and speed of convergence.
    Type: Grant
    Filed: December 11, 2014
    Date of Patent: July 12, 2016
    Inventors: John Walter McDonough, Jr., Wai Chung Chu, Amit Singh Chhetri, Robert Ayrapetian
  • Patent number: 9363616
    Abstract: A test system for testing directional capabilities of an audio device includes a horizontal linear actuator and a vertical linear actuator. The horizontal linear actuator supports a rotary actuator upon which the audio device is placed. The vertical linear actuator supports a sound source. To test the device, the actuators are controlled to establish multiple relative positions of the audio device and the sound source. A test sound is emitted at each of the relative positions. The audio device is configured to provide data generated in response to the test sound at each of the relative positions. The test system receives and records the data along with coordinates indicating the relative positions to which the data corresponds.
    Type: Grant
    Filed: April 18, 2014
    Date of Patent: June 7, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Wai Chung Chu, Steve Anthony Quento, Robert Gregory Deacon, Colter Earl Cederlof, Steve Gil Gonzalez, Philip Ryan Hilmes
  • Patent number: 9319787
    Abstract: The accuracy and computationally efficient estimation of time different (or delay) of arrival (TDOA) data is improved for localization of a sound. In one aspect, for each acoustic source event, multiple sets of TDOA data are generated, where each set uses a different sensor or microphone to be the reference. One of the microphones is ultimately selected to be the reference microphone based, in part, on correlation functions of the various sets of TDOA data. The selected reference microphone is then used in sound source localization or other signal processing applications. The direction of the sound source is found using a VMRL finding algorithm as a function of a channel vector containing information of the selected channels, the reference channel and a TDOA vector.
    Type: Grant
    Filed: December 19, 2013
    Date of Patent: April 19, 2016
    Assignee: Amazon Technologies, Inc.
    Inventor: Wai Chung Chu