Patents by Inventor Xuejing Sun

Xuejing Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170127089
    Abstract: A method of processing data comprises processing first frequency-domain audio or video data using signal processing of a first type, and transforming the processed first frequency-domain audio or video data to processed time-domain audio or video data using a transform which is the inverse of the first transform, and transforming the processed time-domain audio or video data using a second transform which is matched to a second type of signal processing. The method further comprises identifying time-domain audio or video data for which signal processing of the first type, after transformation using the second transform, would yield satisfactory results. The method further comprises transforming the identified time-domain audio or video data to frequency-domain identified audio or video data using the second transform, instead of using the first transform, and processing the identified frequency-domain audio or video data using signal processing of the first type.
    Type: Application
    Filed: September 27, 2016
    Publication date: May 4, 2017
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Dong Shi, Xuejing Sun, Glenn N. Dickins, David Gunawan
  • Publication number: 20170118142
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Application
    Filed: January 4, 2017
    Publication date: April 27, 2017
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. DICKINS, Xuejing SUN, Brendon COSTA
  • Patent number: 9626970
    Abstract: Embodiments of the present invention relate to speaker identification using spatial information. A method of speaker identification for audio content being of a format based on multiple channels is disclosed. The method comprises extracting, from a first audio clip in the format, a plurality of spatial acoustic features across the multiple channels and location information, the first audio clip containing voices from a speaker, and constructing a first model for the speaker based on the spatial acoustic features and the location information, the first model indicating a characteristic of the voices from the speaker. The method further comprises identifying whether the audio content contains voices from the speaker based on the first model. Corresponding system and computer program product are also disclosed.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: April 18, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Shen Huang, Xuejing Sun
  • Publication number: 20170103761
    Abstract: A method of encoding audio information for forward error correction reconstruction of a transmitted audio stream over a lossy packet switched network, the method including the steps of: (a) dividing the audio stream into audio frames; (b) determining a series of corresponding audio frequency bands for the audio frames; (c) determining a series of power envelopes for the frequency bands; (d) encoding the envelopes as a low bit rate version of the audio frame in a redundant transmission frame.
    Type: Application
    Filed: October 7, 2016
    Publication date: April 13, 2017
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Xuejing Sun, Kai Li, Mark S. Vinton, Shen Huang
  • Publication number: 20170104552
    Abstract: A method of determining a near optimal forward error correction scheme for the transmission of audio data over a lossy packet switched network having preallocated estimated bandwidth, delay and packet losses, between at least a first and second communications devices, the method including the steps of: determining a first coding rate for the audio data; determining a peak redundancy coding rate for redundant versions of the audio data; determining an average redundancy coding rate over a period of time for redundant versions of the audio data; determining an objective function which maximizes a bitrate-perceptual audio quality mapping of the transmitted audio data including a playout function formulation; and optimising the objective function to produce a forward error correction scheme providing a high bitrate perceptual audio quality.
    Type: Application
    Filed: October 7, 2016
    Publication date: April 13, 2017
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Xuejing Sun, Dong Shi
  • Patent number: 9602943
    Abstract: An audio processing method and apparatus are described. In one embodiment, at least one first sub-band of a first audio signal is suppressed to obtain a reduced first audio signal with reserved sub-bands; suppressing at least one second sub-band of the at least one second audio signal to obtain at least one reduced second audio signal with reserved sub-bands; and mixing the reduced first audio signal and at least one reduced second audio signal. Alternatively, a first spatial auditory property is assigned to a first audio signal so that the first audio signal may be perceived as originating from a first position. Alternatively, rhythmic similarity between at least two audio signals is detected, and time scaling is applied to an audio signal in response to relatively high rhythmic similarity between the audio signal and the other audio signal(s); and then at least two audio signals are mixed.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: March 21, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Huiqun Deng, Xuejing Sun
  • Patent number: 9584087
    Abstract: A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: February 28, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Xuejing Sun, Glenn N. Dickins
  • Patent number: 9571425
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: February 14, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
  • Publication number: 20170034026
    Abstract: Some implementations involve analyzing audio packets received during a time interval that corresponds with a conversation analysis segment to determine network jitter dynamics data and conversational interactivity data. The network jitter dynamics data may provide an indication of jitter in a network that relays the audio data packets. The conversational interactivity data may provide an indication of interactivity between participants of a conversation represented by the audio data. A jitter buffer size may be controlled according to the network jitter dynamics data and the conversational interactivity data. The time interval may include a plurality of talkspurts.
    Type: Application
    Filed: April 9, 2015
    Publication date: February 2, 2017
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Kai LI, Xuejing SUN, Gary SPITTLE
  • Patent number: 9558744
    Abstract: An audio processing apparatus and an audio processing method are described. In one embodiment, the audio processing apparatus include an audio masker separator for separating from a first audio signal an audio material comprising a sound other than stationary noise and utterance meaningful in semantics, as an audio masker candidate. The apparatus also includes a first context analyzer for obtaining statistics regarding contextual information of detected audio masker candidates, and a masker library builder for building a masker library or updating an existing masker library by adding, based on the statistics, at least one audio masker candidate as an audio masker into the masker library, wherein audio maskers in the maker library are used to be inserted into a target position in a second audio signal to conceal defects in the second audio signal.
    Type: Grant
    Filed: November 27, 2013
    Date of Patent: January 31, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Xuejing Sun, Shen Huang, Poppy Crum, Hannes Muesch, Glenn N. Dickins, Michael Eckert
  • Publication number: 20170026298
    Abstract: Some implementations involve controlling a jitter buffer size during a teleconference according to a jitter buffer size estimation algorithm based, at least in part, on a cumulative distribution function (CDF). The CDF may be based, at least in part, on a network jitter parameter. The CDF may be initialized according to a parametric model. At least one parameter of the parametric model may be based, at least in part, on legacy network jitter information.
    Type: Application
    Filed: April 8, 2015
    Publication date: January 26, 2017
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: JiaQuan HUO, Xuejing SUN, Kai LI
  • Patent number: 9552827
    Abstract: A method (800) for determining an estimate (215, 261) of an echo path property of an electronic device (200, 250, 300, 600). The electronic device is configured to render a total audio signal using a loudspeaker (102), and the electronic device is configured to record an echo of the rendered audio signal using a microphone (103), thereby yielding a recorded audio signal (112). The method comprises inserting (801), in an inaudible manner, an auxiliary audio signal (212) into the total audio signal to be rendered; wherein the auxiliary audio signal (212) comprises a tonal audio signal at a first frequency; isolating (803) the echo of the auxiliary audio signal (212) from the recorded audio signal (112); and determining (804) the estimate (215, 261) of the echo path property based on the inserted auxiliary audio signal (212) and based on the isolated echo of the auxiliary audio signal (212).
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: January 24, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Craig Johnston, Dong Shi, Xuejing Sun, Glenn N. Dickins
  • Patent number: 9548063
    Abstract: Embodiments of method and apparatus for acoustic echo control are described. According to the method, an echo energy-based doubletalk detection is performed to determine whether there is a doubletalk in a microphone signal with reference to a loudspeaker signal. A spectral similarity between spectra of the microphone signal and the loudspeaker signal is calculated. It is determined that there is no doubletalk in the microphone signal if the spectral similarity is higher than a threshold level. Adaption of an adaptive filter for applying acoustic echo cancellation or acoustic echo suppression on the microphone signal is enabled if it is determined that there is no doubletalk in the microphone signal through the echo energy-based doubletalk detection, or there is no doubletalk through the spectral similarity-based doubletalk detection.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: January 17, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Dong Shi, JiaQuan Huo, Xuejing Sun, Glenn N. Dickins
  • Patent number: 9525845
    Abstract: Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 20, 2016
    Assignees: Dobly Laboratories Licensing Corporation, Dobly International AB
    Inventors: Dong Shi, Xuejing Sun, Kai Li, Shen Huang, Harald Mundt, Heiko Purnhagen, Glenn Dickins
  • Publication number: 20160359943
    Abstract: A service request for communication services for communication clients is received. In response, a communication service network is set up to support the communication services. Routing metadata is generated for each of the communication clients. The routing metadata is to be used by each of the communication clients for sharing service quality information with a respective peer communication client over a light-weight peer-to-peer (P2P) network. The routing metadata is downloaded to each of the communication clients. A communication client may exchange service signaling packets or service data packets over the communication service network. When the communication client determines that there is a problematic region in a bitstream received from the communication server, the communication client can request a peer communication client for a service quality information portion related to the problematic region.
    Type: Application
    Filed: June 1, 2016
    Publication date: December 8, 2016
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Shen HUANG, Doh-Suk KIM, Xuejing SUN
  • Patent number: 9514755
    Abstract: The present document relates to audio signal processing in general, and to the concealment of artifacts that result from loss of audio packets during audio transmission over a packet-switched network, in particular. A method (200) for concealing one or more consecutive lost packets is described. A lost packet is a packet which is deemed to be lost transform-based audio decoder. Each of the one or more lost packets comprises a set of transform coefficients. A set of transform coefficients is used by the transform-based audio decoder to generate a corresponding frame of a time domain audio signal. The method (200) comprises determining (205) for a current lost packet of the one or more lost packets a number of preceding lost packets from the one or more lost packets; wherein the determined number is referred to as a loss position.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 6, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Shen Huang, Xuejing Sun
  • Publication number: 20160337510
    Abstract: In a conference call having a plurality of participants interacting in a conference exchange of information in a digital transmission environment, the interaction being across a variable network transmission resource, a method of allocating the level of transmission resource, the methods including the steps of: (a) monitoring predetermined aspects of the participant's behavior during the conference call; (b) determining a divergence of participants behavior from normative values; (c) utilising any divergence as an indicator of aberrant operation of the participants; and (d) allocating the resource determinative on the divergence of participants behavior from normative values.
    Type: Application
    Filed: January 6, 2015
    Publication date: November 17, 2016
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Kai LI, I, Glenn N. DICKINS, Xuejing SUN
  • Publication number: 20160180852
    Abstract: Embodiments of the present invention relate to speaker identification using spatial information. A method of speaker identification for audio content being of a format based on multiple channels is disclosed. The method comprises extracting, from a first audio clip in the format, a plurality of spatial acoustic features across the multiple channels and location information, the first audio clip containing voices from a speaker, and constructing a first model for the speaker based on the spatial acoustic features and the location information, the first model indicating a characteristic of the voices from the speaker. The method further comprises identifying whether the audio content contains voices from the speaker based on the first model. Corresponding system and computer program product are also disclosed.
    Type: Application
    Filed: December 16, 2015
    Publication date: June 23, 2016
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Shen HUANG, Xuejing SUN
  • Patent number: 9373343
    Abstract: An audio signal with a temporal sequence of blocks or frames is received or accessed. Features are determined as characterizing aggregately the sequential audio blocks/frames that have been processed recently, relative to current time. The feature determination exceeds a specificity criterion and is delayed, relative to the recently processed audio blocks/frames. Voice activity indication is detected in the audio signal. VAD is based on a decision that exceeds a preset sensitivity threshold and is computed over a brief time period, relative to blocks/frames duration, and relates to current block/frame features. The VAD and the recent feature determination are combined with state related information, which is based on a history of previous feature determinations that are compiled from multiple features, determined over a time prior to the recent feature determination time period. Decisions to commence or terminate the audio signal, or related gains, are outputted based on the combination.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: June 21, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Zhiwei Shuang, David Gunawan, Xuejing Sun
  • Publication number: 20160148618
    Abstract: The present application relates to packet loss concealment apparatus and method, and audio processing system. According to an embodiment, the packet loss concealment apparatus is provided for concealing packet losses in a stream of audio packets, each audio packet comprising at least one audio frame in transmission format comprising at least one monaural component and at least one spatial component. The packet loss concealment apparatus may comprises a first concealment unit for creating the at least one monaural component for a lost frame in a lost packet and a second concealment unit for creating the at least one spatial component for the lost frame. According to the embodiment, spatial artifacts such as incorrect angle and diffuseness may be avoided as far as possible in PLC for multi-channel spatial or sound field encoded audio signals.
    Type: Application
    Filed: July 2, 2014
    Publication date: May 26, 2016
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Shen HUANG, Xuejing SUN, Heiko PURNHAGEN