Correlation Function Patents (Class 704/216)
  • Patent number: 10937449
    Abstract: An apparatus for determining a pitch information on the basis of an audio signal. The apparatus is configured to obtain a similarity value being associated with a given pair of portions of the audio signal having a given time shift, wherein the apparatus is configured to choose a length of signal portions of the audio signal used to obtain the similarity value for the given time shift in dependence on the given time shift and where the apparatus is configured to choose the length of the signal portions to be linearly dependent on the given time shift, within a tolerance of ±1 sample.
    Type: Grant
    Filed: April 4, 2019
    Date of Patent: March 2, 2021
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jérémie Lecomte, Adrian Tomasek
  • Patent number: 10878801
    Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: December 29, 2020
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 10796684
    Abstract: Audio data describing an audio signal may be received and used to determine a set of frames of the audio signal. One or more potential music events may be determined in the audio signal using a spectral analysis of the set of frames. The audio signal may be analyzed for one or more potential noise or tone events. One or more music states of the audio signal may be determined based on the one or more potential music events and a presence or absence of the one or more noise or tone events. Audio enhancement of the audio signal may be modified based on the one or more determined states of the audio signal.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: October 6, 2020
    Assignee: DIALPAD, INC.
    Inventors: Qian-Yu Tang, John Rector
  • Patent number: 10667056
    Abstract: A low power, digital audio interface includes support for variable length coding depending on content of the audio data sent from the interface. A particularized coding system is implemented that uses techniques of silence detection, dynamic scaling, and periodic encoding to reduce sent data to a minimum. Other techniques include variable packet scaling based on an audio sample rate. Differential signaling techniques are also used. The digital audio interface may be used in a headphone interface to drive digital headphones. A detector in the interface may detect whether digital or analog headphones are coupled to a headphone jack and drive the headphone jack accordingly.
    Type: Grant
    Filed: May 18, 2017
    Date of Patent: May 26, 2020
    Assignee: AVNERA CORPORATION
    Inventors: Chris O'Connor, Xudong Zhao
  • Patent number: 10643633
    Abstract: An observation feature value vector is calculated based on observation signals recorded at different positions in a situation in which target sound sources and background noise are present in a mixed manner; masks associated with the target sound sources and a mask associated with the background noise are estimated; a spatial correlation matrix of the target sound sources that includes the background noise is calculated based on the masks associated with the observation signals and the target sound sources; a spatial correlation matrix of the background noise is calculated based on the masks associated with the observation signals and the background noise; and a spatial correlation matrix of the target sound sources is estimated based on the matrix obtained by weighting each of the spatial correlation matrices by predetermined coefficients.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: May 5, 2020
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro Nakatani, Nobutaka Ito, Takuya Higuchi, Shoko Araki, Takuya Yoshioka
  • Patent number: 10629194
    Abstract: The present disclosure provides a speech recognition method and device based on artificial intelligence. The method includes: collecting signals of an array of microphones to obtain a plurality of first speech signals; filtering out a reverberation signal in each first speech signal to obtain a plurality of second speech signals, and obtaining a third speech signal based on the plurality of second speech signals; performing noise extraction on each first speech signal based on the third speech signal to obtain a plurality of first noise signals; and filtering and adding the plurality of first noise signals to obtain a second noise signal, and subtracting the second noise signal from the third speech signal to obtain a target speech signal.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: April 21, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Hui Song
  • Patent number: 10607617
    Abstract: A coding apparatus, including a memory and a processor that, when executing instructions stored in the memory, performs operations including encoding low-band transform coefficients in a first band and calculating, for each extension-band subband obtained by splitting an extension band, a threshold amplitude based on an analysis of statistics on extension-band transform coefficients included in the subband.
    Type: Grant
    Filed: November 19, 2018
    Date of Patent: March 31, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Takuya Kawashima, Masahiro Oshikiri
  • Patent number: 10585941
    Abstract: An audio file is transformed into a Gabor spectrogram. This is used to compare the audio file to a database of audio files, each represented as a Gabor spectrogram. Before two spectrograms are compared, they are aligned. The spectrograms are broken into blocks and individual Gabor vectors in the blocks are compared. Similarities are stored and an aggregate similarity value is derived for the block. After a series of such comparisons and shifting of the secondary spectrogram block, essentially a running window, an offset value is determined. This offset is used to align the two spectrograms at which stage the spectrograms can be compared in a more effective and meaningful manner. A set of observables is derived from the comparisons and the primary spectrogram is classified in way suitable for the application environment.
    Type: Grant
    Filed: July 30, 2014
    Date of Patent: March 10, 2020
    Assignee: ACE METRIX, INC.
    Inventor: Douglas C. Garrett
  • Patent number: 10516956
    Abstract: A failure detection device for detecting a failure of a sound generating device outputting a sound based on sound data from a speaker includes: an electronic watermark signal generating unit configured to generate an electronic watermark signal including collation data used for collation of whether or not a sound is output from the speaker; the speaker configured to output the electronic watermark signal as a sound; a microphone configured to collect the sound output from the speaker; a collation data detection unit configured to detect the collation data from the electronic watermark signal included in the sound collected by the microphone; and a failure determination unit configured to determine the presence or absence of the failure of the sound generating device by collating the collation data detected by the collation data detection unit with the collation data included in the electronic watermark signal.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: December 24, 2019
    Assignee: ALPINE ELECTRONICS, INC.
    Inventors: Taku Sugai, Nozomu Saito, Jyoji Yamada, Isamu Takaku
  • Patent number: 10440431
    Abstract: Systems and methods are provided for facilitating automated video scripting, Video frames may be analyzed to determine scores indicative of the association between a characteristic of the video frame and an attribute of a theme associated with a particular person. Then, video frames with particular scores can be added together to automatically create a video script. Neural networks can be used to determine the scores. The neural network may also be trained using training data, and updated based on the interaction of a person to a video script.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: October 8, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Prakash Bulusu, Pragyana K. Mishra
  • Patent number: 10424321
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing an audio sample to determine whether the audio sample includes music audio data. One or more detectors, including a spectral fluctuation detector, a peak repetition detector, and a beat pitch detector, may analyze the audio sample and generate a score that represents whether the audio sample includes music audio data. One or more of the scores may be combined to determine whether the audio sample includes music audio data or non-music audio data.
    Type: Grant
    Filed: July 1, 2013
    Date of Patent: September 24, 2019
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Dominik Roblek
  • Patent number: 10283113
    Abstract: The disclosure concerns a method for recognizing driving noise in a sound signal that is acquired by a microphone disposed in a vehicle. The sound signal originates from the surface structure of the road. According to the disclosure, a segment of the road lying ahead of the vehicle in the direction of travel is observed with a sensor installed in or on the vehicle. Using the observation data obtained, the start and duration of driving noise originating from the surface structure of the road are predicted.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: May 7, 2019
    Assignee: FORD GLOBAL TECHNOLOGIES, LLC
    Inventors: Christoph Arndt, Mohsen Lakehal-Ayat
  • Patent number: 10284712
    Abstract: A voice quality evaluation method, apparatus, and system comprises an obtained voice data packet is parsed, and a frame content characteristic of the data packet is determined according to a parse result, for example, the frame content characteristic is a silence frame and a voice frame. Then, a voice sequence is divided into statements according to the determined frame content characteristic, and the statements are divided into multiple frame loss events; after non-voice parameters are extracted according to the frame loss events, voice quality of each statement is evaluated according to a preset voice quality evaluation model and according to the non-voice parameters. Finally, voice quality of the entire voice sequence is evaluated according to the voice quality of each statement. By using this solution, prediction precision can be improved significantly, and accuracy of an evaluation result can be improved.
    Type: Grant
    Filed: August 26, 2016
    Date of Patent: May 7, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Fuzheng Yang, Xuemin Li, Liangliang Jiang, Wei Xiao
  • Patent number: 10262671
    Abstract: An audio coding method and a related apparatus are disclosed. The audio coding method includes: estimating reference linear prediction efficiency of a current audio frame; determining an audio coding scheme that matches the reference linear prediction efficiency of the foregoing current audio frame; and performing audio coding on the foregoing current audio frame according to the audio coding scheme that matches the reference linear prediction efficiency of the foregoing current audio frame. The technical solutions provided in embodiments of the present disclosure help reduce overheads of audio coding.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: April 16, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Zhe Wang
  • Patent number: 10216726
    Abstract: An apparatus for determining a translation word includes a word vector generator configured to generate a word vector corresponding to an input word of a first language with reference to a first word vector space that is related to the first language, a word vector determiner configured to determine a word vector of a second language, wherein the determined word vector of the second language corresponds to the generated word vector, using a matching model, and a translation word selector configured to select a translation word of the second language, wherein the selected translation word corresponds to the input word of the first language, based on the determined word vector of the second language.
    Type: Grant
    Filed: June 21, 2016
    Date of Patent: February 26, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho Dong Lee, Sang Hyun Yoo
  • Patent number: 10134410
    Abstract: A coding apparatus, including a memory and a processor that, when executing instructions stored in the memory, performs operations including encoding low-band transform coefficients in a first band and calculating, for each extension-band subband obtained by splitting an extension band, a threshold amplitude based on an analysis of statistics on extension-band transform coefficients included in the subband.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: November 20, 2018
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Takuya Kawashima, Masahiro Oshikiri
  • Patent number: 10134419
    Abstract: An autocorrelation calculating part calculates autocorrelation Ro(i) from an input signal. A predictive coefficient calculating part performs linear predictive analysis using modified autocorrelation R?o(i) obtained by multiplying the autocorrelation Ro(i) by a coefficient wo(i). Here, it is assumed that a case where, for at least part of each order i, the coefficient wo(i) corresponding to each order i monotonically increases as a value having negative correlation with a fundamental frequency of an input signal in a current frame or a past frame increases and a case where the coefficient wo(i) monotonically decreases as a value having positive correlation with a pitch gain in a current frame or a past frame increases, are included.
    Type: Grant
    Filed: February 6, 2018
    Date of Patent: November 20, 2018
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Yutaka Kamamoto, Takehiro Moriya, Noboru Harada
  • Patent number: 9741358
    Abstract: A method of interference suppression is provided that includes receiving a first audio signal from a first audio capture device and a second audio signal from a second audio capture device wherein the first audio signal includes a first combination of desired audio content and interference and the second audio signal includes a second combination of the desired audio content and the interference, performing blind source separation using the first audio signal and the second audio signal to generate an output interference signal and an output audio signal including the desired audio content with the interference suppressed, estimating interference remaining in the output audio signal using the output interference signal, and subtracting the estimated interference from the output audio signal to generate a final output audio signal with the interference further suppressed.
    Type: Grant
    Filed: June 10, 2014
    Date of Patent: August 22, 2017
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Devangi N Parikh, Muhammad Zubair Ikram
  • Patent number: 9645994
    Abstract: The technical solution under the present disclosure automatically analyzes conversations between users by receiving a training dataset having a text sequence including sentences of a conversation between the users; extracting feature(s) from the training dataset based on features; providing equation(s) for a plurality of tasks, the equation(s) being a mathematical function for calculating value of a parameter for each of the tasks based on the extracted feature; determining value of the parameter for tasks by processing the equation(s); assigning label(s) to each of the sentences based on the determined value of the parameter, a first label being selected from a plurality of first labels, and a second label being selected from a number of second labels; and storing and maintaining with the database a pre-defined value of the parameter, first labels, conversations, second labels, a test dataset, equation(s), and pre-defined features.
    Type: Grant
    Filed: December 9, 2014
    Date of Patent: May 9, 2017
    Assignee: Conduent Business Services, LLC
    Inventors: Arvind Agarwal, Saurabh Kataria, Tong Sun, Sumit Bhatia
  • Patent number: 9373324
    Abstract: Systems and methods for applying feature-space maximum likelihood linear regression (fMLLR) to correlated features are provided. A method for applying fMLLR to correlated features, comprises mapping the correlated features into an uncorrelated feature space, applying fMLLR in the uncorrelated feature space to obtain fMLLR transformed features, and mapping the fMLLR transformed features back to a correlated feature space.
    Type: Grant
    Filed: June 25, 2014
    Date of Patent: June 21, 2016
    Assignee: International Business Machines Corporation
    Inventors: Tara N. Sainath, George A. Saon
  • Patent number: 9236062
    Abstract: A signal manipulator for manipulating an audio signal having a transient event may have a transient remover, a signal processor and a signal inserter for inserting a time portion in a processed audio signal at a signal location where the transient event was removed before processing by the transient remover, so that a manipulated audio signal has a transient event not influenced by the processing, whereby the vertical coherence of the transient event is maintained instead of any processing performed in the signal processor, which would destroy the vertical coherence of a transient.
    Type: Grant
    Filed: May 7, 2012
    Date of Patent: January 12, 2016
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Frederik Nagel, Nikolaus Rettelbach, Markus Multrus, Guillaume Fuchs
  • Patent number: 9047865
    Abstract: A system and method for processing of audio and speech signals is disclosed, which provide compatibility over a range of communication devices operating at different sampling frequencies and/or bit rates. The analyzer of the system divides the input signal in different portions, at least one of which carries information sufficient to provide intelligible reconstruction of the input signal. The analyzer also encodes separate information about other portions of the signal in an embedded manner, so that a smooth transition can be achieved from low bit-rate to high bit-rate applications. Accordingly, communication devices operating at different sampling rates and/or bit-rates can extract corresponding information from the output bit stream of the analyzer. In the present invention embedded information generally relates to separate parameters of the input signal, or to additional resolution in the transmission of original signal parameters.
    Type: Grant
    Filed: August 10, 2007
    Date of Patent: June 2, 2015
    Assignee: Alcatel Lucent
    Inventors: Joseph Gerard Aguilar, David A. Campana, Juin-Hwey Chen, Robert B. Dunn, Robert J. McAulay, Xiaoquin Sun, Wei Wang, Craig Watkins, Robert W. Zopf
  • Patent number: 9048869
    Abstract: An encoder and decoder using LDPC-CC which avoid lowering the transmission efficiency of information while not deteriorating error correction performance, even at termination; and an encoding method of the same. A termination sequence length determining unit determines the sequence length of a termination sequence transmitted added to the end of an information sequence, according to the information length (information size) and encoding rate of the information sequence. A parity calculation unit carries out LDPC-CC coding on the information sequence and the known-information sequence necessary for generating a termination sequence of the determined termination sequence length, and calculates a parity sequence.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: June 2, 2015
    Assignee: Panasonic Corporation
    Inventors: Yutaka Murakami, Hisao Koga, Nobutaka Kodama
  • Patent number: 9043216
    Abstract: An audio signal decoder has a time warp contour calculator, a time warp contour data rescaler and a warp decoder. The time warp contour calculator is configured to generate time warp contour data repeatedly restarting from a predetermined time warp contour start value, based on time warp contour evolution information describing a temporal evolution of the time warp contour. The time warp contour data rescaler is configured to rescale at least a portion of the time warp contour data such that a discontinuity at a restart is avoided, reduced or eliminated in a rescaled version of the time warp contour. The warp decoder is configured to provide the decoded audio signal representation, based on an encoded audio signal representation and using the rescaled version of the time warp contour.
    Type: Grant
    Filed: July 1, 2009
    Date of Patent: May 26, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
  • Patent number: 9026435
    Abstract: The invention provides a method for estimating a fundamental frequency of a speech signal comprising the steps of receiving a signal spectrum of the speech signal, filtering the signal spectrum to obtain a refined signal spectrum, determining a cross-power spectral density using the refined signal spectrum and the signal spectrum, transforming the cross-power spectral density into the time domain to obtain a cross-correlation function, and estimating the fundamental frequency of the speech signal based on the cross-correlation function.
    Type: Grant
    Filed: May 3, 2010
    Date of Patent: May 5, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Mohamed Krini, Gerhard Schmidt
  • Patent number: 9026437
    Abstract: A location determination system includes a first mobile terminal and a second mobile terminal. The first mobile terminal includes a first processor to acquire a first sound signal, analyze the first sound signal to obtain a first analysis result, and transmit the first analysis result. The second mobile terminal includes a second processor to acquire a second sound signal, analyze the second sound signal to obtain a second analysis result, receive the first analysis result from the first mobile terminal, compare the second analysis result with the first analysis result to obtain a comparison result, and determine whether the first mobile terminal locates in an area in which the second mobile terminal locates, based on the comparison result.
    Type: Grant
    Filed: March 26, 2012
    Date of Patent: May 5, 2015
    Assignee: Fujitsu Limited
    Inventor: Eiji Hasegawa
  • Patent number: 9015039
    Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: April 21, 2015
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8996389
    Abstract: Various techniques are disclosed for reducing artifacts generated by time compression. by adapting the time compression based on the state of the received audio. The amount of time compression may be bounded based on audio characteristics. Another feature provides a way of determining the most correlated portions of segments of audio. Voiced speech may be distinguished from unvoiced speech. Another feature provides a way of distinguishing between silence, voiced speech, and unvoiced speech. Time compression may be adapted during periods of lengthy silence. Another feature allows for reducing time compression during sensitive portions of the received audio. One or more of these features may be present in different embodiments.
    Type: Grant
    Filed: June 14, 2011
    Date of Patent: March 31, 2015
    Assignee: Polycom, Inc.
    Inventor: Eric David Elias
  • Patent number: 8990094
    Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8983830
    Abstract: An encoding device can achieve both highly effective encoding/decoding and high-quality decoding audio when executing a scalable stereo audio encoding by using MDCT and ICP. In the encoding device, an MDCT converter executes an MDCT conversion on a residual signal of left channel/right channel subjected to window processing. An MDCT converter executes an MDCT conversion on the monaural residual signal which has been subjected to the window processing. An ICP analyzer executes an ICP analysis by using the correlation between a frequency coefficient of a high-band portion of the left channel/right channel and a frequency coefficient of a high-band portion of the monaural residual signal so as to generate an ICP parameter of the left channel/right channel residual signal. An ICP parameter quantizes each of the ICP parameters. A low-band encoding unit encoder executes highly-accurate encoding on the frequency coefficient of the low-band portion of the left channel/right channel residual signal.
    Type: Grant
    Filed: March 28, 2008
    Date of Patent: March 17, 2015
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Jiong Zhou, Kok Seng Chong, Koji Yoshida
  • Patent number: 8954324
    Abstract: Voice activity detection using multiple microphones can be based on a relationship between an energy at each of a speech reference microphone and a noise reference microphone. The energy output from each of the speech reference microphone and the noise reference microphone can be determined. A speech to noise energy ratio can be determined and compared to a predetermined voice activity threshold. In another embodiment, the absolute value of the autocorrelation of the speech and noise reference signals are determined and a ratio based on autocorrelation values is determined. Ratios that exceed the predetermined threshold can indicate the presence of a voice signal. The speech and noise energies or autocorrelations can be determined using a weighted average or over a discrete frame size.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: February 10, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Song Wang, Samir Kumar Gupta, Eddie L. T. Choy
  • Patent number: 8935164
    Abstract: A non-spatial speech detection system includes a plurality of microphones whose output is supplied to a fixed beamformer. An adaptive beamformer is used for receiving the output of the plurality of microphones and one or more processors are used for processing an output from the fixed beamformer and identifying speech from noise though the use of an algorithm utilizing a covariance matrix.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: January 13, 2015
    Assignee: Gentex Corporation
    Inventors: Robert R. Turnbull, Michael A. Bryson
  • Patent number: 8930200
    Abstract: A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation.
    Type: Grant
    Filed: July 24, 2013
    Date of Patent: January 6, 2015
    Assignee: Huawei Technologies Co., Ltd
    Inventors: Fuwei Ma, Dejun Zhang, Lei Miao, Fengyan Qi
  • Patent number: 8930184
    Abstract: A signal bandwidth extending apparatus including: a bandwidth extending section configured to extend a frequency bandwidth of a target signal, the target signal included in an input signal; a calculating section configured to calculate a degree of the target signal included in the input signal; and a controller configured to change a method of extending the frequency bandwidth by the bandwidth extending section according to a result of the calculating section.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: January 6, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takashi Sudo, Masataka Osada
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8903721
    Abstract: A mute setting is automatically set based on a speech detection result for acoustic signals received by a device. A device detects the speech based on a variety of cues from acoustic signals received using one or more microphones. If speech is detected within one or more frames, a mute setting may be automatically turned off. If speech is not detected, a mute setting may be automatically turned on. A mute setting may remain on as long as speech is not detected within the received acoustic signals. A varying delay may be implemented to help avoid false detections. The delay may be utilized during a mute-on state, and gradually removed during a transition from a mute-on state to a mute-off state.
    Type: Grant
    Filed: October 20, 2010
    Date of Patent: December 2, 2014
    Assignee: Audience, Inc.
    Inventor: Matthew Cowan
  • Patent number: 8903720
    Abstract: Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.
    Type: Grant
    Filed: July 14, 2009
    Date of Patent: December 2, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Tae Jin Lee, Seung-Kwon Baek, Min Je Kim, Dae Young Jang, Jeongil Seo, Kyeongok Kang, Jin-Woo Hong, Hochong Park, Young-Cheol Park
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8831958
    Abstract: An apparatus for processing an audio signal and method thereof are disclosed.
    Type: Grant
    Filed: September 25, 2009
    Date of Patent: September 9, 2014
    Assignee: LG Electronics Inc.
    Inventors: Hyun Kook Lee, Dong Soo Kim, Sung Yong Yoon, Hee Suk Pang, Jae Hyun Lim
  • Patent number: 8793125
    Abstract: A device (1) for converting a first number (M) of input audio channels into a second, larger number (N) of output audio channels comprises: decorrelation units (3) for decomposing the input audio channels into a set of decorrelated auxiliary channels, at least one upmix unit (4) for combining the decorrelated auxiliary channels into the output audio channels, and at least one pre-processing unit (2) for pre-processing the input audio channels and feeding the pre-processed input audio channels to the decorrelation units (3). The pre-processing unit (2) and the upmix unit (4) are preferably controlled by audio parameters.
    Type: Grant
    Filed: July 11, 2005
    Date of Patent: July 29, 2014
    Assignees: Koninklijke Philips Electronics N.V., Coding Technologies AB
    Inventors: Dirk Jeroen Breebaart, Erik Gosuinus Petrus Schuijers, Heiko Purnhagen, Karl Jonas Rödén
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8738367
    Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 27, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8731913
    Abstract: A method for overlap-adding signals useful for performing frame loss concealment (FLC) in an audio decoder as well as in other applications. The method uses a dynamic mix of windows to overlap two signals whose normalized cross-correlation may vary from zero to one. If the overlapping signals are decomposed into a correlated component and an uncorrelated component, they are overlap-added separately using the appropriate window, and then added together. If the overlapping signals are not decomposed, a weighted mix of windows is used. The mix is determined by a measure estimating the amount of cross-correlation between overlapping signals, or the relative amount of correlated to uncorrelated signals.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: May 20, 2014
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen
  • Patent number: 8706488
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: February 27, 2013
    Date of Patent: April 22, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8694309
    Abstract: A method, a computer readable medium and a system for automatic speech recognition tuning management that comprises, collecting an utterance, analyzing the utterance, correlating the collected utterance to the utterance analysis, and fetching at least one of, the collected utterance, the utterance analysis, and the correlation of the collected utterance to the utterance analysis.
    Type: Grant
    Filed: February 12, 2007
    Date of Patent: April 8, 2014
    Assignee: West Corporation
    Inventors: Aaron Scott Fisher, Prashanta Pradhan
  • Patent number: 8666752
    Abstract: Provided are an encoding apparatus and a decoding apparatus of a multi-channel signal. The encoding apparatus of the multi-channel signal may process a phase parameter associated with phase information between a plurality of channels constituting the multi-channel signal, based on a characteristic of the multi-channel signal. The encoding apparatus may generate an encoded bitstream with respect to the multi-channel signal using the processed phase parameter and a mono signal extracted from the multi-channel signal.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: March 4, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-Hoe Kim, Eun Mi Oh
  • Patent number: 8666734
    Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.
    Type: Grant
    Filed: September 23, 2010
    Date of Patent: March 4, 2014
    Assignee: University of Maryland, College Park
    Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
  • Patent number: 8660842
    Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.
    Type: Grant
    Filed: March 9, 2010
    Date of Patent: February 25, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventor: Antoine R. Raux
  • Patent number: 8655655
    Abstract: A sound event detecting module for detecting whether a sound event with characteristic of repeating is generated. A sound end recognizing unit recognizes ends of sounds according to a sound signal to generate sound sections and multiple sets of feature vectors of the sound sections correspondingly. A storage unit stores at least M sets of feature vectors. A similarity comparing unit compares the at least M sets of feature vectors with each other, and correspondingly generates a similarity score matrix, which stores similarity scores of any two of the sound sections of the at least M of the sound sections. A correlation arbitrating unit determines the number of sound sections with high correlations to each other according to the similarity score matrix. When the number is greater than one threshold value, the correlation arbitrating unit indicates that the sound event with the characteristic of repeating is generated.
    Type: Grant
    Filed: December 30, 2010
    Date of Patent: February 18, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Yuh-Ching Wang, Kuo-Yuan Li
  • Patent number: 8645128
    Abstract: A first-pitch metric function based on a first audio sample and a second pitch-metric function based on a second audio sample may be determined. The first and second pitch-metric functions may have either local minima or local maxima that correspond to candidate pitch values of the first and the second audio samples, respectively. The first and the second pitch-metric functions may be transformed to generate a first and a second transformed pitch-metric function, respectively. A correlation function based on a correlation between the first and the second transformed pitch-metric function may also be determined. A lower-dimensionality representation of the correlation function may further be determined. The lower-dimensionality representation may convey information indicative of pitch dynamics between the first and second audio sample. A computing device having a processor and a memory may perform an action based on the information indicative of the pitch dynamics.
    Type: Grant
    Filed: October 2, 2012
    Date of Patent: February 4, 2014
    Assignee: Google Inc.
    Inventor: Ioannis Agiomyrgiannakis