Orthogonal Functions Patents (Class 704/204)
  • Patent number: 11810582
    Abstract: The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The upmixing can be suspended responsive to control data.
    Type: Grant
    Filed: December 23, 2021
    Date of Patent: November 7, 2023
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Heiko Purnhagen, Pontus Carlsson, Lars Villemoes
  • Patent number: 11550770
    Abstract: Time-series data indicating a temporal variation of an index, which indicates a usage state of each of resources that are used by multiple processes, is acquired, and an operation-data matrix including vectors is generated based on the time-series data such that each of the vectors indicates the time-series data at a predetermined time interval and includes as an element the index indicating the usage state of one of the resources at the predetermined time interval. A basis matrix including a predetermined number of basis vectors is generated by performing nonnegative matrix factorization on the operation-data matrix. Component values, which respectively correspond to the resources, indicated by each of the predetermined number of the basis vectors are extracted, and information on the extracted component values is output as usage states of the resources that are used by each of the multiple processes.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: January 10, 2023
    Assignee: FUJITSU LIMITED
    Inventors: Yuji Saito, Tetsuya Uchiumi, Yukihiro Watanabe
  • Patent number: 11482235
    Abstract: A speech enhancement method and a speech enhancement system are provided. The speech enhancement method performs two-stage noise suppression by using digital signal processing and neural network approach. The first-stage noise suppression generates artifact signals by reducing stationary noise in the digital audio signals. The second-stage noise suppression performs voice activity detection and further reduces non-stationary noise in the artifact signals. The result of the voice activity detection is fed back to establish or update a noise model used in the first-stage noise suppression.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: October 25, 2022
    Assignee: QNAP SYSTEMS, INC.
    Inventor: Wei Wei Hsiung
  • Patent number: 10993061
    Abstract: An audio system provides for soundstage-conserving channel summation. The system includes circuitry that generates a first rotated component and a second rotated component by rotating a pair of audio signal components. The circuitry generates left quadrature components that are out of phase with each other using the first rotated component and generates right quadrature components that are out of phase with each other using the second rotated component. The circuitry generates orthogonal correlation transform (OCT) components based on the left and right quadrature components. Each OCT component including a weighted combination of a left quadrature component and a right quadrature component. The circuitry generates a mono output channel using one or more of the OCT components.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: April 27, 2021
    Assignee: Boomcloud 360, Inc.
    Inventors: Joseph Mariglio, III, Zachary Seidess
  • Patent number: 10984805
    Abstract: An apparatus for decoding an encoded signal includes: an audio decoder for decoding an encoded representation of a first set of first spectral portions to obtain a decoded first set of first spectral portions; a parametric decoder for decoding an encoded parametric representation of a second set of second spectral portions to obtain a decoded representation of the parametric representation, wherein the parametric information includes, for each target frequency tile, a source region identification as a matching information; and a frequency regenerator for regenerating a target frequency tile using a source region from the first set of first spectral portions identified by the matching information.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: April 20, 2021
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Christian Neukam, Sascha Disch, Frederik Nagel, Andreas Niedermeier, Konstantin Schmidt, Balaji Nagendran Thoshkahna
  • Patent number: 10599361
    Abstract: A backup agent for orchestrating backups of production hosts includes a persistent storage that stores backup policies and a backup manager that obtains a backup analysis request for a virtual machine hosted by the production hosts; generate a dependency graph based on: backups associated with the virtual machines, and the backup policies associated with the backups; and displays a graphical user interface, using the dependency graph, including user interactive markers based on the backups and dependency indicators interconnecting the user interactive markers. While the graphical user interface is displayed, the backup manager obtains a potential backup policy update based on a user interaction with one of the user interactive markers. After obtaining the potential backup policy update, the backup manager updates the graphical user interface to reflect the potential backup policy update. After updating the graphical user interface, the backup manager initiates generation of the backup.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: March 24, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Asif Khan, Shelesh Chopra, Matthew Dickey Buchman, Bharat Bhushan, Krishnendu Bagchi
  • Patent number: 10396868
    Abstract: A codebook generation system and associated methods are generally described herein.
    Type: Grant
    Filed: October 10, 2012
    Date of Patent: August 27, 2019
    Assignee: INTEL CORPORATION
    Inventors: Xintian E. Lin, Qinghua Li
  • Patent number: 10389415
    Abstract: A codebook generation system and associated methods are generally described herein.
    Type: Grant
    Filed: October 10, 2012
    Date of Patent: August 20, 2019
    Assignee: INTEL CORPORATION
    Inventors: Xintian E. Lin, Qinghua Li
  • Patent number: 10271104
    Abstract: A video play-based information processing method is realized with a multimedia information processing system, a first client and server for video play, a computer storage medium, and a server used for video play. The method includes: receiving a video content request transmitted by a first client, obtaining a video content requested by the first client, and transmitting the video content to the first client; and during transmission of the video content, transmitting first information associated with the video content and render information of the first information to the first client, for the first client to display the first information associated with the video content via a play interface during playing of the video content. The render information of the first information includes a display start time point, a display duration and display position information of the first information.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: April 23, 2019
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Ao Peng
  • Patent number: 10235126
    Abstract: A method and a system (20) of audio source separation are described. The method comprises: receiving (10) an audio mixture and at least one text query associated to the audio mixture; retrieving (11) at least one audio sample from an auxiliary audio database; evaluating (12) the retrieved audio samples; and separating (13) the audio mixture into a plurality of audio sources using the audio samples. The corresponding system (20) comprises a receiving (21) and a processor (22) configured to implement the method.
    Type: Grant
    Filed: May 11, 2015
    Date of Patent: March 19, 2019
    Assignee: INTERDIGITAL CE PATENT HOLDINGS
    Inventors: Quang Khanh Ngoc Duong, Alexey Ozerov, Dalia Elbadawy
  • Patent number: 9870781
    Abstract: The present disclosure relates to a device and method for reducing quantization noise in a sound signal contained in a time-domain excitation decoded by a time-domain decoder. A future frame time-domain excitation is evaluated based on the decoded time-domain excitation. A concatenated time-domain excitation is produced from the decoded time-domain excitation of the time-domain excitation of the future frame and is converted into a frequency-domain excitation. A weighting mask is produced for retrieving spectral information lost in the quantization noise. The frequency-domain excitation is modified to increase spectral dynamics by application of the weighting mask. The modified frequency-domain excitation is converted into a modified time-domain excitation. The latter conversion is delay-less. In an embodiment, the weighting mask may be produced using time averaging or frequency averaging or a combination of time and frequency averaging of the frequency-domain excitation.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: January 16, 2018
    Assignee: VOICEAGE CORPORATION
    Inventors: Tommy Vaillancourt, Milan Jelinek
  • Patent number: 9837088
    Abstract: Present disclosure provides a signal processing method and device. Spectral coefficients of a current frame of a frequency-domain audio signal are divided into N sub-bands. N is a positive integer greater than 1. According to an energy attribute and a spectral attribute of a first subset of the N sub-bands, whether to modify original envelope values of sub-bands in the first subset is determined. A frequency range of each of the M sub-bands in the first subset is lower than a frequency range of each of the K sub-bands. Based on a determination that the original envelope values of the M sub-bands need to be modified, the original envelope values of the M sub-bands are modified individually to obtain modified envelope values of the M sub-bands. Encoding bits are allocated to each of the N sub-bands according to the modified envelope values of the M sub-bands and original envelope values of the K sub-bands.
    Type: Grant
    Filed: October 27, 2016
    Date of Patent: December 5, 2017
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bin Wang, Lei Miao, Zexin Liu
  • Patent number: 9734840
    Abstract: A signal-processing device includes a determination section that compares a frequency spectrum and a floor spectrum of an input audio signal to each other for each frequency bin and determines whether the input audio signal should be subjected to noise reduction processing or not for each of the frequency bins; and a noise reduction-processing section that subtracts a noise frequency spectrum from the frequency spectrum of the input audio signal for each of the frequency bins on the basis of the result determined by the determination section for each of the frequency bins.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: August 15, 2017
    Assignee: NIKON CORPORATION
    Inventors: Yoko Yoshizuka, Mitsuhiro Okazaki, Kosuke Okano
  • Patent number: 9711127
    Abstract: Systems, methods, and apparatus for facilitating multi-sensor signal optimization for speech communication are presented herein. A sensor component including acoustic sensors can be configured to detect sound and generate, based on the sound, first sound information associated with a first sensor of the acoustic sensors and second sound information associated with a second sensor of the acoustic sensors. Further, an audio processing component can be configured to generate filtered sound information based on the first sound information, the second sound information, and a spatial filter associated with the acoustic sensors; determine noise levels for the first sound information, the second sound information, and the filtered sound information; and generate output sound information based on a selection of one of the noise levels or a weighted combination of the noise levels.
    Type: Grant
    Filed: September 17, 2012
    Date of Patent: July 18, 2017
    Assignee: BITWAVE PTE LTD.
    Inventors: Siew Kok Hui, Eng Sui Tan
  • Patent number: 9632982
    Abstract: An orthogonal transform apparatus includes: an interchanging unit which interchanges MDCT coefficients contained in a first half of a prescribed interval with MDCT coefficients contained in a second half thereof; an inverting unit which inverts the sign of the MDCT coefficients contained in the second half of the prescribed interval after the interchange; an inverse cosine transform unit which computes the real components of QMF coefficients by applying an IMDCT using FFT to the MDCT coefficients contained in the first half and the sign-inverted MDCT coefficients contained in the second half; an inverse sine transform unit which computes the imaginary components of the QMF coefficients by applying an IMDST using FFT to the MDCT coefficients contained in the first half and the sign-inverted MDCT coefficients contained in the second half; and a coefficient adjusting unit which computes the QMF coefficients by combining the real components with the imaginary components.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: April 25, 2017
    Assignee: FUJITSU LIMITED
    Inventors: Yohei Kishi, Akira Kamano, Shunsuke Takeuchi, Takeshi Otani
  • Patent number: 9594511
    Abstract: A method for performing a write to a volume x in a cascaded architecture is described. In one embodiment, such a method includes determining whether the volume x has a child volume, wherein each of the volume x and the child volume have a target bit map (TBM) associated therewith. The method then determines whether the TBMs of both the volume x and the child volume are set. If the TBMs are set, the method finds a higher source (HS) volume from which to copy the desired data to the child volume. Finding the HS volume includes comparing ages of mapping relationships upstream from the volume x in order to determine a source of the data. Once the HS volume is found, the method copies the data from the HS volume to the child volume and performs the write to the volume x. A method for performing a read is also disclosed herein.
    Type: Grant
    Filed: April 13, 2015
    Date of Patent: March 14, 2017
    Assignee: International Business Machines Corporation
    Inventors: Michael T. Benhase, Theresa M. Brown, Lokesh M. Gupta, Carol S. Mellgren
  • Patent number: 9514098
    Abstract: Methods and apparatus related to determining coreference resolution using distributed word representations. Distributed word representations, indicative of syntactic and semantic features, may be identified for one or more noun phrases. For each of the one or more noun phrases, a referring feature representation and an antecedent feature representation may be determined, where the referring feature representation includes the distributed word representation, and the antecedent feature representation includes the distributed word representation augmented by one or more antecedent features. In some implementations the referring feature representation may be augmented by one or more referring features. Coreference embeddings of the referring and antecedent feature representations of the one or more noun phrases may be learned. Distance measures between two noun phrases may be determined based on the coreference embeddings.
    Type: Grant
    Filed: December 26, 2013
    Date of Patent: December 6, 2016
    Assignee: Google Inc.
    Inventors: Amarnag Subramanya, Jingyi Liu, Fernando Carlos das Neves Pereira, Kai Chen, Jay Ponte, Rami Al-Rfou′
  • Patent number: 9426564
    Abstract: An audio processing device including a factorization unit which factorizes frequency information obtained by performing time-frequency transformation on an audio signal of a plurality of channels into a channel matrix representing characteristics of a channel direction, a frequency matrix representing characteristics of a frequency direction, and a time matrix representing characteristics of a time direction; and an extraction unit which extracts the frequency information of audio from an arbitrary designated direction based on the channel matrix, the frequency matrix, and the time matrix.
    Type: Grant
    Filed: November 5, 2013
    Date of Patent: August 23, 2016
    Assignees: SONY CORPORATION, Institut de Recherche et Coordination Acoustique/Musique
    Inventors: Yuhki Mitsufuji, Axel Roebel
  • Patent number: 9424856
    Abstract: A spectrum filler for filling non-coded residual sub-vectors of a transform coded audio signal includes a sub-vector compressor (42) configured to compress actually coded residual sub-vectors. A sub-vector rejecter (44) is configured to reject compressed residual sub-vectors that do not fulfill a predetermined sparseness criterion. A sub-vector collector (46) is configured to concatenate the remaining compressed residual sub-vectors to form a first virtual codebook (VC1). A coefficient combiner (48) is configured to combine pairs of coefficients of the first virtual codebook (VC1) to form a second virtual codebook (VC2). A sub-vector filler (50) is configured to fill non-coded residual sub-vectors below a predetermined frequency with coefficients from the first virtual codebook (VC1), and to fill non-coded residual sub-vectors above the predetermined frequency with coefficients from the second virtual codebook (VC2).
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: August 23, 2016
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Volodya Grancharov, Sebastian Näslund, Sigurdur Sverrisson
  • Patent number: 9380398
    Abstract: Disclosed is a sound processing apparatus including a factorization unit and an extraction unit. The factorization unit is configured to factorize frequency information obtained by performing time-frequency transformation on sound signals of a plurality of channels into a channel matrix expressing properties in a channel direction, a frequency matrix expressing properties in a frequency direction, and a time matrix expressing properties in a time direction. The extraction unit is configured to compare the channel matrix with a threshold and extract components specified by a result of the comparison from the channel matrix, the frequency matrix, and the time matrix to generate the frequency information on a sound from a desired sound source.
    Type: Grant
    Filed: April 10, 2014
    Date of Patent: June 28, 2016
    Assignee: Sony Corporation
    Inventor: Yuhki Mitsufuji
  • Patent number: 9306524
    Abstract: Methods of, apparatuses for, and non-transitory computer readable media having instructions thereon that when executed cause carrying out methods of determining and modifying the perceived loudness of a frequency domain audio signal where the frequency resolution, and corresponding temporal coverage of the frequency domain information is not constant. The frequency (and thus temporal) resolution of the perceived loudness processing is maintained constant at the longest block size. One method includes a block combiner and a loudness modification interpolator.
    Type: Grant
    Filed: October 19, 2014
    Date of Patent: April 5, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Michael J. Smithers
  • Patent number: 8977541
    Abstract: The present invention relates to a speech processing apparatus, a speech processing method and a program which, when multichannel audio signals are downmixed and coded, prevent delay and an increase in the computation amount upon decoding of the audio signals. An inverse multiplexing unit (101) acquires coded data on which a BC parameter is multiplexed. An uncorrelated frequency-time transform unit (102) performs IMDCT transform and IMDST transform of frequency spectrum coefficients of a monaural signal (XM) obtained from this coded data to generate the monaural signal XM) which is a time domain signal and a signal (XD?) which is substantially uncorrelated with this monaural signal (XM). The stereo synthesis unit (103) generates a stereo signal by synthesizing the monaural signal (XM) and the signal (XD?) using the BC parameter. The present invention is applicable to, for example, a speech processing apparatus which decodes a downmixed and coded stereo signal.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: March 10, 2015
    Assignee: Sony Corporation
    Inventors: Yasuhiro Toguri, Shiro Suzuki, Jun Matsumoto, Yuuji Maeda, Yuuki Matsumura
  • Publication number: 20150066487
    Abstract: A voice processing apparatus includes: a dividing unit which divides a voice signal into frames in such a manner that any two successive frames overlap each other by a predetermined amount; a first windowing unit which multiplies each frame by a first windowing function that attenuates a signal at both ends of the frame; an orthogonal transform unit which computes a frequency spectrum for each frame multiplied by the first windowing function; a frequency signal processing unit which computes a corrected frequency spectrum; an inverse orthogonal transform unit which computes a corrected frame by applying an inverse orthogonal transform to the corrected frequency spectrum; a second windowing unit which multiplies each corrected frame by a second windowing function that attenuates a signal at both ends of the corrected frame; and an addition unit which adds up the each corrected frame multiplied by the second windowing function, sequentially in time order.
    Type: Application
    Filed: July 3, 2014
    Publication date: March 5, 2015
    Inventor: Naoshi MATSUO
  • Patent number: 8924199
    Abstract: A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller.
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: December 30, 2014
    Assignee: Fujitsu Limited
    Inventors: Chisato Ishikawa, Takeshi Otani, Taro Togawa, Masanao Suzuki, Masakiyo Tanaka
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Publication number: 20140337017
    Abstract: A method converts source speech to target speech by first mapping the source speech to sparse weights using compressive sensing technique, and the transforming, using transformation parameters, the sparse weights to the target speech.
    Type: Application
    Filed: May 9, 2013
    Publication date: November 13, 2014
    Applicant: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Shinji Watanabe, John R. Hershey
  • Patent number: 8838441
    Abstract: A representation of an audio signal having a first, a second and a third frame is derived by estimating first warp information for the first and second frames and second warp information for the second and third frames, the warp information describing pitch information of the audio signal. First or second spectral coefficients for first and second frames or second and third frames are derived using first or second warp information and a first or second weighted representation of the first and second frames or second and third frames, the first or second weighted representation derived by applying a first or second window function to the first and second frames or second and third frames, wherein the first or second window function depends on the first or second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: September 16, 2014
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 8775166
    Abstract: An encoding method includes: extracting core layer characteristic parameters and enhancement layer characteristic parameters of a background noise signal, encoding the core layer characteristic parameters and enhancement layer characteristic parameters to obtain a core layer codestream and an enhancement layer codestream. The disclosure also provides an encoding device, a decoding device and method, an encapsulating method, a reconstructing method, an encoding-decoding system and an encoding-decoding method. By describing the background noise signal with the enhancement layer characteristic parameters, the background noise signal can be processed by using more accurate encoding and decoding method, so as to improve the quality of encoding and decoding the background noise signal.
    Type: Grant
    Filed: August 14, 2009
    Date of Patent: July 8, 2014
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Hualin Wan, Libin Zhang
  • Patent number: 8761410
    Abstract: The present technology provides robust, high quality dereverberation of an acoustic signal which can overcome or substantially alleviate the problems associated with the diverse and dynamic nature of the surrounding acoustic environment. The present technology utilizes acoustic signals received from a plurality of microphones to carry out a multi-faceted analysis which accurately identifies reverberation based on the correlation between the acoustic signals. Due to the spatial distance between the microphones and the variation in reflection paths present in the surrounding acoustic environment, the correlation between the acoustic signals can be used to accurately determine whether portions of one or more of the acoustic signals contain desired speech or undesired reverberation. These correlation characteristics are then used to generate signal modifications applied to one or more of the received acoustic signals to preserve speech and reduce reverberation.
    Type: Grant
    Filed: December 8, 2010
    Date of Patent: June 24, 2014
    Assignee: Audience, Inc.
    Inventors: Carlos Avendano, Carlo Murgia
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8738371
    Abstract: A response storage unit stores a response, a watching degree relative to a display unit, and an output form of the response to a speaker and the display unit. An extracting unit extracts a request from a speech recognition result. A response determining unit determines a response based on the extracted request. A direction detector detects a viewing direction based on sensing information received from a transmitter mounted on a user. A watching-degree determining unit determines a watching degree based on the viewing direction. An output controller obtains an output form corresponding to the response and the determined watching degree from the response storage unit, and outputs the response to the speaker and the display unit according to the obtained output form.
    Type: Grant
    Filed: September 13, 2007
    Date of Patent: May 27, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kazuo Sumita
  • Patent number: 8731910
    Abstract: The invention provides a compensation method for audio frame loss in a MDCT domain, the method comprising: when a frame currently lost is a Pth frame, obtaining a set of frequencies to be predicted, and for each frequency in the set, using phases and amplitudes of a plurality of frames before a (P?1)th frame in a MDCT-MDST domain to predict a phase and an amplitude of the Pth frame, and using the predicted phase and amplitude to obtain a MDCT coefficient of the Pth frame at each corresponding frequency; for a frequency outside the set, using MDCT coefficients of a plurality of frames before the Pth frame to calculate a MDCT coefficient value of the Pth frame at the frequency; performing an IMDCT for the MDCT coefficients of the Pth frame to obtain a time domain signal of the Pth frame.
    Type: Grant
    Filed: February 25, 2010
    Date of Patent: May 20, 2014
    Assignee: ZTE Corporation
    Inventors: Ming Wu, Zhibin Lin, Ke Peng, Zheng Deng, Jing Lu, Xiaojun Qiu, Jiali Li, Guoming Chen, Hao Yuan, Kaiwen Liu
  • Patent number: 8700388
    Abstract: A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.
    Type: Grant
    Filed: March 23, 2009
    Date of Patent: April 15, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernd Edler, Sascha Disch, Ralf Geiger, Stefan Bayer, Ulrich Kraemer, Guillaume Fuchs, Max Neuendorf, Markus Multrus, Gerald Schuller, Harald Popp
  • Patent number: 8682658
    Abstract: The equipment comprises two microphones, sampling means, and de-noising means. The de-noising means are non-frequency noise reduction means comprising a combiner having an adaptive filter performing an iterative search seeking to cancel the noise picked up by one of the microphones on the basis of a noise reference given by the other microphone sensor. The adaptive filter is a fractional delay filter modeling a delay that is shorter than the sampling period. The equipment also has voice activity detector means delivering a signal representative of the presence or the absence of speech from the user of the equipment. The adaptive filter receives this signal as input so as to enable it to act selectively: i) either to perform an adaptive search for the parameters of the filter in the absence of speech; ii) or else to “freeze” those parameters of the filter in the presence of speech.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: March 25, 2014
    Assignee: Parrot
    Inventors: Guillaume Vitte, Michael Herve
  • Patent number: 8615391
    Abstract: An method and apparatus to extract an audio signal having an important spectral component (ISC) and a low bit-rate audio signal coding/decoding method using the method and apparatus to extract the ISC. The method of extracting the ISC includes calculating perceptual importance including an SMR (signal-to-mask ratio) value of transformed spectral audio signals by using a psychoacoustic model, selecting spectral signals having a masking threshold value smaller than that of the spectral audio signals using the SMR value as first ISCs, and extracting a spectral peak from the audio signals selected as the ISCs according to a predetermined weighting factor to select second ISCs. Accordingly, the perceptual important spectral components can be efficiently coded so as to obtain high sound quality at a low bit-rate.
    Type: Grant
    Filed: July 6, 2006
    Date of Patent: December 24, 2013
    Assignee: SAMSUNG Electronics Co., Ltd.
    Inventors: Junghoe Kim, Eunmi Oh, Konstantin Osipov, Boris Kudryashov
  • Patent number: 8606573
    Abstract: VoIP phones according to the present invention include a microphone, which may be internal or external, and allow the user to communicate unobtrusively, check voice mail and conduct other activities in an environment which can be noisy in general and extremely noisy sometimes. Speech recognition functionally may also be used to generate and send touch tone or DTMF tones such as in response to call trees or voice recognition functionality used by airlines, credit card companies, voice mail systems, and other applications. A system and method of audio processing which provides enhanced speech recognition is provided. Audio input is received at the microphone which is processed by adaptive noise cancellation to generate an enhanced audio signal. The operation of the speech recognition engine and the adaptive noise canceller may be advantageously controlled based on Voice Activity Detection (VAD).
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: December 10, 2013
    Inventor: Alon Konchitsky
  • Publication number: 20130297297
    Abstract: A system performs local feature extraction. The system includes a processing device that performs a Short Time Fourier Transform to obtain a spectrogram for a discrete-time speech signal sample. The spectrogram is subdivided based on natural divisions of frequency to humans. Time-frequency-energy is then quantized using information obtained from the spectrogram. And, feature vectors are determined based on the quantized time-frequency-energy information.
    Type: Application
    Filed: April 8, 2013
    Publication date: November 7, 2013
    Inventor: Erhan GUVEN
  • Patent number: 8577045
    Abstract: An encoding apparatus comprises a frame processor (105) which receives a multi channel audio signal comprising at least a first audio signal from a first microphone (101) and a second audio signal from a second microphone (103). An ITD processor 107 then determines an inter time difference between the first audio signal and the second audio signal and a set of delays (109, 111) generates a compensated multi channel audio signal from the multi channel audio signal by delaying at least one of the first and second audio signals in response to the inter time difference signal. A combiner (113) then generates a mono signal by combining channels of the compensated multi channel audio signal and a mono signal encoder (115) encodes the mono signal. The inter time difference may specifically be determined by an algorithm based on determining cross correlations between the first and second audio signals.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: November 5, 2013
    Assignee: Motorola Mobility LLC
    Inventor: Jonathan A. Gibbs
  • Publication number: 20130262097
    Abstract: Systems and methods utilize individually selected modulation spectral features for speech and speaker characterization. The method involves construction of a sparse feature space and a method of finding the approximately best feature subset for attributing a specific characteristic of speech or speaker. The current selection method is based on the Kolmogorov-Smirnov statistical test applied to individual features. The characterization task can be defined empirically and no a-priori theory is necessary to explain characteristic attribution processes. Experimental results indicate that employment of selected modulation spectral features works better than the current state-of-the-art at least in some instances of speech characterization task, e.g. prediction of speaker personality traits, as it is evident from the official results of Interspeech'2012 Speaker Personality Recognition Challenge.
    Type: Application
    Filed: March 29, 2013
    Publication date: October 3, 2013
    Inventor: Aliaksei Ivanou
  • Publication number: 20130253920
    Abstract: A method of processing a speech signal comprises converting the speech signal to digital signals, converting the digital speech signal into short-time frames, applying a Fast Fourier Transform to each of the short-time frames to obtain an original spectrum, deriving a varied spectrum based on the original spectrum, applying discrete cosine transform to compute original cepstrum coefficients for the original spectrum and varied cepstrum coefficients for the varied spectrum and generating a set of frontend feature vectors for each of the short-time frames.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 26, 2013
    Inventor: Qiguang LIN
  • Patent number: 8494845
    Abstract: Provided is a signal distortion elimination apparatus comprising: an inverse filter application means that outputs the signal obtained by applying an inverse filter to an observed signal as a restored signal when a predetermined iteration termination condition is met and outputs the signal obtained by applying the inverse filter to the observed signal as an ad-hoc signal when the predetermined iteration termination condition is not met; a prediction error filter calculation means that segments the ad-hoc signal into frames and outputs a prediction error filter of each frame obtained by performing linear prediction analysis of the ad-hoc signal of each frame; an inverse filter calculation means that calculates an inverse filter such that a concatenation of innovation estimates of the respective frames becomes mutually independent among their samples, where the innovation estimate of a single frame (an innovation estimate) is the signal obtained by applying the prediction error filter of the corresponding frame
    Type: Grant
    Filed: February 16, 2007
    Date of Patent: July 23, 2013
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Takuya Yoshioka, Takafumi Hikichi, Masato Miyoshi
  • Patent number: 8452587
    Abstract: Provided is an encoder which can decode a high-quality stereo signal while keeping the amount of information in the bit allocation information to a minimum when a scalable coding technique is used for a stereo signal. In the encoder, a principal component analysis (PCA) converter converts the left signal and the right signal of the stereo signal and generates the main signal of the first layer and the sub-signal of the first layer. In the first layer to the M-th layer (where M is a natural number, 2 or greater), an adaptive residual encoder compares the importance of the main signal of the m-th layer, where m is a natural number from 1 to M, and the importance of the sub-signal of the m-th layer, selects the signal having the higher importance, encodes the selected signal, and generates the encoded data of the m-th layer.
    Type: Grant
    Filed: May 29, 2009
    Date of Patent: May 28, 2013
    Assignee: Panasonic Corporation
    Inventors: Zongxian Liu, Kok Seng Chong
  • Patent number: 8452588
    Abstract: It is possible to improve quality of a decoding signal in a band spread for estimating a high band from a low band of a decoding signal. A first layer encoder encodes a lower band portion below a predetermined frequency of an input signal so as to generate first layer encoded information. A first layer decoder decodes the first layer encoded information so as to generate a first layer demodulated signal. A second layer encoder divides a high band portion higher, than a predetermined frequency, of an input signal into a plurality of sub-bands and estimates each of the sub-bands from the input signal or the first layer decoded signal by using the estimation result of the sub-band adjacent to the lower band side so as to generate second encoded information including the estimation results of the sub-bands.
    Type: Grant
    Filed: March 13, 2009
    Date of Patent: May 28, 2013
    Assignee: Panasonic Corporation
    Inventors: Tomofumi Yamanashi, Masahiro Oshikiri
  • Patent number: 8447023
    Abstract: During a conference, a multipoint control unit (MCU) designates priority and non-priority endpoints. The MCU forms priority audio from the priority endpoint and sends that audio to the other endpoints at a normal level. However, the MCU forms non-priority audio from the non-priority endpoint and handles that audio based on whether the input audio from the priority endpoint is from speaking or not. If the priority endpoint's audio indicates a participant at that endpoint is speaking, then the MCU sends the non-priority audio to the other endpoints at a reduced level. Designation of which endpoint has priority can be based on which endpoint has a current duration of audio indicative of speech that is longer than other endpoints. Alternatively, the designation can be based on which endpoint is currently presenting content during the conference or based on a mix of speech audio and content presentation.
    Type: Grant
    Filed: February 1, 2010
    Date of Patent: May 21, 2013
    Assignee: Polycom, Inc.
    Inventors: Alain Nimri, Rick Flott
  • Patent number: 8447591
    Abstract: An audio encoder/decoder uses a combination of an overlap windowing transform and block transform that have reversible implementations to provide a reversible, integer-integer form of a lapped transform. The reversible lapped transform permits both lossy and lossless transform domain coding of an audio signal having variable subframe sizes.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: May 21, 2013
    Assignee: Microsoft Corporation
    Inventor: Sanjeev Mehrotra
  • Patent number: 8412518
    Abstract: A representation of an audio signal having a first frame, a second frame following the first frame, and a third frame following the second frame, is derived by estimating first warp information for the first and the second frame and second warp information for the second frame and the third frame, the warp information describing a pitch information of the audio signal. First spectral coefficients for the first and the second frame are derived using the first warp information and a first weighted representation of the first and the second frame, the first weighted representation derived by applying a first window function to the first and the second frames, wherein the first window function depends on the first warp information.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: April 2, 2013
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 8396230
    Abstract: A speech enhancement device and a method for the same are included. The device includes a down-converter, a speech enhancement processor, and an up-converter. The method includes steps of down-converting audio signals to generate down-converted audio signals; performing speech enhancement on the down-converted audio signals to generate speech-enhanced audio signals; and up-converting the speech enhancement audio signals to generate up-converted audio signals.
    Type: Grant
    Filed: October 29, 2008
    Date of Patent: March 12, 2013
    Assignee: MStar Semiconductor, Inc.
    Inventors: Jung Kuei Chang, Dau Ning Guo, Shang Yi Huang, Huang Hsiang Lin, Shao Shi Chen
  • Patent number: 8392200
    Abstract: A complex analysis filterbank is implemented by obtaining an input audio signal as a plurality of N time-domain input samples. Pair-wise additions and subtractions of the time-domain input samples is performed to obtain a first and second groups of intermediate samples, each group having N/2 intermediate samples. The signs of odd-indexed intermediate samples in the second group are then inverted. A first transform is applied to the first group of intermediate samples to obtain a first group of output coefficients in the frequency domain. A second transform is applied to the second group of intermediate samples to obtain an intermediate second group of output coefficients in the frequency domain. The order of coefficients in the intermediate second group of output coefficients is then reversed to obtain a second group of output coefficients. The first and second groups of output coefficients may be stored and/or transmitted as a frequency domain representation of the audio signal.
    Type: Grant
    Filed: April 13, 2010
    Date of Patent: March 5, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Ravi Kiran Chivukula, Yuriy Reznik
  • Publication number: 20130006618
    Abstract: The present invention relates to a speech processing apparatus, a speech processing method and a program which, when multichannel audio signals are downmixed and coded, prevent delay and an increase in the computation amount upon decoding of the audio signals. An inverse multiplexing unit (101) acquires coded data on which a BC parameter is multiplexed. An uncorrelated frequency-time transform unit (102) performs IMDCT transform and IMDST transform of frequency spectrum coefficients of a monaural signal (XM) obtained from this coded data to generate the monaural signal XM) which is a time domain signal and a signal (XD?) which is substantially uncorrelated with this monaural signal (XM). The stereo synthesis unit (103) generates a stereo signal by synthesizing the monaural signal (XM) and the signal (XD?) using the BC parameter. The present invention is applicable to, for example, a speech processing apparatus which decodes a downmixed and coded stereo signal.
    Type: Application
    Filed: March 8, 2011
    Publication date: January 3, 2013
    Inventors: Yasuhiro Toguri, Shiro Suzuki, Jun Matsumoto, Yuuji Maeda, Yuuki Matsumura
  • Patent number: 8340943
    Abstract: Provided is an apparatus of separating a musical sound source, which may re-construct mixed signals into target sound sources and other sound sources directly using sound source information performed using a predetermined musical instrument when the sound source information is present, thereby more effectively separating sound sources included in the mixed signal. The apparatus may include a Nonnegative Matrix Partial Co-Factorization (NMPCF) analysis unit to perform an NMPCF analysis on a mixed signal and a predetermined sound source signal using a sound source separation model, and to obtain a plurality of entity matrices based on the analysis result, and a target instrument signal separating unit to separate, from the mixed signal, a target instrument signal corresponding to the predetermined sound source signal by calculating an inner product between the plurality of entity matrices.
    Type: Grant
    Filed: August 12, 2010
    Date of Patent: December 25, 2012
    Assignees: Electronics and Telecommunications Research Institute, Postech Acadeny-Industry Foundation
    Inventors: Min Je Kim, Seungjin Choi, Jiho Yoo, Kyeongok Kang, Inseon Jang, Jin-Woo Hong