Time Patents (Class 704/211)
  • Patent number: 8676584
    Abstract: The invention relates to a digital signal processing technique that changes the length of an audio signal and, thus, effectively its play-out speed. This is used for frame rate conversion or sound effects in music production. Time scaling may further be used for fast forward or slow-motion audio play-out. According said method the waveform similarity overlap add approach is modified such that a maximized similarity is determined among similarity measures of sub-sequence pairs each comprising a sub-sequence to-be-matched from a input window and a matching sub-sequence from a search window wherein said sub-sequence pairs comprise at least two sub-sequence pairs of which a first pair comprises a first sub-sequence to-be-matched and a second pair comprises a different second sub-sequence to-be-matched. The input window allows for finding sub-sequence pairs with higher similarity than with a WSOLA approach based on a single sub-sequence to-be-matched. This results in less perceivable artefacts.
    Type: Grant
    Filed: June 22, 2009
    Date of Patent: March 18, 2014
    Assignee: Thomson Licensing
    Inventor: Markus Schlosser
  • Patent number: 8666752
    Abstract: Provided are an encoding apparatus and a decoding apparatus of a multi-channel signal. The encoding apparatus of the multi-channel signal may process a phase parameter associated with phase information between a plurality of channels constituting the multi-channel signal, based on a characteristic of the multi-channel signal. The encoding apparatus may generate an encoded bitstream with respect to the multi-channel signal using the processed phase parameter and a mono signal extracted from the multi-channel signal.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: March 4, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-Hoe Kim, Eun Mi Oh
  • Patent number: 8666734
    Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.
    Type: Grant
    Filed: September 23, 2010
    Date of Patent: March 4, 2014
    Assignee: University of Maryland, College Park
    Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
  • Patent number: 8660842
    Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.
    Type: Grant
    Filed: March 9, 2010
    Date of Patent: February 25, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventor: Antoine R. Raux
  • Patent number: 8645145
    Abstract: An audio decoder includes an arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically encoded representation of the spectral values, and a frequency-domain-to-time-domain converter for providing a time-domain audio representation using the decoded spectral values. The arithmetic decoder selects a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state described by a numeric current context value. The arithmetic decoder determines the numeric current context value in dependence on a plurality of previously decoded spectral values. The arithmetic decoder evaluates a hash table, entries of which define both significant state values and boundaries of intervals of numeric context values, in order to select the mapping rule. A mapping rule index value is individually associated to a numeric context value being a significant state value.
    Type: Grant
    Filed: July 12, 2012
    Date of Patent: February 4, 2014
    Assignee: Fraunhoffer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Vignesh Subbaraman, Guillaume Fuchs, Markus Multrus, Nikolaus Rettelbach, Marc Gayer, Oliver Weiss, Christian Griebel, Patrick Warmbold
  • Patent number: 8634708
    Abstract: The invention relates to a method for creating a new roundup of an audiovisual document previously recorded in a device. The document contains two parts, one being the roundup and the other composed of a plurality of reports. The roundup is itself divided into a plurality of parts. The device first searches for the associations between the roundup parts and the reports, and detects the reports that are not associated with roundup parts. Then, summaries are created for the reports not associated with the roundup, and incorporated into the initial roundup to create a new roundup. In this manner, the user can easily select any report from the roundup part associated with this report. The invention also relates to the receiver suitable for implementing the method.
    Type: Grant
    Filed: December 20, 2007
    Date of Patent: January 21, 2014
    Assignee: Thomson Licensing
    Inventors: Louis Chevallier, Claire-Helene Demarty, Lionel Oisel
  • Patent number: 8630863
    Abstract: Provided is a method of encoding an audio/speech signal, the method including determining a variable length of a frame, that is, a processing unit of an input signal in accordance with a position of an attack in the input signal; transforming each frame of the input signal to a frequency domain and dividing the frame into a plurality of sub frequency bands; and, if a signal of a sub frequency band is determined to be encoded in the frequency domain, encoding the signal of the sub frequency band in the frequency domain, and if the signal of the sub frequency band is determined to be encoded in a time domain, inverse transforming the signal of the sub frequency band to the time domain and encoding the inverse transformed signal in the time domain. According to the present invention, the audio/speech signal may be efficiently encoded by controlling time resolution and frequency resolution.
    Type: Grant
    Filed: October 15, 2007
    Date of Patent: January 14, 2014
    Assignee: SAMSUNG Electronics Co., Ltd.
    Inventors: Chang-yong Son, Eun-mi Oh, Jung-hoe Kim, Ho-sang Sung, Kang-eun Lee, Ki-hyun Choo
  • Patent number: 8620645
    Abstract: A decoder arrangement comprising a receiver input for parameters of frame-based coded signals and a decoder arranged to provide frames of decoded audio signals based on the parameters. The receiver input and/or the decoder is arranged to establish a time difference between the occasion when parameters of a first frame is available at the receiver input and the occasion when a decoded audio signal of the first frame is available at an output of the decoder, which time difference corresponds to at least one frame. A postfilter is connected to the output of the decoder and to the receiver input. The postfilter is arranged to provide a filtering of the frames of decoded audio signals into an output signal in response to parameters of a respective subsequent frame.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: December 31, 2013
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8620646
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: December 31, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Patent number: 8620643
    Abstract: A computer numerical processing method for representing audio information for use in conjunction with human hearing is described. The method comprises approximating an eigenfunction equation representing a model of human hearing, calculating the approximation to each of a plurality of eigenfunctions from at least one aspect of the eigenfunction equation, and storing the approximation to each of a plurality of eigenfunctions for use at a later time. The approximation to each of a plurality of eigenfunctions represents audio information. The model of human hearing includes a bandpass operation with a bandwidth having the frequency range of human hearing and a time-limiting operation approximating the time duration correlation window of human hearing.
    Type: Grant
    Filed: August 2, 2010
    Date of Patent: December 31, 2013
    Inventor: Lester F. Ludwig
  • Patent number: 8612222
    Abstract: A speech enhancement system improves the perceptual quality of a processed voice signal. The system improves the perceptual quality of a voice signal by removing unwanted noise components from a voice signal. The system removes undesirable signals that may result in the loss of information. The system receives and analyzes signals to determine whether an undesired random or persistent signal corresponds to one or more modeled noises. When one or more noise components are detected, the noise components are substantially removed or dampened from the signal to provide a less noisy voice signal.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: December 17, 2013
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Patent number: 8606566
    Abstract: A system improves speech intelligibility by reconstructing speech segments. The system includes a low-frequency reconstruction controller programmed to select a predetermined portion of a time domain signal. The low-frequency reconstruction controller substantially blocks signals above and below the selected predetermined portion. A harmonic generator generates low-frequency harmonics in the time domain that lie within a frequency range controlled by a background noise modeler. A gain controller adjusts the low-frequency harmonics to substantially match the signal strength to the time domain original input signal.
    Type: Grant
    Filed: May 23, 2008
    Date of Patent: December 10, 2013
    Assignee: QNX Software Systems Limited
    Inventors: Xueman Li, Rajeev Nongpiur, Frank Linseisen, Phillip A. Hetherington
  • Patent number: 8595005
    Abstract: A computerized method, software, and system for recognizing emotions from a speech signal, wherein statistical and MFCC features are extracted from the speech signal, the MFCC features are sorted to provide a basis for comparison between the speech signal and reference samples, the statistical and MFCC features are compared between the speech signal and reference samples, a scoring system is used to compare relative correlation to different emotions, a probable emotional state is assigned to the speech signal based on the scoring system, and the probable emotional state is communicated to a user.
    Type: Grant
    Filed: April 22, 2011
    Date of Patent: November 26, 2013
    Assignee: Simple Emotion, Inc.
    Inventors: Akash Krishnan, Matthew Fernandez
  • Patent number: 8589166
    Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: November 19, 2013
    Assignee: Broadcom Corporation
    Inventor: Robert W. Zopf
  • Patent number: 8583434
    Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.
    Type: Grant
    Filed: January 29, 2008
    Date of Patent: November 12, 2013
    Assignee: CallMiner, Inc.
    Inventor: Jeffrey A. Gallino
  • Publication number: 20130297299
    Abstract: The speech feature extraction algorithm is based on a hierarchical combination of auditory similarity and pooling functions. Computationally efficient features referred to as “Sparse Auditory Reproducing Kernel” (SPARK) coefficients are extracted under the hypothesis that the noise-robust information in speech signal is embedded in a reproducing kernel Hilbert space (RKHS) spanned by overcomplete, nonlinear, and time-shifted gammatone basis functions. The feature extraction algorithm first involves computing kernel based similarity between the speech signal and the time-shifted gammatone functions, followed by feature pruning using a simple pooling technique (“MAX” operation). Different hyper-parameters and kernel functions may be used to enhance the performance of a SPARK based speech recognizer.
    Type: Application
    Filed: March 7, 2013
    Publication date: November 7, 2013
    Applicant: Board of Trustees of Michigan State University
    Inventors: Shantanu Chakrabartty, Amin Fazeldehkordi
  • Patent number: 8577672
    Abstract: A method and apparatus of providing an audio output to a user in a communications system in which the audio to be output to a user, preferably an audio frame, is assessed before it is broadcast to the user, and then selectively changed on the basis of the assessment. The assessment may be carried out in the audio encoding process, in the audio decoding process and/or after the audio decoding process. The selective changing of the audio output may comprise selectively replacing the audio output and/or re-encoding of the audio output.
    Type: Grant
    Filed: February 27, 2008
    Date of Patent: November 5, 2013
    Assignee: Audax Radio Systems LLP
    Inventor: Graham Kinns
  • Patent number: 8571852
    Abstract: A scalable decoder device (50) for signals representing audio comprises a primary decoder (21) connected to an input (40). The primary decoder (21) is arranged to provide a primary decoded signal (23) based on received parameters (4). A primary postfilter (31) is connected to the primary decoder (23) to provide a primary postfiltered signal (32). A secondary enhancement decoder (45) is connected to the input (40) and arranged to provide a secondary decoded enhancement signal (44). The device further comprises a combiner arrangement (55), arranged for combining the primary postfiltered signal (32) and a signal (53) based on the secondary decoded enhancement signal (44) into an output signal (6) to be provided at an output (6). The combining is made with an adaptable strength relation between contributions from the two signals. A method for decoding coded signals representing audio operates in analogy with the scalable decoder device (50).
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: October 29, 2013
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8566092
    Abstract: The present invention discloses a method and an apparatus for extracting a prosodic feature of a speech signal, the method including: dividing the speech signal into speech frames; transforming the speech frames from time domain to frequency domain; and extracting respective prosodic features for different frequency ranges. According to the above technical solution of the present invention, it is possible to effectively extract the prosodic feature which can combine with a traditional acoustics feature without any obstacle.
    Type: Grant
    Filed: August 16, 2010
    Date of Patent: October 22, 2013
    Assignee: Sony Corporation
    Inventors: Kun Liu, Weiguo Wu
  • Patent number: 8566084
    Abstract: A speech signal processing system which outputs a speech feature, divides an input speech signal into frames so that each pair of consecutive frames have a frame shift length equal to at least one period of the speech signal and have an overlap equal to at least a predetermined length, applies discrete Fourier transform to each of the frames, calculates a CSP coefficient for the pair, searches a predetermined search range in which a speech wave lags a period equal to at least one period to obtain the maximum value of the CSP coefficient for the pair, and generates time-series data of the maximum CSP coefficient values arranged in the order in which the frames appear. A method and a computer readable article of manufacture for the implementing the same are also provided.
    Type: Grant
    Filed: June 1, 2011
    Date of Patent: October 22, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Osamu Ichikawa, Masafumi Nishimura
  • Patent number: 8566083
    Abstract: An audio signal may have a BL and an EL, wherein the EL represents additional information for enhancing the quality of the BL audio content. Decoding of such dual-layer signals usually comprises partial decoding of the BL data, wherein frequency bins of the BL are restored, mapping the restored frequency bins to the MDCT domain, adding them to the decoded EL and performing inverse Integer MDCT. A low-complexity method for decoding comprises reverse mapping of the decoded EL data, adding the reverse mapped EL data to the partially decoded BL data and filtering the sum, using the inverse BL filter bank.
    Type: Grant
    Filed: September 3, 2010
    Date of Patent: October 22, 2013
    Assignee: Thomson Licensing
    Inventors: Peter Jax, Sven Kordon
  • Publication number: 20130253921
    Abstract: An embodiment of the present invention is a method of presenting at least a portion of an audio or audio-visual work including: (a) retrieving an average speed contour or a democratic speed contour from a database apparatus; and (b) presenting the at least a portion at a playback apparatus using the retrieved average speed contour or democratic speed contour to provide presentation rates.
    Type: Application
    Filed: May 16, 2013
    Publication date: September 26, 2013
    Applicant: Enounce Incorporated
    Inventor: Donald J. Hejna, Jr.
  • Patent number: 8521520
    Abstract: Provided are methods and systems of managing handoffs in a wireless communication system having different types of vocoders. Some embodiments include translating state memory of a first vocoder to a second vocoder using a state memory transcoder. The state memory may be delayed to align differences in algorithmic delays between the first vocoder and the second vocoder. In one embodiment, a speech signal may be decoded from the first vocoder, delayed, and encoded to the second vocoder. Furthermore, for a period of time during and/or adjacent to the handoff, the first vocoder may output with decreasing amplitude while the second vocoder outputs with increasing amplitude. Such techniques may be used alone or in combination.
    Type: Grant
    Filed: February 3, 2010
    Date of Patent: August 27, 2013
    Assignee: General Electric Company
    Inventors: Richard Louis Zinser, Michael James Hartman, John Erik Hershey
  • Patent number: 8503517
    Abstract: A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions.
    Type: Grant
    Filed: June 3, 2009
    Date of Patent: August 6, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Pengjun Huang, Christian Pietsch, Christian Sgraja, Georg Frank, Christoph A. Joetten, Marc W. Werner, Wolfgang Granzow
  • Patent number: 8494845
    Abstract: Provided is a signal distortion elimination apparatus comprising: an inverse filter application means that outputs the signal obtained by applying an inverse filter to an observed signal as a restored signal when a predetermined iteration termination condition is met and outputs the signal obtained by applying the inverse filter to the observed signal as an ad-hoc signal when the predetermined iteration termination condition is not met; a prediction error filter calculation means that segments the ad-hoc signal into frames and outputs a prediction error filter of each frame obtained by performing linear prediction analysis of the ad-hoc signal of each frame; an inverse filter calculation means that calculates an inverse filter such that a concatenation of innovation estimates of the respective frames becomes mutually independent among their samples, where the innovation estimate of a single frame (an innovation estimate) is the signal obtained by applying the prediction error filter of the corresponding frame
    Type: Grant
    Filed: February 16, 2007
    Date of Patent: July 23, 2013
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Takuya Yoshioka, Takafumi Hikichi, Masato Miyoshi
  • Patent number: 8494849
    Abstract: A method of transmitting speech data to a remote device in a distributed speech recognition system, includes the steps of: dividing an input speech signal into frames; calculating, for each frame, a voice activity value representative of the presence of speech activity in the frame; grouping the frames into multiframes, each multiframe including a predetermined number of frames; calculating, for each multiframe, a voice activity marker representative of the number of frames in the multiframe representing speech activity; and selectively transmitting, on the basis of the voice activity marker associated with each multiframe, the multiframes to the remote device.
    Type: Grant
    Filed: June 20, 2005
    Date of Patent: July 23, 2013
    Assignee: Telecom Italia S.p.A.
    Inventors: Ivano Salvatore Collotta, Donato Ettorre, Maurizio Fodrini, Pierluigi Gallo, Roberto Spagnolo
  • Patent number: 8489404
    Abstract: A method for detecting a transient in an audio signal that has been broken up into frames includes obtaining a time domain feature of the frames and comparing the domain feature with a predetermined value. If the time domain feature is greater than the predetermined value, the frames are taken as transient and if the time domain feature is less than the predetermined value, the frames are taken as non-transient. The method has a low computational intensity and is thus very suitable for devices with limited processing resources.
    Type: Grant
    Filed: March 15, 2011
    Date of Patent: July 16, 2013
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Zhongsong Lin, Shidong Shang, Shengjiu Wang
  • Patent number: 8484018
    Abstract: An input frame data producing unit produces from data stored in an input buffer input frames each including a predetermined number of sub-frames of a first hopsize determined based on the first frame size and the overlapping rate. A frame processing unit executes a window function on the input frames and shifts the windowed input frames by the first hopsize and overlaps the shifted input frames, storing the overlapped frames in an output frame. An output buffer data producing frame unit stores data from the output frame to an output buffer including a predetermined number of sub-frames of a second hopsize. A CPU sets the first hopsize and overlapping rate in a slow-speed reproduction when the reproducing speed ratio is set lower than 1 different from in a high-speed reproduction when the reproducing speed ratio is set larger than 1.
    Type: Grant
    Filed: July 15, 2010
    Date of Patent: July 9, 2013
    Assignee: Casio Computer Co., Ltd
    Inventor: Masaru Setoguchi
  • Patent number: 8478599
    Abstract: An embodiment of the present invention is a method of presenting a media work which includes: detecting media work content properties in a portion of the media work; associating a presentation rate of the portion with the detected media work content properties; and presenting the portion at the presentation rate; wherein the media work content properties include one or more of: (a) indicia of a number of syllables in utterances; (b) indicia of a number of letters in a word; (c) indicia of the complexity of grammatical structures in portions of the media work; (d) indicia of arrival rate of newly presented objects; (e) indicia of temporal proximity of between events in portions of the media work or (f) indicia of number of phonemes per unit of time in portions of the media work.
    Type: Grant
    Filed: May 18, 2009
    Date of Patent: July 2, 2013
    Assignee: Enounce, Inc.
    Inventor: Donald J. Hejna, Jr.
  • Patent number: 8473298
    Abstract: A digital audio signal can be processed using continuously variable time-frequency resolution by selecting a portion of an input digital audio signal, resampling the selected portion of the input digital audio signal, generating a plurality of spectral characteristics associated with the resampled portion of the input digital audio signal, generating a portion of an output digital audio signal from the plurality of spectral characteristics, and resampling the portion of the output digital audio signal. Further, resampling the selected portion of the input digital audio signal can comprise determining a sampling ratio and resampling the selected portion of the input digital audio signal in accordance with the determined sampling ratio. Additionally, the portion of the output digital audio signal can be resampled in accordance with the inverse of the determined sampling ratio. The sampling ratio can be determined based on a time-frequency resolution requirement associated with an audio processing algorithm.
    Type: Grant
    Filed: November 1, 2005
    Date of Patent: June 25, 2013
    Assignee: Apple Inc.
    Inventor: Kevin Christopher Rogers
  • Patent number: 8468026
    Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. At least one frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, and (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes.
    Type: Grant
    Filed: August 7, 2012
    Date of Patent: June 18, 2013
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 8457955
    Abstract: A voice reproduction apparatus includes an ambient sound analysis unit to analyze a characteristic of an ambient sound, a characteristic analysis unit to analyze an acoustic characteristic of a signal for reproduction, a reproduction timing adjusting unit to record the signal for reproduction and to read the signal for reproduction at a reproduction timing of follow-up reproduction, a reproduction speed changing unit to change a reproduction speed of the read signal for reproduction, and a control unit to control the reproduction timing adjusting unit so that the signal for reproduction is reproduced at the reproduction timing corresponding to an analysis result of the ambient sound analysis unit and to control the reproduction speed changing unit so that the signal for reproduction is reproduced at the reproduction speed corresponding to the analysis result of the ambient sound analysis unit and the acoustic characteristic obtained by the characteristic analysis unit.
    Type: Grant
    Filed: March 1, 2012
    Date of Patent: June 4, 2013
    Assignee: Fujitsu Limited
    Inventors: Taro Togawa, Takeshi Otani, Kaori Endo, Yasuji Ota
  • Patent number: 8452605
    Abstract: An embodiment of an apparatus for generating audio subband values in audio subband channels has an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function having a sequence of window coefficients to obtain windowed samples. The analysis window function has a first group of window coefficients and a second group of window coefficients. The first group of window coefficients is used for windowing later time-domain samples and the second group of window coefficients is used for windowing an earlier time-domain samples. The apparatus further has a calculator for calculating the audio subband values using the windowed samples.
    Type: Grant
    Filed: October 23, 2007
    Date of Patent: May 28, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Markus Schnell, Manfred Lutzky, Markus Lohwasser, Markus Schmidt, Marc Gayer, Michael Mellar, Bernd Edler, Markus Multrus, Gerald Schuller, Ralf Geiger, Bernhard Grill
  • Patent number: 8452589
    Abstract: An embodiment of the present invention is a method of storing a speed contour for use in playback of at least a portion of an audio or audio-visual work including: (a) generating one or more speed contours and/or average speed contours and/or democratic speed contours for the audio or audio-visual work; (b) storing the one or more speed contours and/or average speed contours and/or democratic speed contours in a database; and (c) associating retrieval information with the one or more stored contours in the database.
    Type: Grant
    Filed: February 28, 2011
    Date of Patent: May 28, 2013
    Assignee: Enounce Incorporated
    Inventor: Donald J. Hejna, Jr.
  • Publication number: 20130124200
    Abstract: Noise robust template matching may be performed. First features of a first signal may be computed. Based at least on a portion of the first features, second features of a second signal may be computed. A new signal may be generated based on at least another portion of the first features and on at least a portion of the second features.
    Type: Application
    Filed: December 22, 2011
    Publication date: May 16, 2013
    Inventors: Gautham J. Mysore, Paris Smaragdis, Brian John King
  • Patent number: 8438016
    Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: May 7, 2013
    Assignee: City University of Hong Kong
    Inventors: Weijia Jia, Lizhuo Zhang, Huan Li, Wenyan Lu
  • Patent number: 8438015
    Abstract: An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.
    Type: Grant
    Filed: October 23, 2007
    Date of Patent: May 7, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Markus Schnell, Manfred Lutzky, Markus Lohwasser, Markus Schmidt, Marc Gayer, Michael Mellar, Bernd Edler, Markus Multrus, Gerald Schuller, Ralf Geiger, Bernhard Grill
  • Patent number: 8417532
    Abstract: The transient problem may be sufficiently addressed, and for this purpose, a further delay on the side of the decoding may be reduced if a new SBR frame class is used wherein the frame boundaries are not shifted, i.e. the grid boundaries are still synchronized with the frame boundaries, but wherein a transient position indication is additionally used as a syntax element so as to be used, on the encoder and/or decoder sides, within the frames of these new frame class for determining the grid boundaries within these frames.
    Type: Grant
    Filed: October 18, 2007
    Date of Patent: April 9, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung E.V.
    Inventors: Markus Schnell, Michael Schuldt, Manfred Lutzky, Manuel Jander
  • Patent number: 8417519
    Abstract: The present invention relates to signal modification before pitch period repetition for the synthesis of blocks lost on decoding digital audio signals. The effects of repetition of transitories, such as the plosives of a speech signal, are avoided by comparing the samples of a pitch period with those of the previous pitch period. The signal is modified preferentially by taking the minimum between a current sample (e(3)) of the last pitch period (Tj) and at least one sample (e(2?T0) of approximately the same position in the previous pitch period (Tj?1).
    Type: Grant
    Filed: October 17, 2007
    Date of Patent: April 9, 2013
    Assignee: France Telecom
    Inventors: Balazs Kovesi, Stéphane Ragot
  • Patent number: 8417520
    Abstract: The invention proposes the synthesis of a signal consisting of consecutive blocks. It proposes more particularly, on receipt of such a signal, to replace, by synthesis, lost or erroneous blocks of this signal. To this end, it proposes an attenuation of the overvoicing during the generation of a signal synthesis. More particularly, a voiced excitation is generated on the basis of the pitch period (T) estimated or transmitted at the previous block, by optionally applying a correction of plus or minus a sample of the duration of this period (counted in terms of number of samples), by constituting groups (A?,B?,C?,D?) of at least two samples and inverting positions of samples in the groups, randomly (B?,C?) or in a forced manner. An over-harmonicity in the excitation generated is thus broken and the effect of overvoicing in the synthesis of the generated signal is thereby attenuated.
    Type: Grant
    Filed: October 17, 2007
    Date of Patent: April 9, 2013
    Assignee: France Telecom
    Inventors: David Virette, Balazs Kovesi
  • Patent number: 8417518
    Abstract: A voice recognition system comprises: a voice input unit that receives an input signal from a voice input element and output it; a voice detection unit that detects an utterance segment in the input signal; a voice recognition unit that performs voice recognition for the utterance segment; and a control unit that outputs a control signal to at least one of the voice input unit and the voice detection unit and suppresses a detection frequency if the detection frequency satisfies a predetermined condition.
    Type: Grant
    Filed: February 27, 2008
    Date of Patent: April 9, 2013
    Assignee: NEC Corporation
    Inventor: Toru Iwasawa
  • Publication number: 20130066626
    Abstract: A speech enhancement method is disclosed. The method includes the steps of: receiving a plurality of frames of sound signals by a microphone array; calculating an inter-aural time difference for each frequency band of each frame of the sound signals corresponding to at least one two-microphone set of the microphone array; calculating a plurality of values of cumulative histograms according to the calculated inter-aural time differences; determining a first inter-aural time difference threshold according to the calculated value of the cumulative histograms; and filtering the plurality of frames of sound signals according to the first inter-aural time difference threshold.
    Type: Application
    Filed: March 30, 2012
    Publication date: March 14, 2013
    Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventor: HSIEN CHENG LIAO
  • Patent number: 8396704
    Abstract: Generally speaking, embodiments of the present invention relate to speech processing such as, for example, speech recognition. Speech processing according to one embodiment of the present invention can be performed based on the occurrence of events within the electrical signals representing speech. Such events need not comprise instantaneous occurrences but rather, an occurrence within the electrical signal spanning some period of time. Furthermore, the electrical signal can be analyzed based on the occurrence and location of these events so that less than all of the signal is analyzed. That is, the spoken sounds can be processed based on regions of the signal around and including the events but excluding other portions of the signal. For example, transition periods before the occurrence of the events may be excluded to eliminate noise or transients introduced at that part of the signal.
    Type: Grant
    Filed: October 23, 2008
    Date of Patent: March 12, 2013
    Assignee: Red Shift Company, LLC
    Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
  • Patent number: 8386244
    Abstract: The present invention provides a system and method for representing quasi-periodic (“qp”) waveforms comprising, representing a plurality of limited decompositions of the qp waveform, wherein each decomposition includes a first and second amplitude value and at least one time value. In some embodiments, each of the decompositions is phase adjusted such that the arithmetic sum of the plurality of limited decompositions reconstructs the qp waveform. These decompositions are stored into a data structure having a plurality of attributes. Optionally, these attributes are used to reconstruct the qp waveform, or patterns or features of the qp wave can be determined by using various pattern-recognition techniques. Some embodiments provide a system that uses software, embedded hardware or firmware to carry out the above-described method. Some embodiments use a computer-readable medium to store the data structure and/or instructions to execute the method.
    Type: Grant
    Filed: August 29, 2011
    Date of Patent: February 26, 2013
    Assignee: Digital Intelligence, L.L.C.
    Inventors: Carlos A. Ricci, Vladimir V. Kovtun
  • Patent number: 8386246
    Abstract: A system is described that performs frame erasure concealment to generate frames of an output speech signal corresponding to a series of erased frames of encoded bit-stream in a manner that conceals the quality-degrading effects of such erased frames. In one embodiment, responsive to the detection of a first erased frame in the series, a number of steps are performed. These steps include deriving long-term and short synthesis filters based on previously-generated portions of the output speech signal, calculating a ringing signal segment based on the long-term and short-term synthesis filters, and generating a frame of the output speech signal corresponding to the first erased frame by overlap adding the ringing signal segment to an extrapolated waveform. Deriving the long-term filter includes estimating a pitch period based on a previously-generated portion of the output speech signal by finding a lag that minimizes a sum of magnitude difference function.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: February 26, 2013
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 8380484
    Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.
    Type: Grant
    Filed: August 10, 2004
    Date of Patent: February 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8364472
    Abstract: Provided is an audio encoding device which can detect an optimal pitch pulse when using pitch pulse information as redundant information.
    Type: Grant
    Filed: February 29, 2008
    Date of Patent: January 29, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Publication number: 20130010983
    Abstract: A signal manipulator for manipulating an audio signal having a transient event may have a transient remover, a signal processor and a signal inserter for inserting a time portion in a processed audio signal at a signal location where the transient event was removed before processing by the transient remover, so that a manipulated audio signal has a transient event not influenced by the processing, whereby the vertical coherence of the transient event is maintained instead of any processing performed in the signal processor, which would destroy the vertical coherence of a transient.
    Type: Application
    Filed: May 7, 2012
    Publication date: January 10, 2013
    Inventors: Sascha DISCH, Frederik Nagel, Nikolaus Rettelbach, Markus Multrus, Guillaume Fuchs
  • Patent number: 8352249
    Abstract: An encoding device improves the sound quality of a stereo signal while maintaining a low bit rate. The encoding device includes: an LP inverse filter which LP-inverse-filters a left signal L(n) by using an inverse quantization linear prediction coefficient AdM(z) of a monaural signal; a T/F conversion unit which converts the left sound source signal Le(n) from a temporal region to a frequency region; an inverse quantizer which inverse-quantizes encoded information Mqe; spectrum division units which divide a high-frequency component of the sound source signal Mde(f) and the left signal Le(f) into a plurality of bands; and scale factor calculation units which calculate scale factors ai and ssi by using a monaural sound source signal Mdeh,i(f), a left sound source signal Leh,i(f), Mdeh,i(f), and right sound source signal Reh,i(f) of each divided band.
    Type: Grant
    Filed: November 4, 2008
    Date of Patent: January 8, 2013
    Assignee: Panasonic Corporation
    Inventors: Kok Seng Chong, Koji Yoshida, Masahiro Oshikiri