Time Patents (Class 704/211)
-
Patent number: 8676584Abstract: The invention relates to a digital signal processing technique that changes the length of an audio signal and, thus, effectively its play-out speed. This is used for frame rate conversion or sound effects in music production. Time scaling may further be used for fast forward or slow-motion audio play-out. According said method the waveform similarity overlap add approach is modified such that a maximized similarity is determined among similarity measures of sub-sequence pairs each comprising a sub-sequence to-be-matched from a input window and a matching sub-sequence from a search window wherein said sub-sequence pairs comprise at least two sub-sequence pairs of which a first pair comprises a first sub-sequence to-be-matched and a second pair comprises a different second sub-sequence to-be-matched. The input window allows for finding sub-sequence pairs with higher similarity than with a WSOLA approach based on a single sub-sequence to-be-matched. This results in less perceivable artefacts.Type: GrantFiled: June 22, 2009Date of Patent: March 18, 2014Assignee: Thomson LicensingInventor: Markus Schlosser
-
Patent number: 8666752Abstract: Provided are an encoding apparatus and a decoding apparatus of a multi-channel signal. The encoding apparatus of the multi-channel signal may process a phase parameter associated with phase information between a plurality of channels constituting the multi-channel signal, based on a characteristic of the multi-channel signal. The encoding apparatus may generate an encoded bitstream with respect to the multi-channel signal using the processed phase parameter and a mono signal extracted from the multi-channel signal.Type: GrantFiled: March 17, 2010Date of Patent: March 4, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Jung-Hoe Kim, Eun Mi Oh
-
Patent number: 8666734Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.Type: GrantFiled: September 23, 2010Date of Patent: March 4, 2014Assignee: University of Maryland, College ParkInventors: Carol Espy-Wilson, Srikanth Vishnubhotla
-
Patent number: 8660842Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.Type: GrantFiled: March 9, 2010Date of Patent: February 25, 2014Assignee: Honda Motor Co., Ltd.Inventor: Antoine R. Raux
-
Patent number: 8645145Abstract: An audio decoder includes an arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically encoded representation of the spectral values, and a frequency-domain-to-time-domain converter for providing a time-domain audio representation using the decoded spectral values. The arithmetic decoder selects a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state described by a numeric current context value. The arithmetic decoder determines the numeric current context value in dependence on a plurality of previously decoded spectral values. The arithmetic decoder evaluates a hash table, entries of which define both significant state values and boundaries of intervals of numeric context values, in order to select the mapping rule. A mapping rule index value is individually associated to a numeric context value being a significant state value.Type: GrantFiled: July 12, 2012Date of Patent: February 4, 2014Assignee: Fraunhoffer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.Inventors: Vignesh Subbaraman, Guillaume Fuchs, Markus Multrus, Nikolaus Rettelbach, Marc Gayer, Oliver Weiss, Christian Griebel, Patrick Warmbold
-
Patent number: 8634708Abstract: The invention relates to a method for creating a new roundup of an audiovisual document previously recorded in a device. The document contains two parts, one being the roundup and the other composed of a plurality of reports. The roundup is itself divided into a plurality of parts. The device first searches for the associations between the roundup parts and the reports, and detects the reports that are not associated with roundup parts. Then, summaries are created for the reports not associated with the roundup, and incorporated into the initial roundup to create a new roundup. In this manner, the user can easily select any report from the roundup part associated with this report. The invention also relates to the receiver suitable for implementing the method.Type: GrantFiled: December 20, 2007Date of Patent: January 21, 2014Assignee: Thomson LicensingInventors: Louis Chevallier, Claire-Helene Demarty, Lionel Oisel
-
Patent number: 8630863Abstract: Provided is a method of encoding an audio/speech signal, the method including determining a variable length of a frame, that is, a processing unit of an input signal in accordance with a position of an attack in the input signal; transforming each frame of the input signal to a frequency domain and dividing the frame into a plurality of sub frequency bands; and, if a signal of a sub frequency band is determined to be encoded in the frequency domain, encoding the signal of the sub frequency band in the frequency domain, and if the signal of the sub frequency band is determined to be encoded in a time domain, inverse transforming the signal of the sub frequency band to the time domain and encoding the inverse transformed signal in the time domain. According to the present invention, the audio/speech signal may be efficiently encoded by controlling time resolution and frequency resolution.Type: GrantFiled: October 15, 2007Date of Patent: January 14, 2014Assignee: SAMSUNG Electronics Co., Ltd.Inventors: Chang-yong Son, Eun-mi Oh, Jung-hoe Kim, Ho-sang Sung, Kang-eun Lee, Ki-hyun Choo
-
Patent number: 8620645Abstract: A decoder arrangement comprising a receiver input for parameters of frame-based coded signals and a decoder arranged to provide frames of decoded audio signals based on the parameters. The receiver input and/or the decoder is arranged to establish a time difference between the occasion when parameters of a first frame is available at the receiver input and the occasion when a decoded audio signal of the first frame is available at an output of the decoder, which time difference corresponds to at least one frame. A postfilter is connected to the output of the decoder and to the receiver input. The postfilter is arranged to provide a filtering of the frames of decoded audio signals into an output signal in response to parameters of a respective subsequent frame.Type: GrantFiled: December 14, 2007Date of Patent: December 31, 2013Assignee: Telefonaktiebolaget L M Ericsson (publ)Inventor: Stefan Bruhn
-
Patent number: 8620646Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.Type: GrantFiled: August 8, 2011Date of Patent: December 31, 2013Assignee: The Intellisis CorporationInventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
-
Patent number: 8620643Abstract: A computer numerical processing method for representing audio information for use in conjunction with human hearing is described. The method comprises approximating an eigenfunction equation representing a model of human hearing, calculating the approximation to each of a plurality of eigenfunctions from at least one aspect of the eigenfunction equation, and storing the approximation to each of a plurality of eigenfunctions for use at a later time. The approximation to each of a plurality of eigenfunctions represents audio information. The model of human hearing includes a bandpass operation with a bandwidth having the frequency range of human hearing and a time-limiting operation approximating the time duration correlation window of human hearing.Type: GrantFiled: August 2, 2010Date of Patent: December 31, 2013Inventor: Lester F. Ludwig
-
Patent number: 8612222Abstract: A speech enhancement system improves the perceptual quality of a processed voice signal. The system improves the perceptual quality of a voice signal by removing unwanted noise components from a voice signal. The system removes undesirable signals that may result in the loss of information. The system receives and analyzes signals to determine whether an undesired random or persistent signal corresponds to one or more modeled noises. When one or more noise components are detected, the noise components are substantially removed or dampened from the signal to provide a less noisy voice signal.Type: GrantFiled: August 31, 2012Date of Patent: December 17, 2013Assignee: QNX Software Systems LimitedInventors: Phillip A. Hetherington, Shreyas A. Paranjpe
-
Patent number: 8606566Abstract: A system improves speech intelligibility by reconstructing speech segments. The system includes a low-frequency reconstruction controller programmed to select a predetermined portion of a time domain signal. The low-frequency reconstruction controller substantially blocks signals above and below the selected predetermined portion. A harmonic generator generates low-frequency harmonics in the time domain that lie within a frequency range controlled by a background noise modeler. A gain controller adjusts the low-frequency harmonics to substantially match the signal strength to the time domain original input signal.Type: GrantFiled: May 23, 2008Date of Patent: December 10, 2013Assignee: QNX Software Systems LimitedInventors: Xueman Li, Rajeev Nongpiur, Frank Linseisen, Phillip A. Hetherington
-
Patent number: 8595005Abstract: A computerized method, software, and system for recognizing emotions from a speech signal, wherein statistical and MFCC features are extracted from the speech signal, the MFCC features are sorted to provide a basis for comparison between the speech signal and reference samples, the statistical and MFCC features are compared between the speech signal and reference samples, a scoring system is used to compare relative correlation to different emotions, a probable emotional state is assigned to the speech signal based on the scoring system, and the probable emotional state is communicated to a user.Type: GrantFiled: April 22, 2011Date of Patent: November 26, 2013Assignee: Simple Emotion, Inc.Inventors: Akash Krishnan, Matthew Fernandez
-
Patent number: 8589166Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.Type: GrantFiled: September 21, 2010Date of Patent: November 19, 2013Assignee: Broadcom CorporationInventor: Robert W. Zopf
-
Patent number: 8583434Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.Type: GrantFiled: January 29, 2008Date of Patent: November 12, 2013Assignee: CallMiner, Inc.Inventor: Jeffrey A. Gallino
-
Publication number: 20130297299Abstract: The speech feature extraction algorithm is based on a hierarchical combination of auditory similarity and pooling functions. Computationally efficient features referred to as “Sparse Auditory Reproducing Kernel” (SPARK) coefficients are extracted under the hypothesis that the noise-robust information in speech signal is embedded in a reproducing kernel Hilbert space (RKHS) spanned by overcomplete, nonlinear, and time-shifted gammatone basis functions. The feature extraction algorithm first involves computing kernel based similarity between the speech signal and the time-shifted gammatone functions, followed by feature pruning using a simple pooling technique (“MAX” operation). Different hyper-parameters and kernel functions may be used to enhance the performance of a SPARK based speech recognizer.Type: ApplicationFiled: March 7, 2013Publication date: November 7, 2013Applicant: Board of Trustees of Michigan State UniversityInventors: Shantanu Chakrabartty, Amin Fazeldehkordi
-
Patent number: 8577672Abstract: A method and apparatus of providing an audio output to a user in a communications system in which the audio to be output to a user, preferably an audio frame, is assessed before it is broadcast to the user, and then selectively changed on the basis of the assessment. The assessment may be carried out in the audio encoding process, in the audio decoding process and/or after the audio decoding process. The selective changing of the audio output may comprise selectively replacing the audio output and/or re-encoding of the audio output.Type: GrantFiled: February 27, 2008Date of Patent: November 5, 2013Assignee: Audax Radio Systems LLPInventor: Graham Kinns
-
Patent number: 8571852Abstract: A scalable decoder device (50) for signals representing audio comprises a primary decoder (21) connected to an input (40). The primary decoder (21) is arranged to provide a primary decoded signal (23) based on received parameters (4). A primary postfilter (31) is connected to the primary decoder (23) to provide a primary postfiltered signal (32). A secondary enhancement decoder (45) is connected to the input (40) and arranged to provide a secondary decoded enhancement signal (44). The device further comprises a combiner arrangement (55), arranged for combining the primary postfiltered signal (32) and a signal (53) based on the secondary decoded enhancement signal (44) into an output signal (6) to be provided at an output (6). The combining is made with an adaptable strength relation between contributions from the two signals. A method for decoding coded signals representing audio operates in analogy with the scalable decoder device (50).Type: GrantFiled: December 14, 2007Date of Patent: October 29, 2013Assignee: Telefonaktiebolaget L M Ericsson (publ)Inventor: Stefan Bruhn
-
Patent number: 8566092Abstract: The present invention discloses a method and an apparatus for extracting a prosodic feature of a speech signal, the method including: dividing the speech signal into speech frames; transforming the speech frames from time domain to frequency domain; and extracting respective prosodic features for different frequency ranges. According to the above technical solution of the present invention, it is possible to effectively extract the prosodic feature which can combine with a traditional acoustics feature without any obstacle.Type: GrantFiled: August 16, 2010Date of Patent: October 22, 2013Assignee: Sony CorporationInventors: Kun Liu, Weiguo Wu
-
Patent number: 8566084Abstract: A speech signal processing system which outputs a speech feature, divides an input speech signal into frames so that each pair of consecutive frames have a frame shift length equal to at least one period of the speech signal and have an overlap equal to at least a predetermined length, applies discrete Fourier transform to each of the frames, calculates a CSP coefficient for the pair, searches a predetermined search range in which a speech wave lags a period equal to at least one period to obtain the maximum value of the CSP coefficient for the pair, and generates time-series data of the maximum CSP coefficient values arranged in the order in which the frames appear. A method and a computer readable article of manufacture for the implementing the same are also provided.Type: GrantFiled: June 1, 2011Date of Patent: October 22, 2013Assignee: Nuance Communications, Inc.Inventors: Osamu Ichikawa, Masafumi Nishimura
-
Patent number: 8566083Abstract: An audio signal may have a BL and an EL, wherein the EL represents additional information for enhancing the quality of the BL audio content. Decoding of such dual-layer signals usually comprises partial decoding of the BL data, wherein frequency bins of the BL are restored, mapping the restored frequency bins to the MDCT domain, adding them to the decoded EL and performing inverse Integer MDCT. A low-complexity method for decoding comprises reverse mapping of the decoded EL data, adding the reverse mapped EL data to the partially decoded BL data and filtering the sum, using the inverse BL filter bank.Type: GrantFiled: September 3, 2010Date of Patent: October 22, 2013Assignee: Thomson LicensingInventors: Peter Jax, Sven Kordon
-
Publication number: 20130253921Abstract: An embodiment of the present invention is a method of presenting at least a portion of an audio or audio-visual work including: (a) retrieving an average speed contour or a democratic speed contour from a database apparatus; and (b) presenting the at least a portion at a playback apparatus using the retrieved average speed contour or democratic speed contour to provide presentation rates.Type: ApplicationFiled: May 16, 2013Publication date: September 26, 2013Applicant: Enounce IncorporatedInventor: Donald J. Hejna, Jr.
-
Patent number: 8521520Abstract: Provided are methods and systems of managing handoffs in a wireless communication system having different types of vocoders. Some embodiments include translating state memory of a first vocoder to a second vocoder using a state memory transcoder. The state memory may be delayed to align differences in algorithmic delays between the first vocoder and the second vocoder. In one embodiment, a speech signal may be decoded from the first vocoder, delayed, and encoded to the second vocoder. Furthermore, for a period of time during and/or adjacent to the handoff, the first vocoder may output with decreasing amplitude while the second vocoder outputs with increasing amplitude. Such techniques may be used alone or in combination.Type: GrantFiled: February 3, 2010Date of Patent: August 27, 2013Assignee: General Electric CompanyInventors: Richard Louis Zinser, Michael James Hartman, John Erik Hershey
-
Patent number: 8503517Abstract: A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions.Type: GrantFiled: June 3, 2009Date of Patent: August 6, 2013Assignee: QUALCOMM IncorporatedInventors: Pengjun Huang, Christian Pietsch, Christian Sgraja, Georg Frank, Christoph A. Joetten, Marc W. Werner, Wolfgang Granzow
-
Patent number: 8494845Abstract: Provided is a signal distortion elimination apparatus comprising: an inverse filter application means that outputs the signal obtained by applying an inverse filter to an observed signal as a restored signal when a predetermined iteration termination condition is met and outputs the signal obtained by applying the inverse filter to the observed signal as an ad-hoc signal when the predetermined iteration termination condition is not met; a prediction error filter calculation means that segments the ad-hoc signal into frames and outputs a prediction error filter of each frame obtained by performing linear prediction analysis of the ad-hoc signal of each frame; an inverse filter calculation means that calculates an inverse filter such that a concatenation of innovation estimates of the respective frames becomes mutually independent among their samples, where the innovation estimate of a single frame (an innovation estimate) is the signal obtained by applying the prediction error filter of the corresponding frameType: GrantFiled: February 16, 2007Date of Patent: July 23, 2013Assignee: Nippon Telegraph and Telephone CorporationInventors: Takuya Yoshioka, Takafumi Hikichi, Masato Miyoshi
-
Patent number: 8494849Abstract: A method of transmitting speech data to a remote device in a distributed speech recognition system, includes the steps of: dividing an input speech signal into frames; calculating, for each frame, a voice activity value representative of the presence of speech activity in the frame; grouping the frames into multiframes, each multiframe including a predetermined number of frames; calculating, for each multiframe, a voice activity marker representative of the number of frames in the multiframe representing speech activity; and selectively transmitting, on the basis of the voice activity marker associated with each multiframe, the multiframes to the remote device.Type: GrantFiled: June 20, 2005Date of Patent: July 23, 2013Assignee: Telecom Italia S.p.A.Inventors: Ivano Salvatore Collotta, Donato Ettorre, Maurizio Fodrini, Pierluigi Gallo, Roberto Spagnolo
-
Patent number: 8489404Abstract: A method for detecting a transient in an audio signal that has been broken up into frames includes obtaining a time domain feature of the frames and comparing the domain feature with a predetermined value. If the time domain feature is greater than the predetermined value, the frames are taken as transient and if the time domain feature is less than the predetermined value, the frames are taken as non-transient. The method has a low computational intensity and is thus very suitable for devices with limited processing resources.Type: GrantFiled: March 15, 2011Date of Patent: July 16, 2013Assignee: Freescale Semiconductor, Inc.Inventors: Zhongsong Lin, Shidong Shang, Shengjiu Wang
-
Patent number: 8484018Abstract: An input frame data producing unit produces from data stored in an input buffer input frames each including a predetermined number of sub-frames of a first hopsize determined based on the first frame size and the overlapping rate. A frame processing unit executes a window function on the input frames and shifts the windowed input frames by the first hopsize and overlaps the shifted input frames, storing the overlapped frames in an output frame. An output buffer data producing frame unit stores data from the output frame to an output buffer including a predetermined number of sub-frames of a second hopsize. A CPU sets the first hopsize and overlapping rate in a slow-speed reproduction when the reproducing speed ratio is set lower than 1 different from in a high-speed reproduction when the reproducing speed ratio is set larger than 1.Type: GrantFiled: July 15, 2010Date of Patent: July 9, 2013Assignee: Casio Computer Co., LtdInventor: Masaru Setoguchi
-
Patent number: 8478599Abstract: An embodiment of the present invention is a method of presenting a media work which includes: detecting media work content properties in a portion of the media work; associating a presentation rate of the portion with the detected media work content properties; and presenting the portion at the presentation rate; wherein the media work content properties include one or more of: (a) indicia of a number of syllables in utterances; (b) indicia of a number of letters in a word; (c) indicia of the complexity of grammatical structures in portions of the media work; (d) indicia of arrival rate of newly presented objects; (e) indicia of temporal proximity of between events in portions of the media work or (f) indicia of number of phonemes per unit of time in portions of the media work.Type: GrantFiled: May 18, 2009Date of Patent: July 2, 2013Assignee: Enounce, Inc.Inventor: Donald J. Hejna, Jr.
-
Patent number: 8473298Abstract: A digital audio signal can be processed using continuously variable time-frequency resolution by selecting a portion of an input digital audio signal, resampling the selected portion of the input digital audio signal, generating a plurality of spectral characteristics associated with the resampled portion of the input digital audio signal, generating a portion of an output digital audio signal from the plurality of spectral characteristics, and resampling the portion of the output digital audio signal. Further, resampling the selected portion of the input digital audio signal can comprise determining a sampling ratio and resampling the selected portion of the input digital audio signal in accordance with the determined sampling ratio. Additionally, the portion of the output digital audio signal can be resampled in accordance with the inverse of the determined sampling ratio. The sampling ratio can be determined based on a time-frequency resolution requirement associated with an audio processing algorithm.Type: GrantFiled: November 1, 2005Date of Patent: June 25, 2013Assignee: Apple Inc.Inventor: Kevin Christopher Rogers
-
Patent number: 8468026Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. At least one frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, and (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes.Type: GrantFiled: August 7, 2012Date of Patent: June 18, 2013Assignee: Digital Rise Technology Co., Ltd.Inventor: Yuli You
-
Patent number: 8457955Abstract: A voice reproduction apparatus includes an ambient sound analysis unit to analyze a characteristic of an ambient sound, a characteristic analysis unit to analyze an acoustic characteristic of a signal for reproduction, a reproduction timing adjusting unit to record the signal for reproduction and to read the signal for reproduction at a reproduction timing of follow-up reproduction, a reproduction speed changing unit to change a reproduction speed of the read signal for reproduction, and a control unit to control the reproduction timing adjusting unit so that the signal for reproduction is reproduced at the reproduction timing corresponding to an analysis result of the ambient sound analysis unit and to control the reproduction speed changing unit so that the signal for reproduction is reproduced at the reproduction speed corresponding to the analysis result of the ambient sound analysis unit and the acoustic characteristic obtained by the characteristic analysis unit.Type: GrantFiled: March 1, 2012Date of Patent: June 4, 2013Assignee: Fujitsu LimitedInventors: Taro Togawa, Takeshi Otani, Kaori Endo, Yasuji Ota
-
Patent number: 8452605Abstract: An embodiment of an apparatus for generating audio subband values in audio subband channels has an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function having a sequence of window coefficients to obtain windowed samples. The analysis window function has a first group of window coefficients and a second group of window coefficients. The first group of window coefficients is used for windowing later time-domain samples and the second group of window coefficients is used for windowing an earlier time-domain samples. The apparatus further has a calculator for calculating the audio subband values using the windowed samples.Type: GrantFiled: October 23, 2007Date of Patent: May 28, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.Inventors: Markus Schnell, Manfred Lutzky, Markus Lohwasser, Markus Schmidt, Marc Gayer, Michael Mellar, Bernd Edler, Markus Multrus, Gerald Schuller, Ralf Geiger, Bernhard Grill
-
Patent number: 8452589Abstract: An embodiment of the present invention is a method of storing a speed contour for use in playback of at least a portion of an audio or audio-visual work including: (a) generating one or more speed contours and/or average speed contours and/or democratic speed contours for the audio or audio-visual work; (b) storing the one or more speed contours and/or average speed contours and/or democratic speed contours in a database; and (c) associating retrieval information with the one or more stored contours in the database.Type: GrantFiled: February 28, 2011Date of Patent: May 28, 2013Assignee: Enounce IncorporatedInventor: Donald J. Hejna, Jr.
-
Publication number: 20130124200Abstract: Noise robust template matching may be performed. First features of a first signal may be computed. Based at least on a portion of the first features, second features of a second signal may be computed. A new signal may be generated based on at least another portion of the first features and on at least a portion of the second features.Type: ApplicationFiled: December 22, 2011Publication date: May 16, 2013Inventors: Gautham J. Mysore, Paris Smaragdis, Brian John King
-
Patent number: 8438016Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.Type: GrantFiled: April 10, 2008Date of Patent: May 7, 2013Assignee: City University of Hong KongInventors: Weijia Jia, Lizhuo Zhang, Huan Li, Wenyan Lu
-
Patent number: 8438015Abstract: An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.Type: GrantFiled: October 23, 2007Date of Patent: May 7, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Markus Schnell, Manfred Lutzky, Markus Lohwasser, Markus Schmidt, Marc Gayer, Michael Mellar, Bernd Edler, Markus Multrus, Gerald Schuller, Ralf Geiger, Bernhard Grill
-
Patent number: 8417532Abstract: The transient problem may be sufficiently addressed, and for this purpose, a further delay on the side of the decoding may be reduced if a new SBR frame class is used wherein the frame boundaries are not shifted, i.e. the grid boundaries are still synchronized with the frame boundaries, but wherein a transient position indication is additionally used as a syntax element so as to be used, on the encoder and/or decoder sides, within the frames of these new frame class for determining the grid boundaries within these frames.Type: GrantFiled: October 18, 2007Date of Patent: April 9, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung E.V.Inventors: Markus Schnell, Michael Schuldt, Manfred Lutzky, Manuel Jander
-
Patent number: 8417519Abstract: The present invention relates to signal modification before pitch period repetition for the synthesis of blocks lost on decoding digital audio signals. The effects of repetition of transitories, such as the plosives of a speech signal, are avoided by comparing the samples of a pitch period with those of the previous pitch period. The signal is modified preferentially by taking the minimum between a current sample (e(3)) of the last pitch period (Tj) and at least one sample (e(2?T0) of approximately the same position in the previous pitch period (Tj?1).Type: GrantFiled: October 17, 2007Date of Patent: April 9, 2013Assignee: France TelecomInventors: Balazs Kovesi, Stéphane Ragot
-
Patent number: 8417520Abstract: The invention proposes the synthesis of a signal consisting of consecutive blocks. It proposes more particularly, on receipt of such a signal, to replace, by synthesis, lost or erroneous blocks of this signal. To this end, it proposes an attenuation of the overvoicing during the generation of a signal synthesis. More particularly, a voiced excitation is generated on the basis of the pitch period (T) estimated or transmitted at the previous block, by optionally applying a correction of plus or minus a sample of the duration of this period (counted in terms of number of samples), by constituting groups (A?,B?,C?,D?) of at least two samples and inverting positions of samples in the groups, randomly (B?,C?) or in a forced manner. An over-harmonicity in the excitation generated is thus broken and the effect of overvoicing in the synthesis of the generated signal is thereby attenuated.Type: GrantFiled: October 17, 2007Date of Patent: April 9, 2013Assignee: France TelecomInventors: David Virette, Balazs Kovesi
-
Patent number: 8417518Abstract: A voice recognition system comprises: a voice input unit that receives an input signal from a voice input element and output it; a voice detection unit that detects an utterance segment in the input signal; a voice recognition unit that performs voice recognition for the utterance segment; and a control unit that outputs a control signal to at least one of the voice input unit and the voice detection unit and suppresses a detection frequency if the detection frequency satisfies a predetermined condition.Type: GrantFiled: February 27, 2008Date of Patent: April 9, 2013Assignee: NEC CorporationInventor: Toru Iwasawa
-
Publication number: 20130066626Abstract: A speech enhancement method is disclosed. The method includes the steps of: receiving a plurality of frames of sound signals by a microphone array; calculating an inter-aural time difference for each frequency band of each frame of the sound signals corresponding to at least one two-microphone set of the microphone array; calculating a plurality of values of cumulative histograms according to the calculated inter-aural time differences; determining a first inter-aural time difference threshold according to the calculated value of the cumulative histograms; and filtering the plurality of frames of sound signals according to the first inter-aural time difference threshold.Type: ApplicationFiled: March 30, 2012Publication date: March 14, 2013Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTEInventor: HSIEN CHENG LIAO
-
Patent number: 8396704Abstract: Generally speaking, embodiments of the present invention relate to speech processing such as, for example, speech recognition. Speech processing according to one embodiment of the present invention can be performed based on the occurrence of events within the electrical signals representing speech. Such events need not comprise instantaneous occurrences but rather, an occurrence within the electrical signal spanning some period of time. Furthermore, the electrical signal can be analyzed based on the occurrence and location of these events so that less than all of the signal is analyzed. That is, the spoken sounds can be processed based on regions of the signal around and including the events but excluding other portions of the signal. For example, transition periods before the occurrence of the events may be excluded to eliminate noise or transients introduced at that part of the signal.Type: GrantFiled: October 23, 2008Date of Patent: March 12, 2013Assignee: Red Shift Company, LLCInventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
-
Patent number: 8386244Abstract: The present invention provides a system and method for representing quasi-periodic (“qp”) waveforms comprising, representing a plurality of limited decompositions of the qp waveform, wherein each decomposition includes a first and second amplitude value and at least one time value. In some embodiments, each of the decompositions is phase adjusted such that the arithmetic sum of the plurality of limited decompositions reconstructs the qp waveform. These decompositions are stored into a data structure having a plurality of attributes. Optionally, these attributes are used to reconstruct the qp waveform, or patterns or features of the qp wave can be determined by using various pattern-recognition techniques. Some embodiments provide a system that uses software, embedded hardware or firmware to carry out the above-described method. Some embodiments use a computer-readable medium to store the data structure and/or instructions to execute the method.Type: GrantFiled: August 29, 2011Date of Patent: February 26, 2013Assignee: Digital Intelligence, L.L.C.Inventors: Carlos A. Ricci, Vladimir V. Kovtun
-
Patent number: 8386246Abstract: A system is described that performs frame erasure concealment to generate frames of an output speech signal corresponding to a series of erased frames of encoded bit-stream in a manner that conceals the quality-degrading effects of such erased frames. In one embodiment, responsive to the detection of a first erased frame in the series, a number of steps are performed. These steps include deriving long-term and short synthesis filters based on previously-generated portions of the output speech signal, calculating a ringing signal segment based on the long-term and short-term synthesis filters, and generating a frame of the output speech signal corresponding to the first erased frame by overlap adding the ringing signal segment to an extrapolated waveform. Deriving the long-term filter includes estimating a pitch period based on a previously-generated portion of the output speech signal by finding a lag that minimizes a sum of magnitude difference function.Type: GrantFiled: June 27, 2008Date of Patent: February 26, 2013Assignee: Broadcom CorporationInventor: Juin-Hwey Chen
-
Patent number: 8380484Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.Type: GrantFiled: August 10, 2004Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
-
Patent number: 8370132Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.Type: GrantFiled: November 21, 2005Date of Patent: February 5, 2013Assignee: Verizon Services Corp.Inventor: Adrian E. Conway
-
Patent number: 8364472Abstract: Provided is an audio encoding device which can detect an optimal pitch pulse when using pitch pulse information as redundant information.Type: GrantFiled: February 29, 2008Date of Patent: January 29, 2013Assignee: Panasonic CorporationInventor: Hiroyuki Ehara
-
Publication number: 20130010983Abstract: A signal manipulator for manipulating an audio signal having a transient event may have a transient remover, a signal processor and a signal inserter for inserting a time portion in a processed audio signal at a signal location where the transient event was removed before processing by the transient remover, so that a manipulated audio signal has a transient event not influenced by the processing, whereby the vertical coherence of the transient event is maintained instead of any processing performed in the signal processor, which would destroy the vertical coherence of a transient.Type: ApplicationFiled: May 7, 2012Publication date: January 10, 2013Inventors: Sascha DISCH, Frederik Nagel, Nikolaus Rettelbach, Markus Multrus, Guillaume Fuchs
-
Patent number: 8352249Abstract: An encoding device improves the sound quality of a stereo signal while maintaining a low bit rate. The encoding device includes: an LP inverse filter which LP-inverse-filters a left signal L(n) by using an inverse quantization linear prediction coefficient AdM(z) of a monaural signal; a T/F conversion unit which converts the left sound source signal Le(n) from a temporal region to a frequency region; an inverse quantizer which inverse-quantizes encoded information Mqe; spectrum division units which divide a high-frequency component of the sound source signal Mde(f) and the left signal Le(f) into a plurality of bands; and scale factor calculation units which calculate scale factors ai and ssi by using a monaural sound source signal Mdeh,i(f), a left sound source signal Leh,i(f), Mdeh,i(f), and right sound source signal Reh,i(f) of each divided band.Type: GrantFiled: November 4, 2008Date of Patent: January 8, 2013Assignee: Panasonic CorporationInventors: Kok Seng Chong, Koji Yoshida, Masahiro Oshikiri