Speech Signal Processing Patents (Class 704/200)

Psychoacoustic (Class 704/200.1)

For storage or transmission (Class 704/201)

Recognition (Class 704/231)

Synthesis (Class 704/258)

Application (Class 704/270)

Device and method for computer-assisted learning

Patent number: 8348670

Abstract: A method for the computer-assisted learning of orthography, the method includes executing the following steps by a data processing system: retrieving (11) a main set of words from a data storage; retrieving (12) an error data set associated with said main set of words from the data storage; repeatedly executing the following steps: selecting (13) a word to prompt the user with, by computing, for each word from the error data set, a statistic measure related to the probability of an error occurring in the word, and selecting the word which has the maximum value of the statistic measure; prompting (14) the user with the word; accepting (15) a user input specifying a sequence of symbols; comparing (16) the user input with the word; updating (17, 18) and storing the error data set.

Type: Grant

Filed: July 25, 2008

Date of Patent: January 8, 2013

Assignee: Dybuster AG

Inventors: Christian Voegeli, Markus Gross
Spectro-temporal varying approach for speech enhancement

Patent number: 8352257

Abstract: The present system proposes a technique called the spectro-temporal varying technique, to compute the suppression gain. This method is motivated by the perceptual properties of human auditory system; specifically, that the human ear has higher frequency resolution in the lower frequencies band and less frequency resolution in the higher frequencies, and also that the important speech information in the high frequencies are consonants which usually have random noise spectral shape. A second property of the human auditory system is that the human ear has lower temporal resolution in the lower frequencies and higher temporal resolution in the higher frequencies. Based on that, the system uses a spectro-temporal varying method which introduces the concept of frequency-smoothing by modifying the estimation of the a posteriori SNR. In addition, the system also makes the a priori SNR time-smoothing factor depend on frequency.

Type: Grant

Filed: December 20, 2007

Date of Patent: January 8, 2013

Assignee: QNX Software Systems Limited

Inventors: Phil A. Hetherington, Xueman Li
Systems and methods for active listening/observing and event detection

Patent number: 8348839

Abstract: Embodiments of the present invention provide a method of monitoring an environment comprising: monitoring at least one data stream wherein the data stream is a data stream in the environment; detecting a specified event from the data stream; and triggering a response to the specified event. Embodiments of the present invention provide a system for monitoring an environment comprising: a receiver adapted to receive at least one input data stream wherein the input data stream is a data stream in the environment; an active listener/observer system adapted to monitor the data stream; and an interface adapted to express at least one output stream. Embodiments of the present invention provide a computer-readable medium having instructions comprising: an active listener/observer routine configured to monitor at least one data stream; a detection routine configured to find specified events in the data stream; and an output routine configured to express a response event.

Type: Grant

Filed: April 10, 2007

Date of Patent: January 8, 2013

Assignee: General Electric Company

Inventors: Pallav Sharda, Steven Eric Linthicum
INFORMATION RETRIEVING APPARATUS, INFORMATION RETRIEVING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20130006616

Abstract: According to an embodiment, an information retrieving apparatus includes a housing; an input-output unit to perform dialogue processing with a user; a first detecting unit to detect means of transfer which indicates present means of transfer for the user; a second detecting unit to detect a holding status which indicates whether the user is holding the housing; a third detecting unit to detect a talking posture which indicates whether the housing is held near the face of the user; a selecting unit to select, from among a plurality of interaction modes that establish the dialogue processing, an interaction mode according to a combination of the means of transfer, the holding status, and the talking posture; an dialogue manager to control the dialogue processing according to the selected interaction mode; and a information retrieval unit to retrieve information using a keyword that is input during the dialogue processing.

Type: Application

Filed: July 2, 2012

Publication date: January 3, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Hiromi Wakaki, Kazuo Sumita, Hiroko Fujii, Masaru Suzuki, Michiaki Ariga
Audio encoding method and device

Patent number: 8340305

Abstract: Audio encoding method and device comprising the transmission, in addition to the data representing a frequency-limited signal, of information relating to a temporal filter that is to be applied to the entire enhanced signal, both in its transmitted low-frequency part and in its reconstituted high-frequency part. The application of this filter for reshaping the reconstituted high-frequency part and the correction of compression artefacts present in the transmitted low-frequency part. In this way, the application of the temporal filter, simple and inexpensive, to all or part of the reconstituted signal, makes it possible to obtain a signal of good perceived quality.

Type: Grant

Filed: December 28, 2007

Date of Patent: December 25, 2012

Assignee: Mobiclip

Inventor: Alexandre Delattre
Echo suppressing system, echo suppressing method, recording medium, echo suppressor, sound output device, audio system, navigation system and mobile object

Patent number: 8340963

Abstract: An echo suppressing system includes: a sound output device for outputting sound based on a sound signal, including a passing section for allowing passage of a component of a different frequency band, and a plurality of sound output sections, each of which outputs sound based on each of the plurality of sound signals passed through the passing section; a summer for summing the plurality of sound signals to generate a reference sound signal; a sound input device for converting input sound into a sound signal; and an echo suppressor for suppressing echo based on the sound output by the sound output device, including an input section to which a sound signal is input from the sound input device as an observation sound signal, and a correction section for correcting the observation sound signal so as to suppress echo included in the observation sound signal.

Type: Grant

Filed: April 8, 2010

Date of Patent: December 25, 2012

Assignee: Fujitsu Limited

Inventors: Naoshi Matsuo, Taisuke Itou
Data embedding device and data extraction device

Patent number: 8340973

Abstract: A data embedding device for embedding data in a speech code obtained by encoding a speech in accordance with a speech encoding method based on a voice generation process of a human being, includes an embedding judgment unit, every speech code, judging whether or not data should be embedded in the speech code, and an embedding unit embedding data in two or more parameter codes of a plurality of parameter codes constituting the speech code for which it is judged by the embedding judgment unit that the data should be embedded.

Type: Grant

Filed: May 3, 2011

Date of Patent: December 25, 2012

Assignee: Fujitsu Limited

Inventors: Yoshiteru Tsuchinaga, Yasuji Ota, Masanao Suzuki, Masakiyo Tanaka, Joe Mizuno
Methods and apparatus for efficient vocoder implementations

Patent number: 8340960

Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.

Type: Grant

Filed: June 16, 2009

Date of Patent: December 25, 2012

Assignee: Altera Corporation

Inventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
Method and system for separating musical sound source

Patent number: 8340943

Abstract: Provided is an apparatus of separating a musical sound source, which may re-construct mixed signals into target sound sources and other sound sources directly using sound source information performed using a predetermined musical instrument when the sound source information is present, thereby more effectively separating sound sources included in the mixed signal. The apparatus may include a Nonnegative Matrix Partial Co-Factorization (NMPCF) analysis unit to perform an NMPCF analysis on a mixed signal and a predetermined sound source signal using a sound source separation model, and to obtain a plurality of entity matrices based on the analysis result, and a target instrument signal separating unit to separate, from the mixed signal, a target instrument signal corresponding to the predetermined sound source signal by calculating an inner product between the plurality of entity matrices.

Type: Grant

Filed: August 12, 2010

Date of Patent: December 25, 2012

Assignees: Electronics and Telecommunications Research Institute, Postech Acadeny-Industry Foundation

Inventors: Min Je Kim, Seungjin Choi, Jiho Yoo, Kyeongok Kang, Inseon Jang, Jin-Woo Hong
AUTOMATED DEMOGRAPHIC ANALYSIS BY ANALYZING VOICE ACTIVITY

Publication number: 20120323566

Abstract: Methods, systems, and media for determining a response to be generated in an environment are provided. The methods, systems, and media monitor the environment for a voice activity of an individual. The voice activity of the individual is detected and analyzed. A content descriptor of the voice activity is determined based on the voice activity of the individual. A demographic descriptor of the individual is determined based on the voice activity of the individual. The content descriptor, the demographic descriptor, and known information are correlated to determine the response to be generated in the environment.

Type: Application

Filed: August 13, 2012

Publication date: December 20, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Michael Johnston, Hisao M. Chang, Harry E. Blanchard, Bernard S. Renger, Linda Roberts
Regeneration of wideband speech

Patent number: 8332210

Abstract: A system and method for processing a narrowband speech signal comprising speech samples in a first range of frequencies. the method comprises: generating from the narrowband speech signal a highband speech signal in a second range of frequencies above the first range of frequencies; determining a pitch of the highband speech signal; using the pitch to generate a pitch-dependent tonality measure from samples of the highband speech signal; and filtering the speech samples using a gain factor derived from the tonality measure and selected to reduce the amplitude of harmonics in the highband speech signal.

Type: Grant

Filed: June 10, 2009

Date of Patent: December 11, 2012

Assignee: Skype

Inventors: Mattias Nilsson, Soren Vang Andersen
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8332220

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: March 25, 2008

Date of Patent: December 11, 2012

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael J. Burkhart, Daniel G. Eisenhauer, Daniel M. Schumacher, Thomas J. Watson
System and method for low power stereo perceptual audio coding using adaptive masking threshold

Patent number: 8332216

Abstract: A method for stereo audio perceptual encoding of an input signal includes masking threshold estimation and bit allocation. The masking threshold estimation and bit allocation are performed once every two encoding processes. Another method for stereo audio perceptual encoding of an input signal includes performing a time-to-frequency transformation, performing a quantization, performing a bitstream formatting to produce an output stream, and performing a psychoacoustics analysis. The psychoacoustics analysis includes masking threshold estimation on a first of every two successive frames of the input signal.

Type: Grant

Filed: August 22, 2006

Date of Patent: December 11, 2012

Assignee: STMicroelectronics Asia Pacific PTE., Ltd.

Inventors: Evelyn Kurniawati, Sapna George
Method and arrangement for enhancing speech quality

Patent number: 8326607

Abstract: The present invention relates to a method and arrangement for improving quality of a voice transmission by extracting filter coefficient parameters with respect to a voice signal in a first speech transmission rate, and using the extracted filter coefficient parameters in a second transmission rate that is equal or lower than the first transmission rate.

Type: Grant

Filed: January 11, 2010

Date of Patent: December 4, 2012

Assignee: Sony Ericsson Mobile Communications AB

Inventor: Martin Nyström
Non-speech section detecting method and non-speech section detecting device

Patent number: 8326612

Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.

Type: Grant

Filed: April 5, 2010

Date of Patent: December 4, 2012

Assignee: Fujitsu Limited

Inventors: Nobuyuki Washio, Shoji Hayakawa
Transcoding method, apparatus, device and system

Patent number: 8326608

Abstract: A method, a device, and a system for transcoding between two embedded codecs are disclosed. The method includes: delaying a first encoded stream in input streams for integer number of frames, where the first encoded stream includes a stream of at least one extension layer in the input streams obtained after input signals are encoded by using a first codec; and using the first codec to decode other encoded streams in the input streams to obtain the first decoded signal; and performing delay aligning and adjusting to obtain an adjusted signal so as to reduce the transcoding complexity and enhance quality of the transcoded signals.

Type: Grant

Filed: January 26, 2012

Date of Patent: December 4, 2012

Assignee: Huawei Technologies Co., Ltd.

Inventors: Chen Hu, Lei Miao, Zexin Liu, Longyin Chen, Herve Marcel Taddei, Qing Zhang
Enhanced artificial intelligence language

Patent number: 8321371

Abstract: A method of determining an appropriate response to an input includes linking a plurality of attributes to a plurality of response templates using a plurality of Boolean expressions. Each attribute is associated with a set of patterns. Each pattern within the set of patterns is equivalent. The method also includes determining an appropriate response template from the plurality of response templates based on the input.

Type: Grant

Filed: December 4, 2007

Date of Patent: November 27, 2012

Assignee: Kurzweil Technologies, Inc.

Inventors: Matthew Bridges, Raymond C. Kurzweil
Apparatus, medium and method to encode and decode high frequency signal

Patent number: 8321229

Abstract: A method and apparatus to encoding or decoding an audio signal is provided. In the method and apparatus, a noise-floor level to use in encoding or decoding a high frequency signal is updated according to the degree of a voiced or unvoiced sound included in the signal.

Type: Grant

Filed: October 23, 2008

Date of Patent: November 27, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ki-hyun Choo, Eun-mi Oh, Ho-sang Sung, Jung-hoe Kim, Mi-young Kim
Transient detection and modification in audio signals

Patent number: 8321206

Abstract: A system and method are disclosed for transient detection and modification in audio signals. Digital signal processing techniques are used to detect transients and modify an audio signal to enhance or suppress such transients, as desired. A transient audio event is detected in a first portion of the audio signal. A graded response to the detected transient audio event is determined. The first portion of the audio signal is modified in accordance with the graded response. The extent of enhancement or suppression (as applicable) may be determined at least in part by a measure of the significance or magnitude of the transient.

Type: Grant

Filed: January 31, 2008

Date of Patent: November 27, 2012

Assignee: Creative Technology Ltd

Inventors: Michael Goodwin, Carlos Avendano, Martin Wolters, Ramkumar Sridharan
Method and device for the hierarchical coding of a source audio signal and corresponding decoding method and device, programs and signals

Patent number: 8321230

Abstract: Hierarchical coding of a source audio signal in the form of a data stream including a base level and at least two hierarchical enhancement levels, each of the levels being organized in successive frames. At least one frame of at least one enhancement level has a duration less than the duration of at least one frame of the base level. At least one indication representative of an order used for a set of enhancement level frames corresponding to the duration of at least one frame of the base level is inserted into the data stream.

Type: Grant

Filed: February 5, 2007

Date of Patent: November 27, 2012

Assignee: France Telecom

Inventors: Pierrick Philippe, Patrice Collen, Christophe Veaux
Efficient filtering with a complex modulated filterbank

Patent number: 8315859

Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.

Type: Grant

Filed: March 17, 2010

Date of Patent: November 20, 2012

Assignee: Dolby International AB

Inventor: Lars Villemoes
Audio signal quality enhancement apparatus and method

Patent number: 8315862

Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.

Type: Grant

Filed: June 5, 2009

Date of Patent: November 20, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
MDCT domain post-filtering apparatus and method for quality enhancement of speech

Patent number: 8315853

Abstract: A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality.

Type: Grant

Filed: June 5, 2008

Date of Patent: November 20, 2012

Assignee: Electronics and Telecommunications Research Institute

Inventors: Hyun-woo Kim, Jong-mo Sung, Mi-suk Lee, Do-young Kim, Byung-sun Lee
Clock extraction circuit for use in a linearly expandable broadcast router

Patent number: 8315348

Abstract: A method is described for extracting selected time information from a stream of serialized AES digital audio data. A first transition indicative of a first preamble of said stream of serialized AES digital audio data is detected and, upon detection of the transition, a time count initiated. A second transition indicative of a subsequent preamble of said serialized AES digital audio data is subsequently detected and the time count halted. The time separating the first and second transitions is then determined. The separation time, which preferably is determined in the form of a fast clock pulse count, is then transferred to a decoding logic circuit for use in decoding the stream of serialized AES digital audio data.

Type: Grant

Filed: June 20, 2003

Date of Patent: November 20, 2012

Assignee: GVBB Holdings S.A.R.L.

Inventors: Carl L. Christensen, Lynn Howard Arbuckle
Systems and methods of providing modified media content

Patent number: 8312492

Abstract: A method and system of providing media content is disclosed. In a particular embodiment, the method includes receiving media content from a content source at a set-top box device. The media content includes video data having a first playback rate and audio data having the first playback rate. The method further includes transforming the audio data via a non-linear transformation to produce modified audio data having a second playback rate, modifying the video data to produce modified video data having the second playback rate, and synchronizing the modified audio data and the modified video data to produce modified media content having the second playback rate. A network-based media content storage device and associated logic to provide adjusted rate audio content are also disclosed.

Type: Grant

Filed: March 19, 2007

Date of Patent: November 13, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
Speech dialog control based on signal pre-processing

Patent number: 8306815

Abstract: A speech dialog system interfaces a user to a computer. The system includes a signal pre-processor that processes a speech input to generate an enhanced signal and an analysis signal. A speech recognition unit may generate a recognition result based on the enhanced signal. A control unit may manage an output unit or an external device based on the information within the analysis signal.

Type: Grant

Filed: December 6, 2007

Date of Patent: November 6, 2012

Assignee: Nuance Communications, Inc.

Inventors: Lars König, Gerhard Uwe Schmidt, Andreas Löw
Speech coding

Patent number: 8301441

Abstract: A method of encoding one or more parent blocks of values, the number of values being the length of each block, the method comprising for each parent block: (a) determining a first sum of values in the parent block; (b) splitting the parent block into smaller subblocks; (c) for at least one of the subblocks, determining a second sum of the values in the subblock, selecting a likelihood table from the plurality of likelihood tables based on said first sum of values in the parent block and encoding the second sum using the likelihood table; (d) designating each subblock a parent block; (e) carrying out steps (a), (b), (c) and (d) until at least one parent block reaches a predetermined condition.

Type: Grant

Filed: June 5, 2009

Date of Patent: October 30, 2012

Assignee: Skype

Inventor: Koen Bernard Vos
Context Based Voice Activity Detection Sensitivity

Publication number: 20120271634

Abstract: A speech dialog system is described that adjusts a voice activity detection threshold during a speech dialog prompt C to reflect a context-based probability of user barge in speech occurring. For example, the context-based probability may be based on the location of one or more transition relevance places in the speech dialog prompt.

Type: Application

Filed: March 26, 2010

Publication date: October 25, 2012

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor: Nils Lenke
Apparatus and a method for calculating a number of spectral envelopes

Patent number: 8296159

Abstract: An apparatus calculates a number of spectral envelopes to be derived by a spectral band replication (SBR) encoder, wherein the SBR encoder is adapted to encode an audio signal using a plurality of sample values within a predetermined number of subsequent time portions in an SBR frame extending from an initial time to a final time, the predetermined number of subsequent time portions being arranged in a time sequence given by the audio signal. The apparatus has a decision value calculator for determining a decision value, the decision value measuring a deviation in spectral energy distributions of a pair of neighboring time portions. The apparatus further has a detector for detecting a violation of a threshold by the decision value and a processor for determining a first envelope border between the pair of neighboring time portions when the violation of the threshold is detected.

Type: Grant

Filed: January 11, 2011

Date of Patent: October 23, 2012

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Max Neuendorf, Bernhard Grill, Ulrich Kraemer, Markus Multrus, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Markus Lohwasser, Marc Gayer, Manuel Jander, Virgilio Bacigalupo
USER INTENT ANALYSIS EXTENT OF SPEAKER INTENT ANALYSIS SYSTEM

Publication number: 20120262296

Abstract: A speaker intent analysis system and method for validating the truthfulness and intent of a plurality of participants' responses to questions. A computer stores, retrieves, and transmits a series of questions to be answered audibly by participants. The participants' answers are received by a data processor. The data processor analyzes and records the participants' speech parameters for determining the likelihood of dishonesty. In addition to analyzing participants' speech parameters for distinguishing stress or other abnormality, the processor may be equipped with voice recognition software to screen responses that while not dishonest, are indicative of possible malfeasance on the part of the participants. Once the responses are analyzed, the processor produces an output that is indicative of the participant's credibility. The output may be sent to proper parties and/or devices such as a web page, computer, e-mail, PDA, pager, database, report, etc. for appropriate action.

Type: Application

Filed: June 12, 2012

Publication date: October 18, 2012

Inventor: DAVID BEZAR
Meeting visualization system

Patent number: 8290776

Abstract: Voice of plural participants during a meeting is obtained and dialogue situations of the participants that change every second are displayed in real time, so that it is possible to provide a meeting visualization system for triggering more positive discussions. Voice data collected from plural voice collecting units associated with plural participants is processed by a voice processing server to extract speech information. The speech information is sequentially input to an aggregation server. A query process is performed for the speech information by a stream data processing unit of the aggregation server, so that activity data such as the accumulation value of speeches of the participants in the meeting is generated. A display processing unit visualizes and displays dialogue situations of the participants by using the sizes of circles and the thicknesses of lines on the basis of the activity data.

Type: Grant

Filed: April 1, 2008

Date of Patent: October 16, 2012

Assignee: Hitachi, Ltd.

Inventors: Norihiko Moriwaki, Nobuo Sato, Tsuneyuki Imaki, Toshihiko Kashiyama, Itaru Nishizawa, Masashi Egi
Method and apparatus for sinusoidal audio coding

Patent number: 8290770

Abstract: Provided are a method and apparatus for sinusoidal audio coding, which employs a tracking method for further effective coding of sinusoids extracted in the process of a sinusoidal analysis of parametric coding. The sinusoidal audio coding method includes: extracting sinusoids of a current frame by performing a sinusoidal analysis on an input audio signal; with respect to each of the extracted sinusoids, setting a mode selected from a birth mode in which a sinusoid is newly generated irrespective of sinusoids of a previous frame, a continuation mode in which the sinusoid is only one sinusoid continued from one of the sinusoids of the previous frame, and a branch mode in which the sinusoid is one of a plurality of sinusoids continued from one of the sinusoids of the previous frame; and coding the extracted sinusoids according to the selected mode. Accordingly, a plurality of sinusoids that can be continued from one previous track component are set to the continuation mode or the branch mode.

Type: Grant

Filed: February 5, 2008

Date of Patent: October 16, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Nam-suk Lee, Geon-hyoung Lee, Jae-one Oh, Chul-woo Lee, Jong-hoon Jeong
Circular frequency translation with noise blending

Patent number: 8285543

Abstract: An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.

Type: Grant

Filed: January 24, 2012

Date of Patent: October 9, 2012

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael Mead Truman, Mark Stuart Vinton
System and method for multidimensional gesture analysis

Patent number: 8280732

Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.

Type: Grant

Filed: March 26, 2009

Date of Patent: October 2, 2012

Inventors: Wolfgang Richter, Roland Aubauer
Audio decoder, audio object encoder, method for decoding a multi-audio-object signal, multi-audio-object encoding method, and non-transitory computer-readable medium therefor

Patent number: 8280744

Abstract: An audio decoder for decoding a multi-audio-object signal having an audio signal of a first type and an audio signal of a second type encoded therein is described, the multi-audio-object signal having a downmix signal and side information, the side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, and a residual signal specifying residual level values in a second predetermined time/frequency resolution, the audio decoder having a processor for computing prediction coefficients based on the level information; and an up-mixer for up-mixing the downmix signal based on the prediction coefficients and the residual signal to obtain a first up-mix audio signal approximating the audio signal of the first type and/or a second up-mix audio signal approximating the audio signal of the second type.

Type: Grant

Filed: October 17, 2008

Date of Patent: October 2, 2012

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
System, method and software for a speech-enabled call routing application using an action-object matrix

Patent number: 8280013

Abstract: A system, method and software for facilitating a speech-enabled call routing application using an action-object matrix is disclosed. In operation, a natural language user utterance may be evaluated to identify an action and object available in an action-object matrix indicating transactions or operations available to a user. Depending upon the contents of the natural language user utterance, additional prompts and/or a disambiguation dialogue may be effected to elicit an available action-object combination selection from the user. Following identification of an action-object combination from the natural language user utterance, the action-object matrix may cooperate with a look-up table to identify an appropriate use routing destination. Following identification of an appropriate routing destination, the user connection may be routed to a service agent or module configured to facilitate the user selected transaction as indicated by the action-object combination.

Type: Grant

Filed: July 15, 2008

Date of Patent: October 2, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Robert R. Bushey, John M. Martin, Benjamin A. Knott
Noise variance estimator for speech enhancement

Patent number: 8280731

Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.

Type: Grant

Filed: March 14, 2008

Date of Patent: October 2, 2012

Assignee: Dolby Laboratories Licensing Corporation

Inventor: Rongshan Yu
System and method for encoding and decoding pulse indices

Patent number: 8280729

Abstract: Methods, and corresponding codec-containing devices are provided that have source coding schemes for encoding a component of an excitation. In some cases, the source coding scheme is an enumerative source coding scheme, while in other cases the source coding scheme is an arithmetic source coding scheme. In some cases, the source coding schemes are applied to encode a fixed codebook component of the excitation for a codec employing codebook excited linear prediction, for example an AMR-WB (Adaptive Multi-Rate-Wideband) speech codec.

Type: Grant

Filed: January 22, 2010

Date of Patent: October 2, 2012

Assignee: Research In Motion Limited

Inventors: Xiang Yu, Dake He, En-hui Yang
Sound signal generating method, sound signal generating device, and recording medium

Patent number: 8280737

Abstract: A sound signal generating method includes: generating, using a computer, a plurality of unit waveform signals by dividing the original sound signal having a periodic length of repeating similar waveforms by the length of the waveform; generating, using a computer, a repetitive waveform signal for each of the generated unit waveform signals by repeating the waveform of the unit waveform signal a given number of times; and generating, using a computer, an outputsound signal by shifting each of the repetitive waveform signals in each length with a sequence in which the unit waveform signals form the original sound signal and then superimposing on one another.

Type: Grant

Filed: February 10, 2010

Date of Patent: October 2, 2012

Assignee: Fujitsu Limited

Inventor: Kazuhiro Watanabe
Apparatus and a method for decoding an encoded audio signal

Patent number: 8275626

Abstract: An apparatus for decoding an encoded audio signal having first and second portions encoded in accordance with first and second encoding algorithms, respectively, BWE parameters for the first and second portions and a coding mode information indicating a first or a second decoding algorithm, includes first and second decoders, a BWE module and a controller. The decoders decode portions in accordance with decoding algorithms for time portions of the encoded signal to obtain decoded signals. The BWE module has a controllable crossover frequency and is configured for performing a bandwidth extension algorithm using the first decoded signal and the BWE parameters for the first portion, and for performing a bandwidth extension algorithm using the second decoded signal and the bandwidth extension parameter for the second portion. The controller controls the crossover frequency for the BWE module in accordance with the coding mode information.

Type: Grant

Filed: January 11, 2011

Date of Patent: September 25, 2012

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Max Neuendorf, Bernhard Grill, Ulrich Kraemer, Markus Multrus, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Markus Lohwasser, Marc Gayer, Manuel Jander, Virgilio Bacigalupo
Multi-modal, geo-tempo communications systems

Patent number: 8275834

Abstract: Disclosed is a flexible, multi-modal system useful in communications among users, capable of synchronizing real world and augmented reality, wherein the system is deployed in centralized and distributed computational platforms. The system comprises a plurality of input devices designed and configured to generate signals representing speech, gestures, pointing direction, and location of a user, and transmit the same to a multi-modal interface. Some of the signals generated represent a message from the user intended for dissemination to other users.

Type: Grant

Filed: September 14, 2009

Date of Patent: September 25, 2012

Assignee: Applied Research Associates, Inc.

Inventors: Roberto Aldunate, Gregg E Larson
VOICE PROCESSING DEVICE AND METHOD, AND PROGRAM

Publication number: 20120239384

Abstract: A voice processing device includes a voice pitch converting unit that performs a voice pitch converting process with respect to an input voice signal and converts voice pitch of the input voice signal, an error detecting unit that detects an error between the number of samples of an output voice signal, which is expected, and the number of samples of the output voice signal, which is actually output, and a time length control unit that controls adjustment of the time length in such a manner that the time length of the output voice signal is corrected by the amount of the error.

Type: Application

Filed: March 9, 2012

Publication date: September 20, 2012

Inventors: Akihiro MUKAI, Akira Inoue
Quantizing feature vectors in decision-making applications

Patent number: 8271278

Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.

Type: Grant

Filed: April 3, 2010

Date of Patent: September 18, 2012

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
Audio decoding using variable-length codebook application ranges

Patent number: 8271293

Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. Each frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied, and (iii) window information. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes. Subband samples are then generated by dequantizing the decoded quantization indexes, and a sequence of different window functions that were applied within a single frame of the audio data is identified based on the window information.

Type: Grant

Filed: March 28, 2011

Date of Patent: September 18, 2012

Assignee: Digital Rise Technology Co., Ltd.

Inventor: Yuli You
Compatible multi-channel coding/decoding

Patent number: 8270618

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: September 9, 2008

Date of Patent: September 18, 2012

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hölzer, Claus Spenger
Scalable encoding device, and scalable encoding method

Patent number: 8271275

Abstract: A scalable encoding device capable of reducing an encoding rate to reduce a circuit scale while preventing sound quality deterioration of a decoded signal. An extension layer is coarsely divided into a system for processing a first channel and a system for processing a second channel. A sound source predictor for processing the first channel predicts a drive sound source signal of the first channel from a drive sound source signal of a monaural signal, and outputs the predicted drive sound source signal through a multiplier to a first CELP encoder. A sound source predictor for processing the second channel predicts the drive sound source signal of the second channel from the drive sound source signal of the monaural signal and the output from the first CELP encoder, and outputs the predicted drive sound source signal through a multiplier to a second CELP encoder.

Type: Grant

Filed: May 29, 2006

Date of Patent: September 18, 2012

Assignee: Panasonic Corporation

Inventors: Michiyo Goto, Koji Yoshida
Method for bias compensation for cepstro-temporal smoothing of spectral filter gains

Patent number: 8271271

Abstract: A method for modification of a cepstro-temporally smoothed gain function of a gain function resulting in a bias compensated spectral gain function is provided. The cepstro-temporal smoothing increases the quality of an enhanced output signal, as it affects only spectral outliers caused by estimation errors, while the speech characteristics are well preserved. However, due to the cepstral transform, the temporal smoothing is done in the logarithmic domain rather than the linear domain, and hence results in a certain bias. Thus, the method for a general bias compensation for a cepstro-temporal smoothing of spectral filter gain functions that is only dependent on the lower limit of the spectral filter-gain function.

Type: Grant

Filed: July 17, 2009

Date of Patent: September 18, 2012

Assignee: Siemens Medical Instruments Pte. Ltd.

Inventors: Colin Breithaupt, Timo Gerkmann, Rainer Martin
Digital telecommunication system

Patent number: 8265696

Abstract: A digital telecommunications system wherein the telecommunications centers of the calling and called terminal are arranged to perform handshaking concerning the speech codec used by the terminals. Depending on the link between the telecommunications centers, the telecommunications centers are arranged to connect call connections past a transcoder unit or to control the transcoder units to let encoded speech through without speech encoding operations in such a way that speech encoding and decoding are carried out only in the terminals. Handshaking between the telecommunications centers is carried out as outband signalling.

Type: Grant

Filed: October 19, 1999

Date of Patent: September 11, 2012

Assignee: Nokia Siemens Networks Oy

Inventor: Markku Verkama
Method and an apparatus for decoding an audio signal

Patent number: 8265941

Abstract: A method for decoding an audio signal comprises receiving a combined downmix, a combined object information, and a mix information, the combined downmix being generating using at least two downmix signals, the combined object information being made by combination of at least two sets of object information, generating a downmix processing information using the combined object information and the mix information, and processing the combined downmix using the downmix processing information. The method and an apparatus for decoding an audio signal comprising the combined downmix and the combined object information can control object gain and output in a remote conference and so on. The method and the apparatus for decoding audio signal that contains multi-object signals are fast and efficiently by reducing process time, computer resource, thereby relieving the resource requirement like the wide bandwidth by using the combined object information.

Type: Grant

Filed: December 6, 2007

Date of Patent: September 11, 2012

Assignee: LG Electronics Inc.

Inventors: Hyen O Oh, Yang Won Jung
Voice activated smart card

Patent number: 8266451

Abstract: A portable device including a biometric voice sensor configured to detect voice information and to take an action in response to speech spoken into the voice sensor. The device also includes a voice processor configured to process the voice sensor signal characteristics. The portable device may encrypt the detected signal and may compare the detected signal characteristics with voice characteristics that are stored in a memory of the portable device for applications such as voice enabled authentication, identification, command execution, encryption, and free speech recognition. The voice sensor may include a thin membrane portion that detects pressure waves caused by human speech. The portable device may be a contact-type smart card, a contactless smart card, or a hybrid smart card with contact and contactless interfaces. The device may be powered by an internal battery or by a host via contacts or by a power signal making use of the antenna in a contactless implementation.

Type: Grant

Filed: August 31, 2001

Date of Patent: September 11, 2012

Assignee: Gemalto SA

Inventors: Robert A. Leydier, Bertrand du Castel

prev … 7 8 9 10 11 12 13 14 15 … next