Speech Signal Processing Patents (Class 704/200)
  • Patent number: 8348670
    Abstract: A method for the computer-assisted learning of orthography, the method includes executing the following steps by a data processing system: retrieving (11) a main set of words from a data storage; retrieving (12) an error data set associated with said main set of words from the data storage; repeatedly executing the following steps: selecting (13) a word to prompt the user with, by computing, for each word from the error data set, a statistic measure related to the probability of an error occurring in the word, and selecting the word which has the maximum value of the statistic measure; prompting (14) the user with the word; accepting (15) a user input specifying a sequence of symbols; comparing (16) the user input with the word; updating (17, 18) and storing the error data set.
    Type: Grant
    Filed: July 25, 2008
    Date of Patent: January 8, 2013
    Assignee: Dybuster AG
    Inventors: Christian Voegeli, Markus Gross
  • Patent number: 8352257
    Abstract: The present system proposes a technique called the spectro-temporal varying technique, to compute the suppression gain. This method is motivated by the perceptual properties of human auditory system; specifically, that the human ear has higher frequency resolution in the lower frequencies band and less frequency resolution in the higher frequencies, and also that the important speech information in the high frequencies are consonants which usually have random noise spectral shape. A second property of the human auditory system is that the human ear has lower temporal resolution in the lower frequencies and higher temporal resolution in the higher frequencies. Based on that, the system uses a spectro-temporal varying method which introduces the concept of frequency-smoothing by modifying the estimation of the a posteriori SNR. In addition, the system also makes the a priori SNR time-smoothing factor depend on frequency.
    Type: Grant
    Filed: December 20, 2007
    Date of Patent: January 8, 2013
    Assignee: QNX Software Systems Limited
    Inventors: Phil A. Hetherington, Xueman Li
  • Patent number: 8348839
    Abstract: Embodiments of the present invention provide a method of monitoring an environment comprising: monitoring at least one data stream wherein the data stream is a data stream in the environment; detecting a specified event from the data stream; and triggering a response to the specified event. Embodiments of the present invention provide a system for monitoring an environment comprising: a receiver adapted to receive at least one input data stream wherein the input data stream is a data stream in the environment; an active listener/observer system adapted to monitor the data stream; and an interface adapted to express at least one output stream. Embodiments of the present invention provide a computer-readable medium having instructions comprising: an active listener/observer routine configured to monitor at least one data stream; a detection routine configured to find specified events in the data stream; and an output routine configured to express a response event.
    Type: Grant
    Filed: April 10, 2007
    Date of Patent: January 8, 2013
    Assignee: General Electric Company
    Inventors: Pallav Sharda, Steven Eric Linthicum
  • Publication number: 20130006616
    Abstract: According to an embodiment, an information retrieving apparatus includes a housing; an input-output unit to perform dialogue processing with a user; a first detecting unit to detect means of transfer which indicates present means of transfer for the user; a second detecting unit to detect a holding status which indicates whether the user is holding the housing; a third detecting unit to detect a talking posture which indicates whether the housing is held near the face of the user; a selecting unit to select, from among a plurality of interaction modes that establish the dialogue processing, an interaction mode according to a combination of the means of transfer, the holding status, and the talking posture; an dialogue manager to control the dialogue processing according to the selected interaction mode; and a information retrieval unit to retrieve information using a keyword that is input during the dialogue processing.
    Type: Application
    Filed: July 2, 2012
    Publication date: January 3, 2013
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Hiromi Wakaki, Kazuo Sumita, Hiroko Fujii, Masaru Suzuki, Michiaki Ariga
  • Patent number: 8340305
    Abstract: Audio encoding method and device comprising the transmission, in addition to the data representing a frequency-limited signal, of information relating to a temporal filter that is to be applied to the entire enhanced signal, both in its transmitted low-frequency part and in its reconstituted high-frequency part. The application of this filter for reshaping the reconstituted high-frequency part and the correction of compression artefacts present in the transmitted low-frequency part. In this way, the application of the temporal filter, simple and inexpensive, to all or part of the reconstituted signal, makes it possible to obtain a signal of good perceived quality.
    Type: Grant
    Filed: December 28, 2007
    Date of Patent: December 25, 2012
    Assignee: Mobiclip
    Inventor: Alexandre Delattre
  • Patent number: 8340963
    Abstract: An echo suppressing system includes: a sound output device for outputting sound based on a sound signal, including a passing section for allowing passage of a component of a different frequency band, and a plurality of sound output sections, each of which outputs sound based on each of the plurality of sound signals passed through the passing section; a summer for summing the plurality of sound signals to generate a reference sound signal; a sound input device for converting input sound into a sound signal; and an echo suppressor for suppressing echo based on the sound output by the sound output device, including an input section to which a sound signal is input from the sound input device as an observation sound signal, and a correction section for correcting the observation sound signal so as to suppress echo included in the observation sound signal.
    Type: Grant
    Filed: April 8, 2010
    Date of Patent: December 25, 2012
    Assignee: Fujitsu Limited
    Inventors: Naoshi Matsuo, Taisuke Itou
  • Patent number: 8340973
    Abstract: A data embedding device for embedding data in a speech code obtained by encoding a speech in accordance with a speech encoding method based on a voice generation process of a human being, includes an embedding judgment unit, every speech code, judging whether or not data should be embedded in the speech code, and an embedding unit embedding data in two or more parameter codes of a plurality of parameter codes constituting the speech code for which it is judged by the embedding judgment unit that the data should be embedded.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: December 25, 2012
    Assignee: Fujitsu Limited
    Inventors: Yoshiteru Tsuchinaga, Yasuji Ota, Masanao Suzuki, Masakiyo Tanaka, Joe Mizuno
  • Patent number: 8340960
    Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: December 25, 2012
    Assignee: Altera Corporation
    Inventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
  • Patent number: 8340943
    Abstract: Provided is an apparatus of separating a musical sound source, which may re-construct mixed signals into target sound sources and other sound sources directly using sound source information performed using a predetermined musical instrument when the sound source information is present, thereby more effectively separating sound sources included in the mixed signal. The apparatus may include a Nonnegative Matrix Partial Co-Factorization (NMPCF) analysis unit to perform an NMPCF analysis on a mixed signal and a predetermined sound source signal using a sound source separation model, and to obtain a plurality of entity matrices based on the analysis result, and a target instrument signal separating unit to separate, from the mixed signal, a target instrument signal corresponding to the predetermined sound source signal by calculating an inner product between the plurality of entity matrices.
    Type: Grant
    Filed: August 12, 2010
    Date of Patent: December 25, 2012
    Assignees: Electronics and Telecommunications Research Institute, Postech Acadeny-Industry Foundation
    Inventors: Min Je Kim, Seungjin Choi, Jiho Yoo, Kyeongok Kang, Inseon Jang, Jin-Woo Hong
  • Publication number: 20120323566
    Abstract: Methods, systems, and media for determining a response to be generated in an environment are provided. The methods, systems, and media monitor the environment for a voice activity of an individual. The voice activity of the individual is detected and analyzed. A content descriptor of the voice activity is determined based on the voice activity of the individual. A demographic descriptor of the individual is determined based on the voice activity of the individual. The content descriptor, the demographic descriptor, and known information are correlated to determine the response to be generated in the environment.
    Type: Application
    Filed: August 13, 2012
    Publication date: December 20, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Michael Johnston, Hisao M. Chang, Harry E. Blanchard, Bernard S. Renger, Linda Roberts
  • Patent number: 8332210
    Abstract: A system and method for processing a narrowband speech signal comprising speech samples in a first range of frequencies. the method comprises: generating from the narrowband speech signal a highband speech signal in a second range of frequencies above the first range of frequencies; determining a pitch of the highband speech signal; using the pitch to generate a pitch-dependent tonality measure from samples of the highband speech signal; and filtering the speech samples using a gain factor derived from the tonality measure and selected to reduce the amplitude of harmonics in the highband speech signal.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: December 11, 2012
    Assignee: Skype
    Inventors: Mattias Nilsson, Soren Vang Andersen
  • Patent number: 8332220
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: March 25, 2008
    Date of Patent: December 11, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: William K. Bodin, Michael J. Burkhart, Daniel G. Eisenhauer, Daniel M. Schumacher, Thomas J. Watson
  • Patent number: 8332216
    Abstract: A method for stereo audio perceptual encoding of an input signal includes masking threshold estimation and bit allocation. The masking threshold estimation and bit allocation are performed once every two encoding processes. Another method for stereo audio perceptual encoding of an input signal includes performing a time-to-frequency transformation, performing a quantization, performing a bitstream formatting to produce an output stream, and performing a psychoacoustics analysis. The psychoacoustics analysis includes masking threshold estimation on a first of every two successive frames of the input signal.
    Type: Grant
    Filed: August 22, 2006
    Date of Patent: December 11, 2012
    Assignee: STMicroelectronics Asia Pacific PTE., Ltd.
    Inventors: Evelyn Kurniawati, Sapna George
  • Patent number: 8326607
    Abstract: The present invention relates to a method and arrangement for improving quality of a voice transmission by extracting filter coefficient parameters with respect to a voice signal in a first speech transmission rate, and using the extracted filter coefficient parameters in a second transmission rate that is equal or lower than the first transmission rate.
    Type: Grant
    Filed: January 11, 2010
    Date of Patent: December 4, 2012
    Assignee: Sony Ericsson Mobile Communications AB
    Inventor: Martin Nyström
  • Patent number: 8326612
    Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.
    Type: Grant
    Filed: April 5, 2010
    Date of Patent: December 4, 2012
    Assignee: Fujitsu Limited
    Inventors: Nobuyuki Washio, Shoji Hayakawa
  • Patent number: 8326608
    Abstract: A method, a device, and a system for transcoding between two embedded codecs are disclosed. The method includes: delaying a first encoded stream in input streams for integer number of frames, where the first encoded stream includes a stream of at least one extension layer in the input streams obtained after input signals are encoded by using a first codec; and using the first codec to decode other encoded streams in the input streams to obtain the first decoded signal; and performing delay aligning and adjusting to obtain an adjusted signal so as to reduce the transcoding complexity and enhance quality of the transcoded signals.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: December 4, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Chen Hu, Lei Miao, Zexin Liu, Longyin Chen, Herve Marcel Taddei, Qing Zhang
  • Patent number: 8321371
    Abstract: A method of determining an appropriate response to an input includes linking a plurality of attributes to a plurality of response templates using a plurality of Boolean expressions. Each attribute is associated with a set of patterns. Each pattern within the set of patterns is equivalent. The method also includes determining an appropriate response template from the plurality of response templates based on the input.
    Type: Grant
    Filed: December 4, 2007
    Date of Patent: November 27, 2012
    Assignee: Kurzweil Technologies, Inc.
    Inventors: Matthew Bridges, Raymond C. Kurzweil
  • Patent number: 8321229
    Abstract: A method and apparatus to encoding or decoding an audio signal is provided. In the method and apparatus, a noise-floor level to use in encoding or decoding a high frequency signal is updated according to the degree of a voiced or unvoiced sound included in the signal.
    Type: Grant
    Filed: October 23, 2008
    Date of Patent: November 27, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-hyun Choo, Eun-mi Oh, Ho-sang Sung, Jung-hoe Kim, Mi-young Kim
  • Patent number: 8321206
    Abstract: A system and method are disclosed for transient detection and modification in audio signals. Digital signal processing techniques are used to detect transients and modify an audio signal to enhance or suppress such transients, as desired. A transient audio event is detected in a first portion of the audio signal. A graded response to the detected transient audio event is determined. The first portion of the audio signal is modified in accordance with the graded response. The extent of enhancement or suppression (as applicable) may be determined at least in part by a measure of the significance or magnitude of the transient.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: November 27, 2012
    Assignee: Creative Technology Ltd
    Inventors: Michael Goodwin, Carlos Avendano, Martin Wolters, Ramkumar Sridharan
  • Patent number: 8321230
    Abstract: Hierarchical coding of a source audio signal in the form of a data stream including a base level and at least two hierarchical enhancement levels, each of the levels being organized in successive frames. At least one frame of at least one enhancement level has a duration less than the duration of at least one frame of the base level. At least one indication representative of an order used for a set of enhancement level frames corresponding to the duration of at least one frame of the base level is inserted into the data stream.
    Type: Grant
    Filed: February 5, 2007
    Date of Patent: November 27, 2012
    Assignee: France Telecom
    Inventors: Pierrick Philippe, Patrice Collen, Christophe Veaux
  • Patent number: 8315859
    Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: November 20, 2012
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 8315862
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 8315853
    Abstract: A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality.
    Type: Grant
    Filed: June 5, 2008
    Date of Patent: November 20, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hyun-woo Kim, Jong-mo Sung, Mi-suk Lee, Do-young Kim, Byung-sun Lee
  • Patent number: 8315348
    Abstract: A method is described for extracting selected time information from a stream of serialized AES digital audio data. A first transition indicative of a first preamble of said stream of serialized AES digital audio data is detected and, upon detection of the transition, a time count initiated. A second transition indicative of a subsequent preamble of said serialized AES digital audio data is subsequently detected and the time count halted. The time separating the first and second transitions is then determined. The separation time, which preferably is determined in the form of a fast clock pulse count, is then transferred to a decoding logic circuit for use in decoding the stream of serialized AES digital audio data.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: November 20, 2012
    Assignee: GVBB Holdings S.A.R.L.
    Inventors: Carl L. Christensen, Lynn Howard Arbuckle
  • Patent number: 8312492
    Abstract: A method and system of providing media content is disclosed. In a particular embodiment, the method includes receiving media content from a content source at a set-top box device. The media content includes video data having a first playback rate and audio data having the first playback rate. The method further includes transforming the audio data via a non-linear transformation to produce modified audio data having a second playback rate, modifying the video data to produce modified video data having the second playback rate, and synchronizing the modified audio data and the modified video data to produce modified media content having the second playback rate. A network-based media content storage device and associated logic to provide adjusted rate audio content are also disclosed.
    Type: Grant
    Filed: March 19, 2007
    Date of Patent: November 13, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Ann Syrdal, Alistair Conkie
  • Patent number: 8306815
    Abstract: A speech dialog system interfaces a user to a computer. The system includes a signal pre-processor that processes a speech input to generate an enhanced signal and an analysis signal. A speech recognition unit may generate a recognition result based on the enhanced signal. A control unit may manage an output unit or an external device based on the information within the analysis signal.
    Type: Grant
    Filed: December 6, 2007
    Date of Patent: November 6, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Lars König, Gerhard Uwe Schmidt, Andreas Löw
  • Patent number: 8301441
    Abstract: A method of encoding one or more parent blocks of values, the number of values being the length of each block, the method comprising for each parent block: (a) determining a first sum of values in the parent block; (b) splitting the parent block into smaller subblocks; (c) for at least one of the subblocks, determining a second sum of the values in the subblock, selecting a likelihood table from the plurality of likelihood tables based on said first sum of values in the parent block and encoding the second sum using the likelihood table; (d) designating each subblock a parent block; (e) carrying out steps (a), (b), (c) and (d) until at least one parent block reaches a predetermined condition.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: October 30, 2012
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Publication number: 20120271634
    Abstract: A speech dialog system is described that adjusts a voice activity detection threshold during a speech dialog prompt C to reflect a context-based probability of user barge in speech occurring. For example, the context-based probability may be based on the location of one or more transition relevance places in the speech dialog prompt.
    Type: Application
    Filed: March 26, 2010
    Publication date: October 25, 2012
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventor: Nils Lenke
  • Patent number: 8296159
    Abstract: An apparatus calculates a number of spectral envelopes to be derived by a spectral band replication (SBR) encoder, wherein the SBR encoder is adapted to encode an audio signal using a plurality of sample values within a predetermined number of subsequent time portions in an SBR frame extending from an initial time to a final time, the predetermined number of subsequent time portions being arranged in a time sequence given by the audio signal. The apparatus has a decision value calculator for determining a decision value, the decision value measuring a deviation in spectral energy distributions of a pair of neighboring time portions. The apparatus further has a detector for detecting a violation of a threshold by the decision value and a processor for determining a first envelope border between the pair of neighboring time portions when the violation of the threshold is detected.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: October 23, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Max Neuendorf, Bernhard Grill, Ulrich Kraemer, Markus Multrus, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Markus Lohwasser, Marc Gayer, Manuel Jander, Virgilio Bacigalupo
  • Publication number: 20120262296
    Abstract: A speaker intent analysis system and method for validating the truthfulness and intent of a plurality of participants' responses to questions. A computer stores, retrieves, and transmits a series of questions to be answered audibly by participants. The participants' answers are received by a data processor. The data processor analyzes and records the participants' speech parameters for determining the likelihood of dishonesty. In addition to analyzing participants' speech parameters for distinguishing stress or other abnormality, the processor may be equipped with voice recognition software to screen responses that while not dishonest, are indicative of possible malfeasance on the part of the participants. Once the responses are analyzed, the processor produces an output that is indicative of the participant's credibility. The output may be sent to proper parties and/or devices such as a web page, computer, e-mail, PDA, pager, database, report, etc. for appropriate action.
    Type: Application
    Filed: June 12, 2012
    Publication date: October 18, 2012
    Inventor: DAVID BEZAR
  • Patent number: 8290776
    Abstract: Voice of plural participants during a meeting is obtained and dialogue situations of the participants that change every second are displayed in real time, so that it is possible to provide a meeting visualization system for triggering more positive discussions. Voice data collected from plural voice collecting units associated with plural participants is processed by a voice processing server to extract speech information. The speech information is sequentially input to an aggregation server. A query process is performed for the speech information by a stream data processing unit of the aggregation server, so that activity data such as the accumulation value of speeches of the participants in the meeting is generated. A display processing unit visualizes and displays dialogue situations of the participants by using the sizes of circles and the thicknesses of lines on the basis of the activity data.
    Type: Grant
    Filed: April 1, 2008
    Date of Patent: October 16, 2012
    Assignee: Hitachi, Ltd.
    Inventors: Norihiko Moriwaki, Nobuo Sato, Tsuneyuki Imaki, Toshihiko Kashiyama, Itaru Nishizawa, Masashi Egi
  • Patent number: 8290770
    Abstract: Provided are a method and apparatus for sinusoidal audio coding, which employs a tracking method for further effective coding of sinusoids extracted in the process of a sinusoidal analysis of parametric coding. The sinusoidal audio coding method includes: extracting sinusoids of a current frame by performing a sinusoidal analysis on an input audio signal; with respect to each of the extracted sinusoids, setting a mode selected from a birth mode in which a sinusoid is newly generated irrespective of sinusoids of a previous frame, a continuation mode in which the sinusoid is only one sinusoid continued from one of the sinusoids of the previous frame, and a branch mode in which the sinusoid is one of a plurality of sinusoids continued from one of the sinusoids of the previous frame; and coding the extracted sinusoids according to the selected mode. Accordingly, a plurality of sinusoids that can be continued from one previous track component are set to the continuation mode or the branch mode.
    Type: Grant
    Filed: February 5, 2008
    Date of Patent: October 16, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-suk Lee, Geon-hyoung Lee, Jae-one Oh, Chul-woo Lee, Jong-hoon Jeong
  • Patent number: 8285543
    Abstract: An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: October 9, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael Mead Truman, Mark Stuart Vinton
  • Patent number: 8280732
    Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.
    Type: Grant
    Filed: March 26, 2009
    Date of Patent: October 2, 2012
    Inventors: Wolfgang Richter, Roland Aubauer
  • Patent number: 8280744
    Abstract: An audio decoder for decoding a multi-audio-object signal having an audio signal of a first type and an audio signal of a second type encoded therein is described, the multi-audio-object signal having a downmix signal and side information, the side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, and a residual signal specifying residual level values in a second predetermined time/frequency resolution, the audio decoder having a processor for computing prediction coefficients based on the level information; and an up-mixer for up-mixing the downmix signal based on the prediction coefficients and the residual signal to obtain a first up-mix audio signal approximating the audio signal of the first type and/or a second up-mix audio signal approximating the audio signal of the second type.
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: October 2, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
  • Patent number: 8280013
    Abstract: A system, method and software for facilitating a speech-enabled call routing application using an action-object matrix is disclosed. In operation, a natural language user utterance may be evaluated to identify an action and object available in an action-object matrix indicating transactions or operations available to a user. Depending upon the contents of the natural language user utterance, additional prompts and/or a disambiguation dialogue may be effected to elicit an available action-object combination selection from the user. Following identification of an action-object combination from the natural language user utterance, the action-object matrix may cooperate with a look-up table to identify an appropriate use routing destination. Following identification of an appropriate routing destination, the user connection may be routed to a service agent or module configured to facilitate the user selected transaction as indicated by the action-object combination.
    Type: Grant
    Filed: July 15, 2008
    Date of Patent: October 2, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Robert R. Bushey, John M. Martin, Benjamin A. Knott
  • Patent number: 8280731
    Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: October 2, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Rongshan Yu
  • Patent number: 8280729
    Abstract: Methods, and corresponding codec-containing devices are provided that have source coding schemes for encoding a component of an excitation. In some cases, the source coding scheme is an enumerative source coding scheme, while in other cases the source coding scheme is an arithmetic source coding scheme. In some cases, the source coding schemes are applied to encode a fixed codebook component of the excitation for a codec employing codebook excited linear prediction, for example an AMR-WB (Adaptive Multi-Rate-Wideband) speech codec.
    Type: Grant
    Filed: January 22, 2010
    Date of Patent: October 2, 2012
    Assignee: Research In Motion Limited
    Inventors: Xiang Yu, Dake He, En-hui Yang
  • Patent number: 8280737
    Abstract: A sound signal generating method includes: generating, using a computer, a plurality of unit waveform signals by dividing the original sound signal having a periodic length of repeating similar waveforms by the length of the waveform; generating, using a computer, a repetitive waveform signal for each of the generated unit waveform signals by repeating the waveform of the unit waveform signal a given number of times; and generating, using a computer, an outputsound signal by shifting each of the repetitive waveform signals in each length with a sequence in which the unit waveform signals form the original sound signal and then superimposing on one another.
    Type: Grant
    Filed: February 10, 2010
    Date of Patent: October 2, 2012
    Assignee: Fujitsu Limited
    Inventor: Kazuhiro Watanabe
  • Patent number: 8275626
    Abstract: An apparatus for decoding an encoded audio signal having first and second portions encoded in accordance with first and second encoding algorithms, respectively, BWE parameters for the first and second portions and a coding mode information indicating a first or a second decoding algorithm, includes first and second decoders, a BWE module and a controller. The decoders decode portions in accordance with decoding algorithms for time portions of the encoded signal to obtain decoded signals. The BWE module has a controllable crossover frequency and is configured for performing a bandwidth extension algorithm using the first decoded signal and the BWE parameters for the first portion, and for performing a bandwidth extension algorithm using the second decoded signal and the bandwidth extension parameter for the second portion. The controller controls the crossover frequency for the BWE module in accordance with the coding mode information.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: September 25, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Max Neuendorf, Bernhard Grill, Ulrich Kraemer, Markus Multrus, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Markus Lohwasser, Marc Gayer, Manuel Jander, Virgilio Bacigalupo
  • Patent number: 8275834
    Abstract: Disclosed is a flexible, multi-modal system useful in communications among users, capable of synchronizing real world and augmented reality, wherein the system is deployed in centralized and distributed computational platforms. The system comprises a plurality of input devices designed and configured to generate signals representing speech, gestures, pointing direction, and location of a user, and transmit the same to a multi-modal interface. Some of the signals generated represent a message from the user intended for dissemination to other users.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: September 25, 2012
    Assignee: Applied Research Associates, Inc.
    Inventors: Roberto Aldunate, Gregg E Larson
  • Publication number: 20120239384
    Abstract: A voice processing device includes a voice pitch converting unit that performs a voice pitch converting process with respect to an input voice signal and converts voice pitch of the input voice signal, an error detecting unit that detects an error between the number of samples of an output voice signal, which is expected, and the number of samples of the output voice signal, which is actually output, and a time length control unit that controls adjustment of the time length in such a manner that the time length of the output voice signal is corrected by the amount of the error.
    Type: Application
    Filed: March 9, 2012
    Publication date: September 20, 2012
    Inventors: Akihiro MUKAI, Akira Inoue
  • Patent number: 8271278
    Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.
    Type: Grant
    Filed: April 3, 2010
    Date of Patent: September 18, 2012
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
  • Patent number: 8271293
    Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. Each frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied, and (iii) window information. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes. Subband samples are then generated by dequantizing the decoded quantization indexes, and a sequence of different window functions that were applied within a single frame of the audio data is identified based on the window information.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: September 18, 2012
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 8270618
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: September 18, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hölzer, Claus Spenger
  • Patent number: 8271275
    Abstract: A scalable encoding device capable of reducing an encoding rate to reduce a circuit scale while preventing sound quality deterioration of a decoded signal. An extension layer is coarsely divided into a system for processing a first channel and a system for processing a second channel. A sound source predictor for processing the first channel predicts a drive sound source signal of the first channel from a drive sound source signal of a monaural signal, and outputs the predicted drive sound source signal through a multiplier to a first CELP encoder. A sound source predictor for processing the second channel predicts the drive sound source signal of the second channel from the drive sound source signal of the monaural signal and the output from the first CELP encoder, and outputs the predicted drive sound source signal through a multiplier to a second CELP encoder.
    Type: Grant
    Filed: May 29, 2006
    Date of Patent: September 18, 2012
    Assignee: Panasonic Corporation
    Inventors: Michiyo Goto, Koji Yoshida
  • Patent number: 8271271
    Abstract: A method for modification of a cepstro-temporally smoothed gain function of a gain function resulting in a bias compensated spectral gain function is provided. The cepstro-temporal smoothing increases the quality of an enhanced output signal, as it affects only spectral outliers caused by estimation errors, while the speech characteristics are well preserved. However, due to the cepstral transform, the temporal smoothing is done in the logarithmic domain rather than the linear domain, and hence results in a certain bias. Thus, the method for a general bias compensation for a cepstro-temporal smoothing of spectral filter gain functions that is only dependent on the lower limit of the spectral filter-gain function.
    Type: Grant
    Filed: July 17, 2009
    Date of Patent: September 18, 2012
    Assignee: Siemens Medical Instruments Pte. Ltd.
    Inventors: Colin Breithaupt, Timo Gerkmann, Rainer Martin
  • Patent number: 8265696
    Abstract: A digital telecommunications system wherein the telecommunications centers of the calling and called terminal are arranged to perform handshaking concerning the speech codec used by the terminals. Depending on the link between the telecommunications centers, the telecommunications centers are arranged to connect call connections past a transcoder unit or to control the transcoder units to let encoded speech through without speech encoding operations in such a way that speech encoding and decoding are carried out only in the terminals. Handshaking between the telecommunications centers is carried out as outband signalling.
    Type: Grant
    Filed: October 19, 1999
    Date of Patent: September 11, 2012
    Assignee: Nokia Siemens Networks Oy
    Inventor: Markku Verkama
  • Patent number: 8265941
    Abstract: A method for decoding an audio signal comprises receiving a combined downmix, a combined object information, and a mix information, the combined downmix being generating using at least two downmix signals, the combined object information being made by combination of at least two sets of object information, generating a downmix processing information using the combined object information and the mix information, and processing the combined downmix using the downmix processing information. The method and an apparatus for decoding an audio signal comprising the combined downmix and the combined object information can control object gain and output in a remote conference and so on. The method and the apparatus for decoding audio signal that contains multi-object signals are fast and efficiently by reducing process time, computer resource, thereby relieving the resource requirement like the wide bandwidth by using the combined object information.
    Type: Grant
    Filed: December 6, 2007
    Date of Patent: September 11, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hyen O Oh, Yang Won Jung
  • Patent number: 8266451
    Abstract: A portable device including a biometric voice sensor configured to detect voice information and to take an action in response to speech spoken into the voice sensor. The device also includes a voice processor configured to process the voice sensor signal characteristics. The portable device may encrypt the detected signal and may compare the detected signal characteristics with voice characteristics that are stored in a memory of the portable device for applications such as voice enabled authentication, identification, command execution, encryption, and free speech recognition. The voice sensor may include a thin membrane portion that detects pressure waves caused by human speech. The portable device may be a contact-type smart card, a contactless smart card, or a hybrid smart card with contact and contactless interfaces. The device may be powered by an internal battery or by a host via contacts or by a power signal making use of the antenna in a contactless implementation.
    Type: Grant
    Filed: August 31, 2001
    Date of Patent: September 11, 2012
    Assignee: Gemalto SA
    Inventors: Robert A. Leydier, Bertrand du Castel