Transformation Patents (Class 704/203)
  • Publication number: 20080040101
    Abstract: Sound signals from sound sources present in multiple directions are accepted as inputs of multiple channels, and signal of each channel is transformed into a signal on a frequency axis. A phase component of the transformed signal is calculated for each identical frequency, and phase difference between the multiple channels is calculated. An amplitude component of the transformed signal is calculated, and a noise component is estimated from the calculated amplitude component. An SN ratio for each frequency is calculated on the basis of the amplitude component and the estimated noise component, and frequencies at which the SN ratios are larger than a predetermined value are extracted. Difference between arrival distances is calculated on the basis of the phase difference at selected frequency, and the arrival direction in which it is estimated that the target sound source is present is calculated.
    Type: Application
    Filed: July 20, 2007
    Publication date: February 14, 2008
    Applicant: FUJITSU LIMITED
    Inventor: Shoji Hayakawa
  • Patent number: 7328160
    Abstract: An encoding device includes a transforming unit operable to extract a part of an inputted audio signal at predetermined time intervals and to transform each extracted part to produce a plurality of windows composed of short blocks, and a judging unit operable to compare the windows with one another to judge whether there is a similarity of a predetermined degree and to replace a high frequency part of a first window, which is one of the produced windows, with values “0” when there is the similarity, wherein the first window and a second window share a high frequency part of the second window, which is also one of the produced windows. The encoding device also includes a first quantizing unit operable to quantize the produced windows after replacing operation; a first encoding unit operable to encode the quantized windows to produce encoded data; and a stream output unit operable to output the produced encoded data.
    Type: Grant
    Filed: November 1, 2002
    Date of Patent: February 5, 2008
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Kosuke Nishio, Takeshi Norimatsu, Mineo Tsushima, Naoya Tanaka
  • Publication number: 20080027714
    Abstract: To shorten an output delay while a high sound source separation performance is ensured when a sound separation process based on an ICA method is performed. A second Fourier transform process execution cycle t2 for obtaining a second frequency-domain signal S1 used as an input signal of a filter process is set shorter than a first Fourier transform process execution cycle t1 for obtaining a first frequency-domain signal used for a learning computation of a separating matrix. When the time length of a second time-domain signal S1 is set shorter than a time length of a first time-domain signal S0, a second separating matrix used for a filter process is set by aggregating matrix components of a first separating matrix obtained through a learning calculation for every a plurality of groups.
    Type: Application
    Filed: June 26, 2007
    Publication date: January 31, 2008
    Inventors: Takashi Hiekata, Yohei Ikeda
  • Patent number: 7315822
    Abstract: A system and method for encoding and/or decoding a signal, such as an audio signal, employing a reversible transform obtained via matrix lifting. This reversible transform not only converts integer input to integer output, but also reconstructs the exact input from the output. It is one of the key modules for lossless and progressive to lossless audio codecs. The system and method of the invention produces smaller quantization noise and better compression performance of lossless and progressive to lossless codecs previously known. A number of embodiments employing RMDCT solutions are described. Matrix lifting is used to implement a reversible fast Fourier transform (FFT) and a reversible fractional-shifted FFT, respectively, which are further combined with reversible rotations to form a RMDCT. A progressive-to-lossless embedded audio codec (PLEAC) employing RMDCT is implemented with superior results for both lossless and lossy audio compression.
    Type: Grant
    Filed: February 20, 2004
    Date of Patent: January 1, 2008
    Assignee: Microsoft Corp.
    Inventor: Jin Li
  • Patent number: 7310598
    Abstract: The invention relates to representation of one and multidimensional signal vectors in multiple nonorthogonal domains and design of Vector Quantizers that can be chosen among these representations. There is presented a Vector Quantization technique in multiple nonorthogonal domains for both waveform and model based signal characterization. An iterative codebook accuracy enhancement algorithm, applicable to both waveform and model based Vector Quantization in multiple nonorthogonal domains, which yields further improvement in signal coding performance, is disclosed. Further, Vector Quantization in multiple nonorthogonal domains is applied to speech and exhibits clear performance improvements of reconstruction quality for the same bit rate compared to existing single domain Vector Quantization techniques. The technique disclosed herein can be easily extended to several other one and multidimensional signal classes.
    Type: Grant
    Filed: April 11, 2003
    Date of Patent: December 18, 2007
    Assignee: University of Central Florida Research Foundation, Inc.
    Inventors: Wasfy Mikhael, Venkatesh Krishnan
  • Patent number: 7299175
    Abstract: An audio encoder regulates quality and bitrate with a control strategy. The strategy includes several features. First, an encoder regulates quantization using quality, minimum bit count, and maximum bit count parameters. Second, an encoder regulates quantization using a noise measure that indicates reliability of a complexity measure. Third, an encoder normalizes a control parameter value according to block size for a variable-size block. Fourth, an encoder uses a bit-count control loop de-linked from a quality control loop. Fifth, an encoder addresses non-monotonicity of quality measurement as a function of quantization level when selecting a quantization level. Sixth, an encoder uses particular interpolation rules to find a quantization level in a quality or bit-count control loop. Seventh, an encoder filters a control parameter value to smooth quality. Eighth, an encoder corrects model bias by adjusting a control parameter value in view of current buffer fullness.
    Type: Grant
    Filed: February 24, 2005
    Date of Patent: November 20, 2007
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Patent number: 7295968
    Abstract: The invention concerns audio signal processing, comprising: a first processing of an audio source signal, using at least a mathematical transform applied on first sequences of samples obtained by applying first segmentation windows on the audio source signal; and a second audio processing applied on second sequences of samples obtained by applying second segmentation windows on the signal delivered by the first step; the two successive first windows and/or the two successive second windows overlapping, the overlaps being such that the segmentations are synchronous.
    Type: Grant
    Filed: May 15, 2002
    Date of Patent: November 13, 2007
    Assignee: Wavecom
    Inventors: Franck Bietrix, Hubert Cadusseau
  • Patent number: 7269549
    Abstract: An encoding method is characterised by a step of encoding parameters of a given sinusoidal component in encoded frames either differentially relative to other components in the same frame or directly, i.e. without differential encoding. Whether the encoding is differential or direct is decided algorithmically. A first type of algorithm produces an optimal result using a method derived from graph theory. An alternative algorithm, which is less computing intensive, provides an approximate result by an iterative greedy search algorithm.
    Type: Grant
    Filed: October 15, 2002
    Date of Patent: September 11, 2007
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Jesper Jensen, Richard Heusdens
  • Patent number: 7251596
    Abstract: The present invention provides a unique wave-trigon transformation (WTT) method for performing transformation process over a wave signal. The present invention also provides a pitch detecting method and apparatus for detecting pitch based on the WTT process as well as a sentence detecting method and apparatus for detecting a sentence in a sound signal based on the WTT process. The pitch detecting method and apparatus can effectively detect pitch in a sound signal. In the WTT process, an inputted wave signal (such as a sound signal) is transformed into a series of trigons, and an energy-width spectrum is formed using these trigons. For a sound signal containing voice, the distribution of trigons transformed from the sound signal has a certain pattern. By analyzing the pattern, whether a pitch is contained in the sound signal can be determined. In particular, existence of a pitch can be determined by determining and evaluating the periodicity of trigons in a candidate chained peak in the energy-width spectrum.
    Type: Grant
    Filed: December 23, 2002
    Date of Patent: July 31, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Lianshan Zhu, Tao Yu
  • Patent number: 7245903
    Abstract: A telecommunication system serves individual programmed instruments by wireless operation, voice dialing and, prepaid operation. The central equipment involves a multiple port wireless platform along with computing capability for interfacing a multitude of mobile station instruments simultaneously for interactive audio communication. Depending on programmed operation, the central equipment may receive vocal instruction data from individual instruments, to bridge connections to other telephone terminals, incur financial transactions, as by credit card number, or to provide a geographic information service. Specific information on physical locations is addressed and fetched on the basis area and target location characteristics.
    Type: Grant
    Filed: January 27, 2005
    Date of Patent: July 17, 2007
    Inventor: Byard G. Nilsson
  • Patent number: 7219055
    Abstract: The present invention relates to a speech recognition apparatus for recognizing speeches of a plurality of users with high accuracy. An adapting unit 12 detects a best transformation function for adapting an input speech to an acoustic model from at least one transformation function based on the transformation results which are obtained by transforming the input speech by at least one transformation function stored in a storing unit 13, and allocates the input speech to the best transformation function. Further, the adapting unit 12 updates the transformation function to which the new input speech is allocated by all the input speeches allocated to the transformation function. A selecting unit 14 selects the transformation function used for transforming the input speech from at least one transformation function stored in the storing unit 13. A transforming unit 5 transforms the input speech by the selected transformation function.
    Type: Grant
    Filed: June 7, 2002
    Date of Patent: May 15, 2007
    Assignee: Sony Corporation
    Inventor: Helmut Lucke
  • Patent number: 7197453
    Abstract: A system for, and method of optimizing an operation of an oversampled filter bank and an oversampled discrete Fourier transform (DFT) filter bank designed by the system or the method. In one embodiment, the system includes: (1) a null space generator configured to produce a basis of a null space of a perfect reconstruction condition matrix based on a first window of the oversampled filter bank and (2) an optimizer associated with the basis generator and configured to employ the null space and an optimization criterion to construct a second window of the oversampled filter bank.
    Type: Grant
    Filed: June 23, 2006
    Date of Patent: March 27, 2007
    Assignee: Texas Instruments Incorporated
    Inventor: Mohamed F. Mansour
  • Patent number: 7185240
    Abstract: An apparatus for testing codec software includes a processor unit operative to execute a test program to read input data from a memory unit, to transform the input data according to transformation conditions by referring to data of the transformation conditions stored in the memory unit, and to generate a plurality of transformed data sets for storage in the memory unit, operative to execute a codec program to perform a plurality of encoding processes in parallel with respect to the plurality of transformed data sets stored in the memory unit to generate a plurality of encoded data sets, and to perform a plurality of decoding processes in parallel with respect to the plurality of encoded data sets to generate a plurality of decoded data sets, and operative to execute the test program to evaluate quality of at least one of the encoded data sets and the decoded data sets.
    Type: Grant
    Filed: June 22, 2005
    Date of Patent: February 27, 2007
    Assignee: Fujitsu Limited
    Inventors: Kenji Saito, Kazuyoshi Oyama
  • Patent number: 7151802
    Abstract: In a method and device for recovering the high frequency content of a wideband signal previously down-sampled, and for injecting this high frequency content in an over-sampled synthesized version of the wideband signal to produce a fill-spectrum synthesized wideband signal, a random noise generator produces a noise sequence having a given spectrum. A spectral shaping unit spectrally shapes the noise sequence in relation to linear prediction filter coefficients related to the down-sampled wideband signal. A signal injection circuit finally injects the spectrally-shaped noise sequence in the over-sampled synthesized signal version to thereby produce the full-spectrum synthesized wideband signal.
    Type: Grant
    Filed: October 27, 1999
    Date of Patent: December 19, 2006
    Assignee: Voiceage Corporation
    Inventors: Bruno Bessette, Redwan Salami, Roch Lefebvre
  • Patent number: 7143028
    Abstract: A simple and efficient method for producing an obfuscated speech signal which may be used to mask a stream of speech, is disclosed. A speech signal representing the speech stream to be masked is obtained. The speech signal is then temporally partitioned into segments, preferably corresponding to phonemes within the speech stream. The segments are then stored in a memory, and some or all of the segments are subsequently selected, retrieved, and assembled into an obfuscated speech signal representing an unintelligble speech stream that, when combined with the speech signal or reproduced and combined with the speech stream, provides a masking effect. While the presently preferred embodiment finds application most readily in an open plan office, embodiments suitable for use in restaurants, classrooms, and in telecommunications systems are also disclosed.
    Type: Grant
    Filed: July 24, 2002
    Date of Patent: November 28, 2006
    Assignee: Applied Minds, Inc.
    Inventors: W. Daniel Hillis, Bran Ferren, Russel Howe, Brian Eno
  • Patent number: 7133699
    Abstract: A method of separating mixed wireless signals is provided. The method includes receiving, at an antenna comprising a first quantity of antenna elements, mixed signals comprising a mixture of source signals communicated from a second quantity of wireless signal sources, and separating the mixed signals to estimate the source signals. The second quantity is greater than the first quantity, and the source signals communicated from at least one of the wireless signal sources are received at the antenna as complex signals.
    Type: Grant
    Filed: April 22, 2003
    Date of Patent: November 7, 2006
    Assignee: Raytheon Company
    Inventors: Yuri Owechko, David B. Shu
  • Patent number: 7124075
    Abstract: Methods and apparatus for detecting periodicity and/or for determining the fundamental period of a signal such as speech. The methods include embedding a portion of a sampled digitized signal into an m-dimensional state space to obtain a sequence of m-dimensional vectors, selecting closest pairs of vectors in state space from a plurality of possible pairs of m-dimensional vectors in said sequence of m-dimensional vectors, accumulating total numbers of selected closest pairs of vectors having the same time separation values to produce a histogram of accumulated numbers, and locating at least a highest peak in a portion of said histogram to obtain a value indicating the fundamental period of the signal. Various embodiments are directed to speech and audio signal processing and other speech related applications. However, the methods have a general nature and can be applied to other types of periodic or quasi-periodic signals as well.
    Type: Grant
    Filed: May 7, 2002
    Date of Patent: October 17, 2006
    Inventor: Dmitry Edward Terez
  • Patent number: 7103539
    Abstract: According to the invention, a method for increasing quality of an enhanced output signal to approximate an undistorted sound signal is disclosed. In one step, a distorted input signal is received that includes an embedded corrupting signal. The embedded corrupting signal is statistically related to the undistorted sound signal. An enhancement signal is determined by finding a difference between the distorted input signal and the enhanced output signal. The enhancement signal attempts to offset the affect of the embedded corrupting signal. Based at least in part upon analyzing the enhancement signal, the enhanced output signal is produced.
    Type: Grant
    Filed: November 8, 2001
    Date of Patent: September 5, 2006
    Assignees: Global IP Sound Europe AB, Global IP Sound Inc.
    Inventor: W. Bastiaan Kleijn
  • Patent number: 7103101
    Abstract: A method, system, and computer program product for blind Karhunen-Loeve transform (KLT) coding is disclosed, wherein KLT coefficients and possibly additional information are transmitted without transmitting KLT basis vectors. Instead the KLT basis vectors are estimated at both the transmitter and the receiver. In particular, the KLT transform coefficients are calculated based on an original signal frame, quantized, encoded, and transmitted. The transmitted KLT coefficients are received, decoded, and transformed into a reconstructed original signal frame by the receiver. Exemplary embodiments include a KLT coder in which a search direction vector is initially set to an arbitrary vector, such as a white noise vector, a KLT coder which selects a search direction vector from a set of search direction vectors in a codebook and a KLT coder for an original signal frame with an unknown signal subspace dimension.
    Type: Grant
    Filed: October 15, 2001
    Date of Patent: September 5, 2006
    Assignee: Southern Methodist University
    Inventor: Carlos E. Davila
  • Patent number: 7089178
    Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.
    Type: Grant
    Filed: April 30, 2002
    Date of Patent: August 8, 2006
    Assignee: Qualcomm Inc.
    Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek
  • Patent number: 7089177
    Abstract: The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech.
    Type: Grant
    Filed: August 3, 2005
    Date of Patent: August 8, 2006
    Assignee: The Regents of the University of California
    Inventors: Greg C. Burnett, John F. Holzrichter, Lawrence C. Ng
  • Patent number: 7065485
    Abstract: The method and preprocessor enhances the intelligibility of narrowband speech without essentially lengthening the overall time duration of the signal. Both spectral enhancements and variable-rate time-scaling procedures are implemented to improve the salience of initial consonants, particularly the perceptually important formant transitions. Emphasis is transferred from the dominating vowel to the preceding consonant through adaptation of the phoneme timing structure. In a further embodiment, the technique is applied as a preprocessor to a speech coder.
    Type: Grant
    Filed: January 9, 2002
    Date of Patent: June 20, 2006
    Assignee: AT&T Corp
    Inventors: Nicola R. Chong-White, Richard Vandervoort Cox
  • Patent number: 7016831
    Abstract: Disclosed is a voice code conversation apparatus to which voice code obtained by a first voice encoding method is input for converting this voice code to voice code of a second voice encoding method. The apparatus includes a code separating unit for separating, from the voice code based upon the first voice encoding method, codes of a plurality of components necessary to reconstruct a voice signal, code converters for dequantizing the codes of each of the components and then quantizing the dequantized values by the second voice encoding method to thereby generate codes, and a code multiplexer for multiplexing the codes output from respective ones of the code converters and transmitting voice code based upon the second voice encoding method.
    Type: Grant
    Filed: March 27, 2001
    Date of Patent: March 21, 2006
    Assignee: Fujitsu Limited
    Inventors: Masanao Suzuki, Yasuji Ota, Yoshiteru Tsuchinaga
  • Patent number: 7013266
    Abstract: In a method for determining speech quality using an objective measure, in order to enhance prediction reliability of the evaluated quality parameters, distortions of the mean spectral envelope are extensively corrected with a weighting function WT(f) before comparing spectral properties. Additionally, the fixed band limits for integration of spectral power density are suppressed and other band limits are searched for instead in a predetermined optimization area in which the resulting spectral intensity representations of the voice signal to be evaluated and the reference voice signal have maximum similarity. The solutions described can supplement known methods and can be incorporated into their structures.
    Type: Grant
    Filed: August 14, 1999
    Date of Patent: March 14, 2006
    Assignee: Deutsche Telekom AG
    Inventor: Jens Berger
  • Patent number: 7010480
    Abstract: A method for preparing a speech signal for encoding comprises determining whether the spectral content of an input speech signal is representative of a defined spectral characteristic (e.g., a defined characteristic slope). A frequency specific filter component of a weighting filter is controlled based on the determination of the spectral content of the speech signal or/and its location in the encoder. A core weighting filter component of the weighting filter may be maintained regardless of the spectral content of the speech signal.
    Type: Grant
    Filed: September 13, 2001
    Date of Patent: March 7, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Huan-Yu Su
  • Patent number: 6993477
    Abstract: A signal processing device utilizes a stochastic approximation of a gradient descent algorithm for updating a transform. The signal processing device is configured to implement the transform for producing a desired transformed output signal, and the transform is updated using the stochastic approximation of the gradient algorithm based on received data associated with the signal being processed. The transform is represented in a reduced-parameter form, such as a Givens parameterized form or a Householder form, such that the reduced-parameter form for an N×N transform comprises fewer than N2 parameters. The updating process is implemented using computations involving the reduced-parameter form, and an adaptation of the transform is represented directly as one or more changes in the reduced-parameter form. The gradient algorithm may be configured to minimize a negative gradient of a pairwise energy compaction property of the transform.
    Type: Grant
    Filed: June 8, 2000
    Date of Patent: January 31, 2006
    Assignee: Lucent Technologies Inc.
    Inventor: Vivek K. Goyal
  • Patent number: 6980933
    Abstract: Estimates of spectral magnitude and phase are obtained by an estimation process using spectral information from analysis filter banks such as the Modified Discrete Cosine Transform. The estimation process may be implemented by convolution-like operations with impulse responses. Portions of the impulse responses may be selected for use in the convolution-like operations to trade off between computational complexity and estimation accuracy. Mathematical derivations of analytical expressions for filter structures and impulse responses are disclosed.
    Type: Grant
    Filed: January 27, 2004
    Date of Patent: December 27, 2005
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Corey I. Cheng, Michael J. Smithers, David N. Lathrop
  • Patent number: 6963842
    Abstract: A memory-efficient system converting a signal from a first transform domain to a second transform domain. The system includes a first mechanism that obtains an input signal expressed via a first transform-domain signal representation. A second mechanism expresses the input signal via a second transform-domain signal representation without intermediate time-domain conversion. In the specific embodiment, the input signal is a Modified Discrete Cosine Transform (MDCT) signal. The second transform-domain signal representation is a Discrete Fourier Transform (DFT) signal. The second mechanism further includes a third mechanism that combines effects of an inverse MDCT, a synthesis window function, and an analysis window function, and provides a first signal in response thereto. A fourth mechanism converts the MDCT signal to the DFT signal based on the first signal.
    Type: Grant
    Filed: September 5, 2001
    Date of Patent: November 8, 2005
    Assignee: Creative Technology Ltd.
    Inventor: Michael M. Goodwin
  • Patent number: 6959277
    Abstract: In a conventional device for extracting voice features accurately without being influenced by noises, such as a voice recognition device, usually an input voice signal is processed first by a noise reduction system having the tap length N, and the result is FFT-processed by L-points, and then the power spectrum vector is calculated; accordingly, a one time operation requires N multiplications and (N?1) summations. The voice feature extraction device according to the invention receives a voice signal including noises from a microphone, which is processed by a window function operation unit, and thereafter FFT-processed by an FFT operation unit by L-points. A power calculation unit calculates a power spectrum vector of the input voice signal. However, a noise reduction system determines in advance a filter coefficient of this system and processes the coefficient to calculate a noise reduction coefficient, and the power spectrum vector is processed by this noise reduction system.
    Type: Grant
    Filed: June 26, 2001
    Date of Patent: October 25, 2005
    Assignee: Alpine Electronics, Inc.
    Inventors: Shingo Kiuchi, Toshiaki Asano, Nozomu Saito
  • Patent number: 6925116
    Abstract: The present invention proposes a new method and apparatus for the enhancement of source coding systems. The invention employs bandwidth reduction (101) prior to or in the encoder (103), followed by spectral-band replication (105) at the decoder (107). This is accomplished by the use of new transposition methods, in combination with spectral envelope adjustments. Reduced bitrate at a given perceptual quality or an improved perceptual quality at a given bitrate is offered. The invention is preferably integrated in a hardware or software codec, but can also be implemented as a separate processor in combination with a codec. The invention offers substantial improvements practically independent of codec type and technological progress.
    Type: Grant
    Filed: October 8, 2003
    Date of Patent: August 2, 2005
    Assignee: Coding Technologies AB
    Inventors: Lars Gustaf Liljeryd, Per Rune Albin Ekstrand, Lars Fredrik Henn, Hans Magnus Kristofer Kjorling
  • Patent number: 6912495
    Abstract: An improved speech model and methods for estimating the model parameters, synthesizing speech from the parameters, and quantizing the parameters are disclosed. The improved speech model allows a time and frequency dependent mixture of quasi-periodic, noise-like, and pulse-like signals. For pulsed parameter estimation, an error criterion with reduced sensitivity to time shifts is used to reduce computation and improve performance. Pulsed parameter estimation performance is further improved using the estimated voiced strength parameter to reduce the weighting of frequency bands which are strongly voiced when estimating the pulsed parameters. The voiced, unvoiced, and pulsed strength parameters are quantized using a weighted vector quantization method using a novel error criterion for obtaining high quality quantization. The fundamental frequency and pulse position parameters are efficiently quantized based on the quantized strength parameters.
    Type: Grant
    Filed: November 20, 2001
    Date of Patent: June 28, 2005
    Assignee: Digital Voice Systems, Inc.
    Inventors: Daniel W. Griffin, John C. Hardwick
  • Patent number: 6910006
    Abstract: In order to decode a sequence ?=(?1, . . . , ?i, . . . , ?n) where ?i is the received electrical signal corresponding to a transmitted signal ai representing the ith binary element vi of a word v=(v1, . . . , vn) chosen in a code C of words satisfying v·hT=0, where h is a row n-tuplet on the set {0,1 }, whose number of 1 is denoted w, an item of extrinsic information ?ext[A(i,h)]=P[ai=?1|A(i,h)]/P[ai=+1|A(i,h)] is determined on each of the elements vi covered by h, A(i,h) being the set of the received values ?j covered by h, with the exception of ?i, and P[ai|A(i,h)] being the probability that the ith signal transmitted was ai. This gives ?ext[A(i,h)]=[S1(i)+S3(i)+ . . . ]/[1+S2(i)+S4(i)+ . . .
    Type: Grant
    Filed: June 28, 2001
    Date of Patent: June 21, 2005
    Assignee: Canon Kabushiki Kaisha
    Inventor: Philippe Piret
  • Patent number: 6895376
    Abstract: A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. Re-estimation processes are performed to more strongly separate speaker-dependent and speaker-independent components of the speech model. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: May 17, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Florent Perronnin, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
  • Patent number: 6850884
    Abstract: In a coding procedure, coding parameters are selected for coding the speech signal to achieve enhanced perceptual quality of reproduced speech. At least one coding parameter value or preferential coding parameter value is selected to make a spectral response of the speech signal more uniform to compensate for spectral variations that might otherwise be imparted into the speech signal by a communications network associated with the signal processing system.
    Type: Grant
    Filed: February 14, 2001
    Date of Patent: February 1, 2005
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Huan Yu-Su
  • Patent number: 6842733
    Abstract: A signal processing system is well suited for conditioning a speech signal prior to coding the speech signal to achieve enhanced perceptual quality of reproduced speech. The signal processing system may be incorporated into mobile or portable wireless communications devices, wireless infrastructure equipment, or both. The signal processing system includes a filtering arrangement for filtering an input speech signal to make a spectral response of the speech signal more uniform to compensate for spectral variations that might otherwise be imparted into the speech signal by a communications network associated with the signal processing system.
    Type: Grant
    Filed: February 12, 2001
    Date of Patent: January 11, 2005
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Huan-Yu Su
  • Publication number: 20040243400
    Abstract: The invention relates to a speech extender, embodied for the estimation of a broadband speech signal from a narrow/band speech signal. Furthermore, the above is adaptive such as to adjust to the coding of a communication terminal and/or to a speaker.
    Type: Application
    Filed: March 29, 2004
    Publication date: December 2, 2004
    Inventor: Stefano Ambrosius Klinke
  • Patent number: 6826600
    Abstract: Mechanisms and techniques allow computer systems to create and exchange uniquely identified shared objects. Using this invention, a client computer system can operate client software to generate local object definitions in a local object specification. To assure that the local object definitions created by the client are uniquely identifiable by this client, as well as by a server and possibly other clients which may require access to such object definitions (e.g., other clients in a collaboration software system), the invention allows the client to send the local object specification to the server for unique identification of the object definitions. The server receives the local object specification containing the local object definitions created by the client and can convert each local object definition within the local object specification to a global object definition in a global object specification.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: November 30, 2004
    Assignee: Cisco Technology, Inc.
    Inventor: Paul J. Russell
  • Patent number: 6804651
    Abstract: Initially, voice signal components (4) are extracted from the audio signal (1) in a procedure for determining a measure of quality (2) of an audio signal (1). Based on this signal, a reference signal (6) is then generated by means of noise suppression (7) and interruption interpolation (8). This signal is compared with the voice signal (4) and an intrusive quality value (10) is determined in this way. A further quality value (15) is determined by establishing and evaluating (12, 14) codec-related signal distortions in the voice signal (4). Another quality value (17) is generated from the information relating to the detected signal interruptions (8). The measure of quality (2) is finally determined as a linear combination (16) of the various quality values (10, 15, 17, 18).
    Type: Grant
    Filed: March 19, 2002
    Date of Patent: October 12, 2004
    Assignee: Swissqual AG
    Inventors: Pero Juric, Bendicht Thomet
  • Patent number: 6795804
    Abstract: A system and method for applying a linear transformation to classify and input event. In one aspect, a method for classification comprises the steps of capturing an input event; extracting an n-dimensional feature vector from the input event; applying a linear transformation to the feature vector to generate a pool of projections; utilizing different subsets from the pool of projections to classify the feature vector; and outputting a class identity of the classified feature vector. In another aspect, the step of utilizing different subsets from the pool of projections to classify the feature vector comprises the steps of, for each predefined class, selecting a subset from the pool of projections associated with the class; computing a score for the class based on the associated subset; and assigning, to the feature vector, the class having the highest computed score.
    Type: Grant
    Filed: November 1, 2000
    Date of Patent: September 21, 2004
    Assignee: International Business Machines Corporation
    Inventors: Nagendra Kumar Goel, Ramesh Ambat Gopinath
  • Publication number: 20040176949
    Abstract: Method and apparatus for the classification of speech signals. Speech is classified into two broad classes of speech production—whispered speech and normally phonated speech. Speech classified in this manner will yield increased performance of automated speech processing systems because the erroneous results that occur when typical automated speech processing systems encounter non-typical speech such as whispered speech, will be avoided.
    Type: Application
    Filed: March 3, 2003
    Publication date: September 9, 2004
    Inventors: Stanley J. Wenndt, Edward J. Cupples
  • Publication number: 20040162721
    Abstract: A method of editing (4) relatively long frames with high sub-frame accuracy for editing in the context of sinusoidal coding is disclosed. In order to provide such a method for high accuracy editing, so called transient positions can be applied where an edit point (EEP, SEP) is desired in a previously encoded signal (AS). The adding is done as some kind of post-processing, by for example an audio editing application. The advantage of using a transient position as an edit point, is that the signal can then abruptly end or start at the transient position, in principle with sample resolution accuracy, whereas in prior art systems, one is limited to frame boundaries, which occur, for example, once per 100 ms.
    Type: Application
    Filed: December 4, 2003
    Publication date: August 19, 2004
    Inventors: Arnoldus Werner Johannes Oomen, Leon Maria Van de Kerkhof
  • Publication number: 20040158459
    Abstract: Systems and methods are described for a fast paired method of 1-D cyclic convolution. A method includes calculating a paired transform of a signal, grouping components of the paired transform to form a plurality of splitting-signals, shifting the plurality of splitting signals, multiplying the plurality of splitting signals by a plurality of corresponding Fourier transforms, and calculating an inverse paired transform of the plurality of splitting signals.
    Type: Application
    Filed: January 31, 2003
    Publication date: August 12, 2004
    Applicant: Board of Regents, The University of Texas System
    Inventor: Artyom M. Grigoryan
  • Patent number: 6757648
    Abstract: A transcoder reduces excess requantization error in quantization of spectral data. The transcoder phase shifts data decompressed by a decompressor. The phase shifting causes a change to corresponding spectral data produced in later transform coding of the decompressed data. When the spectral data is then quantized to reduce bitrate, the earlier phase shifting reduces excess requantization error. After transcoding, a second decompressor can compensate for the phase shifting by, for example, reverse shifting by the amount of the phase shift. Instead of phase shifting, the transcoder can reduce excess requantization error by, for example, adding random noise to the decompressed data or changing transform block sizes.
    Type: Grant
    Filed: June 28, 2001
    Date of Patent: June 29, 2004
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Ming-Chieh Lee
  • Patent number: 6754628
    Abstract: Methods and apparatus for facilitating speaker recognition, wherein, from target data that is provided relating to a target speaker and background data that is provided relating to at least one background speaker, a set of cohort data is selected from the background data that has at least one proximate characteristic with respect to the target data. The target data and the cohort data are then combined in a manner to produce at least one new cohort model for use in subsequent speaker verification. Similar methods and apparatus are contemplated for non-voice-based applications, such as verification through fingerprints.
    Type: Grant
    Filed: June 13, 2000
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Stephane H. Maes, Jiri Navratil
  • Patent number: 6748355
    Abstract: A sound synthesis method for modeling and synthesizing dynamic, parameterized sounds. The sound synthesis method yields perceptually convincing sounds and provides flexibility through model parameterization. By manipulating model parameters, a variety of related, but perceptually different sounds can be generated. The result is subtle changes in sounds, in addition to synthesis of a variety of sounds, all from a small set of models. The sound models can change dynamically according to changes in the simulation environment. The method is applicable to both stochastic (impulse-based) and non-stochastic (pitched) sounds.
    Type: Grant
    Filed: January 28, 1998
    Date of Patent: June 8, 2004
    Assignee: Sandia Corporation
    Inventors: Nadine E. Miner, Thomas P. Caudell
  • Patent number: 6745162
    Abstract: The present invention relates to a system and method which serves as a refinement in the criteria used to improve the performance of audio signal processing systems. More specifically, the present invention provides a method by which the frequency and magnitude of artifacts added to audio signal data in an encoder device can be reduced. The encoding device through which the audio signal passes includes a filter bank for filtering source audio data to produce frequency sub-bands, a psycho-acoustic modeler for calculating signal to masking ratios from the frequency sub-bands of the source audio data, and a bit allocator for assigning for using the signal to masking ratios to assign a finite number of bits to represent the frequency sub-bands. In the absence of a significant event, the bit allocator performs a pre-bit allocation procedure to prevent artifacts or discontinuities in the encoded audio data.
    Type: Grant
    Filed: October 23, 2000
    Date of Patent: June 1, 2004
    Assignees: Sony Corporation, Sony Electronics, Inc.
    Inventor: Fengduo Hu
  • Patent number: 6741962
    Abstract: A speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.
    Type: Grant
    Filed: March 7, 2002
    Date of Patent: May 25, 2004
    Assignee: NEC Corporation
    Inventor: Kenichi Iso
  • Patent number: 6738739
    Abstract: Voiced speech preprocessing employs waveform interpolation or a harmonic model circuit to smooth a transition region and simplify speech coding. At low bit rates, the speech is coded by a system that maintains a high perceptual quality in the transition region from a voiced (quasi-periodic) portion of the speech signal to an unvoiced (non-periodic) portion of the speech signal. Similarly, the transition region from an unvoiced portion to a voiced portion is conditioned to maintain a high perceptual quality at a low bandwidth. The transition region from one type of voiced region to another type of voiced region is also smoothed. The transition region is smoothed to create a quasi-periodic speech signal.
    Type: Grant
    Filed: February 15, 2001
    Date of Patent: May 18, 2004
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Publication number: 20040088160
    Abstract: A digital audio encoding method using an advanced psychoacoustic model is provided. The audio encoding method including determining the type of a window according to the characteristic of an input audio signal; generating a complex modified discrete cosine transform (CMDCT) spectrum from the input audio signal according to the determined window type; generating a fast Fourier transform (FFT) spectrum from the input audio signal, by using the determined window type; and performing a psychoacoustic model analysis by using the generated CMDCT spectrum and FFT spectrum.
    Type: Application
    Filed: September 2, 2003
    Publication date: May 6, 2004
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Mathew Manu
  • Patent number: 6725190
    Abstract: A speech reconstruction method and system for converting a series of binned spectra or functions thereof such as the Mel Frequency Cepstra Coefficients (MFCC), of an original digitized speech signal, into a reconstructed speech signal, where each binned spectrum has a respective pitch value and voicing decision. The binned spectra are derived from the original digitized speech signal at successive instances by multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions and computing the integrals thereof. At each respective time instance, harmonic frequencies and weights are generated according to the respective pitch value and voicing decision. Basis functions having bounded supports on the frequency axis are each sampled at all said harmonic frequencies, which are within its support and multiplied by respective harmonic weights.
    Type: Grant
    Filed: November 2, 1999
    Date of Patent: April 20, 2004
    Assignee: International Business Machines Corporation
    Inventors: Dan Chazan, Gilad Cohen, Ron Hoory