Voiced Or Unvoiced Patents (Class 704/208)
  • Patent number: 7050968
    Abstract: In a speech signal decoding method, information containing at least a sound source signal, gain, and filter coefficients is decoded from a received bit stream. Voiced speech and unvoiced speech of a speech signal are identified using the decoded information. Smoothing processing based on the decoded information is performed for at least either one of the decoded gain and decoded filter coefficients in the unvoiced speech. The speech signal is decoded by driving a filter having the decoded filter coefficients by an excitation signal obtained by multiplying the decoded sound source signal by the decoded gain using the result of the smoothing processing. A speech signal decoding apparatus is also disclosed.
    Type: Grant
    Filed: July 27, 2000
    Date of Patent: May 23, 2006
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 7043428
    Abstract: A method of initializing an ITU Recommendation G.729 Annex B compliant voice activity detection (VAD) device is disclosed, having the steps of (1) determining a first set of running average background noise characteristics in accordance with Recommendation G.729B; (2) determining a second set of running average background noise characteristics; and (3) substituting the second set of running average background noise characteristics for the first set when a specific event occurs. The specific event is a divergence between the first and second sets of running average background noise characteristics.
    Type: Grant
    Filed: August 3, 2001
    Date of Patent: May 9, 2006
    Assignee: Texas Instruments Incorporated
    Inventor: Dunling Li
  • Patent number: 7039581
    Abstract: Linear predictive speech coding system with classification of frames and a hybrid coder using both waveform coding and parametric coding for different classes of frames. Phase alignment for a parametric coder aligns synthesized speech frames with adjacent waveform coder synthesized frames. Zero phase alignment of speech prior to waveform coding aligns synthesized speech frames of a waveform coder with frames synthesized with a parametric coder. Inter-frame interpolation of LP coefficients suppresses artifacts in resultant synthesized speech frames.
    Type: Grant
    Filed: September 22, 2000
    Date of Patent: May 2, 2006
    Assignee: Texas Instruments Incorporated
    Inventors: Jacek Stachurski, Alan V. McCree
  • Patent number: 7035793
    Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.
    Type: Grant
    Filed: October 27, 2004
    Date of Patent: April 25, 2006
    Assignee: Microsoft Corporation
    Inventors: Hao Jiang, Hongjiang Zhang
  • Patent number: 7031912
    Abstract: A speech coding apparatus includes a frequency parameter generating unit that generates LSP coefficients of an input signal. When the input signal is a non-speech signal, it generates the LSP coefficients of the non-speech signal in such a manner that they approach the LSP coefficients of the speech signal. Thus, even when the input signal is the non-speech signal, its LSP coefficients are quantized by referring to the LSP quantization codebook which is specifically prepared for the speech signal. Although a conventional speech coding apparatus has a problem in that even when it transmits the non-speech signal in a good condition, a conventional speech decoding apparatus cannot always decode the non-speech signal correctly, the present speech coding apparatus can solve the problem even when the receiving side uses the conventional speech decoding apparatus.
    Type: Grant
    Filed: July 30, 2001
    Date of Patent: April 18, 2006
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hisashi Yajima, Shigeaki Suzuki, Hideaki Ebisawa
  • Patent number: 7031916
    Abstract: A method of initializing an ITU Recommendation G.729 Annex B voice activity detection (VAD) device is disclosed, having the steps of (1) extracting a set of parameters from a signal that characterize the signal; (2) calculating an energy measure of the signal from the set of parameters; (3) comparing the energy measure with a reference value; (4) determining an initial value for an average of a noise characteristic of the signal; and (5) counting the number of times the energy measure equals or exceeds the reference level. Also disclosed is a method of converging an ITU Recommendation G.
    Type: Grant
    Filed: June 1, 2001
    Date of Patent: April 18, 2006
    Assignee: Texas Instruments Incorporated
    Inventors: Dunling Li, Daniel C. Thomas, Gokhan Sisli
  • Patent number: 7024353
    Abstract: In a distributed voice recognition system, a back-end pattern matching unit 27 can be informed of voice activity detection information as developed through use of a back-end voice activity detector 25. Although no specific voice activity detection information is developed or forwarded by the front-end of the system, precursor information as developed at the back-end can be used by the voice activity detector to nevertheless ascertain with relative accuracy the presence or absence of voice in a given set of corresponding voice recognition features as developed by the front-end of the system.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: April 4, 2006
    Assignee: Motorola, Inc.
    Inventor: Tenkasi Ramabadran
  • Patent number: 7016832
    Abstract: A voiced/unvoiced information estimation system uses input spectrum and synthetic spectrum to produce a voicing level spectrum. The estimation system uses a spectrum difference calculation unit to normalize a spectrum difference energy for each harmonic band in unit of harmonic band, and further uses a voicing level calculation unit to calculate a voicing level. The voicing level of each harmonic band has a continuous value between 1 and 0. The estimation system is effective in vector quantization of voiced/unvoiced information at a low bit rate. Because it is unnecessary to calculate a threshold for deciding a voiced/unvoiced information, a decision anomaly occurring due to threshold is eliminated, and the accuracy of a voicing level is improved. Furthermore, since a spectrum is represented by mixing a voiced element and a unvoiced element in a harmonic band, the estimation system improves the audio quality of a combined sound.
    Type: Grant
    Filed: July 3, 2001
    Date of Patent: March 21, 2006
    Assignee: LG Electronics, Inc.
    Inventor: Yong Soo Choi
  • Patent number: 6985857
    Abstract: A perceptually weighted speech coder system samples a speech signal and determines its pitch. The speech signal is characterized as fully voiced, partially voiced or weakly voiced. A Lloyd-Max quantizer is trained with the pitch values of those speech signals characterized as being substantially fully voiced. The quantizer quantizes the trained fully voiced pitch values and the pitch values of the non-fully voiced speech signals. The quantizer can also quantize gain values in a similar manner. Sampling is increased for fully-voice signals to improve coding accuracy. This limits application to non-real time speech storage.
    Type: Grant
    Filed: September 27, 2001
    Date of Patent: January 10, 2006
    Assignee: Motorola, Inc.
    Inventor: Victor Adut
  • Patent number: 6983242
    Abstract: A method for robust speech classification in speech coding and, in particular, for robust classification in the presence of background noise is herein provided. A noise-free set of parameters is derived, thereby reducing the adverse effects of background noise on the classification process. The speech signal is identified as speech or non-speech. A set of basic parameters is derived for the speech frame, then the noise component of the parameters is estimated and removed. If the frame is non-speech, the noise estimations are updated. All the parameters are then compared against a predetermined set of thresholds. Because the background noise has been removed from the parameters, the set of thresholds is largely unaffected by any changes in the noise. The frame is classified into any number of classes, thereby emphasizing the perceptually important features by performing perceptual matching rather than waveform matching.
    Type: Grant
    Filed: August 21, 2000
    Date of Patent: January 3, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Jes Thyssen
  • Patent number: 6975984
    Abstract: A technique for separating an acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source. The technique can be used to improve the quality of electrolaryngeal speech, and may be adapted for use in a special purpose telephone. A method according to the invention extracts a segment of consecutive values from the original stream of numerical values, and performs a discrete Fourier transform on the this first group of values. Next, a second group of values is extracted from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F0, and harmonics thereof. An inverse-Fourier transform is applied to the second group of values, to produce a representation of a segment of the V component. Multiple V component segments are then concatenated to form a V component sample stream.
    Type: Grant
    Filed: February 7, 2001
    Date of Patent: December 13, 2005
    Assignee: Speech Technology and Applied Research Corporation
    Inventors: Joel M. MacAuslan, Venkatesh Chari, Richard Goldhor, Carol Espy-Wilson
  • Patent number: 6970819
    Abstract: The principal object of this invention is to provide a suitable control method for closing length with respect to phonemes (such as unvoiced plosive consonants) having a closing interval, and as a result an improved rule-based speech synthesis device is provided. A phoneme type judgement part 201 judges whether the phoneme in question is a vowel or consonant and, in the case of a consonant, judges whether or not it is a consonant that anteriorly has a closing interval. As a result, it operates a vowel length estimation part 202 when it judges that the phoneme is a vowel and operates a consonant length estimation part 205 when it judges that the phoneme is a consonant, and when it has judged that this phoneme anteriorly has a closing interval, it operates a closing length estimation part 208, whereby the respective time lengths are estimated. After that, the estimated time lengths are set by vowel length setting part 203, consonant length setting part 206 and closing length setting part 209, respectively.
    Type: Grant
    Filed: October 27, 2000
    Date of Patent: November 29, 2005
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Yukio Tabei
  • Patent number: 6952669
    Abstract: A device is presented that includes an encoder. The encoder compresses a plurality of signals at variable frame rates based on a plurality of prioritized parameters to reduce signal bandwidth while preserving perceptual signal quality. Also presented is a device that includes a decoder. The decoder decompresses a plurality of compressed signals at variable rates based on a plurality of prioritized parameters to reduce signal bandwidth while preserving perceptual signal quality.
    Type: Grant
    Filed: January 12, 2001
    Date of Patent: October 4, 2005
    Assignee: Telecompression Technologies, Inc.
    Inventor: Sandra Hutchins
  • Patent number: 6915257
    Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment if a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.
    Type: Grant
    Filed: December 21, 2000
    Date of Patent: July 5, 2005
    Assignee: Nokia Mobile Phones Limited
    Inventors: Ari Heikkinen, Samuli Pietila, Vesa Ruoppila
  • Patent number: 6915256
    Abstract: A system, method and computer readable medium for quantizing pitch information of audio is disclosed. The method includes capturing audio representing a numbered frame of a plurality of numbered frames. The method further includes calculating a class of the frame, wherein a class is any one of a voiced or unvoiced class. If the frame is a voiced class, a pitch is calculated for the frame. If the frame is an even numbered frame and a voiced class, a codeword of a first length is calculated by absolutely quantizing the frame pitch. If the frame is an odd numbered frame and a voiced class and a reliable frame is available, a codeword of a second length is calculated by differentially quantizing the frame pitch. If there is no reliable frame available, a codeword of the second length is calculated by absolutely quantizing the frame pitch.
    Type: Grant
    Filed: February 7, 2003
    Date of Patent: July 5, 2005
    Assignees: Motorola, Inc., International Business Machines Corporation
    Inventors: Tenkasi V. Ramabadran, Alexander Sorin
  • Patent number: 6912495
    Abstract: An improved speech model and methods for estimating the model parameters, synthesizing speech from the parameters, and quantizing the parameters are disclosed. The improved speech model allows a time and frequency dependent mixture of quasi-periodic, noise-like, and pulse-like signals. For pulsed parameter estimation, an error criterion with reduced sensitivity to time shifts is used to reduce computation and improve performance. Pulsed parameter estimation performance is further improved using the estimated voiced strength parameter to reduce the weighting of frequency bands which are strongly voiced when estimating the pulsed parameters. The voiced, unvoiced, and pulsed strength parameters are quantized using a weighted vector quantization method using a novel error criterion for obtaining high quality quantization. The fundamental frequency and pulse position parameters are efficiently quantized based on the quantized strength parameters.
    Type: Grant
    Filed: November 20, 2001
    Date of Patent: June 28, 2005
    Assignee: Digital Voice Systems, Inc.
    Inventors: Daniel W. Griffin, John C. Hardwick
  • Patent number: 6901362
    Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: May 31, 2005
    Assignee: Microsoft Corporation
    Inventors: Hao Jiang, Hongjiang Zhang
  • Patent number: 6889186
    Abstract: A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.
    Type: Grant
    Filed: June 1, 2000
    Date of Patent: May 3, 2005
    Assignee: Avaya Technology Corp.
    Inventor: Paul Roller Michaelis
  • Patent number: 6876953
    Abstract: A method to process narrowband signals includes dividing the signal into segments of length N, where N optimizes filter bandwidth, FFT size, processing and memory. Each N-length segment is processed sequentially by filtering, a FFT and a peak detector that identifies the N-length segment's K largest spectral components. The frequency, bandwidth and power for the K largest spectral components are stored sequentially as N-processed data. After processing multiple N-length segments, reconstructing individual frequency spectrums for J continuous segments of the N-processed data, mapping the J reconstructed spectrums to a single spectrum, and applying the peak detector to the composite spectrum to separately store the single spectrum's K largest frequencies, with powers and bandwidths, as (N×J)-processed data. The N-length data is processed in groups of J until all N-length data is reprocessed. J may have multiple values, generating multiple processed data sets.
    Type: Grant
    Filed: April 20, 2000
    Date of Patent: April 5, 2005
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Scott D. Fisher
  • Patent number: 6865529
    Abstract: A method of estimating the pitch of a speech signal comprises the steps of dividing the speech signal into segments, calculating for each segment a conformity function, and detecting peaks in the conformity function. The method further comprises the steps of estimating an average distance between said peaks, and using the estimated average distance as an estimate of the pitch. In this way a method less complex than prior art methods, and thus suitable for small digital signal processors, is provided. The method also avoids the pitch halving situation. When the method is based on the fact that the identified peaks in the conformity function show a periodic behavior and that the true pitch period actually corresponds to the distance between the peaks, a simpler algorithm is achieved which provides the true pitch period independent on the occurrence of pitch halving, pitch doubling, etc. A similar device is also provided.
    Type: Grant
    Filed: April 5, 2001
    Date of Patent: March 8, 2005
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Cecilia Brandel, Henrik Johannisson
  • Publication number: 20040267525
    Abstract: Provided are an apparatus for and a method of determining a transmission rate in speech transcoding. An input frame is classified as speech or silence based on a first threshold value that is predetermined for at least one of a fixed code-book gain value, an adaptive code-book gain value, a noise to signal rate, and a pitch delay that correspond to an input parameter of a coded bit stream. An input frame classified as voiced is classified as stationary or non-stationary based on a third threshold value that is predetermined for the amount of change in the ACBG value or a difference between the minimum and maximum pitch delays. An input frame, classified as voiced by a voiced/unvoiced classifying portion, is classified as voiced or non-stationary based on a class of a previous frame.
    Type: Application
    Filed: December 4, 2003
    Publication date: December 30, 2004
    Inventors: Eung Don Lee, Hyun Woo Kim, Do Young Kim, Chang Dong Yoo, Seong Ho Seo, Dal Won Jang
  • Patent number: 6829578
    Abstract: Robust acoustic tone features are achieved first by the introduction of on-line, look-ahead trace back of the fundamental frequency (F0) contour with adaptive pruning, this fundamental frequency serves as the signal preprocessing front-end. The F0 contour is subsequently decomposed into lexical tone effect, phrase intonation effect, and random effect by means of time-variant, weighted moving average (MA) filter in conjunction with weighted (placing more emphasis on vowels) least squares of the F0 contour. The intonation effect is removed by subtraction of the F0 contour under superposition assumption. The acoustic tone features are defined as two parts. First, is the coefficients of the second order weighted regression of the de-intonation of the F0 contour over neighbouring frames. The second part deals with the degree of the periodicity of the signal, which are the coefficients of the second order regression of the auto-correlation.
    Type: Grant
    Filed: July 9, 2001
    Date of Patent: December 7, 2004
    Assignee: Koninklijke Philips Electronics, N.V.
    Inventors: Chang-Han Huang, Frank Torsten Bernd Seide
  • Publication number: 20040243402
    Abstract: The spectrum parameter calculator circuit 100 divides a decoded reproduction speech signal into frames and computes a spectrum parameter for each frame. The coefficient calculator circuit 130 shifts a frequency of the spectrum parameter to higher one, and then determines a filter coefficient extended in frequency bandwidth to output it to the composition filter circuit 170. The adder 160 outputs a sound-source signal, which results from addition of a noise signal having a duration equal to the frame length and an adaptive code vector based on a past sound-source signal, to the composition filter circuit 170. The adder 190 uses a sound-source signal extended in frequency bandwidth and adds the signal to a signal resulting from conversion of the reproduction speech signal with a sampling frequency having a higher frequency component to reproduce and output a speech signal extended in frequency bandwidth.
    Type: Application
    Filed: July 26, 2004
    Publication date: December 2, 2004
    Inventor: Kazunori Ozawa
  • Publication number: 20040230422
    Abstract: A method and related apparatus for determined whether a voice signal is mixed with a vocal signal. When applying to a multi-channel system, the method includes: counting number of zero-crossings of a sound signal of each channel within a given period; if the zero-crossing number of the sound signal of a first channel are lower than those of the sound signal of a second channel by a predetermined threshold, determining that the sound signal of the first channel is mixed with a vocal signal.
    Type: Application
    Filed: October 5, 2003
    Publication date: November 18, 2004
    Inventor: Gin-Der Wu
  • Patent number: 6820052
    Abstract: A low-bit-rate coding technique for unvoiced segments of speech includes the steps of extracting high-time-resolution energy coefficients from a frame of speech, quantizing the energy coefficients, generating a high-time-resolution energy envelope from the quantized energy coefficients, and reconstituting a residue signal by shaping a randomly generated noise vector with quantized values of the energy envelope. The energy envelope may be generated with a linear interpolation technique. A post-processing measure may be obtained and compared with a predefined threshold to determine whether the coding algorithm is performing adequately.
    Type: Grant
    Filed: July 17, 2002
    Date of Patent: November 16, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Amitava Das, Sharath Manjunath
  • Patent number: 6816832
    Abstract: A comfort noise block, that include a hangover period and comfort noise parameters, is transmitted in such a manner that it is not interrupted by other messages, such as FACCH messages. This is accomplished in a mobile station by a determination of whether any FACCH messages are required to be transmitted. If such FACCH messages exist, a further determination may be made as to which transmission can be made in the shortest time (i.e., the FACCH message or messages or the comfort noise parameters message), and this transmission is made first. In any event the comfort noise parameters block is transmitted without interruption. In a further embodiment of this invention the comfort noise parameters message is transmitted by being concatenated with another message, such as a neighbor channel measurement results message, so as to reduce overhead, conserve bandwidth, and reduce power consumption.
    Type: Grant
    Filed: June 11, 2001
    Date of Patent: November 9, 2004
    Assignee: Nokia Corporation
    Inventors: Seppo Alanara, Pekka Kapanen
  • Patent number: 6810377
    Abstract: A lost frame recovery technique for LPC-based systems employs interpolation of parameters from previous and subsequent good frames, selective attenuation of frame energy when the energy of a subframe exceeds a threshold, and energy tapering in the presence of multiple successive lost frames.
    Type: Grant
    Filed: June 19, 1998
    Date of Patent: October 26, 2004
    Assignee: Comsat Corporation
    Inventors: Grant Ian Ho, Marion Baraniecki, Suat Yeldener
  • Patent number: 6807526
    Abstract: At least one coded binary audio flux organized into frames is created from digital audio signals which were coded by transforming them from the time domain into the frequency domain. Transform coefficients of the signals in the frequency domain are quantized and coded according to a set of quantizers. The set is determined from a set of values extracted from the signals. The values make up selection parameters of the set of quantizers. The parameters are also present in the frames. A partial decoding state decodes then dequantizes transform coefficients produced by the coding based on a set of quantizers determined from the selection parameters contained in the frames of the coded binary audio flux or of each coded binary audio flux. The partially decoded frames are subjected to processing in the frequency domain. The thus-processed frames are then made available for use in a later utilization step.
    Type: Grant
    Filed: December 8, 2000
    Date of Patent: October 19, 2004
    Assignee: France Telecom S.A.
    Inventors: Abdellatif Benjelloun Touimi, Yannick Mahieux, Claude Lamblin
  • Patent number: 6804646
    Abstract: A method and an apparatus for processing a sound signal in which a useful signal and an interference signal are specified, the sound signal being transformed into the frequency domain and a change in the profile of the frequency being represented by an envelope for at least one frequency over a time. By segmenting the envelope, a maximum is obtained for each segment, the smallest maximum, weighted by a factor, being subtracted from the sound signal. It is also possible to take account of the minimum for the purpose of reducing the interference signal.
    Type: Grant
    Filed: September 19, 2000
    Date of Patent: October 12, 2004
    Assignee: Siemens Aktiengesellschaft
    Inventor: Tobias Schneider
  • Publication number: 20040167776
    Abstract: An apparatus and method for shaping the speech signal in consideration of its energy distribution. The shaping apparatus includes an encoder for receiving and encoding an unvoiced speech or background noise, dividing it into frequency bands according to its characteristics, performing comparison of energies of the frequency bands, and setting energy intensity flags according to the comparison result; and a decoder for shaping the data encoded by the encoder and the energy intensity flags. The present invention employs the shaping method in consideration of characteristics of the original input speech signal, and uses the shaping filter only using information about energy distribution without adding a large amount of bits to the signal that is difficult to synthesize, such as an unvoiced speech and background noise.
    Type: Application
    Filed: September 5, 2003
    Publication date: August 26, 2004
    Inventors: Eun-Kyoung Go, Dae-Hwan Hwang
  • Publication number: 20040167775
    Abstract: Estimating a speech signal pitch frequency by determining a speech signal frame line spectrum including spectral lines having respective line amplitudes and frequencies, selecting a predefined number of spectral lines having highest amplitudes, fewer then the total number of the spectral lines, calculating a preliminary utility function over a pitch frequency range to provide a preliminary utility function value for each pitch frequency in the range measuring the compatibility of the selected spectral lines with the pitch frequency, identifying a predefined number of preliminary pitch frequency candidates at least partly responsive to the preliminary utility function, where each candidate is a local maximum of the preliminary utility function, calculating a final utility score for each of the candidates, and selecting any of the candidates to be an estimated pitch frequency of the speech signal at least partly responsive to any of the final utility scores.
    Type: Application
    Filed: February 24, 2003
    Publication date: August 26, 2004
    Applicant: International Business Machines Corporation
    Inventor: Alexander Sorin
  • Patent number: 6775650
    Abstract: The invention concerns a method for conditioning a digital speech signal(s) processed by successive frames, which consists carrying out a harmonic analysis to estimate the pitch on each frame where it has a speech activity, and in oversampling at an oversampling frequency (fe) which is a multiple of the estimated pitch.
    Type: Grant
    Filed: June 2, 2000
    Date of Patent: August 10, 2004
    Assignee: Matra Nortel Communications
    Inventors: Philip Lockwood, Stéphane Lubiarz
  • Publication number: 20040153315
    Abstract: This invention relates to a non-intrusive speech quality assessment system.
    Type: Application
    Filed: January 15, 2004
    Publication date: August 5, 2004
    Applicant: PSYTECHNICS LIMITED
    Inventors: Richard Reynolds, Simon Broom, Paul Barrett
  • Patent number: 6754620
    Abstract: A system and method is provided for rendering data indicative of delays associated with enabling and/or disabling an analog-to-digital conversion system employed by a telephony communication network. The system of the present invention utilizes a display device and an interface manager. The interface manager receives data indicative of power levels at various frequencies and times of signals received by a transceiver that is communicating via the conventional telephony communication network. The interface manager then renders a graphical display via the display device based on the received data. The graphical display may include clusters, in which each of the clusters is associated with a particular range of power levels. By analyzing the clusters, a user can determine the delays associated with enabling and/or disabling the analog-to-digital conversion system. The graphical display may also include indicators that may be used to determine the foregoing delays.
    Type: Grant
    Filed: March 29, 2000
    Date of Patent: June 22, 2004
    Assignee: Agilent Technologies, Inc.
    Inventor: Samuel M Bauer
  • Patent number: 6738739
    Abstract: Voiced speech preprocessing employs waveform interpolation or a harmonic model circuit to smooth a transition region and simplify speech coding. At low bit rates, the speech is coded by a system that maintains a high perceptual quality in the transition region from a voiced (quasi-periodic) portion of the speech signal to an unvoiced (non-periodic) portion of the speech signal. Similarly, the transition region from an unvoiced portion to a voiced portion is conditioned to maintain a high perceptual quality at a low bandwidth. The transition region from one type of voiced region to another type of voiced region is also smoothed. The transition region is smoothed to create a quasi-periodic speech signal.
    Type: Grant
    Filed: February 15, 2001
    Date of Patent: May 18, 2004
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 6704702
    Abstract: A speech encoding method, apparatus and program wherein an input speech signal is divided into a plurality of frames each having a predetermined length, each of the frames is subdivided into a plurality of subframes, a predictive pitch period of a subframe in a to-be-encoded current frame is obtained by using pitch periods of at least two frames of the current frame and past and future frames with respect to the current frame; a pitch period of a subframe in the current frame is obtained by using the predictive pitch period, a relative pitch pattern codebook storing a plurality of relative pitch patterns representing fluctuations in pitch periods of a plurality of subframes is prepared, and a change in pitch period of plural subframes is expressed with one relative pitch pattern selected from the relative pitch pattern codebook.
    Type: Grant
    Filed: December 1, 2000
    Date of Patent: March 9, 2004
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Oshikiri, Kimio Miseki, Masami Akamine
  • Patent number: 6697776
    Abstract: A digitized signal detection system where the bit rate encoding is changed dynamically to provide encoding for different type signals and formats at bit rates optimized to properly reconstruct the input signal whether speech or non-speech and therefore can transfer signals of different character on a frame by frame basis. A change of encoding format can make the system a speech or music recognizer dependent what is to be listened for. Three basic components a recognizer which categorizes the type of input signal, an evaluator which evaluates the category of quality of the reconstructed signal and a recommender which make as recommendation based on the quality to change standards to encode the signals received pursuant to a standard which provides for improved quality. The dynamic signal detector receives the input signal directly and extracts the parameters for evaluation. These parameters are tested and a determination made if a switch of standards are required. To improve the reconstructed signal.
    Type: Grant
    Filed: July 31, 2000
    Date of Patent: February 24, 2004
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Gilles G. Fayad, Huan-Yu Su
  • Patent number: 6691092
    Abstract: A system determines a voicing measure as a measure of the degree of signal periodicity and uses the determined voicing measure to quantize the spectral magnitude of the slowly evolving waveform (SEW) and the modeling of the SEW and rapidly evolving waveform (REW) phase spectra.
    Type: Grant
    Filed: April 4, 2000
    Date of Patent: February 10, 2004
    Assignee: Hughes Electronics Corporation
    Inventors: Bangalore R. Udaya Bhaskar, Srinivas Nandkumar, Kumar Swaminathan, Gaguk Zakaria
  • Patent number: 6691081
    Abstract: A digital signal processor for processing data including voice messaging data that may have both voiced and unvoiced speech components utilizes computer routines stored in a memory used by the digital signal processor. The computer routines programmed provide for control of at least a portion of a selective call receiver; receiving and decoding data received at the selective call receiver; comparing the addresses received at the selective call receiver with addresses stored in a memory location coupled to the digital signal processor; controlling voicing including both voiced and unvoiced speech components; and generating a pitch wave using an inverse discrete Fourier Transform and resample the pitch wave to provide a time domain voiced speech component.
    Type: Grant
    Filed: April 28, 2000
    Date of Patent: February 10, 2004
    Assignee: Motorola, Inc.
    Inventors: Jian-Cheng Huang, Kenneth D. Finlon, Floyd D. Simpson
  • Patent number: 6691085
    Abstract: A method and system for encoding and decoding an input signal, wherein the input signal is divided into a higher frequency band and a lower frequency band in the encoding and decoding processes, and wherein the decoding of the higher frequency band is carried out by using an artificial signal along with speech related parameters obtained from the lower frequency band. In particular, the artificial signal is scaled before it is transformed into an artificial wideband signal containing colored noise in both the lower and the higher frequency band. Additionally, voice activity information is used to define speech periods and non-speech periods of the input signal. Based on the voice activity information, different weighting factors are used to scale the artificial signal in speech periods and non-speech periods.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: February 10, 2004
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Jani Rotola-Pukkila, Hannu Mikkola, Janne Vainio
  • Patent number: 6665637
    Abstract: The present invention relates to the concealment of errors in decoded acoustic signals caused by encoded data representing the acoustic signals being partially lost or damaged during transmission over a transmission medium. In case of lost data or received damaged data a secondary reconstructed signal is produced on basis of a primary reconstructed signal. This signal has a spectrally adjusted spectrum (Z4E), such that it deviates less with respect spectral shape from a spectrum (Z3) of a previously reconstructed signal produced from previously received data than a spectrum (Z′4) of the primary reconstructed signal.
    Type: Grant
    Filed: October 19, 2001
    Date of Patent: December 16, 2003
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Stefan Bruhn
  • Publication number: 20030216908
    Abstract: An estimate is made of the power of a speech portion of a speech signal that includes speech portions separated by non-speech portions, the power for the speech portion being estimated based on a power envelope that spans the speech portion. The gain of an automatic gain control is not adjusted during the speech portions.
    Type: Application
    Filed: May 16, 2002
    Publication date: November 20, 2003
    Inventors: Alexander Berestesky, David W. Duehren
  • Patent number: 6647280
    Abstract: A signal processing method, preferably for extracting a fundamental period from a noisy, low-frequency signal, is disclosed. The signal processing method generally comprises calculating a numerical transform for a number of selected periods by multiplying signal data by discrete points of a sine and a cosine wave of varying period and summing the results. The period of the sine and cosine waves are preferably selected to have a period substantially equivalent to the period of interest when performing the transform.
    Type: Grant
    Filed: January 14, 2002
    Date of Patent: November 11, 2003
    Assignee: OB Scientific, Inc.
    Inventors: Dennis E. Bahr, James L. Reuss.
  • Patent number: 6640209
    Abstract: A closed-loop, multimode, mixed-domain linear prediction (MDLP) speech coder includes a high-rate, time-domain coding mode, a low-rate, frequency-domain coding mode, and a closed-loop mode-selection mechanism for selecting a coding mode for the coder based upon the speech content of frames input to the coder. Transition speech (i.e., from unvoiced speech to voiced speech, or vice versa) frames are encoded with the high-rate, time-domain coding mode, which may be a CELP coding mode. Voiced speech frames are encoded with the low-rate, frequency-domain coding mode, which may be a harmonic coding mode. Phase parameters are not encoded by the frequency-domain coding mode, and are instead modeled in accordance with, e.g., a quadratic phase model. For each speech frame encoded with the frequency-domain coding mode, the initial phase value is taken to be the initial phase value of the immediately preceding speech frame encoded with the frequency-domain coding mode.
    Type: Grant
    Filed: February 26, 1999
    Date of Patent: October 28, 2003
    Assignee: Qualcomm Incorporated
    Inventor: Amitava Das
  • Publication number: 20030195744
    Abstract: Linear predictive coding (LPC) filter parameters are determined for use in encoding a voice signal. Samples of a speech signal using a z-transform function are pre-emphasized. The pre-emphasized samples are analyzed to produce LPC reflection coefficients. The LPC reflection coefficients are quantized by a voiced quantizer and by an unvoiced quantizer producing sets of quantized reflection coefficients. Each set is converted into respective spectral coefficients. The set which produces a smaller lag-spectral distance is determined. The determined set is selected to encode the voice signal.
    Type: Application
    Filed: May 28, 2003
    Publication date: October 16, 2003
    Applicant: InterDigital Technology Corporation
    Inventors: Daniel Lin, Brian M. McCarthy
  • Publication number: 20030101048
    Abstract: A kind of suppression system of background noise of voice sounds signals, the adaptive filter of the long-time and short-time statistic characteristics of the voice sounds, since the statistic characteristics of the voice sounds signals varies as time goes by, the association coefficients of the filter also have to be adjusted according to the variation of the voice sounds signals to eliminate the unnecessary background noise, next to compensate for the high frequency attenuation of the voice sounds signals by passing through the high frequency booster so as to elevate the degree of brightness of the voice sounds signals and to acquire the voice sounds with the best quality.
    Type: Application
    Filed: October 30, 2001
    Publication date: May 29, 2003
    Applicant: Chunghwa Telecom Co., Ltd.
    Inventor: Chia-Horng Liu
  • Publication number: 20030097257
    Abstract: A sound signal processing method includes emphasizing a first sound signal based on a plurality of sound signals produced by a plurality of microphones arranged at intervals, determining a frequency by an arrival direction of a second sound signal other than the first sound signal and the interval between the microphones, and removing a frequency band including the frequency determined, from the first sound signal emphasized.
    Type: Application
    Filed: November 22, 2002
    Publication date: May 22, 2003
    Inventors: Tadashi Amada, Takanori Yamamoto
  • Publication number: 20030093265
    Abstract: A method and system for Chinese speech pitch extraction is disclosed. The method and system for Chinese speech pitch extraction comprises: pre-computing an anti-bias auto-correlation of a Hamming window function; for at least one frame, saving a first candidate as an unvoiced candidate, and detecting other voiced candidates from the anti-bias auto-correlation function; and calculating a cost value for a pitch path according to a voiced/unvoiced intensity function based on the unvoiced and voiced candidates, saving a predetermined number of least-cost paths, and outputting at least a portion of contiguous frames with low time delay.
    Type: Application
    Filed: November 12, 2001
    Publication date: May 15, 2003
    Inventors: Bo Xu, Liang He, Wen Ke
  • Patent number: RE38889
    Abstract: A pitch period extracting apparatus includes a microcomputer which determines a sampling frequency for an A/D converter, and a range of delay times for calculating autocorrelative values on the basis of the sampling frequency. For example, the delay times are set within a range of 20 samples?k?100 samples in a case of 8 kHz, and a range of 15 samples?k?75 samples in a case of 6 kHz. The microcomputer calculates the autocorrelative values of speech signal data stored in a buffer memory, and outputs a delay time at which a maximum autocorrelative value is obtainable as a pitch period of an inputted speech signal.
    Type: Grant
    Filed: October 6, 2000
    Date of Patent: November 22, 2005
    Assignee: Sanyo Electric Co., Ltd.
    Inventor: Takeo Inoue
  • Patent number: RE38269
    Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the “noisy” vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end.
    Type: Grant
    Filed: October 21, 1999
    Date of Patent: October 7, 2003
    Assignee: ITT Manufacturing Enterprises, Inc.
    Inventor: Yu-Jih Liu