Voiced Or Unvoiced Patents (Class 704/208)
-
Patent number: 7555310Abstract: An electronic apparatus configured, on the assumption of its sharing by a plurality of users, to provide a plurality of functions, in which normal operations and interrupt operations based on a user's voice can be performed is provided. The apparatus comprises: a voice inputting unit for inputting a user's voice; a storing unit for storing at least a predetermined type of operation for each of a plurality of users; a voice decoding unit for decoding input voice information; a determining unit for determining if the decoded voice describes a type of operation or describes a type of operation and a word “interrupt”; a command outputting unit for outputting: a command which allows a processing involved in the type of operation to be executed, at least when the determining unit determines that the decoded voice describes a type of operation in the storing unit.Type: GrantFiled: November 9, 2006Date of Patent: June 30, 2009Assignee: Kyocera Mita CorporationInventors: Kentarou Sakuramoto, Hideki Hayashi
-
Patent number: 7542787Abstract: The present invention provides an apparatus and method for providing hands-free operation of a device. A hands-free adapter is provided that communicates with a device and a headset. The hands-free adapter allows a user to use voice commands so that the user does not have to handle the device. The hands-free adapter receives voice commands from the headset and translates the voice commands to commands recognized by the device. The hands-free adapter also monitors the device to detect device events and provides notice of the events to the user via the headset.Type: GrantFiled: February 14, 2006Date of Patent: June 2, 2009Assignee: AT&T Intellectual Property I, L. P.Inventors: Lan Zhang, Joseph E. Page, Jr., Barrett M. Kreiner
-
Patent number: 7536298Abstract: An embodiment of the invention improves upon the International Telecommunication Union's ITU-T G.729 Annex B comfort noise generation algorithm by reducing the computational complexity of the comfort noise generation algorithm. The computational complexity is reduced by reusing pre-computed random Gaussian noise samples for each non active voice frame versus calculating new random Gaussian noise samples for each non active voice frame as described by Annex B.Type: GrantFiled: March 15, 2004Date of Patent: May 19, 2009Assignee: Intel CorporationInventors: Permachanahalli S Ramkumar, Shashi Shankar Hosur
-
Publication number: 20090125301Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.Type: ApplicationFiled: November 3, 2008Publication date: May 14, 2009Applicant: Melodis Inc.Inventors: Aaron Master, Seyed Majid Emami
-
Patent number: 7529664Abstract: An approach for improving quality of synthesized speech is presented. The input speech or residual is first separated into a voiced portion and a noise portion. The voice portion is coded using CELP methods. The noise portion of the input speech may be estimated at the decoder since it contains minimal voiced speech components. The separation is frequency dependent and is adaptive to the input speech. The separation may be accomplished using a lowpass/highpass filter combination. The information regarding bandwidth of the lowpass/highpass is presented to the decoder to facilitate reproduction of the noise portion of the speech.Type: GrantFiled: March 11, 2004Date of Patent: May 5, 2009Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7505950Abstract: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.Type: GrantFiled: April 26, 2006Date of Patent: March 17, 2009Assignee: Nokia CorporationInventors: Jilei Tian, Jani Nurminen, Victor Popa
-
Patent number: 7505594Abstract: A method and apparatus for controlling a discontinuous transmission process. Audio information is digitized and provided to a vocoder. A voice activity level is determined from the digitized audio signal, and if voice activity is present, active vocoder frames are generated at a predetermined output rate. If voice activity is not detected, inactive vocoder frames are generated. During transitions between periods of speech activity and speech inactivity, transition frames are generated, the transition frames comprising background noise information.Type: GrantFiled: December 19, 2000Date of Patent: March 17, 2009Assignee: QUALCOMM IncorporatedInventor: Anthony Mauro
-
Patent number: 7478041Abstract: Provided is a method for canceling background noise of a sound source other than a target direction sound source in order to realize highly accurate speech recognition, and a system using the same. In terms of directional characteristics of a microphone array, due to a capability of approximating a power distribution of each angle of each of possible various sound source directions by use of a sum of coefficient multiples of a base form angle power distribution of a target sound source measured beforehand by base form angle by using a base form sound, and power distribution of a non-directional background sound by base form, only a component of the target sound source direction is extracted at a noise suppression part. In addition, when the target sound source direction is unknown, at a sound source localization part, a distribution for minimizing the approximate residual is selected from base form angle power distributions of various sound source directions to assume a target sound source direction.Type: GrantFiled: March 12, 2003Date of Patent: January 13, 2009Assignee: International Business Machines CorporationInventors: Osamu Ichikawa, Tetsuya Takiguchi, Masafumi Nishimura
-
Patent number: 7478040Abstract: A method for adaptive long-term filtering of an audio signal, such as a decoded speech signal. The method includes measuring a smoothed periodicity of an audio signal segment, such as an audio frame, wherein the smoothed periodicity is measured by low-pass filtering an instantaneous periodicity of the audio signal segment. The periodicity of the audio signal segment is then increased in a manner that depends upon whether the smoothed periodicity is less than a predetermined threshold. By utilizing a smoothed periodicity measurement in this fashion, more accurate control of the post-filter is provided as compared to conventional solutions. Additionally, the method includes deriving filters by interpolating between filter responses of adjacent audio signal segments to minimize distortion at segment boundaries.Type: GrantFiled: October 20, 2004Date of Patent: January 13, 2009Assignee: Broadcom CorporationInventors: Jes Thyssen, Juin-Hwey Chen
-
Patent number: 7457746Abstract: There is provided a pitch lag predictor for use by a speech decoder to generate a predicted pitch lag parameter. The pitch lag predictor comprises a summation calculator configured to generate a first summation based on a plurality of previous pitch lag parameters, and a second summation based on a plurality of previous pitch lag parameters and a position of each of the plurality of previous pitch lag parameters with respect to the predicted pitch lag parameter; a coefficient calculator configured to generate a first coefficient using a first equation based on the first summation and the second summation, and a second coefficient using a second equation based on the first summation and the second summation, wherein the first equation is different than the second equation; and a predictor configured to generate the predicted pitch lag parameter based on the first coefficient and the second coefficient.Type: GrantFiled: March 20, 2006Date of Patent: November 25, 2008Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7424427Abstract: An audio classification system classifies sounds in an audio stream as belonging to one of a relatively small number of classes. The audio classification system includes a signal analysis component [301] and a decoder [302]. The decoder [302] includes a number of models [310-316] for performing the audio classifications. In one implementation, the possible classifications include: vowels, fricatives, narrowband, wideband, coughing, gender, and silence. The classified audio may be used to enhance speech recognition of the audio stream.Type: GrantFiled: October 16, 2003Date of Patent: September 9, 2008Assignees: Verizon Corporate Services Group Inc., BBN Technologies Corp.Inventors: Daben Liu, Francis G. Kubala
-
Patent number: 7412379Abstract: Techniques utilising Time Scale Modification (TSM) of signals are described. The signal is analysed and divided into frames of similar signal types. Techniques specific to the signal type are then applied to the frames thereby optimising the modification process. The method of the present invention enables TSM of different audio signal parts to be realized using different methods, and a system for effecting said method is also described.Type: GrantFiled: April 2, 2002Date of Patent: August 12, 2008Assignee: Koninklijke Philips Electronics N.V.Inventors: Rakesh Taori, Andreas Johannes Gerrits, Dzevdet Burazerovic
-
Publication number: 20080167863Abstract: The present invention relates to an apparatus and method of improving intelligibility of a voice signal. A method of improving intelligibility of a voice signal according to an embodiment of the present invention includes analyzing a background noise signal on a call receiving side, classifying a received voice signal into a silence signal, an unvoiced sound signal, and a voiced sound signal, and intensifying the classified unvoiced sound signal and voiced sound signal on the basis of the analyzed background noise signal on the call receiving side.Type: ApplicationFiled: November 16, 2007Publication date: July 10, 2008Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Chang-kyu Choi, Kwang-il Hwang, Sun-gi Hong, Young-hun Sung, Yeun-bae Kim, Yong Kim, Sang-hoon Lee, Hong Jeong
-
Patent number: 7386444Abstract: Hybrid linear predictive speech coding system with phase alignment predictive quantization zero phase alignment of speech prior to waveform coding aligns synthesized speech frames of a waveform coder with frames synthesized with a parametric coder. Inter-frame interpolation of LP coefficients suppresses artifacts in resultant synthesized speech frames.Type: GrantFiled: January 30, 2004Date of Patent: June 10, 2008Assignee: Texas Instruments IncorporatedInventor: Jacek Stachurski
-
Patent number: 7376557Abstract: A privacy apparatus adds a privacy sound based on a speaker's own voice into the environment, thereby confusing listeners as to which of the sounds is the real source. This permits disruption of the ability to understand the source speech of the user by eliminating segregation cues that the auditory system uses to interpret speech. The privacy apparatus minimizes segregation cues. The privacy apparatus is relatively quiet and thus easily acceptable in a typical open floor design office space. The privacy apparatus contains an A/D converter that converts the speech into a digital signal, a DSP that converts the digital signal into a privacy signal with pre-recorded speech fragments that are summed so that the speech fragments at least partly overlap one another, a D/A converter that converts the privacy signal into an output signal and one or more loudspeakers from which the output signal is emitted.Type: GrantFiled: January 4, 2006Date of Patent: May 20, 2008Assignee: Herman Miller, Inc.Inventors: Jeffrey Specht, Daniel Mapes-Riordan, William DeKruif
-
Publication number: 20080109218Abstract: A system and method for modeling speech in such a way that both voiced and unvoiced contributions can co-exist at certain frequencies. In various embodiments, three spectral bands (or bands of up to three different types) are used. In one embodiment, the lowest band or group of bands is completely voiced, the middle band or group of bands contains both voiced and unvoiced contributions, and the highest band or group of bands is completely unvoiced. The embodiments of the present invention may be used for speech coding and other speech processing applications.Type: ApplicationFiled: September 13, 2007Publication date: May 8, 2008Inventors: Jani Nurminen, Sakari Himanen
-
Publication number: 20080109217Abstract: An apparatus for providing control of voicing in processed speech includes a spectra approximation element and a comparing element. The spectra approximation element may be configured to compute a voiced contribution and an unvoiced contribution for each of a reference speech sample and a processed speech sample. The comparing element may be configured to compare indications of voiced and unvoiced contributions of the reference speech sample and indications of voiced and unvoiced contributions of the processed speech sample, and to determine whether to correct at least one of the voiced or unvoiced contributions of the processed speech sample based on the comparison.Type: ApplicationFiled: November 8, 2006Publication date: May 8, 2008Inventor: Jani K. Nurminen
-
Patent number: 7366658Abstract: An enhanced noise pre-processor in a speech codec smoothes channel energy estimate moving toward a first smoothing constant if a prior signal to noise ratio estimate for more than five channels are above a threshold and toward a second smaller smoothing constant otherwise. Forming a signal to noise ratio estimate for each channel includes conditionally boosting if a signal energy estimate is more than a predetermined factor of a noise energy estimate and signal to noise ratio estimates are above a threshold for more than five channels. The estimated signal to noise ratio is conditionally modified if two long term prediction coefficients are above a predetermined factor. The estimated signal to noise ratio is not modified and a voice metric is set greater than a voice metric threshold upon matching templates corresponding to the fricative and nasal speech sounds. An adaptive minimum channel gain is chosen based on a current signal to noise ratio estimate.Type: GrantFiled: December 11, 2006Date of Patent: April 29, 2008Assignee: Texas Instruments IncorporatedInventors: Pratibha Moogi, Chanaveeragouda Virupaxagouda Goudar
-
Publication number: 20080077399Abstract: There is provided a low-frequency-band voice reconstructing device. A voice signal from which a signal in a low-frequency band is removed is inputted to the device and the device reconstructs the signal in the low frequency band based on the input voice signal. The device comprises a first portion for extracting part of harmonic components of a pitch signal of voice from the input voice signal, a second portion for squaring a signal extracted by the first portion, a third portion for extracting a signal of a pitch frequency and harmonic signals of a lower limit frequency or below of the input voice signal, from the signal obtained by the second portion, and a fourth portion for correcting an amplitude level of the signal extracted by the third portion.Type: ApplicationFiled: September 20, 2007Publication date: March 27, 2008Applicant: Sanyo Electric Co., Ltd.Inventor: Masahiro Yoshida
-
Patent number: 7343284Abstract: A method for discriminating noise from signal in a noise-contaminated signal involves decomposing a frame of samples of the signal into decorrelated components, and using a difference between probability distributions of the noise contributions and the signal contributions to identify signal and noise. A Gaussian distribution is used to determine whether the components are only noise whereas a Laplacian distribution is used to determine whether the components contain the signal. Such discrimination may be used in speech enhancement or voice activity detection apparatus.Type: GrantFiled: July 17, 2003Date of Patent: March 11, 2008Assignee: Nortel Networks LimitedInventors: Saeed Gazor, Mohamed El-Hennawey
-
Patent number: 7337107Abstract: Pitch estimation and classification into voiced, unvoiced and transitional speech were performed by a spectro-temporal auto-correlation technique. A peak picking formula was then employed. A weighting function was then applied to the power spectrum. The harmonics weighted power spectrum underwent mel-scaled band-pass filtering, and the log-energy of the filter's output was discrete cosine transformed to produce cepstral coefficients. A within-filter cubic-root amplitude compression was applied to reduce amplitude variation without compromise of the gain invariance properties.Type: GrantFiled: October 2, 2001Date of Patent: February 26, 2008Assignee: The Regents of the University of CaliforniaInventors: Kenneth Rose, Liang Gu
-
Patent number: 7337108Abstract: An adaptive “temporal audio scaler” is provided for automatically stretching and compressing frames of audio signals received across a packet-based network. Prior to stretching or compressing segments of a current frame, the temporal audio scaler first computes a pitch period for each frame for sizing signal templates used for matching operations in stretching and compressing segments. Further, the temporal audio scaler also determines the type or types of segments comprising each frame. These segment types include “voiced” segments, “unvoiced” segments, and “mixed” segments which include both voiced and unvoiced portions. The stretching or compression methods applied to segments of each frame are then dependent upon the type of segments comprising each frame. Further, the amount of stretching and compression applied to particular segments is automatically variable for minimizing signal artifacts while still ensuring that an overall target stretching or compression ratio is maintained for each frame.Type: GrantFiled: September 10, 2003Date of Patent: February 26, 2008Assignee: Microsoft CorporationInventors: Dinei Florencio, Philip Chou, Li-Wei He
-
Patent number: 7295982Abstract: The present invention relates to a system and method for automatically verifying that a message received from a user is intelligible. In an exemplary embodiment, a message is received from the user. A speech level of the user's message may be measured and compared to a pre-determined speech level threshold to determine whether the measured speech level is below the pre-determined speech level threshold. A signal-to-noise ratio of the user's message may be measured and compared to a pre-determined signal-to-noise ratio threshold to determine whether the measured signal-to-noise ratio of the message is below the pre-determined signal-to-noise ratio threshold. An estimate of intelligibility for the user's message may be calculated and compared to an intelligibility threshold to determine whether the calculated estimate of intelligibility is below the intelligibility threshold.Type: GrantFiled: November 19, 2001Date of Patent: November 13, 2007Assignee: AT&T Corp.Inventors: Harvey S. Cohen, Randy G. Goldberg, Kenneth H. Rosen
-
Patent number: 7277916Abstract: A system for emulating interaction with an interactive voice response unit is provided. The system comprises, a client node connected to the network, the client node soliciting interaction with the interactive voice response unit and a proxy server node connected to the network, the server node accessible to client node, the interactive voice response unit accessible to the server node. A connection is established between the client node and the proxy server node, the proxy server node accepts data from the client node and translates the data into a format for interacting with the interactive voice response unit whereupon the data is then propagated to the interactive voice response unit. Response data resulting from the input data at the interactive voice response unit is propagated to the proxy server node whereupon the response data is translated into a format for dissemination at the client node and propagated thereto.Type: GrantFiled: October 5, 2004Date of Patent: October 2, 2007Assignee: Genesys Telecommunications Laboratories, Inc.Inventor: Marcialito Nuestro
-
Patent number: 7272551Abstract: Estimating a speech signal pitch frequency by determining a speech signal frame line spectrum including spectral lines having respective line amplitudes and frequencies, selecting a predefined number of spectral lines having highest amplitudes, fewer then the total number of the spectral lines, calculating a preliminary utility function over a pitch frequency range to provide a preliminary utility function value for each pitch frequency in the range measuring the compatibility of the selected spectral lines with the pitch frequency, identifying a predefined number of preliminary pitch frequency candidates at least partly responsive to the preliminary utility function, where each candidate is a local maximum of the preliminary utility function, calculating a final utility score for each of the candidates, and selecting any of the candidates to be an estimated pitch frequency of the speech signal at least partly responsive to any of the final utility scores.Type: GrantFiled: February 24, 2003Date of Patent: September 18, 2007Assignee: International Business Machines CorporationInventor: Alexander Sorin
-
Patent number: 7266493Abstract: There is provided a method of selecting a pitch lag value from a plurality of pitch lag candidates for coding a speech signal. The method comprises identifying the plurality of pitch lag candidates from a frame of the speech signal using correlation; classifying the speech signal to obtain a voice classification; determining whether one or more of the plurality of pitch lag candidates are in a temporal neighborhood of one or more previous pitch lag values; favoring the one or more of the plurality of pitch lag candidates determined to be in the temporal neighborhood of the one or more previous pitch lag values, by adaptive weighting, over other ones of the plurality of pitch lag candidates; and selecting the pitch lag value based on the voice classification and the one or more of the plurality of pitch lag candidates favored by the adaptive weighting.Type: GrantFiled: October 13, 2005Date of Patent: September 4, 2007Assignee: Mindspeed Technologies, Inc.Inventors: Huan-Yu Su, Yang Gao
-
Publication number: 20070185709Abstract: A method and apparatus of estimating a voicing for speech recognition by using local spectral information. The voicing estimation method for speech recognition includes performing a Fourier transform on input voice signals after performing pre-processing on the input voice signals. The method further includes detecting peaks in the input voice signals after smoothing the input voice signals. The method also includes computing every frequency bound associated with the detected peaks, and determining a class of a voicing according to each computed frequency bound.Type: ApplicationFiled: January 25, 2007Publication date: August 9, 2007Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Kwang Cheol Oh, Jae-Hoon Jeong
-
Patent number: 7246059Abstract: The invention provides a method and system for dynamically estimating background noise. The system includes a portable communication device, a vocoder, and a voice activated detector. Based on information received by the portable communication device, the vocoder determines parameters related to incoming information including a voicing mode indicative of the periodicity of incoming information. The voice activated detector then compares the voicing mode to a threshold to determine whether a background noise estimate should be updated.Type: GrantFiled: July 24, 2003Date of Patent: July 17, 2007Assignee: Motorola, Inc.Inventors: Ali Behboodian, Pratik Desai, Chin Pan Wong
-
Patent number: 7243062Abstract: A method (200) and apparatus (100) for segmenting a sequence of audio samples into homogeneous segments (550 and 555) are disclosed. The method (200) forms a sequence of frames (701 to 704) along the sequence of audio samples, and extracts, for each frame, a data feature. The data features form a sequence of data features. Transition points in the sequence of data features are thin detected by applying the Bayesian Information Criterion to the sequence of data features. The transition points define the homogeneous segments (550 and 555). Preferably the data feature is single-dimensional and a leptokurtic distribution is used as an event model in the Bayesian Information Criterion.Type: GrantFiled: October 25, 2002Date of Patent: July 10, 2007Assignee: Canon Kabushiki KaishaInventor: Timothy John Wark
-
Patent number: 7233899Abstract: Computer comparison of one or more dictionary entries with a sound record of a human utterance to determine whether and where each dictionary entry is contained within the sound record. The record is segmented, and for each vocalized segment a spectrogram is obtained, and for other segments symbolic and numeric data are obtained. The spectrogram of a vocalized segment is then processed using a method selected from a group consisting of a triple time transform, a triple frequency transform, a linear-piecewise-linear transform, and combinations thereof, to decrease noise and to eliminate variations in pronunciation. Each entry in the dictionary is then compared with every sequence of segments of substantially the same length in the sound record. The comparison takes into account the formant profiles within each vocalized segment and symbolic and numeric data for other segments are obtained in the record and in the dictionary entries.Type: GrantFiled: March 7, 2002Date of Patent: June 19, 2007Inventors: Vitaliy S. Fain, Samuel V. Fain
-
Patent number: 7233894Abstract: A pitch estimation system including a low-frequency band noise detector (LBND) operative to detect the presence of low-frequency band noise in a first audio frame, a frequency-domain pitch estimator operative to calculate a pitch estimation of a second audio frame from at least one spectral peak in the second audio frame, and a pitch estimator controller operative to cause the pitch estimator to exclude from the spectrum of the second audio frame at least one low-frequency spectral peak below a predefined threshold where low-frequency band noise is present in the first audio frame.Type: GrantFiled: February 24, 2003Date of Patent: June 19, 2007Assignee: International Business Machines CorporationInventor: Alexander Sorin
-
Patent number: 7228271Abstract: The telephone apparatus of the present invention comprises a first voice band expander for generating a voiced signal frequency component by shifting the frequency of the voice signal received, a second voice band expander for generating a voiceless signal frequency component by shifting the frequency of the voice signal received, and a voice composer for composing the voice signal received, the output of the first voice band expander, and the output of the second voice band expander, which is able to output clear voices in aural communication.Type: GrantFiled: December 23, 2002Date of Patent: June 5, 2007Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Toshimichi Tokuda, Takashi Kimura
-
Patent number: 7222070Abstract: Linear predictive speech coding system with classification of frames and a hybrid coder using both waveform coding and parametric coding for different classes of frames. Phase alignment for a parametric coder aligns synthesized speech frames with adjacent waveform coder synthesized frames. Zero phase alignment of speech prior to waveform coding aligns synthesized speech frames of a waveform coder with frames synthesized with a parametric coder. Inter-frame interpolation of LP coefficients suppresses artifacts in resultant synthesized speech frames.Type: GrantFiled: September 22, 2000Date of Patent: May 22, 2007Assignee: Texas Instruments IncorporatedInventors: Jacek Stachurski, Alan V. McCree
-
Patent number: 7206739Abstract: A method for searching an excitation (or fixed) codebook in a speech coding system. In a speech coding system including a synthesis filter for synthesizing a speech signal, a fixed codebook searcher according to the present invention segments a speech signal frame into a plurality of subframes to generate an excitation signal to be used in a synthesis filter, segments again each of the subframes into a plurality of subgroups, and searches the respective subframes each comprised of a plurality of pulse position/amplitude combinations for pulses. The fixed codebook searcher searches the respective subgroups for a predetermine number of pulses having non-zero amplitude, and generates the searched pulses as an initial vector. Next, the fixed codebook searcher selects a pulse combination including at least one pulse among the pulses of the initial vector, and then substitutes pulses of the selected pulse combination for pulses in other positions in the subgroups.Type: GrantFiled: May 23, 2002Date of Patent: April 17, 2007Assignee: Samsung Electronics Co., Ltd.Inventor: Dae-Ryong Lee
-
Patent number: 7191123Abstract: The gain smoothing method and device modify the amplitude of an innovative codevector in relation to background noise present in a previously sampled wideband signal. The gain smoothing device comprises a gain smoothing calculator for calculating a smoothing gain in response to a factor representative of voicing in the sampled wideband signal, a factor representative of the stability of a set of linear prediction filter coefficients, and an innovative codebook gain. The gain smoothing device also comprises an amplifier for amplifying the innovative codevector with the smoothing gain to thereby produce a gain-smoothed innovative codevector. The function of the gain-smoothing device improves the perceived synthesized signal when background noise is present in the sampled wideband signal.Type: GrantFiled: November 17, 2000Date of Patent: March 13, 2007Assignee: Voiceage CorporationInventors: Bruno Bessette, Redwan Salami, Roch Lefebvre
-
Patent number: 7191128Abstract: The present invention relates to method and system for distinguishing speech from music in a digital audio signal in real time.Type: GrantFiled: February 21, 2003Date of Patent: March 13, 2007Assignee: LG Electronics Inc.Inventors: Mikhael A. Sall, Sergei N. Gramnitskiy, Alexandr L. Maiboroda, Victor V. Redkov, Anatoli I. Tikhotsky, Andrei B. Viktorov
-
Patent number: 7171357Abstract: A voice activity detector (100) filters (204) out noise energy and then computes a high-frequency (2400 Hz to 4000 Hz) versus low-frequency (100 Hz to 2400 Hz) signal energy ratio (224), total voiceband (100 Hz to 4000 Hz) signal energy (214), and signal periodicity (208) on successive frames of signal samples. Signal periodicity is determined by estimating the pitch period (206) of the signal, determining a gain value of the signal over the pitch period as a function of the estimated pitch period, and estimating a periodicity of the signal over the pitch period as a function of the estimated pitch period and the gain value.Type: GrantFiled: March 21, 2001Date of Patent: January 30, 2007Assignee: Avaya Technology Corp.Inventor: Simon Daniel Boland
-
Patent number: 7149683Abstract: The present invention relates to a method and device for quantizing linear prediction parameters in variable bit-rate sound signal coding, in which an input linear prediction parameter vector is received, a sound signal frame corresponding to the input linear prediction parameter vector is classified, a prediction vector is computed, the computed prediction vector is removed from the input linear prediction parameter vector to produce a prediction error vector, and the prediction error vector is quantized. Computation of the prediction vector comprises selecting one of a plurality of prediction schemes in relation to the classification of the sound signal frame, and processing the prediction error vector through the selected prediction scheme. The present invention further relates to a method and device for dequantizing linear prediction parameters in variable bit-rate sound signal decoding.Type: GrantFiled: January 19, 2005Date of Patent: December 12, 2006Assignee: Nokia CorporationInventor: Milan Jelinek
-
Patent number: 7146310Abstract: A low-bit-rate coding technique for unvoiced segments of speech includes the steps of extracting high-time-resolution energy coefficients from a frame of speech, quantizing the energy coefficients, generating a high-time-resolution energy envelope from the quantized energy coefficients, and reconstituting a residue signal by shaping a randomly generated noise vector with quantized values of the energy envelope. The energy envelope may be generated with a linear interpolation technique. A post-processing measure may be obtained and compared with a predefined threshold to determine whether the coding algorithm is performing adequately.Type: GrantFiled: September 29, 2004Date of Patent: December 5, 2006Assignee: Qualcomm, IncorporatedInventors: Amitava Das, Sharath Manjunath
-
Patent number: 7139700Abstract: Linear predictive speech coding system with classification of frames and a hybrid coder using both waveform coding and parametric coding for different classes of frames. Phase alignment for a parametric coder aligns synthesized speech frames with adjacent waveform coder synthesized frames. Zero phase alignment of speech prior to waveform coding aligns synthesized speech frames of a waveform coder with frames synthesized with a parametric coder. Inter-frame interpolation of LP coefficients suppresses artifacts in resultant synthesized speech frames.Type: GrantFiled: September 22, 2000Date of Patent: November 21, 2006Assignee: Texas Instruments IncorporatedInventors: Jacek Stachurski, Alan V. McCree
-
Patent number: 7127392Abstract: The present invention is a device for and method of detecting voice activity. First, the AM envelope of a segment of a signal of interest is determined. Next, the number of times the AM envelope crosses a user-definable threshold is determined. If there are no crossings, the segment is identified as non-speech. next, the number of points on the AM envelope within a user-definable range is determined. If there are less than a user-definable number of points within the range, the segment is identified as non-speech. Next, the mean, variance, and power ratio of the normalized spectral content of the AM envelope is found and compared to the same for known speech and non-speech. The segment is identified as being of the same type as the known speech or non-speech to which it most closely compares. These steps are repreated for each signal segment of interest.Type: GrantFiled: February 12, 2003Date of Patent: October 24, 2006Assignee: The United States of America as represented by the National Security AgencyInventor: David C. Smith
-
Patent number: 7120576Abstract: A method for detecting music in a speech signal having a plurality of frames. The method comprises defining a music threshold value for a first parameter extracted from a frame of the speech signal, defining a background noise threshold value for the first parameter, and defining an unsure threshold value for the first parameter. The unsure threshold value falls between the music threshold value and the background noise threshold value. If the first parameter falls between the music threshold value and the background noise threshold value, the speech signal is classified as music or background noise based on analyzing a plurality of first parameters extracted from the plurality of frames.Type: GrantFiled: November 4, 2004Date of Patent: October 10, 2006Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7120578Abstract: Speech coding systems include multi-rate speech codecs having an encoder and a decoder. Silence description coding for multi-rate speech coding systems that employ discontinued transmission is performed in either the encoder or the decoder of the multi-rate speech codec. It may also be performed in a distributed manner wherein it is performed partially in the encoder and partially in the decoder. The silence description coding is performed on a speech signal having a substantially non-speech-like characteristic. Voice activity detection classifies the speech signal as being either substantially speech-like or substantially non-speech-like. The silence description coding is selected from a plurality of coding modes. In certain embodiments of the invention, the silence description coding is a source coding mode that operates at a bit rate that fits within a bit rate budget as determined by all of the available source coding modes within the plurality of coding modes.Type: GrantFiled: April 24, 2001Date of Patent: October 10, 2006Assignee: Mindspeed Technologies, Inc.Inventors: Jes Thyssen, Huan-yu Su, Adil Benyassine, Eyal Shlomot
-
Patent number: 7117150Abstract: A first filter (2061 in FIG. 1) calculates a long-time average of first change quantities based on a difference between a line spectral frequency of an input voice signal and a long-time average thereof. A second filter (2062 in FIG. 1) calculates a long-time average of second change quantities based on a difference between a whole band energy of the input voice signal and a long-time average thereof. A third filter (2063 in FIG. 1) calculates a long-time average of third change quantities based on a difference between a low band energy of the input voice signal and a long-time average thereof. A fourth filter (2064 in FIG. 1) calculates a long-time average of fourth change quantities based on a difference between a zero cross number of the input voice signal and a long-time average thereof. A voice/non-voice determining circuit (1040 in FIG.Type: GrantFiled: May 31, 2001Date of Patent: October 3, 2006Assignee: NEC CorporationInventor: Atsushi Murashima
-
Patent number: 7113522Abstract: Wideband speech signals must be converted to narrowband speech signals if the transmission medium or the destination terminal is constructed with narrowband constraints. A typical wideband-to-narrowband conversion method is the elimination of frequencies above 3400 Hz using a low pass filter and a down sampler. However, this method produces a muffled speech sound since the resulting narrowband signal has a flat frequency response. Methods and apparatus are presented herein to enhance the acoustic quality of a wideband-to-narrowband converted signal. A bandwidth switching filter is used to emphasize a mid-range frequency portion of the wideband signal so that the resulting narrowband signal has a non-flat frequency spectrum.Type: GrantFiled: January 24, 2001Date of Patent: September 26, 2006Assignee: QUALCOMM, IncorporatedInventors: Khaled H. El-Maleh, Arasanipalai K. Ananthapadmanabhan, Andrew P. DeJaco
-
Patent number: 7103349Abstract: The invention is a system, a method of transmitting messages selectively as text or non-text from an entity (104) in a network (100 and 102), and an entity in a network. A system in accordance with the invention includes at least one terminal (16); a network containing the at least one terminal; an entity in the network which provides messages selectively as text or non-text to the network in a speech encoded form; and wherein the messages are transmitted in the speech encoded form by the network to the at least one terminal which reproduces the messages to a user thereof in either a text form or by a sound reproduction device of the at least one terminal.Type: GrantFiled: May 2, 2002Date of Patent: September 5, 2006Assignee: Nokia CorporationInventors: Teemu Himanen, Pasi Ylinen
-
Patent number: 7099823Abstract: A coded voice signal format converting apparatus is provided which is capable of converting a signal format of a coded voice signal by computations in reduced amounts. In the coded voice signal format converting apparatus, in a second coding device is employed a quantizing accuracy information converting section to which a first quantizing accuracy information output from a quantizing accuracy information decoding section in a first decoding device is input. Second mapping signal is quantized by a mapped signal coding section to produce a coded voice signal and the first quantizing accuracy information is converted so that it can be used by mapped signal coding section to determine a second quantizing accuracy information.Type: GrantFiled: February 27, 2001Date of Patent: August 29, 2006Assignee: NEC CorporationInventor: Yuichiro Takamizawa
-
Patent number: 7089178Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.Type: GrantFiled: April 30, 2002Date of Patent: August 8, 2006Assignee: Qualcomm Inc.Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek
-
Patent number: 7080017Abstract: A frequency compander for improving the frequency response of a telephone line when used for remote broadcasting. The inventive device comprises an encoder for compressing the frequency spectrum of an audio signal and a decoder for expanding the signal back to its original spectrum. Preferably the encoder comprises: an anti-aliasing filter; an A/D converter for digitizing incoming audio; a DSP for compressing the audio; and a D/A converter for outputting compressed audio to the phone line. The decoder comprises: an anti-aliasing filter; an A/D converter for digitizing the incoming compressed signal; a DSP for restoring the original audio; and a D/A converter for outputting program audio. In a preferred embodiment, encoding and decoding are performed in the frequency domain. In another preferred embodiment, encoding and decoding are performed in the time domain using trigonometric transformations.Type: GrantFiled: May 31, 2002Date of Patent: July 18, 2006Inventors: Ken Scott Fisher, Kevin Cotton Baxter, Fred H. Holmes
-
Patent number: 7065485Abstract: The method and preprocessor enhances the intelligibility of narrowband speech without essentially lengthening the overall time duration of the signal. Both spectral enhancements and variable-rate time-scaling procedures are implemented to improve the salience of initial consonants, particularly the perceptually important formant transitions. Emphasis is transferred from the dominating vowel to the preceding consonant through adaptation of the phoneme timing structure. In a further embodiment, the technique is applied as a preprocessor to a speech coder.Type: GrantFiled: January 9, 2002Date of Patent: June 20, 2006Assignee: AT&T CorpInventors: Nicola R. Chong-White, Richard Vandervoort Cox