Voiced Or Unvoiced Patents (Class 704/214)
  • Publication number: 20080097754
    Abstract: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.
    Type: Application
    Filed: August 7, 2007
    Publication date: April 24, 2008
    Applicant: NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY
    Inventors: Masataka Goto, Hiromasa Fujihara, Hiroshi Okuno
  • Publication number: 20080082323
    Abstract: A system that integrates various intelligent classification techniques and preprocessing algorithms is provided. A feature extracting unit receives audio signals and extracts audio features for identification by using various descriptors; a preprocessing unit normalized the data for data consistency; a classification unit classifying audio signals into several categories according to the audio features.
    Type: Application
    Filed: November 3, 2006
    Publication date: April 3, 2008
    Inventors: Mingsian R. Bai, Meng-Chun Chen
  • Patent number: 7346502
    Abstract: There is provided a method of updating a noise state of a voice activity detector (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time since the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum energy plus a first predetermined value.
    Type: Grant
    Filed: January 26, 2006
    Date of Patent: March 18, 2008
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Eyal Shlomot, Adil Benyassine
  • Patent number: 7337108
    Abstract: An adaptive “temporal audio scaler” is provided for automatically stretching and compressing frames of audio signals received across a packet-based network. Prior to stretching or compressing segments of a current frame, the temporal audio scaler first computes a pitch period for each frame for sizing signal templates used for matching operations in stretching and compressing segments. Further, the temporal audio scaler also determines the type or types of segments comprising each frame. These segment types include “voiced” segments, “unvoiced” segments, and “mixed” segments which include both voiced and unvoiced portions. The stretching or compression methods applied to segments of each frame are then dependent upon the type of segments comprising each frame. Further, the amount of stretching and compression applied to particular segments is automatically variable for minimizing signal artifacts while still ensuring that an overall target stretching or compression ratio is maintained for each frame.
    Type: Grant
    Filed: September 10, 2003
    Date of Patent: February 26, 2008
    Assignee: Microsoft Corporation
    Inventors: Dinei Florencio, Philip Chou, Li-Wei He
  • Publication number: 20080027719
    Abstract: A method for modifying a window with a frame associated with an audio signal is described. A signal is received. The signal is partitioned into a plurality of frames. A determination is made if a frame within the plurality of frames is associated with a non-speech signal. A modified discrete cosine transform (MDCT) window function is applied to the frame to generate a first zero pad region and a second zero pad region if it was determined that the frame is associated with a non-speech signal. The frame is encoded. The decoder window is the same as the encoder window.
    Type: Application
    Filed: February 14, 2007
    Publication date: January 31, 2008
    Inventors: Venkatesh Kirshnan, Ananthapadmanabhan A. Kandhadai
  • Patent number: 7317945
    Abstract: The present invention provides a cochlear stimulation system and method for capturing and translating fine time structure (“FTS”) in incoming sounds and delivering this information spatially to the cochlea. The system comprises a FTS estimator/analyzer and a current navigator. An embodiment of the method comprises analyzing the incoming sounds within a time frequency band, extracting the slowly varying frequency components and estimating the FTS to obtain a more precise dominant FTS component within a frequency band. After adding the fine structure to the carrier to identify a precise dominant FTS component in each analysis frequency band (or stimulation channel), a stimulation current may be “steered” or directed, using the concept of virtual electrodes, to the precise spatial location (place) on the cochlea that corresponds to the dominant FTS component. This process is simultaneously repeated for each stimulation channel and each FTS component.
    Type: Grant
    Filed: November 13, 2003
    Date of Patent: January 8, 2008
    Assignee: Advanced Bionics Corporation
    Inventors: Leonid M. Litvak, David A. Krubsack, Edward H. Overstreet
  • Patent number: 7302385
    Abstract: Provided are a speech restoration system and method for concealing packet losses.
    Type: Grant
    Filed: July 7, 2003
    Date of Patent: November 27, 2007
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Ho Sang Sung, Dae Hwan Hwang, Moon Keun Lee, Ki Seung Lee, Young Cheol Park, Dae Hee Youn
  • Patent number: 7289791
    Abstract: The present invention relates to a mobile set integrating a memory efficient data storage system for the real time recording of voice conversations, data transmission and the like. The data recorder has the capacity to selectively choose the most relevant time frames of a conversation for recording, while discarding time frames that only occupy additional space in memory without holding any conversational data. The invention executes a series of logic steps on each signal including a voice activity detector step, frame comparison step, and sequential recording step. A mobile set having a modified architecture for performing the methods of the present invention is also disclosed.
    Type: Grant
    Filed: August 29, 2003
    Date of Patent: October 30, 2007
    Assignee: Broadcom Corporation
    Inventor: Fei Xie
  • Patent number: 7260541
    Abstract: A decoding device is a decoding device that generates frequency spectral data from an inputted encoded audio data stream, and includes: a core decoding unit for decoding the inputted encoded data stream and generating lower frequency spectral data representing an audio signal; and an extended decoding unit for generating, based on the lower frequency spectral data, extended frequency spectral data indicating a harmonic structure, which is same as an extension along the frequency axis of the harmonic structure indicated by the lower frequency spectral data, in a frequency region which is not represented by the encoded data stream.
    Type: Grant
    Filed: July 11, 2002
    Date of Patent: August 21, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Mineo Tsushima, Takeshi Norimatsu, Naoya Tanaka, Kosuke Nishio
  • Patent number: 7246058
    Abstract: Systems and methods are provided for detecting voiced and unvoiced speech in acoustic signals having varying levels of background noise. The systems receive acoustic signals at two microphones, and generate difference parameters between the acoustic signals received at each of the two microphones. The difference parameters are representative of the relative difference in signal gain between portions of the received acoustic signals. The systems identify information of the acoustic signals as unvoiced speech when the difference parameters exceed a first threshold, and identify information of the acoustic signals as voiced speech when the difference parameters exceed a second threshold. Further, embodiments of the systems include non-acoustic sensors that receive physiological information to aid in identifying voiced speech.
    Type: Grant
    Filed: May 30, 2002
    Date of Patent: July 17, 2007
    Assignee: Aliph, Inc.
    Inventor: Gregory C. Burnett
  • Patent number: 7243062
    Abstract: A method (200) and apparatus (100) for segmenting a sequence of audio samples into homogeneous segments (550 and 555) are disclosed. The method (200) forms a sequence of frames (701 to 704) along the sequence of audio samples, and extracts, for each frame, a data feature. The data features form a sequence of data features. Transition points in the sequence of data features are thin detected by applying the Bayesian Information Criterion to the sequence of data features. The transition points define the homogeneous segments (550 and 555). Preferably the data feature is single-dimensional and a leptokurtic distribution is used as an event model in the Bayesian Information Criterion.
    Type: Grant
    Filed: October 25, 2002
    Date of Patent: July 10, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventor: Timothy John Wark
  • Patent number: 7228271
    Abstract: The telephone apparatus of the present invention comprises a first voice band expander for generating a voiced signal frequency component by shifting the frequency of the voice signal received, a second voice band expander for generating a voiceless signal frequency component by shifting the frequency of the voice signal received, and a voice composer for composing the voice signal received, the output of the first voice band expander, and the output of the second voice band expander, which is able to output clear voices in aural communication.
    Type: Grant
    Filed: December 23, 2002
    Date of Patent: June 5, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Toshimichi Tokuda, Takashi Kimura
  • Patent number: 7219065
    Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity short-duration speech features in said signal.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: May 15, 2007
    Inventors: Andrew E. Vandali, Graeme M. Clark
  • Patent number: 7180892
    Abstract: A signal processing system which discriminates between voice signals and data signals modulated by a voiceband carrier. The signal processing system includes a voice exchange, a data exchange and a call discriminator. The voice exchange is capable of exchanging voice signals between a switched circuit network and a packet based network. The signal processing system also includes a data exchange capable of exchanging data signals modulated by a voiceband carrier on the switched circuit network with unmodulated data signal packets on the packet based network. The data exchange is performed by demodulating data signals from the switched circuit network for transmission on the packet based network, and modulating data signal packets from the packet based network for transmission on the switched circuit network. The call discriminator is used to selectively enable the voice exchange and data exchange.
    Type: Grant
    Filed: September 1, 2000
    Date of Patent: February 20, 2007
    Assignee: Broadcom Corporation
    Inventor: Onur Tackin
  • Patent number: 7177304
    Abstract: Devices, softwares and methods for prioritizing between voice data packets for discard decision purposes. A perceptual importance of a voice data packet relative to the others is determined at encoding, preferably according to the content of the encoded sound. The relative importance is represented as a comparative discardability code in the packet. If a discard decision is made, it takes into account the comparative discardability code of the packet, thus preferring to discard the unimportant packets more frequently.
    Type: Grant
    Filed: January 3, 2002
    Date of Patent: February 13, 2007
    Assignee: Cisco Technology, Inc.
    Inventors: Ning Mo, Carlos Laux, Geethgayathri Ramachandran, Chunyan Li
  • Patent number: 7171357
    Abstract: A voice activity detector (100) filters (204) out noise energy and then computes a high-frequency (2400 Hz to 4000 Hz) versus low-frequency (100 Hz to 2400 Hz) signal energy ratio (224), total voiceband (100 Hz to 4000 Hz) signal energy (214), and signal periodicity (208) on successive frames of signal samples. Signal periodicity is determined by estimating the pitch period (206) of the signal, determining a gain value of the signal over the pitch period as a function of the estimated pitch period, and estimating a periodicity of the signal over the pitch period as a function of the estimated pitch period and the gain value.
    Type: Grant
    Filed: March 21, 2001
    Date of Patent: January 30, 2007
    Assignee: Avaya Technology Corp.
    Inventor: Simon Daniel Boland
  • Patent number: 7161905
    Abstract: According to one embodiment of the invention, a method for managing time-sensitive packetized data streams at a receiver includes receiving a time-sensitive packet of a data stream, analyzing an energy level of a payload signal of the packet, and determining whether to drop the packet based on the energy level of the payload signal.
    Type: Grant
    Filed: May 3, 2001
    Date of Patent: January 9, 2007
    Assignee: Cisco Technology, Inc.
    Inventors: Paul S Hahn, Michael E Knappe, Richard A Dunlap, Luke K Surazski
  • Patent number: 7162417
    Abstract: An amplitude altering magnification (r) applied to sub-phoneme units of a voiced portion and an amplitude altering magnification s to be applied to sub-phoneme units of an unvoiced portion are determined based upon a target phoneme average power (p0) of synthesized speech and power (p) of a selected phoneme unit. Sub-phoneme units are extracted from a phoneme to be synthesized. From among the extracted sub-phoneme units, a sub-phoneme unit of the voiced portion is multiplied by the amplitude altering magnification (r), and a sub-phoneme unit of the unvoiced portion is multiplied by the amplitude altering magnification (s). Synthesized speech is obtained using the sub-phoneme units thus obtained. This makes it possible to realize power control in which any decline in the quality of synthesized speech is reduced.
    Type: Grant
    Filed: July 13, 2005
    Date of Patent: January 9, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Yasuhiro Komori, Mitsuru Otsuka
  • Patent number: 7146314
    Abstract: Data handling dynamically responds to changing noise power conditions to separate valid data from noise. A reference power level acts as a threshold between dynamically assumed noise and valid data, and dynamically refers to the reference power level changing adaptively with the background noise. The introduction of dynamic noise control in VOX (Voice Activated Transmission) improves a VOX device operation in a noisy environment, even when the background noise profiles are changing. Processing is on a frame by frame basis for successive frames. The threshold is adaptively changed when a comparison of frame signal power to the threshold indicates speech or the absence of speech in the compared frame repeatedly and continuously for a period of time involving plural successive frames having no valid speech or noise above the threshold to correspondingly reduce or increase the threshold by changing the threshold to a value that is a function of the input signal power.
    Type: Grant
    Filed: December 20, 2001
    Date of Patent: December 5, 2006
    Assignee: Renesas Technology Corporation
    Inventor: Yunbiao Wang
  • Patent number: 7136812
    Abstract: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode.
    Type: Grant
    Filed: November 14, 2003
    Date of Patent: November 14, 2006
    Assignee: Qualcomm, Incorporated
    Inventors: Sharath Manjunath, William Gardner
  • Patent number: 7136630
    Abstract: The present invention relates to a mobile set integrating a memory efficient data storage system for the real time recording of voice conversations, data transmission and the like. The data recorder has the capacity to selectively choose the most relevant time frames of a conversation for recording, while discarding time frames that only occupy additional space in memory without holding any conversational data. The invention executes a series of logic steps on each signal including a voice activity detector step, frame comparison step, and sequential recording step. A mobile set having a modified architecture for performing the methods of the present invention is also disclosed.
    Type: Grant
    Filed: December 22, 2000
    Date of Patent: November 14, 2006
    Assignee: Broadcom Corporation
    Inventor: Fei Xie
  • Patent number: 7127392
    Abstract: The present invention is a device for and method of detecting voice activity. First, the AM envelope of a segment of a signal of interest is determined. Next, the number of times the AM envelope crosses a user-definable threshold is determined. If there are no crossings, the segment is identified as non-speech. next, the number of points on the AM envelope within a user-definable range is determined. If there are less than a user-definable number of points within the range, the segment is identified as non-speech. Next, the mean, variance, and power ratio of the normalized spectral content of the AM envelope is found and compared to the same for known speech and non-speech. The segment is identified as being of the same type as the known speech or non-speech to which it most closely compares. These steps are repreated for each signal segment of interest.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: October 24, 2006
    Assignee: The United States of America as represented by the National Security Agency
    Inventor: David C. Smith
  • Patent number: 7120576
    Abstract: A method for detecting music in a speech signal having a plurality of frames. The method comprises defining a music threshold value for a first parameter extracted from a frame of the speech signal, defining a background noise threshold value for the first parameter, and defining an unsure threshold value for the first parameter. The unsure threshold value falls between the music threshold value and the background noise threshold value. If the first parameter falls between the music threshold value and the background noise threshold value, the speech signal is classified as music or background noise based on analyzing a plurality of first parameters extracted from the plurality of frames.
    Type: Grant
    Filed: November 4, 2004
    Date of Patent: October 10, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7120575
    Abstract: A digitized speech signal (600) is input to an F0 (fundamental frequency) processor that computes (610) a continuous F0 data from the speech signal. By the criterion voicing state transition (voiced/unvoiced transitions) the speech signal is presegmented (620) into segments. For each segment (630) it is evaluated (640) whether F0 is defined or not defined i.e. whether F0 is ON or OFF. In case of F0=OFF a candidate segment boundary is assumed as described above and, starting from that boundary, prosodic features are computed (650). The feature values are input into a classification tree and each candidate segment is classified thereby revealing, as a result, the existence or non-existence of a semantic or syntactic speech unit.
    Type: Grant
    Filed: August 2, 2001
    Date of Patent: October 10, 2006
    Assignee: International Business Machines Corporation
    Inventors: Martin Haase, Werner Kriechbaum, Gerhard Stenzel
  • Patent number: 7117150
    Abstract: A first filter (2061 in FIG. 1) calculates a long-time average of first change quantities based on a difference between a line spectral frequency of an input voice signal and a long-time average thereof. A second filter (2062 in FIG. 1) calculates a long-time average of second change quantities based on a difference between a whole band energy of the input voice signal and a long-time average thereof. A third filter (2063 in FIG. 1) calculates a long-time average of third change quantities based on a difference between a low band energy of the input voice signal and a long-time average thereof. A fourth filter (2064 in FIG. 1) calculates a long-time average of fourth change quantities based on a difference between a zero cross number of the input voice signal and a long-time average thereof. A voice/non-voice determining circuit (1040 in FIG.
    Type: Grant
    Filed: May 31, 2001
    Date of Patent: October 3, 2006
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 7080008
    Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.
    Type: Grant
    Filed: May 11, 2004
    Date of Patent: July 18, 2006
    Assignee: Microsoft Corporation
    Inventors: Hao Jiang, Hong-Jiang Zhang
  • Patent number: 7072833
    Abstract: A system is provided for detecting the presence of speech within an input audio signal. The system includes a memory for storing a predetermined function which gives, for a given set of audio signal values, a probability density for parameters of a predetermined speech model which is assumed to have generated the set of audio signal values, the probability density defining, for a given set of model parameter values, the probability that the predetermined speech model has those parameter values given that the speech model is assumed to have generated the set of audio signal values. The system applies a current set of received signal values to the stored probability density function and then draws samples from it using a Gibbs sampler. The system then analyses the samples to determine a set parameter values representative of the audio signal. The system then uses these parameter values to determine whether or not speech is present within the audio signals.
    Type: Grant
    Filed: May 30, 2001
    Date of Patent: July 4, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventor: Jebu Jacob Rajan
  • Patent number: 7065338
    Abstract: In coding and decoding an acoustic parameter, a weighted vector is generated by multiplying a code vector output in a past frame and a code vector selected in a present frame by weighting factors respectively selected from a factor code book and adding the products to each other.
    Type: Grant
    Filed: November 27, 2001
    Date of Patent: June 20, 2006
    Assignees: Nippon Telegraph and Telephone Corporation, Matsushita Electric Industrial Co., Ltd.
    Inventors: Kazunori Mano, Yusuke Hiwasaki, Hiroyuki Ehara, Kazutoshi Yasunaga
  • Patent number: 7058568
    Abstract: The type of audio stored in the payload of a data packet transmitted over a data network is identified as speech audio or non-speech audio through the use of a non-speech identifier included in a header in the data packet. Upon detection of data packet containing non-speech audio, the receiver of the data packet may modify jitter buffer latency while the non-speech audio is being received. Modifying the jitter buffer latency while non-speech audio is being received minimizes the loss of spoken words during jitter buffer latency modification.
    Type: Grant
    Filed: January 18, 2000
    Date of Patent: June 6, 2006
    Assignee: Cisco Technology, Inc.
    Inventor: Gary M. Lewis
  • Patent number: 7043428
    Abstract: A method of initializing an ITU Recommendation G.729 Annex B compliant voice activity detection (VAD) device is disclosed, having the steps of (1) determining a first set of running average background noise characteristics in accordance with Recommendation G.729B; (2) determining a second set of running average background noise characteristics; and (3) substituting the second set of running average background noise characteristics for the first set when a specific event occurs. The specific event is a divergence between the first and second sets of running average background noise characteristics.
    Type: Grant
    Filed: August 3, 2001
    Date of Patent: May 9, 2006
    Assignee: Texas Instruments Incorporated
    Inventor: Dunling Li
  • Patent number: 7035793
    Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.
    Type: Grant
    Filed: October 27, 2004
    Date of Patent: April 25, 2006
    Assignee: Microsoft Corporation
    Inventors: Hao Jiang, Hongjiang Zhang
  • Patent number: 7024353
    Abstract: In a distributed voice recognition system, a back-end pattern matching unit 27 can be informed of voice activity detection information as developed through use of a back-end voice activity detector 25. Although no specific voice activity detection information is developed or forwarded by the front-end of the system, precursor information as developed at the back-end can be used by the voice activity detector to nevertheless ascertain with relative accuracy the presence or absence of voice in a given set of corresponding voice recognition features as developed by the front-end of the system.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: April 4, 2006
    Assignee: Motorola, Inc.
    Inventor: Tenkasi Ramabadran
  • Patent number: 7016834
    Abstract: In general, this invention concerns speech encoding and decoding used in digital radio systems and a method by which the processing capacity required can be reduced in a telecommunication system using discontinuous transmission between a transmitter and receiver. In particular, the method according to the invention is used to match two telecommunication systems using different encoding methods between the transmitter and receiver. In the method, the signals transmitted by the transmitter are made suitable for the receiver in the signal path so that in the first step, at least one information parameter comprising at least two content identifiers is formed for each data frame of the data parameters (101) received. In the next step, data corresponding to the original data is synthesized from the data parameters (101) of the received frames, after which the synthesized data is transmitted for recoding with an encoding method suitable for the receiver.
    Type: Grant
    Filed: July 14, 2000
    Date of Patent: March 21, 2006
    Assignee: Nokia Corporation
    Inventor: Ari Lakaniemi
  • Patent number: 7016832
    Abstract: A voiced/unvoiced information estimation system uses input spectrum and synthetic spectrum to produce a voicing level spectrum. The estimation system uses a spectrum difference calculation unit to normalize a spectrum difference energy for each harmonic band in unit of harmonic band, and further uses a voicing level calculation unit to calculate a voicing level. The voicing level of each harmonic band has a continuous value between 1 and 0. The estimation system is effective in vector quantization of voiced/unvoiced information at a low bit rate. Because it is unnecessary to calculate a threshold for deciding a voiced/unvoiced information, a decision anomaly occurring due to threshold is eliminated, and the accuracy of a voicing level is improved. Furthermore, since a spectrum is represented by mixing a voiced element and a unvoiced element in a harmonic band, the estimation system improves the audio quality of a combined sound.
    Type: Grant
    Filed: July 3, 2001
    Date of Patent: March 21, 2006
    Assignee: LG Electronics, Inc.
    Inventor: Yong Soo Choi
  • Patent number: 6999920
    Abstract: Method for the reduction of echo and/or noise signals in TK systems for the transmission of useful acoustic signals, in which, when a silence interval is present, the distorted useful signal is modified by a time-dependent control signal ao(t) or by a control signal ao(k) cycled in the rhythm of a scan rate fT=1/T. The control signal ao(k) is varied in such manner that, during the presence of speech signals in the useful signals, the amplitude of the control signal ao(k) is set to a predetermined constant value co and, when a silence interval begins, the amplitude of the control signal ao(k) is reduced continuously from one sample value to the next in accordance with the recurrence formula ao(k+1)=ao(k).? with ?<1. After the end of the silence interval, ao(k) is again set equal to co.
    Type: Grant
    Filed: November 21, 2000
    Date of Patent: February 14, 2006
    Assignee: Alcatel
    Inventors: Hans-Jürgen Matt, Michael Walker, Michael Maurer
  • Patent number: 6947888
    Abstract: A low-bit-rate coding technique for unvoiced segments of speech, without loss of quality compared to the conventional Code Excited Linear Prediction (CELP) method operating at a much higher bit rate. A set of gains are derived from a residual signal after whitening the speech signal by a linear prediction filter. These gains are then quantized and applied to a randomly generated sparse excitation. The excitation is filtered, and its spectral characteristics are analyzed and compared to the spectral characteristics of the original residual signal. Based on this analysis, a filter is chosen to shape the spectral characteristics of the excitation to achieve optimal performance.
    Type: Grant
    Filed: October 17, 2000
    Date of Patent: September 20, 2005
    Assignee: Qualcomm Incorporated
    Inventor: Pengjun Huang
  • Patent number: 6915257
    Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment if a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.
    Type: Grant
    Filed: December 21, 2000
    Date of Patent: July 5, 2005
    Assignee: Nokia Mobile Phones Limited
    Inventors: Ari Heikkinen, Samuli Pietila, Vesa Ruoppila
  • Patent number: 6915256
    Abstract: A system, method and computer readable medium for quantizing pitch information of audio is disclosed. The method includes capturing audio representing a numbered frame of a plurality of numbered frames. The method further includes calculating a class of the frame, wherein a class is any one of a voiced or unvoiced class. If the frame is a voiced class, a pitch is calculated for the frame. If the frame is an even numbered frame and a voiced class, a codeword of a first length is calculated by absolutely quantizing the frame pitch. If the frame is an odd numbered frame and a voiced class and a reliable frame is available, a codeword of a second length is calculated by differentially quantizing the frame pitch. If there is no reliable frame available, a codeword of the second length is calculated by absolutely quantizing the frame pitch.
    Type: Grant
    Filed: February 7, 2003
    Date of Patent: July 5, 2005
    Assignees: Motorola, Inc., International Business Machines Corporation
    Inventors: Tenkasi V. Ramabadran, Alexander Sorin
  • Patent number: 6912495
    Abstract: An improved speech model and methods for estimating the model parameters, synthesizing speech from the parameters, and quantizing the parameters are disclosed. The improved speech model allows a time and frequency dependent mixture of quasi-periodic, noise-like, and pulse-like signals. For pulsed parameter estimation, an error criterion with reduced sensitivity to time shifts is used to reduce computation and improve performance. Pulsed parameter estimation performance is further improved using the estimated voiced strength parameter to reduce the weighting of frequency bands which are strongly voiced when estimating the pulsed parameters. The voiced, unvoiced, and pulsed strength parameters are quantized using a weighted vector quantization method using a novel error criterion for obtaining high quality quantization. The fundamental frequency and pulse position parameters are efficiently quantized based on the quantized strength parameters.
    Type: Grant
    Filed: November 20, 2001
    Date of Patent: June 28, 2005
    Assignee: Digital Voice Systems, Inc.
    Inventors: Daniel W. Griffin, John C. Hardwick
  • Patent number: 6901362
    Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: May 31, 2005
    Assignee: Microsoft Corporation
    Inventors: Hao Jiang, Hongjiang Zhang
  • Patent number: 6889186
    Abstract: A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.
    Type: Grant
    Filed: June 1, 2000
    Date of Patent: May 3, 2005
    Assignee: Avaya Technology Corp.
    Inventor: Paul Roller Michaelis
  • Patent number: 6850882
    Abstract: A method of and device for the diagnosis and treatment of speech dynamically measures the functioning of the velum in the control of nasality during speech. Various components of oral and nasal airflow are separated and selectively analyzed including (i) the fundamental frequency component of each airflow during voiced speech, (ii) a plurality of voice components that cover a frequency range encompassing at least the lowest vocal tract resonance (the first formant), and (iii) the subsonic and infrasonic components of at least the nasal airflow. By comparing the nasal and oral airflow components at the voice fundamental frequency, a nasalization measure for voiced speech sounds is formed which emulates methods that compare low frequency nasal and oral airflow during voiced speech, while eliminating or greatly reducing the problems associated with comparing these low frequency airflows, and which improves upon previous methods based on measuring and comparing nasal and oral radiated sound pressure.
    Type: Grant
    Filed: October 23, 2000
    Date of Patent: February 1, 2005
    Inventor: Martin Rothenberg
  • Patent number: 6826600
    Abstract: Mechanisms and techniques allow computer systems to create and exchange uniquely identified shared objects. Using this invention, a client computer system can operate client software to generate local object definitions in a local object specification. To assure that the local object definitions created by the client are uniquely identifiable by this client, as well as by a server and possibly other clients which may require access to such object definitions (e.g., other clients in a collaboration software system), the invention allows the client to send the local object specification to the server for unique identification of the object definitions. The server receives the local object specification containing the local object definitions created by the client and can convert each local object definition within the local object specification to a global object definition in a global object specification.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: November 30, 2004
    Assignee: Cisco Technology, Inc.
    Inventor: Paul J. Russell
  • Patent number: 6816832
    Abstract: A comfort noise block, that include a hangover period and comfort noise parameters, is transmitted in such a manner that it is not interrupted by other messages, such as FACCH messages. This is accomplished in a mobile station by a determination of whether any FACCH messages are required to be transmitted. If such FACCH messages exist, a further determination may be made as to which transmission can be made in the shortest time (i.e., the FACCH message or messages or the comfort noise parameters message), and this transmission is made first. In any event the comfort noise parameters block is transmitted without interruption. In a further embodiment of this invention the comfort noise parameters message is transmitted by being concatenated with another message, such as a neighbor channel measurement results message, so as to reduce overhead, conserve bandwidth, and reduce power consumption.
    Type: Grant
    Filed: June 11, 2001
    Date of Patent: November 9, 2004
    Assignee: Nokia Corporation
    Inventors: Seppo Alanara, Pekka Kapanen
  • Publication number: 20040220803
    Abstract: A voice channel data processor 207 and corresponding method 600 operable in a wireless communications unit's 200 receiver and transmitter to facilitate data transmission on a voice channel includes an encoder 301 for encoding data traffic as a transmit voice frame having a predetermined vocoder parameter and inserting the transmit voice frame into a stream of transmit voice frames with voice traffic and further includes a decoder 303 for parsing a stream of received voice frames to obtain a vocoder parameter for each, comparing the vocoder parameter for each received frame to the predetermined vocoder parameter, routing the received voice frame for processing as data traffic when the comparison is favorable, and otherwise routing the received voice frame for processing as voice traffic.
    Type: Application
    Filed: April 30, 2003
    Publication date: November 4, 2004
    Applicant: MOTOROLA, INC.
    Inventors: Gordon W. Chiu, Daniel J. Landron, Vincent Vigna, Chin P. Wong, David R. Heeschen
  • Patent number: 6810377
    Abstract: A lost frame recovery technique for LPC-based systems employs interpolation of parameters from previous and subsequent good frames, selective attenuation of frame energy when the energy of a subframe exceeds a threshold, and energy tapering in the presence of multiple successive lost frames.
    Type: Grant
    Filed: June 19, 1998
    Date of Patent: October 26, 2004
    Assignee: Comsat Corporation
    Inventors: Grant Ian Ho, Marion Baraniecki, Suat Yeldener
  • Patent number: 6804646
    Abstract: A method and an apparatus for processing a sound signal in which a useful signal and an interference signal are specified, the sound signal being transformed into the frequency domain and a change in the profile of the frequency being represented by an envelope for at least one frequency over a time. By segmenting the envelope, a maximum is obtained for each segment, the smallest maximum, weighted by a factor, being subtracted from the sound signal. It is also possible to take account of the minimum for the purpose of reducing the interference signal.
    Type: Grant
    Filed: September 19, 2000
    Date of Patent: October 12, 2004
    Assignee: Siemens Aktiengesellschaft
    Inventor: Tobias Schneider
  • Patent number: 6799161
    Abstract: A speech coding apparatus having a speech input unit for receiving input speech, a speech coding rate selector for selecting an appropriate speech coding rate according to the power of the input speech, a speech analyzer for processing the input speech to estimate a transfer function of the speaker's oral cavity, and a speech coding unit forming a synthesis filter based on the transfer function of the oral cavity. The speech coding unit also codes an excitation signal of the synthesis filter on the basis of an estimation result supplied by the speech analyzer. A gain suppressor interposed between the speech input unit and the speech coding unit suppresses the gain of a signal supplied from the speech input unit to the speech coding unit during an unvoiced period according to information from the speech coding rate selector.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: September 28, 2004
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Atsushi Yokoyama
  • Patent number: 6792405
    Abstract: A feature extraction process for use in a wireless communication system provides automatic speech recognition based on both spectral envelope and voicing information. The shape of the spectral envelope is used to determine the LSPs of the incoming bitstream and the adaptive gain coefficients and fixed gain coefficients are used to generate the “voiced” and “unvoiced” feature parameter information.
    Type: Grant
    Filed: December 5, 2000
    Date of Patent: September 14, 2004
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Publication number: 20040153316
    Abstract: First encoded voice bits are transcoded into second encoded voice bits by dividing the first encoded voice bits into one or more received frames, with each received frame containing multiple ones of the first encoded voice bits. First parameter bits for at least one of the received frames are generated by applying error control decoding to one or more of the encoded voice bits contained in the received frame, speech parameters are computed from the first parameter bits, and the speech parameters are quantized to produce second parameter bits. Finally, a transmission frame is formed by applying error control encoding to one or more of the second parameter bits, and the transmission frame is included in the second encoded voice bits.
    Type: Application
    Filed: January 30, 2003
    Publication date: August 5, 2004
    Inventor: John C. Hardwick