Voiced Or Unvoiced Patents (Class 704/214)
  • Patent number: 6754620
    Abstract: A system and method is provided for rendering data indicative of delays associated with enabling and/or disabling an analog-to-digital conversion system employed by a telephony communication network. The system of the present invention utilizes a display device and an interface manager. The interface manager receives data indicative of power levels at various frequencies and times of signals received by a transceiver that is communicating via the conventional telephony communication network. The interface manager then renders a graphical display via the display device based on the received data. The graphical display may include clusters, in which each of the clusters is associated with a particular range of power levels. By analyzing the clusters, a user can determine the delays associated with enabling and/or disabling the analog-to-digital conversion system. The graphical display may also include indicators that may be used to determine the foregoing delays.
    Type: Grant
    Filed: March 29, 2000
    Date of Patent: June 22, 2004
    Assignee: Agilent Technologies, Inc.
    Inventor: Samuel M Bauer
  • Patent number: 6707869
    Abstract: A filter to apply a window function to a digital signal is provided. The filter has a memory for storing a basic set of values representing a single window. An adapter can generate from this basic set a plurality of adapted sets of values, where the adapted sets of values define window functions having different window sizes. The adapter has an input for receiving a control signal that allows the adapter to select the proper adapted set to suit the digital signal being processed. The application of the window function is effected on successive frames of the digital signal by using the adapted set of values generated by the adapter in response to the control signal. The filter has VAD applications, among others.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: March 16, 2004
    Assignee: Nortel Networks Limited
    Inventor: Shude Zhang
  • Publication number: 20040039566
    Abstract: This disclosure is directed to techniques for condensed voice buffering, transmission and playback. The techniques may involve identification of encoded voice frames as either speech or a pause, and selective exclusion of a portion of the frames for storage, transmission or playback based on the identification. In this manner, the techniques are capable of condensing a series of encoded voice frames. When variable rate coding is employed, a pause frame may be identified, for example, based on a threshold comparison for the rate of the encoded frame. In some cases, the techniques may involve excluding only a portion of the identified frames from a consecutive sequence of the identified frames, thereby preserving a minimum number of the identified frames needed for intelligible conversation.
    Type: Application
    Filed: August 29, 2002
    Publication date: February 26, 2004
    Inventors: James A. Hutchison, Sun Tam
  • Patent number: 6697776
    Abstract: A digitized signal detection system where the bit rate encoding is changed dynamically to provide encoding for different type signals and formats at bit rates optimized to properly reconstruct the input signal whether speech or non-speech and therefore can transfer signals of different character on a frame by frame basis. A change of encoding format can make the system a speech or music recognizer dependent what is to be listened for. Three basic components a recognizer which categorizes the type of input signal, an evaluator which evaluates the category of quality of the reconstructed signal and a recommender which make as recommendation based on the quality to change standards to encode the signals received pursuant to a standard which provides for improved quality. The dynamic signal detector receives the input signal directly and extracts the parameters for evaluation. These parameters are tested and a determination made if a switch of standards are required. To improve the reconstructed signal.
    Type: Grant
    Filed: July 31, 2000
    Date of Patent: February 24, 2004
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Gilles G. Fayad, Huan-Yu Su
  • Patent number: 6691084
    Abstract: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode.
    Type: Grant
    Filed: December 21, 1998
    Date of Patent: February 10, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Sharath Manjunath, William Gardner
  • Patent number: 6691085
    Abstract: A method and system for encoding and decoding an input signal, wherein the input signal is divided into a higher frequency band and a lower frequency band in the encoding and decoding processes, and wherein the decoding of the higher frequency band is carried out by using an artificial signal along with speech related parameters obtained from the lower frequency band. In particular, the artificial signal is scaled before it is transformed into an artificial wideband signal containing colored noise in both the lower and the higher frequency band. Additionally, voice activity information is used to define speech periods and non-speech periods of the input signal. Based on the voice activity information, different weighting factors are used to scale the artificial signal in speech periods and non-speech periods.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: February 10, 2004
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Jani Rotola-Pukkila, Hannu Mikkola, Janne Vainio
  • Patent number: 6691081
    Abstract: A digital signal processor for processing data including voice messaging data that may have both voiced and unvoiced speech components utilizes computer routines stored in a memory used by the digital signal processor. The computer routines programmed provide for control of at least a portion of a selective call receiver; receiving and decoding data received at the selective call receiver; comparing the addresses received at the selective call receiver with addresses stored in a memory location coupled to the digital signal processor; controlling voicing including both voiced and unvoiced speech components; and generating a pitch wave using an inverse discrete Fourier Transform and resample the pitch wave to provide a time domain voiced speech component.
    Type: Grant
    Filed: April 28, 2000
    Date of Patent: February 10, 2004
    Assignee: Motorola, Inc.
    Inventors: Jian-Cheng Huang, Kenneth D. Finlon, Floyd D. Simpson
  • Patent number: 6681202
    Abstract: The invention describes a system that generates a wide band signal (100-7000 Hz) from a telephony band (or narrow band: 300-3400 Hz) speech signal to obtain an extended band speech signal (100-3400 Hz). This technique is particularly advantageous since it increases signal naturalness and listening comfort with keeping compatibility with all current telephony systems. The described technique is inspired on Linear Predictive speech coders. The speech signal is thus split into a spectral envelope and a short-term residual signal. Both signals are extended separately and recombined to create an extended band signal.
    Type: Grant
    Filed: November 13, 2000
    Date of Patent: January 20, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Giles Miet, Andy Gerrits
  • Publication number: 20040006462
    Abstract: A system and method of determining whether a receiver in active (non-DTX) mode should remain in active (non-DTX) mode or switch to inactive (DTX) mode and vice versa. For switching from non-DTX to DTX mode in a receiver, a received AMR frame is subjected to a SID_FIRST marker comparison. If the results of the SID_FIRST marker comparison exceed a SID_FIRST threshold, then the received AMR frame is processed as a SID_FIRST frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a SID_UPDATE threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is processed as a voice frame in non-DTX mode. For switching from DTX to non-DTX mode in a receiver, a received AMR frame in DTX mode is subjected to an ONSET frame comparison.
    Type: Application
    Filed: July 3, 2002
    Publication date: January 8, 2004
    Inventor: Phillip Marc Johnson
  • Patent number: 6662153
    Abstract: A time-separated speech coder that codes a transitional signal of voiced/unvoiced sound through harmonic speech coding, the coder including a transitional excitation signal analyzer/synthesizer for coding the transitional signal by extracting the harmonic model parameters of both transitional analyzers after detecting a transitional point and generating sinusoidal waveforms according to a variable transitional point separating both transitional analyzers. By the transitional point at which energy varies abruptly and the time-separated coding based on the transitional point, more improved speech quality than in the general harmonic speech coder can be obtained using the time-separated speech coder by increasing the representation capability of the transitional signal with large energy variation, after adapting it to the variable transitional point.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: December 9, 2003
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hyoung Jung Kim, In Sung Lee, Jong Hark Kim, Man Ho Park, Byung Sik Yoon, Song In Choi, Dae Sik Kim
  • Patent number: 6658378
    Abstract: In a low bitrate speech encoding system, encoded bits are strongly protected against errors produced on a transmission path. A decoding side checks the transmission errors using an error check code appended to a convolution decoded output and adjusts the decoding output depending on the results of check of transmission errors. At this time, it is necessary to maintain continuity of speech signals after speech decoding. To this end, a convolution decoder 16 convolution decodes the convolution coded output from the encoding device to provide a convolution decoded output of a crucial bit set with the appended error check code and a bit set excluding the crucial bit set. A CRC code comparator-frame masking unit 15 compares the CRC check code appended to the convolution decoded output from the convolution decoder 16 to the CRC check code computed from the bit group excluding the crucial bit set to adjust the convolution decoded output.
    Type: Grant
    Filed: June 16, 2000
    Date of Patent: December 2, 2003
    Assignee: Sony Corporation
    Inventor: Yuuji Maeda
  • Patent number: 6647280
    Abstract: A signal processing method, preferably for extracting a fundamental period from a noisy, low-frequency signal, is disclosed. The signal processing method generally comprises calculating a numerical transform for a number of selected periods by multiplying signal data by discrete points of a sine and a cosine wave of varying period and summing the results. The period of the sine and cosine waves are preferably selected to have a period substantially equivalent to the period of interest when performing the transform.
    Type: Grant
    Filed: January 14, 2002
    Date of Patent: November 11, 2003
    Assignee: OB Scientific, Inc.
    Inventors: Dennis E. Bahr, James L. Reuss.
  • Patent number: 6643619
    Abstract: A method for reducing interference in acoustic signals by using of an adaptive filter method involving spectral subtraction. The inventive method enables a significant reduction of interference in acoustic signals, especially voice signals, without causing any substantial falsification of said signals such as echo or musical tones, and significantly reduces computational requirements in comparison with other methods known per se that are similarly designed to improve signal quality.
    Type: Grant
    Filed: June 20, 2000
    Date of Patent: November 4, 2003
    Inventors: Klaus Linhard, Tim Haulick
  • Patent number: 6640208
    Abstract: A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.
    Type: Grant
    Filed: September 12, 2000
    Date of Patent: October 28, 2003
    Assignee: Motorola, Inc.
    Inventors: Yaxin Zhang, Jianming Song, Anton Madievski
  • Patent number: 6625574
    Abstract: An input digital audio signal is divided into sub-band signals in respective sub-bands. Scale factors of the respective sub-bands are determined on the basis of the sub-band signals for every frame. Calculation is made as to differences between the determined scale factors for a first frame and the determined scale factors for a second frame preceding the first frame. Absolute values of the calculated scale-factor differences are calculated, and data representative of the calculated absolute values are generated. The data representative of the calculated absolute values are encoded into data of a Huffman code. Sign bits are generated which represent signs of the calculated scale-factor differences. The sub-band signals are quantized in response to the determined scale factors for every frame to generate quantized samples of the sub-band signals. The Huffman-code data, the generated sign bits, and the quantized samples of the sub-band signals are combined into a bit stream.
    Type: Grant
    Filed: August 25, 2000
    Date of Patent: September 23, 2003
    Assignee: Matsushita Electric Industrial., Ltd.
    Inventors: Shohei Taniguchi, Yutaka Banba
  • Publication number: 20030125937
    Abstract: An encoder and associated vector estimation method and system (1) for processing a sequence of input vectors (y0 to yT) each comprising a plurality of elements. The vector estimation system (1) has a digital filter (2) with a filter vector input (3) for receiving said sequence of input vectors (y0 to yT) and a predictor gain input (4) for controlling characteristics of the filter (2). The filter (2) is a Kalman filter and has both a current slowly evolving filter estimate output (6) and a previous slowly evolving filter estimate output (20).
    Type: Application
    Filed: December 28, 2001
    Publication date: July 3, 2003
    Inventor: Mark Thomson
  • Publication number: 20030115045
    Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.
    Type: Application
    Filed: December 13, 2001
    Publication date: June 19, 2003
    Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
  • Publication number: 20030101049
    Abstract: Speech data frames for transmitting control signalling messages are selected in accordance with the relative subjective importance of the speech signal data content of the frame. Speech frames are classified into frame types with lower priority frame types, such as non-speech frames, being selected first for the control message data and higher priority frame types, such as onset and transient, being avoided for selection due to the higher subjective contribution to speech quality.
    Type: Application
    Filed: September 30, 2002
    Publication date: May 29, 2003
    Applicant: Nokia Corporation
    Inventors: Ari Lakaniemi, Janne Vainio
  • Patent number: 6564183
    Abstract: A speech encoding/decoding apparatus. A speech encoding apparatus has a coding portion for receiving input information related to an uncoded signal representative of an original speech signal, the coding portion including a fixed coding portion for receiving the input information and producing a first coded signal estimate, and an adaptive coding portion for receiving the input information and producing a second coded signal estimate. A controller is connected to the fixed coding portion and the adaptive coding portion for receiving information indicative of speech characteristics of the uncoded signal and generates a control signal; and a code modifier receives the first coded signal estimate from the fixed coding portion and the control signal from the controller and produces a modified signal estimate.
    Type: Grant
    Filed: December 22, 1999
    Date of Patent: May 13, 2003
    Assignee: Telefonaktiebolaget LM Erricsson (Publ)
    Inventors: Roar Hagen, Erik Ekudden
  • Patent number: 6556967
    Abstract: The present invention is a device for and method of detecting voice activity by receiving a signal; computing the absolute value of the signal; squaring the absolute value; low pass filtering the squared result; computing the mean of the filtered signal; subtracting the mean from the filtered result; padding the mean subtracted result with zeros to form a value that is a power of two if the result is not already a power of two; computing a DFFT of the power of two result; normalizing the DFFT result of the last step; computing a mean of the normalization; computing a variance of the normalization; computing a power ratio of the normalization; classifying the mean, variance and power ratio as speech or non-speech based on how this feature vector compares to similarly constructed feature vectors of known speech and non-speech. The voice activity detector includes an absolute value squarer; a low pass filter; a mean subtractor; a zero padder; a DFFT; a normalizer; and a classifier.
    Type: Grant
    Filed: March 12, 1999
    Date of Patent: April 29, 2003
    Assignee: The United States of America as represented by the National Security Agency
    Inventors: Douglas J. Nelson, David C. Smith, Jeffrey L. Townsend
  • Publication number: 20030078770
    Abstract: The invention relates to a method for determining voice activity in a signal section of an audio signal. The result, i.e. whether voice activity is present in the section of the signal thus observed, depends upon spectral and temporal stationarity of the signal section and/or prior signal sections. In a first step, the method determines whether there is spectral stationatity in the observed signal section. In a second step, the method determines whether there is temporal stationarity in the signal section in question. The final decision as to the presence of voice activity in the signal section observed depends upon the initial values of both steps.
    Type: Application
    Filed: October 25, 2002
    Publication date: April 24, 2003
    Inventors: Alexander Kyrill Fischer, Christoph Erdmann
  • Publication number: 20030033140
    Abstract: Techniques utilising Time Scale Modification (TSM) of signals are described. The signal is analysed and divided into frames of similar signal types. Techniques specific to the signal type are then applied to the frames thereby optimising the modification process. The method of the present invention enables TSM of different audio signal parts to be realized using different methods, and a system for effecting said method is also described.
    Type: Application
    Filed: April 2, 2002
    Publication date: February 13, 2003
    Inventors: Rakesh Taori, Andreas Johannes Gerrits, Dzevdet Burazerovic
  • Patent number: 6519279
    Abstract: Transceiver circuitry 1 comprises a first portion 10,20,30,41,50,100, having a first modulation means 41 operating at a first order of modulation, for transmitting and receiving voice signals; a second portion 20,30,42,50,100, having a second modulation means 42 operating at a second order of modulation, for transmitting and receiving digital signals at a higher data rate than is achievable by the first portion; and a data conversion means 20,30,100 operable to convert from or into voice signals intended for processing by the first portion into or from digital signals for processing by the second portion.
    Type: Grant
    Filed: January 5, 2000
    Date of Patent: February 11, 2003
    Assignee: Motorola, Inc.
    Inventors: Ouelid Abdesselem, Lydie Desperben
  • Publication number: 20020198704
    Abstract: A speech detection system is described which uses a time series noise model to represent audio signals corresponding to noise. The system compares incoming audio signals with the noise model and determines the beginning or end of speech in the audio signal depending on how well the input audio compares to the noise model.
    Type: Application
    Filed: May 31, 2002
    Publication date: December 26, 2002
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Jebu Jacob Rajan, Jason Peter Andrew Charlesworth
  • Publication number: 20020198705
    Abstract: Systems and methods are provided for detecting voiced and unvoiced speech in acoustic signals having varying levels of background noise. The systems receive acoustic signals at two microphones, and generate difference parameters between the acoustic signals received at each of the two microphones. The difference parameters are representative of the relative difference in signal gain between portions of the received acoustic signals. The systems identify information of the acoustic signals as unvoiced speech when the difference parameters exceed a first threshold, and identify information of the acoustic signals as voiced speech when the difference parameters exceed a second threshold. Further, embodiments of the systems include non-acoustic sensors that receive physiological information to aid in identifying voiced speech.
    Type: Application
    Filed: May 30, 2002
    Publication date: December 26, 2002
    Inventor: Gregory C. Burnett
  • Patent number: 6484138
    Abstract: It is an objective of the present invention to provide an optimized method of selection of the encoding mode that provides rate efficient coding of the input speech. It is a second objective of the present invention to identify and provide a means for generating a set of parameters ideally suited for this operational mode selection. Third, it is an objective of the present invention to provide identification of two separate conditions that allow low rate coding with minimal sacrifice to quality. The two conditions are the coding of unvoiced speech and the coding of temporally masked speech. It is a fourth objective of the present invention to provide a method for dynamically adjusting the average output data rate of the speech coder with minimal impact on speech quality.
    Type: Grant
    Filed: April 12, 2001
    Date of Patent: November 19, 2002
    Assignee: Qualcomm, Incorporated
    Inventor: Andrew P. DeJaco
  • Patent number: 6480823
    Abstract: The input signal is transformed into the frequency domain and then subdivided into bands corresponding to different frequency ranges. Adaptive thresholds are applied to the data from each frequency band separately. Thus the short-term band-limited energies are tested for the presence or absence of a speech signal. The adaptive threshold values are independently updated for each of the signal paths, using a histogram data structure to accumulate long-term data representing the mean and variance of energy within the respective frequency band. Endpoint detection is performed by a state machine that transitions from the speech absent state to the speech present state, and vice versa, depending on the results of the threshold comparisons. A partial speech detection system handles cases in which the input signal is truncated.
    Type: Grant
    Filed: March 24, 1998
    Date of Patent: November 12, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Yi Zhao, Jean-Claude Junqua
  • Patent number: 6475245
    Abstract: A method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and “transitory” or “transition” speech, and a particular type of coding scheme is used for each class. Harmonic coding is used for steady state voiced speech, “noise-like” coding is used for stationary unvoiced speech, and a special coding mode is used for transition speech, designed to capture the location, the structure, and the strength of the local time events that characterize the transition portions of the speech. The compression schemes can be applied to the speech signal or to the LP residual signal.
    Type: Grant
    Filed: February 5, 2001
    Date of Patent: November 5, 2002
    Assignee: The Regents of the University of California
    Inventors: Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
  • Publication number: 20020156620
    Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment. If a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.
    Type: Application
    Filed: December 21, 2000
    Publication date: October 24, 2002
    Inventors: Ari Heikkinen, Samuli Pietila, Vesa Ruoppila
  • Patent number: 6470311
    Abstract: In a speech processing system, an optimal filter frequency is determined and used to filter an unfiltered signal. The optimum filter is chosen by passing the largest voice area greater than 50 ms through multiple filters. The average energy output for each filter and differences between the filter averages (DeltaEnergy) are calculated. The first peak in DeltaEnergy above the average DeltaEnergy determines the optimal filter for filtering the signal. The filtered signal is divided into segments and voiced periods are determined. The unfiltered signal is divided into pitch synchronous frames based on the filtered signal.
    Type: Grant
    Filed: October 15, 1999
    Date of Patent: October 22, 2002
    Assignee: Fonix Corporation
    Inventor: Robert Brian Moncur
  • Patent number: 6453041
    Abstract: An improved voice activity detection system and method is provided for use in speakerphones and other voice activated systems. To facilitate switching between various operating modes, the voice activity detection scheme utilizes a new voice energy term which is based on an integral of the absolute value of a derivative of a speech signal. Voice activity is detected during a silence mode by comparing a first ratio of a current voice energy value to a background noise value with a voice activity threshold value. Voice activity is detected when the first ratio is greater than the voice activity threshold value. Another step involves identifying a direction of the voice activity during a transmit and receive mode by comparing a second ratio of a transmit path voice energy value to a receive path voice energy value with a transmit threshold value and a receive threshold value. When the second ratio is greater than the transmit threshold value, voice activity is present in the transmit path.
    Type: Grant
    Filed: October 15, 1998
    Date of Patent: September 17, 2002
    Assignee: Agere Systems Guardian Corp.
    Inventor: Erol Eryilmaz
  • Patent number: 6385548
    Abstract: An apparatus and method to characterize an input communication signal as being a voice, tone or noise signal is provided. The apparatus and method involve measuring variations of pitch over time from a sampled input signal. A minimum value of Average Magnitude Difference Function (AMDF) over a pitch range and an average variation value of the AMDF over sampled intervals are used to determine whether the signal is a voice signal, a tone or noise. Historical data of these values is maintained in a dual buffer arrangement and is used in the determination of signal type by detecting transitions.
    Type: Grant
    Filed: December 12, 1997
    Date of Patent: May 7, 2002
    Assignee: Motorola, Inc.
    Inventors: Satish Ananthaiyer, Eric David Elias
  • Patent number: 6385570
    Abstract: An apparatus and method for detecting transitional parts of speech, and a method of synthesizing transitional parts of speech, are provided. This apparatus includes a residual signal preprocessor for emphasizing a period of a speech residual signal which includes a peak value, a relative peak value calculation unit for obtaining a peak value of a preprocessed residual signal and a relative peak value using a predetermined reference peak value, and a transitional part detector for detecting transitional parts of speech on the basis of the relative peak value.
    Type: Grant
    Filed: May 1, 2000
    Date of Patent: May 7, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Moo-young Kim
  • Patent number: 6370500
    Abstract: A technique is used in a speech encoder (107) that reduces non-speech activity of a low bit rate digital voice message. Speech model parameters that include quantized speech spectral parameter vectors are generated in a sequence of frames. A determination is made as to which frames of the sequence of frames are voiced frames and which frames are unvoiced frames. A consecutive sequence of frames of unvoiced frames is identified (2330) as an unvoiced burst when a length, NUV, of the consecutive sequence of frames exceeds a predetermined length, Ns. A non-speech activity portion of the unvoiced burst is identified (2335-2365) and removed.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: April 9, 2002
    Assignee: Motorola, Inc.
    Inventors: Jian-Cheng Huang, Sunil Satyamurti, Floyd Simpson, Kenneth Finlon
  • Patent number: 6360199
    Abstract: A speech coding rate selector includes: a speech input unit for receiving an input speech; a short-term power arithmetic unit for computing the power of an input speech at a predetermined time unit; an ambient noise power estimating unit for estimating the power of an ambient noise superimposed on an input speech; a rate selection threshold value arithmetic unit for computing a group of power threshold values for selecting a speech coding rate; a power comparator for selecting one appropriate rate from among a plurality of speech coding rates; an ambient noise property inferring unit for inferring the property of an ambient noise superimposed on an input speech; and a comparison power corrector for correcting an output value of the short-term power arithmetic unit if an ambient noise, the property of which has been inferred by the ambient noise property inferring unit, proves to exhibit a considerable time-dependent change in power.
    Type: Grant
    Filed: June 8, 1999
    Date of Patent: March 19, 2002
    Inventor: Atsushi Yokoyama
  • Patent number: 6314391
    Abstract: In case codes of old and new standards are recorded on the same recording medium, it is desirable that the signals of the old standard can be reproduced by an old standard accommodating reproducing device, while both signals can be reproduced by the new standard accommodating reproducing device such as to avoid lowering of the signal quality. To this end, if multi-channel signals are recorded in terms of a frame the size of which cannot be controlled, a second encoding circuit encodes signals of a channel reproduced by the old standard accommodating reproducing device, while a first encoding circuit encodes the signals of a channel reproduced by an old standard accommodating reproducing device with a number of bits smaller than the maximum number of bits that can be allocated to that frame. A codestring generating circuit arrays a codestring encoded by the second encoding circuit in a void area of a frame provided by encoding in the first encoding circuit.
    Type: Grant
    Filed: February 18, 1998
    Date of Patent: November 6, 2001
    Assignee: Sony Corporation
    Inventors: Kyoya Tsutsui, Osamu Shimoyoshi
  • Patent number: 6304842
    Abstract: A method of encoding signal segments which represent unvoiced plosives. The signal segments to be encoded are contained within a speech signal divided into m=1, . . . , N frames. Each frame is subdivided into l=1, . . . , L subframes. The speech signal has a gain gm(l) within each subframe. An energy measure em(l) representative of the signal segments' energy content is defined. An energy threshold eth(l) representative of a sudden energy change characteristic of an unvoiced plosive is also defined. For each frame, the energy measure em(l) and the energy threshold eth(l) are derived for each subframe within that frame. If em(l)≦eth(l) for each subframe within a particular frame, then a plosive locator lpl=0 and a plosive index ipl=0 are assigned to that frame to indicate absence of a plosive within that frame.
    Type: Grant
    Filed: June 30, 1999
    Date of Patent: October 16, 2001
    Assignee: Glenayre Electronics, Inc.
    Inventors: Mohammad Aamir Husain, Bhaskar Bhattacharya
  • Patent number: 6285979
    Abstract: Phoneme analysis is carried out in real time by detecting a voiced component in the range of 200 Hz to 1 KHz and simultaneously detecting voiceless components having frequencies greater than about 2.4 KHz and greater than about 3.4 KHz, respectively, to produce respective outputs which are logically combined to produce two-bit logic signals which can be used to control a speech processing device.
    Type: Grant
    Filed: February 22, 1999
    Date of Patent: September 4, 2001
    Assignee: AVR Communications Ltd.
    Inventors: Boris Ginzburg, Barak Dar
  • Patent number: 6275795
    Abstract: In an apparatus for extracting information from an input speech signal, a preprocessor, a buffer, a segmenter, an acoustic classifier and a feature extractor are provided. The preprocessor generates formant related information for consecutive time frames of the input speech signal. This formant related information is fed into the buffer, which can store signals representative of a plurality of frames. The segmenter monitors the signals representative of the incoming frames and identifies segments in the input speech signal during which variations in the formant related information remain within prespecified limits. The acoustic classifier then determines classification information for each segment identified by the segmenter, based on acoustic classes found in training data. The feature estimator then determines, for each segment, the information required, based on the input speech signal during that segment, training data and the classification information determined by the acoustic classifier.
    Type: Grant
    Filed: January 8, 1999
    Date of Patent: August 14, 2001
    Assignee: Canon Kabushiki Kaisha
    Inventor: Eli Tzirkel-Hancock
  • Patent number: 6260017
    Abstract: A multipulse interpolative coder for transition speech frames includes an extractor configured to represent a first frame of transitional speech samples by a subset of the samples of the frame. The coder also includes an interpolator configured to interpolate the subset of samples and a subset of samples extracted from an earlier-received frame to synthesize other samples of the first frame that are not included in the subset. The subset of samples is further simplified by selecting a set of pulses from the subset and assigning zero values to unselected pulses. In the alternative, a portion of the unselected pulses may be quantized. The set of pulses may be the pulses having the greatest absolute amplitudes in the subset. In the alternative, the set of pulses may be the most perceptually significant pulses of the subset.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: July 10, 2001
    Assignee: Qualcomm Inc.
    Inventors: Amitava Das, Sharath Manjunath
  • Patent number: 6249757
    Abstract: A system for detection of voice activity in a communications signal, employing a nonlinear two filter voice detection algorithm, in which one filter has a low time constant (the fast filter) and one filter has a high time constant (the slow filter). The slow filter serves to provide a noise floor estimate for the incoming signal, and the fast filter serves to more closely represent the total energy in the signal. The absolute value of incoming data is presented to both filters, and the difference in filter outputs is integrated over each of a series of successive frames, thereby giving an indication of the energy level above the noise floor in each frame of the incoming signal. Voice activity is detected if the measured energy level for a frame exceeds a specified threshold level. Silence (e.g., leaving only noise) is detected if the measured energy level for each of a specified number of successive frames does not exceed a specified threshold level.
    Type: Grant
    Filed: February 16, 1999
    Date of Patent: June 19, 2001
    Assignee: 3Com Corporation
    Inventor: David G. Cason
  • Patent number: 6249758
    Abstract: An audio signal encoding device is provided comprising an input for receiving a sub-frame of an audio signal, a voiced audio signal synthesis stage, an unvoiced audio signal synthesis stage, and a processing unit. The voiced audio signal synthesis stage is operative for producing a first synthetic audio signal approximating the sub-frame of an audio signal received at the input on the basis of a first set of parameters. The unvoiced audio signal synthesis stage is operative for producing a second synthetic audio signal approximating the sub-frame of an audio signal received at the input on the basis of a second set of parameters. The processing unit is operative for releasing a set of parameters allowing to generate a selected one of the first synthetic audio signal and the second synthetic audio signal.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: June 19, 2001
    Assignee: Nortel Networks Limited
    Inventor: Paul Mermelstein
  • Patent number: 6240381
    Abstract: The onset of a particular signal event is determined by first smoothing the signal containing the event, and then analyzing the smoothed waveform to determine onset. Smoothing is performed by analyzing the value of each point of data and modifying the value based on previous data point values in the waveform. The smoothed waveform is analyzed by iteratively stepping through the data points of the smoothed waveform and determining event onset based on change in data point values. The analysis uses the slope of the waveform to determine whether the data point values and slopes meet certain criteria indicating an event onset.
    Type: Grant
    Filed: February 17, 1998
    Date of Patent: May 29, 2001
    Assignee: Fonix Corporation
    Inventor: Michael W. Newson
  • Patent number: 6233550
    Abstract: A method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and “transitory” or “transition” speech, and a particular type of coding scheme is used for each class. Harmonic coding is used for steady state voiced speech, “noise-like” coding is used for stationary unvoiced speech, and a special coding mode is used for transition speech, designed to capture the location, the structure, and the strength of the local time events that characterize the transition portions of the speech. The compression schemes can be applied to the speech signal or to the LP residual signal.
    Type: Grant
    Filed: August 28, 1998
    Date of Patent: May 15, 2001
    Assignee: The Regents of the University of California
    Inventors: Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
  • Patent number: 6226606
    Abstract: In a method for tracking pitch in a speech signal, first and second window vectors are created from samples taken across first and second windows of the speech signal. The first window is separated from the second window by a test pitch period. The energy of the speech signal in the first window is combined with the correlation between the first window vector and the second window vector to produce a predictable energy factor. The predictable energy factor is then used to determine a pitch score for the test pitch period. Based in part on the pitch score, a portion of the pitch track is identified.
    Type: Grant
    Filed: November 24, 1998
    Date of Patent: May 1, 2001
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, James G. Droppo, III
  • Patent number: 6188979
    Abstract: A method and apparatus for improved pitch period (&tgr;) estimation in a compression system is disclosed. The system uses original estimates of integer lag (&tgr;0) and open-loop prediction gain (&bgr;ol) as input to an adaptive filter parameter initialization block (304) which supplies inputs to a plurality of adaptive filter elements (306-308). Adaptive filter elements (306-308) provide information regarding the harmonics of the residual signal (&egr;(n)) to an adaptive filter parameter analysis block (310). Adaptive filter parameter analysis block (310) estimates the fundamental frequency of the residual signal based on the analysis of the harmonics and outputs a pitch period (&tgr;) for eventual use in a delay contour computation.
    Type: Grant
    Filed: May 28, 1998
    Date of Patent: February 13, 2001
    Assignee: Motorola, Inc.
    Inventor: James Patrick Ashley
  • Patent number: 6182032
    Abstract: A communication system has a network and a number of terminals. The network and the terminals have multi-rate speech encoders and decoders. Two terminals may communicate with each other through two-way voice communication where voice paths of the two-way voice communication are acoustically coupled to each other. The two terminals may also communicate through at least one non-acoustically coupled path. If the two terminals communicate through at least one non-acoustically coupled path, multi-rate encoders and decoders assigned to the non-acoustically coupled path operate at a lower bit rate than in a situation in which the two terminals operate through two-way voice. Whether a communication between the two terminals is through at least one non-acoustically coupled path is established a priori or dynamically.
    Type: Grant
    Filed: September 10, 1998
    Date of Patent: January 30, 2001
    Assignee: U.S. Philips Corporation
    Inventor: Juha Rapeli
  • Patent number: 6173265
    Abstract: A voice recording and/or reproducing device includes a plurality of coders having different bit rates for coding voice to provide coded voice data, a voice recording mode change over switch for selecting one of the plurality of coders, and a system controller. The system controller stores coding selection data obtained by the change over of the voice recording mode and coded voice data obtained from the selected coder, to a storing medium, and reduces a deterioration of the voice due to the change over. The voice recording and/or reproducing device also includes a detector for detecting the coding selection data, and a plurality of decoders for decoding the coded voice data at the bit rate corresponding to the detected coding selection data.
    Type: Grant
    Filed: December 23, 1996
    Date of Patent: January 9, 2001
    Assignee: Olympus Optical Co., Ltd.
    Inventor: Hidetaka Takahashi
  • Patent number: 6157906
    Abstract: A digital signal processor (100) receives a digitally vocoded signal (102), and calculates a staggered average value (404) from the frame energy of each received frame, or the product of the frame energy and a voicing value. While the staggered average value is above a threshold voice indicator value, speech is declared present.
    Type: Grant
    Filed: July 31, 1998
    Date of Patent: December 5, 2000
    Assignee: Motorola, Inc.
    Inventors: Richard Brent Nicholls, Chin Pan Wong, Martin Thuo Karanja, Patrick Joseph Doran, David James Graham
  • Patent number: RE38269
    Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the “noisy” vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end.
    Type: Grant
    Filed: October 21, 1999
    Date of Patent: October 7, 2003
    Assignee: ITT Manufacturing Enterprises, Inc.
    Inventor: Yu-Jih Liu