Voiced Or Unvoiced Patents (Class 704/214)
-
Patent number: 6754620Abstract: A system and method is provided for rendering data indicative of delays associated with enabling and/or disabling an analog-to-digital conversion system employed by a telephony communication network. The system of the present invention utilizes a display device and an interface manager. The interface manager receives data indicative of power levels at various frequencies and times of signals received by a transceiver that is communicating via the conventional telephony communication network. The interface manager then renders a graphical display via the display device based on the received data. The graphical display may include clusters, in which each of the clusters is associated with a particular range of power levels. By analyzing the clusters, a user can determine the delays associated with enabling and/or disabling the analog-to-digital conversion system. The graphical display may also include indicators that may be used to determine the foregoing delays.Type: GrantFiled: March 29, 2000Date of Patent: June 22, 2004Assignee: Agilent Technologies, Inc.Inventor: Samuel M Bauer
-
Patent number: 6707869Abstract: A filter to apply a window function to a digital signal is provided. The filter has a memory for storing a basic set of values representing a single window. An adapter can generate from this basic set a plurality of adapted sets of values, where the adapted sets of values define window functions having different window sizes. The adapter has an input for receiving a control signal that allows the adapter to select the proper adapted set to suit the digital signal being processed. The application of the window function is effected on successive frames of the digital signal by using the adapted set of values generated by the adapter in response to the control signal. The filter has VAD applications, among others.Type: GrantFiled: December 28, 2000Date of Patent: March 16, 2004Assignee: Nortel Networks LimitedInventor: Shude Zhang
-
Publication number: 20040039566Abstract: This disclosure is directed to techniques for condensed voice buffering, transmission and playback. The techniques may involve identification of encoded voice frames as either speech or a pause, and selective exclusion of a portion of the frames for storage, transmission or playback based on the identification. In this manner, the techniques are capable of condensing a series of encoded voice frames. When variable rate coding is employed, a pause frame may be identified, for example, based on a threshold comparison for the rate of the encoded frame. In some cases, the techniques may involve excluding only a portion of the identified frames from a consecutive sequence of the identified frames, thereby preserving a minimum number of the identified frames needed for intelligible conversation.Type: ApplicationFiled: August 29, 2002Publication date: February 26, 2004Inventors: James A. Hutchison, Sun Tam
-
Patent number: 6697776Abstract: A digitized signal detection system where the bit rate encoding is changed dynamically to provide encoding for different type signals and formats at bit rates optimized to properly reconstruct the input signal whether speech or non-speech and therefore can transfer signals of different character on a frame by frame basis. A change of encoding format can make the system a speech or music recognizer dependent what is to be listened for. Three basic components a recognizer which categorizes the type of input signal, an evaluator which evaluates the category of quality of the reconstructed signal and a recommender which make as recommendation based on the quality to change standards to encode the signals received pursuant to a standard which provides for improved quality. The dynamic signal detector receives the input signal directly and extracts the parameters for evaluation. These parameters are tested and a determination made if a switch of standards are required. To improve the reconstructed signal.Type: GrantFiled: July 31, 2000Date of Patent: February 24, 2004Assignee: Mindspeed Technologies, Inc.Inventors: Gilles G. Fayad, Huan-Yu Su
-
Patent number: 6691084Abstract: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode.Type: GrantFiled: December 21, 1998Date of Patent: February 10, 2004Assignee: Qualcomm IncorporatedInventors: Sharath Manjunath, William Gardner
-
Patent number: 6691085Abstract: A method and system for encoding and decoding an input signal, wherein the input signal is divided into a higher frequency band and a lower frequency band in the encoding and decoding processes, and wherein the decoding of the higher frequency band is carried out by using an artificial signal along with speech related parameters obtained from the lower frequency band. In particular, the artificial signal is scaled before it is transformed into an artificial wideband signal containing colored noise in both the lower and the higher frequency band. Additionally, voice activity information is used to define speech periods and non-speech periods of the input signal. Based on the voice activity information, different weighting factors are used to scale the artificial signal in speech periods and non-speech periods.Type: GrantFiled: October 18, 2000Date of Patent: February 10, 2004Assignee: Nokia Mobile Phones Ltd.Inventors: Jani Rotola-Pukkila, Hannu Mikkola, Janne Vainio
-
Patent number: 6691081Abstract: A digital signal processor for processing data including voice messaging data that may have both voiced and unvoiced speech components utilizes computer routines stored in a memory used by the digital signal processor. The computer routines programmed provide for control of at least a portion of a selective call receiver; receiving and decoding data received at the selective call receiver; comparing the addresses received at the selective call receiver with addresses stored in a memory location coupled to the digital signal processor; controlling voicing including both voiced and unvoiced speech components; and generating a pitch wave using an inverse discrete Fourier Transform and resample the pitch wave to provide a time domain voiced speech component.Type: GrantFiled: April 28, 2000Date of Patent: February 10, 2004Assignee: Motorola, Inc.Inventors: Jian-Cheng Huang, Kenneth D. Finlon, Floyd D. Simpson
-
Patent number: 6681202Abstract: The invention describes a system that generates a wide band signal (100-7000 Hz) from a telephony band (or narrow band: 300-3400 Hz) speech signal to obtain an extended band speech signal (100-3400 Hz). This technique is particularly advantageous since it increases signal naturalness and listening comfort with keeping compatibility with all current telephony systems. The described technique is inspired on Linear Predictive speech coders. The speech signal is thus split into a spectral envelope and a short-term residual signal. Both signals are extended separately and recombined to create an extended band signal.Type: GrantFiled: November 13, 2000Date of Patent: January 20, 2004Assignee: Koninklijke Philips Electronics N.V.Inventors: Giles Miet, Andy Gerrits
-
Publication number: 20040006462Abstract: A system and method of determining whether a receiver in active (non-DTX) mode should remain in active (non-DTX) mode or switch to inactive (DTX) mode and vice versa. For switching from non-DTX to DTX mode in a receiver, a received AMR frame is subjected to a SID_FIRST marker comparison. If the results of the SID_FIRST marker comparison exceed a SID_FIRST threshold, then the received AMR frame is processed as a SID_FIRST frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a SID_UPDATE threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is processed as a voice frame in non-DTX mode. For switching from DTX to non-DTX mode in a receiver, a received AMR frame in DTX mode is subjected to an ONSET frame comparison.Type: ApplicationFiled: July 3, 2002Publication date: January 8, 2004Inventor: Phillip Marc Johnson
-
Patent number: 6662153Abstract: A time-separated speech coder that codes a transitional signal of voiced/unvoiced sound through harmonic speech coding, the coder including a transitional excitation signal analyzer/synthesizer for coding the transitional signal by extracting the harmonic model parameters of both transitional analyzers after detecting a transitional point and generating sinusoidal waveforms according to a variable transitional point separating both transitional analyzers. By the transitional point at which energy varies abruptly and the time-separated coding based on the transitional point, more improved speech quality than in the general harmonic speech coder can be obtained using the time-separated speech coder by increasing the representation capability of the transitional signal with large energy variation, after adapting it to the variable transitional point.Type: GrantFiled: January 24, 2001Date of Patent: December 9, 2003Assignee: Electronics and Telecommunications Research InstituteInventors: Hyoung Jung Kim, In Sung Lee, Jong Hark Kim, Man Ho Park, Byung Sik Yoon, Song In Choi, Dae Sik Kim
-
Patent number: 6658378Abstract: In a low bitrate speech encoding system, encoded bits are strongly protected against errors produced on a transmission path. A decoding side checks the transmission errors using an error check code appended to a convolution decoded output and adjusts the decoding output depending on the results of check of transmission errors. At this time, it is necessary to maintain continuity of speech signals after speech decoding. To this end, a convolution decoder 16 convolution decodes the convolution coded output from the encoding device to provide a convolution decoded output of a crucial bit set with the appended error check code and a bit set excluding the crucial bit set. A CRC code comparator-frame masking unit 15 compares the CRC check code appended to the convolution decoded output from the convolution decoder 16 to the CRC check code computed from the bit group excluding the crucial bit set to adjust the convolution decoded output.Type: GrantFiled: June 16, 2000Date of Patent: December 2, 2003Assignee: Sony CorporationInventor: Yuuji Maeda
-
Patent number: 6647280Abstract: A signal processing method, preferably for extracting a fundamental period from a noisy, low-frequency signal, is disclosed. The signal processing method generally comprises calculating a numerical transform for a number of selected periods by multiplying signal data by discrete points of a sine and a cosine wave of varying period and summing the results. The period of the sine and cosine waves are preferably selected to have a period substantially equivalent to the period of interest when performing the transform.Type: GrantFiled: January 14, 2002Date of Patent: November 11, 2003Assignee: OB Scientific, Inc.Inventors: Dennis E. Bahr, James L. Reuss.
-
Patent number: 6643619Abstract: A method for reducing interference in acoustic signals by using of an adaptive filter method involving spectral subtraction. The inventive method enables a significant reduction of interference in acoustic signals, especially voice signals, without causing any substantial falsification of said signals such as echo or musical tones, and significantly reduces computational requirements in comparison with other methods known per se that are similarly designed to improve signal quality.Type: GrantFiled: June 20, 2000Date of Patent: November 4, 2003Inventors: Klaus Linhard, Tim Haulick
-
Patent number: 6640208Abstract: A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.Type: GrantFiled: September 12, 2000Date of Patent: October 28, 2003Assignee: Motorola, Inc.Inventors: Yaxin Zhang, Jianming Song, Anton Madievski
-
Patent number: 6625574Abstract: An input digital audio signal is divided into sub-band signals in respective sub-bands. Scale factors of the respective sub-bands are determined on the basis of the sub-band signals for every frame. Calculation is made as to differences between the determined scale factors for a first frame and the determined scale factors for a second frame preceding the first frame. Absolute values of the calculated scale-factor differences are calculated, and data representative of the calculated absolute values are generated. The data representative of the calculated absolute values are encoded into data of a Huffman code. Sign bits are generated which represent signs of the calculated scale-factor differences. The sub-band signals are quantized in response to the determined scale factors for every frame to generate quantized samples of the sub-band signals. The Huffman-code data, the generated sign bits, and the quantized samples of the sub-band signals are combined into a bit stream.Type: GrantFiled: August 25, 2000Date of Patent: September 23, 2003Assignee: Matsushita Electric Industrial., Ltd.Inventors: Shohei Taniguchi, Yutaka Banba
-
Publication number: 20030125937Abstract: An encoder and associated vector estimation method and system (1) for processing a sequence of input vectors (y0 to yT) each comprising a plurality of elements. The vector estimation system (1) has a digital filter (2) with a filter vector input (3) for receiving said sequence of input vectors (y0 to yT) and a predictor gain input (4) for controlling characteristics of the filter (2). The filter (2) is a Kalman filter and has both a current slowly evolving filter estimate output (6) and a previous slowly evolving filter estimate output (20).Type: ApplicationFiled: December 28, 2001Publication date: July 3, 2003Inventor: Mark Thomson
-
Publication number: 20030115045Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.Type: ApplicationFiled: December 13, 2001Publication date: June 19, 2003Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
-
Publication number: 20030101049Abstract: Speech data frames for transmitting control signalling messages are selected in accordance with the relative subjective importance of the speech signal data content of the frame. Speech frames are classified into frame types with lower priority frame types, such as non-speech frames, being selected first for the control message data and higher priority frame types, such as onset and transient, being avoided for selection due to the higher subjective contribution to speech quality.Type: ApplicationFiled: September 30, 2002Publication date: May 29, 2003Applicant: Nokia CorporationInventors: Ari Lakaniemi, Janne Vainio
-
Patent number: 6564183Abstract: A speech encoding/decoding apparatus. A speech encoding apparatus has a coding portion for receiving input information related to an uncoded signal representative of an original speech signal, the coding portion including a fixed coding portion for receiving the input information and producing a first coded signal estimate, and an adaptive coding portion for receiving the input information and producing a second coded signal estimate. A controller is connected to the fixed coding portion and the adaptive coding portion for receiving information indicative of speech characteristics of the uncoded signal and generates a control signal; and a code modifier receives the first coded signal estimate from the fixed coding portion and the control signal from the controller and produces a modified signal estimate.Type: GrantFiled: December 22, 1999Date of Patent: May 13, 2003Assignee: Telefonaktiebolaget LM Erricsson (Publ)Inventors: Roar Hagen, Erik Ekudden
-
Patent number: 6556967Abstract: The present invention is a device for and method of detecting voice activity by receiving a signal; computing the absolute value of the signal; squaring the absolute value; low pass filtering the squared result; computing the mean of the filtered signal; subtracting the mean from the filtered result; padding the mean subtracted result with zeros to form a value that is a power of two if the result is not already a power of two; computing a DFFT of the power of two result; normalizing the DFFT result of the last step; computing a mean of the normalization; computing a variance of the normalization; computing a power ratio of the normalization; classifying the mean, variance and power ratio as speech or non-speech based on how this feature vector compares to similarly constructed feature vectors of known speech and non-speech. The voice activity detector includes an absolute value squarer; a low pass filter; a mean subtractor; a zero padder; a DFFT; a normalizer; and a classifier.Type: GrantFiled: March 12, 1999Date of Patent: April 29, 2003Assignee: The United States of America as represented by the National Security AgencyInventors: Douglas J. Nelson, David C. Smith, Jeffrey L. Townsend
-
Publication number: 20030078770Abstract: The invention relates to a method for determining voice activity in a signal section of an audio signal. The result, i.e. whether voice activity is present in the section of the signal thus observed, depends upon spectral and temporal stationarity of the signal section and/or prior signal sections. In a first step, the method determines whether there is spectral stationatity in the observed signal section. In a second step, the method determines whether there is temporal stationarity in the signal section in question. The final decision as to the presence of voice activity in the signal section observed depends upon the initial values of both steps.Type: ApplicationFiled: October 25, 2002Publication date: April 24, 2003Inventors: Alexander Kyrill Fischer, Christoph Erdmann
-
Publication number: 20030033140Abstract: Techniques utilising Time Scale Modification (TSM) of signals are described. The signal is analysed and divided into frames of similar signal types. Techniques specific to the signal type are then applied to the frames thereby optimising the modification process. The method of the present invention enables TSM of different audio signal parts to be realized using different methods, and a system for effecting said method is also described.Type: ApplicationFiled: April 2, 2002Publication date: February 13, 2003Inventors: Rakesh Taori, Andreas Johannes Gerrits, Dzevdet Burazerovic
-
Patent number: 6519279Abstract: Transceiver circuitry 1 comprises a first portion 10,20,30,41,50,100, having a first modulation means 41 operating at a first order of modulation, for transmitting and receiving voice signals; a second portion 20,30,42,50,100, having a second modulation means 42 operating at a second order of modulation, for transmitting and receiving digital signals at a higher data rate than is achievable by the first portion; and a data conversion means 20,30,100 operable to convert from or into voice signals intended for processing by the first portion into or from digital signals for processing by the second portion.Type: GrantFiled: January 5, 2000Date of Patent: February 11, 2003Assignee: Motorola, Inc.Inventors: Ouelid Abdesselem, Lydie Desperben
-
Publication number: 20020198704Abstract: A speech detection system is described which uses a time series noise model to represent audio signals corresponding to noise. The system compares incoming audio signals with the noise model and determines the beginning or end of speech in the audio signal depending on how well the input audio compares to the noise model.Type: ApplicationFiled: May 31, 2002Publication date: December 26, 2002Applicant: CANON KABUSHIKI KAISHAInventors: Jebu Jacob Rajan, Jason Peter Andrew Charlesworth
-
Publication number: 20020198705Abstract: Systems and methods are provided for detecting voiced and unvoiced speech in acoustic signals having varying levels of background noise. The systems receive acoustic signals at two microphones, and generate difference parameters between the acoustic signals received at each of the two microphones. The difference parameters are representative of the relative difference in signal gain between portions of the received acoustic signals. The systems identify information of the acoustic signals as unvoiced speech when the difference parameters exceed a first threshold, and identify information of the acoustic signals as voiced speech when the difference parameters exceed a second threshold. Further, embodiments of the systems include non-acoustic sensors that receive physiological information to aid in identifying voiced speech.Type: ApplicationFiled: May 30, 2002Publication date: December 26, 2002Inventor: Gregory C. Burnett
-
Patent number: 6484138Abstract: It is an objective of the present invention to provide an optimized method of selection of the encoding mode that provides rate efficient coding of the input speech. It is a second objective of the present invention to identify and provide a means for generating a set of parameters ideally suited for this operational mode selection. Third, it is an objective of the present invention to provide identification of two separate conditions that allow low rate coding with minimal sacrifice to quality. The two conditions are the coding of unvoiced speech and the coding of temporally masked speech. It is a fourth objective of the present invention to provide a method for dynamically adjusting the average output data rate of the speech coder with minimal impact on speech quality.Type: GrantFiled: April 12, 2001Date of Patent: November 19, 2002Assignee: Qualcomm, IncorporatedInventor: Andrew P. DeJaco
-
Patent number: 6480823Abstract: The input signal is transformed into the frequency domain and then subdivided into bands corresponding to different frequency ranges. Adaptive thresholds are applied to the data from each frequency band separately. Thus the short-term band-limited energies are tested for the presence or absence of a speech signal. The adaptive threshold values are independently updated for each of the signal paths, using a histogram data structure to accumulate long-term data representing the mean and variance of energy within the respective frequency band. Endpoint detection is performed by a state machine that transitions from the speech absent state to the speech present state, and vice versa, depending on the results of the threshold comparisons. A partial speech detection system handles cases in which the input signal is truncated.Type: GrantFiled: March 24, 1998Date of Patent: November 12, 2002Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Yi Zhao, Jean-Claude Junqua
-
Patent number: 6475245Abstract: A method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and “transitory” or “transition” speech, and a particular type of coding scheme is used for each class. Harmonic coding is used for steady state voiced speech, “noise-like” coding is used for stationary unvoiced speech, and a special coding mode is used for transition speech, designed to capture the location, the structure, and the strength of the local time events that characterize the transition portions of the speech. The compression schemes can be applied to the speech signal or to the LP residual signal.Type: GrantFiled: February 5, 2001Date of Patent: November 5, 2002Assignee: The Regents of the University of CaliforniaInventors: Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
-
Publication number: 20020156620Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment. If a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.Type: ApplicationFiled: December 21, 2000Publication date: October 24, 2002Inventors: Ari Heikkinen, Samuli Pietila, Vesa Ruoppila
-
Patent number: 6470311Abstract: In a speech processing system, an optimal filter frequency is determined and used to filter an unfiltered signal. The optimum filter is chosen by passing the largest voice area greater than 50 ms through multiple filters. The average energy output for each filter and differences between the filter averages (DeltaEnergy) are calculated. The first peak in DeltaEnergy above the average DeltaEnergy determines the optimal filter for filtering the signal. The filtered signal is divided into segments and voiced periods are determined. The unfiltered signal is divided into pitch synchronous frames based on the filtered signal.Type: GrantFiled: October 15, 1999Date of Patent: October 22, 2002Assignee: Fonix CorporationInventor: Robert Brian Moncur
-
Patent number: 6453041Abstract: An improved voice activity detection system and method is provided for use in speakerphones and other voice activated systems. To facilitate switching between various operating modes, the voice activity detection scheme utilizes a new voice energy term which is based on an integral of the absolute value of a derivative of a speech signal. Voice activity is detected during a silence mode by comparing a first ratio of a current voice energy value to a background noise value with a voice activity threshold value. Voice activity is detected when the first ratio is greater than the voice activity threshold value. Another step involves identifying a direction of the voice activity during a transmit and receive mode by comparing a second ratio of a transmit path voice energy value to a receive path voice energy value with a transmit threshold value and a receive threshold value. When the second ratio is greater than the transmit threshold value, voice activity is present in the transmit path.Type: GrantFiled: October 15, 1998Date of Patent: September 17, 2002Assignee: Agere Systems Guardian Corp.Inventor: Erol Eryilmaz
-
Patent number: 6385548Abstract: An apparatus and method to characterize an input communication signal as being a voice, tone or noise signal is provided. The apparatus and method involve measuring variations of pitch over time from a sampled input signal. A minimum value of Average Magnitude Difference Function (AMDF) over a pitch range and an average variation value of the AMDF over sampled intervals are used to determine whether the signal is a voice signal, a tone or noise. Historical data of these values is maintained in a dual buffer arrangement and is used in the determination of signal type by detecting transitions.Type: GrantFiled: December 12, 1997Date of Patent: May 7, 2002Assignee: Motorola, Inc.Inventors: Satish Ananthaiyer, Eric David Elias
-
Patent number: 6385570Abstract: An apparatus and method for detecting transitional parts of speech, and a method of synthesizing transitional parts of speech, are provided. This apparatus includes a residual signal preprocessor for emphasizing a period of a speech residual signal which includes a peak value, a relative peak value calculation unit for obtaining a peak value of a preprocessed residual signal and a relative peak value using a predetermined reference peak value, and a transitional part detector for detecting transitional parts of speech on the basis of the relative peak value.Type: GrantFiled: May 1, 2000Date of Patent: May 7, 2002Assignee: Samsung Electronics Co., Ltd.Inventor: Moo-young Kim
-
Patent number: 6370500Abstract: A technique is used in a speech encoder (107) that reduces non-speech activity of a low bit rate digital voice message. Speech model parameters that include quantized speech spectral parameter vectors are generated in a sequence of frames. A determination is made as to which frames of the sequence of frames are voiced frames and which frames are unvoiced frames. A consecutive sequence of frames of unvoiced frames is identified (2330) as an unvoiced burst when a length, NUV, of the consecutive sequence of frames exceeds a predetermined length, Ns. A non-speech activity portion of the unvoiced burst is identified (2335-2365) and removed.Type: GrantFiled: September 30, 1999Date of Patent: April 9, 2002Assignee: Motorola, Inc.Inventors: Jian-Cheng Huang, Sunil Satyamurti, Floyd Simpson, Kenneth Finlon
-
Patent number: 6360199Abstract: A speech coding rate selector includes: a speech input unit for receiving an input speech; a short-term power arithmetic unit for computing the power of an input speech at a predetermined time unit; an ambient noise power estimating unit for estimating the power of an ambient noise superimposed on an input speech; a rate selection threshold value arithmetic unit for computing a group of power threshold values for selecting a speech coding rate; a power comparator for selecting one appropriate rate from among a plurality of speech coding rates; an ambient noise property inferring unit for inferring the property of an ambient noise superimposed on an input speech; and a comparison power corrector for correcting an output value of the short-term power arithmetic unit if an ambient noise, the property of which has been inferred by the ambient noise property inferring unit, proves to exhibit a considerable time-dependent change in power.Type: GrantFiled: June 8, 1999Date of Patent: March 19, 2002Inventor: Atsushi Yokoyama
-
Patent number: 6314391Abstract: In case codes of old and new standards are recorded on the same recording medium, it is desirable that the signals of the old standard can be reproduced by an old standard accommodating reproducing device, while both signals can be reproduced by the new standard accommodating reproducing device such as to avoid lowering of the signal quality. To this end, if multi-channel signals are recorded in terms of a frame the size of which cannot be controlled, a second encoding circuit encodes signals of a channel reproduced by the old standard accommodating reproducing device, while a first encoding circuit encodes the signals of a channel reproduced by an old standard accommodating reproducing device with a number of bits smaller than the maximum number of bits that can be allocated to that frame. A codestring generating circuit arrays a codestring encoded by the second encoding circuit in a void area of a frame provided by encoding in the first encoding circuit.Type: GrantFiled: February 18, 1998Date of Patent: November 6, 2001Assignee: Sony CorporationInventors: Kyoya Tsutsui, Osamu Shimoyoshi
-
Patent number: 6304842Abstract: A method of encoding signal segments which represent unvoiced plosives. The signal segments to be encoded are contained within a speech signal divided into m=1, . . . , N frames. Each frame is subdivided into l=1, . . . , L subframes. The speech signal has a gain gm(l) within each subframe. An energy measure em(l) representative of the signal segments' energy content is defined. An energy threshold eth(l) representative of a sudden energy change characteristic of an unvoiced plosive is also defined. For each frame, the energy measure em(l) and the energy threshold eth(l) are derived for each subframe within that frame. If em(l)≦eth(l) for each subframe within a particular frame, then a plosive locator lpl=0 and a plosive index ipl=0 are assigned to that frame to indicate absence of a plosive within that frame.Type: GrantFiled: June 30, 1999Date of Patent: October 16, 2001Assignee: Glenayre Electronics, Inc.Inventors: Mohammad Aamir Husain, Bhaskar Bhattacharya
-
Patent number: 6285979Abstract: Phoneme analysis is carried out in real time by detecting a voiced component in the range of 200 Hz to 1 KHz and simultaneously detecting voiceless components having frequencies greater than about 2.4 KHz and greater than about 3.4 KHz, respectively, to produce respective outputs which are logically combined to produce two-bit logic signals which can be used to control a speech processing device.Type: GrantFiled: February 22, 1999Date of Patent: September 4, 2001Assignee: AVR Communications Ltd.Inventors: Boris Ginzburg, Barak Dar
-
Patent number: 6275795Abstract: In an apparatus for extracting information from an input speech signal, a preprocessor, a buffer, a segmenter, an acoustic classifier and a feature extractor are provided. The preprocessor generates formant related information for consecutive time frames of the input speech signal. This formant related information is fed into the buffer, which can store signals representative of a plurality of frames. The segmenter monitors the signals representative of the incoming frames and identifies segments in the input speech signal during which variations in the formant related information remain within prespecified limits. The acoustic classifier then determines classification information for each segment identified by the segmenter, based on acoustic classes found in training data. The feature estimator then determines, for each segment, the information required, based on the input speech signal during that segment, training data and the classification information determined by the acoustic classifier.Type: GrantFiled: January 8, 1999Date of Patent: August 14, 2001Assignee: Canon Kabushiki KaishaInventor: Eli Tzirkel-Hancock
-
Patent number: 6260017Abstract: A multipulse interpolative coder for transition speech frames includes an extractor configured to represent a first frame of transitional speech samples by a subset of the samples of the frame. The coder also includes an interpolator configured to interpolate the subset of samples and a subset of samples extracted from an earlier-received frame to synthesize other samples of the first frame that are not included in the subset. The subset of samples is further simplified by selecting a set of pulses from the subset and assigning zero values to unselected pulses. In the alternative, a portion of the unselected pulses may be quantized. The set of pulses may be the pulses having the greatest absolute amplitudes in the subset. In the alternative, the set of pulses may be the most perceptually significant pulses of the subset.Type: GrantFiled: May 7, 1999Date of Patent: July 10, 2001Assignee: Qualcomm Inc.Inventors: Amitava Das, Sharath Manjunath
-
Patent number: 6249757Abstract: A system for detection of voice activity in a communications signal, employing a nonlinear two filter voice detection algorithm, in which one filter has a low time constant (the fast filter) and one filter has a high time constant (the slow filter). The slow filter serves to provide a noise floor estimate for the incoming signal, and the fast filter serves to more closely represent the total energy in the signal. The absolute value of incoming data is presented to both filters, and the difference in filter outputs is integrated over each of a series of successive frames, thereby giving an indication of the energy level above the noise floor in each frame of the incoming signal. Voice activity is detected if the measured energy level for a frame exceeds a specified threshold level. Silence (e.g., leaving only noise) is detected if the measured energy level for each of a specified number of successive frames does not exceed a specified threshold level.Type: GrantFiled: February 16, 1999Date of Patent: June 19, 2001Assignee: 3Com CorporationInventor: David G. Cason
-
Patent number: 6249758Abstract: An audio signal encoding device is provided comprising an input for receiving a sub-frame of an audio signal, a voiced audio signal synthesis stage, an unvoiced audio signal synthesis stage, and a processing unit. The voiced audio signal synthesis stage is operative for producing a first synthetic audio signal approximating the sub-frame of an audio signal received at the input on the basis of a first set of parameters. The unvoiced audio signal synthesis stage is operative for producing a second synthetic audio signal approximating the sub-frame of an audio signal received at the input on the basis of a second set of parameters. The processing unit is operative for releasing a set of parameters allowing to generate a selected one of the first synthetic audio signal and the second synthetic audio signal.Type: GrantFiled: June 30, 1998Date of Patent: June 19, 2001Assignee: Nortel Networks LimitedInventor: Paul Mermelstein
-
Patent number: 6240381Abstract: The onset of a particular signal event is determined by first smoothing the signal containing the event, and then analyzing the smoothed waveform to determine onset. Smoothing is performed by analyzing the value of each point of data and modifying the value based on previous data point values in the waveform. The smoothed waveform is analyzed by iteratively stepping through the data points of the smoothed waveform and determining event onset based on change in data point values. The analysis uses the slope of the waveform to determine whether the data point values and slopes meet certain criteria indicating an event onset.Type: GrantFiled: February 17, 1998Date of Patent: May 29, 2001Assignee: Fonix CorporationInventor: Michael W. Newson
-
Patent number: 6233550Abstract: A method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and “transitory” or “transition” speech, and a particular type of coding scheme is used for each class. Harmonic coding is used for steady state voiced speech, “noise-like” coding is used for stationary unvoiced speech, and a special coding mode is used for transition speech, designed to capture the location, the structure, and the strength of the local time events that characterize the transition portions of the speech. The compression schemes can be applied to the speech signal or to the LP residual signal.Type: GrantFiled: August 28, 1998Date of Patent: May 15, 2001Assignee: The Regents of the University of CaliforniaInventors: Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
-
Patent number: 6226606Abstract: In a method for tracking pitch in a speech signal, first and second window vectors are created from samples taken across first and second windows of the speech signal. The first window is separated from the second window by a test pitch period. The energy of the speech signal in the first window is combined with the correlation between the first window vector and the second window vector to produce a predictable energy factor. The predictable energy factor is then used to determine a pitch score for the test pitch period. Based in part on the pitch score, a portion of the pitch track is identified.Type: GrantFiled: November 24, 1998Date of Patent: May 1, 2001Assignee: Microsoft CorporationInventors: Alejandro Acero, James G. Droppo, III
-
Patent number: 6188979Abstract: A method and apparatus for improved pitch period (&tgr;) estimation in a compression system is disclosed. The system uses original estimates of integer lag (&tgr;0) and open-loop prediction gain (&bgr;ol) as input to an adaptive filter parameter initialization block (304) which supplies inputs to a plurality of adaptive filter elements (306-308). Adaptive filter elements (306-308) provide information regarding the harmonics of the residual signal (&egr;(n)) to an adaptive filter parameter analysis block (310). Adaptive filter parameter analysis block (310) estimates the fundamental frequency of the residual signal based on the analysis of the harmonics and outputs a pitch period (&tgr;) for eventual use in a delay contour computation.Type: GrantFiled: May 28, 1998Date of Patent: February 13, 2001Assignee: Motorola, Inc.Inventor: James Patrick Ashley
-
Patent number: 6182032Abstract: A communication system has a network and a number of terminals. The network and the terminals have multi-rate speech encoders and decoders. Two terminals may communicate with each other through two-way voice communication where voice paths of the two-way voice communication are acoustically coupled to each other. The two terminals may also communicate through at least one non-acoustically coupled path. If the two terminals communicate through at least one non-acoustically coupled path, multi-rate encoders and decoders assigned to the non-acoustically coupled path operate at a lower bit rate than in a situation in which the two terminals operate through two-way voice. Whether a communication between the two terminals is through at least one non-acoustically coupled path is established a priori or dynamically.Type: GrantFiled: September 10, 1998Date of Patent: January 30, 2001Assignee: U.S. Philips CorporationInventor: Juha Rapeli
-
Patent number: 6173265Abstract: A voice recording and/or reproducing device includes a plurality of coders having different bit rates for coding voice to provide coded voice data, a voice recording mode change over switch for selecting one of the plurality of coders, and a system controller. The system controller stores coding selection data obtained by the change over of the voice recording mode and coded voice data obtained from the selected coder, to a storing medium, and reduces a deterioration of the voice due to the change over. The voice recording and/or reproducing device also includes a detector for detecting the coding selection data, and a plurality of decoders for decoding the coded voice data at the bit rate corresponding to the detected coding selection data.Type: GrantFiled: December 23, 1996Date of Patent: January 9, 2001Assignee: Olympus Optical Co., Ltd.Inventor: Hidetaka Takahashi
-
Patent number: 6157906Abstract: A digital signal processor (100) receives a digitally vocoded signal (102), and calculates a staggered average value (404) from the frame energy of each received frame, or the product of the frame energy and a voicing value. While the staggered average value is above a threshold voice indicator value, speech is declared present.Type: GrantFiled: July 31, 1998Date of Patent: December 5, 2000Assignee: Motorola, Inc.Inventors: Richard Brent Nicholls, Chin Pan Wong, Martin Thuo Karanja, Patrick Joseph Doran, David James Graham
-
Patent number: RE38269Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the “noisy” vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end.Type: GrantFiled: October 21, 1999Date of Patent: October 7, 2003Assignee: ITT Manufacturing Enterprises, Inc.Inventor: Yu-Jih Liu