Silence Decision Patents (Class 704/215)
  • Patent number: 7302388
    Abstract: Method and apparatus detect voice activity for spectrum or power efficiency purposes. The method determines and tracks the instant, minimum and maximum power levels of the input signal. The method selects a first range of signals to be considered as noise, and a second range of signals to be considered as voice. The method uses the selected voice, noise and power levels to calculate a log likelihood ratio (LLR). The method uses the LLR to determine a threshold, then uses the threshold for differentiating between noise and voice.
    Type: Grant
    Filed: February 17, 2004
    Date of Patent: November 27, 2007
    Assignee: Ciena Corporation
    Inventors: Song Zhang, Eric Verreault
  • Patent number: 7299173
    Abstract: Speech presence is detected by first bandpass filtering (141, 143, 145) the speech to split it into banks of sub-bands. A matrix of shift registers (150) store each sub-band of speech. A power determining circuit (259) then determines individual power measurements of the speech stored in each shift register element. A variance combining circuit (160) combines the individual power measurements to provide a variance for the individual shift registers. A comparator circuit (170) finally compares the variance with at least one threshold to indicate whether speech is detected.
    Type: Grant
    Filed: January 30, 2002
    Date of Patent: November 20, 2007
    Assignee: Motorola Inc.
    Inventors: Changxue Ma, Mark Randolph
  • Publication number: 20070233471
    Abstract: A speech processing apparatus includes a sound input unit that receives an input of a sound including a voice of one of an operator and a person other than the operator; a designation-duration accepting unit that accepts a designation-duration designated by the operator as a time interval that is a target of a speech processing within the input sound; a voice-duration detecting unit that detects a voice-duration that is a time interval in which the voice is present from the input sound; a speaker determining unit that determines whether a speaker of the voice is the operator or the person based on the input sound; and a deciding unit that detects an overlapping period between the designation-duration and the voice-duration, and decides that the voice-duration including the overlapping period is a processing duration, when the overlapping period is detected and the speaker is determined to be the person.
    Type: Application
    Filed: October 17, 2006
    Publication date: October 4, 2007
    Applicant: Kabushiki Kaisha Toshiba
    Inventor: Masahide ARIU
  • Patent number: 7272552
    Abstract: The present invention is a system and method that improves upon voice activity detection by packetizing actual noise signals, typically background noise. In accordance with the present invention an access network receives an input voice signal (including noise) and converts the input voice signal into a packetized voice signal. The packetized voice signal is transmitted via a network to an egress network. The egress network receives the packetized voice signal, converts the packetized voice signal into an output voice signal, and outputs the output voice signal. The egress network also extracts and stores noise packets from the received packetized voice signal and converts the packetized noise signal into an output noise signal. When the access network ceases to receive the input voice signal while the call is still ongoing, the access network instructs the egress network to continually output the output noise signal.
    Type: Grant
    Filed: December 27, 2002
    Date of Patent: September 18, 2007
    Assignee: AT&T Corp.
    Inventors: James H James, Joshua Hal Rosenbluth
  • Patent number: 7243063
    Abstract: A method segments an audio signal including frames into non-speech and speech segments. First, high-dimensional spectral features are extracted from the audio signal. The high-dimensional features are then projected non-linearly to low-dimensional features that are subsequently averaged using a sliding window and weighted averages. A linear discriminant is applied to the averaged low-dimensional features to determine a threshold separating the low-dimensional features. The linear discriminant can be determined from a Gaussian mixture or a polynomial applied to a bi-model histogram distribution of the low-dimensional features. Then, the threshold can be used to classify the frames into either non-speech or speech segments. Speech segments having a very short duration can be discarded, and the longer speech segments can be further extended. In batch-mode or real-time the threshold can be updated continuously.
    Type: Grant
    Filed: July 17, 2002
    Date of Patent: July 10, 2007
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Bhiksha Ramakrishnan, Rita Singh
  • Patent number: 7233895
    Abstract: A low energy detector maintains a sorted list that defines to the circuit removing samples from the queue the location and energy of all samples that fall below the predetermined energy level. This allows a circuit removing samples to execute an algorithm that allows samples to be deleted in accordance with a predetermined pattern.
    Type: Grant
    Filed: May 30, 2002
    Date of Patent: June 19, 2007
    Assignee: Avaya Technology Corp.
    Inventor: Norman W. Petty
  • Patent number: 7228271
    Abstract: The telephone apparatus of the present invention comprises a first voice band expander for generating a voiced signal frequency component by shifting the frequency of the voice signal received, a second voice band expander for generating a voiceless signal frequency component by shifting the frequency of the voice signal received, and a voice composer for composing the voice signal received, the output of the first voice band expander, and the output of the second voice band expander, which is able to output clear voices in aural communication.
    Type: Grant
    Filed: December 23, 2002
    Date of Patent: June 5, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Toshimichi Tokuda, Takashi Kimura
  • Patent number: 7203640
    Abstract: An input signal is input via an input part. A plurality of signal section candidate detecting parts having different detection algorithms detect an intended signal section candidate and a noise signal section candidate from the input signal. A signal section classifying part is notified of detection results from the respective signal section candidate detecting parts, and classifies the respective signal section candidates based on a combination of the detection results.
    Type: Grant
    Filed: October 30, 2002
    Date of Patent: April 10, 2007
    Assignee: Fujitsu Limited
    Inventors: Kentaro Murase, Takuya Noda, Kazuhiro Watanabe
  • Patent number: 7203638
    Abstract: A source-controlled Variable bit-rate Multi-mode WideBand (VMR-WB) codec, having a mode of operation that is interoperable with the Adaptive Multi-Rate wideband (AMR-WB) codec, the codec comprising: at least one Interoperable full-rate (I-FR) mode, having a first bit allocation structure based on one of a AMR-WB codec coding types; and at least one comfort noise generator (CNG) coding type for encoding inactive speech frame having a second bit allocation structure based on AMR-WB SID_UPDATE coding type.
    Type: Grant
    Filed: January 19, 2005
    Date of Patent: April 10, 2007
    Assignee: Nokia Corporation
    Inventors: Milan Jelinek, Redwan Salami
  • Patent number: 7177810
    Abstract: A method and apparatus for finding endpoints in speech by utilizing information contained in speech prosody. Prosody denotes the way speakers modulate the timing, pitch and loudness of phones, words, and phrases to convey certain aspects of meaning; informally, prosody includes what is perceived as the “rhythm” and “melody” of speech. Because speakers use prosody to convey units of speech to listeners, the method and apparatus performs endpoint detection by extracting and interpreting the relevant prosodic properties of speech.
    Type: Grant
    Filed: April 10, 2001
    Date of Patent: February 13, 2007
    Assignee: SRI International
    Inventors: Elizabeth Shriberg, Harry Bratt, Mustafa K. Sonmez
  • Patent number: 7162418
    Abstract: A buffering process for real-time digital audio is provided to effect of network “jitter” from inconsistent network packet delivery rates. The buffering algorithm is particularly useful for audio data including distinct bursts separated by silence, such as speech. The process holds incoming audio packets in a queue until either: (a) the buffer contents meet a predetermined threshold; or (b) the end packet of a burst is received. The result is that silent periods between bursts may expand or decrease relative to the original audio pattern, allowing cumulative jitter to be played out as silence. The threshold is sized such that the deviation in silence is unnoticeable by a listener. In an optional embodiment, the process periodically adjusts the threshold to adapt to network conditions.
    Type: Grant
    Filed: November 15, 2001
    Date of Patent: January 9, 2007
    Assignee: Microsoft Corporation
    Inventors: Ivan J. Leichtling, Ido Ben-Shachar
  • Patent number: 7155385
    Abstract: An estimate is made of the power of a speech portion of a speech signal that includes speech portions separated by non-speech portions, the power for the speech portion being estimated based on a power envelope that spans the speech portion. The gain of an automatic gain control is not adjusted during the speech portions.
    Type: Grant
    Filed: May 16, 2002
    Date of Patent: December 26, 2006
    Assignee: Comerica Bank, as Administrative Agent
    Inventors: Alexander Berestesky, David E. Duehren
  • Patent number: 7136630
    Abstract: The present invention relates to a mobile set integrating a memory efficient data storage system for the real time recording of voice conversations, data transmission and the like. The data recorder has the capacity to selectively choose the most relevant time frames of a conversation for recording, while discarding time frames that only occupy additional space in memory without holding any conversational data. The invention executes a series of logic steps on each signal including a voice activity detector step, frame comparison step, and sequential recording step. A mobile set having a modified architecture for performing the methods of the present invention is also disclosed.
    Type: Grant
    Filed: December 22, 2000
    Date of Patent: November 14, 2006
    Assignee: Broadcom Corporation
    Inventor: Fei Xie
  • Patent number: 7130797
    Abstract: A method of locating a talker in a reverberant environment comprises receiving multiple audio signals from a microphone array that include direct path audio signal and reverberation signal components. The direct path audio signal components of the multiple audio signals are detected and are used to weight the multiple audio signals. A position estimate based on the weighted audio signals is then calculated. Periods of speech activity are detected and a final position estimate is generated during the periods of speech activity.
    Type: Grant
    Filed: August 15, 2002
    Date of Patent: October 31, 2006
    Assignee: Mitel Networks Corporation
    Inventors: Franck Beaucoup, Michael Tetelbaum
  • Patent number: 7130793
    Abstract: When it is determined that a sample queue exceeds a first predefined level, samples being received from a IP switched network are modified such that samples are removed within the voiced region of the samples by removing whole pitch periods of samples. If the sample queue is below a second predefined number, additional samples are placed into the queue by analyzing voiced samples from the IP switched network and generating additional pitch periods of samples.
    Type: Grant
    Filed: April 5, 2002
    Date of Patent: October 31, 2006
    Assignee: Avaya Technology Corp.
    Inventors: Norman C. Chan, Sharmistha Sarkar Das
  • Patent number: 7127392
    Abstract: The present invention is a device for and method of detecting voice activity. First, the AM envelope of a segment of a signal of interest is determined. Next, the number of times the AM envelope crosses a user-definable threshold is determined. If there are no crossings, the segment is identified as non-speech. next, the number of points on the AM envelope within a user-definable range is determined. If there are less than a user-definable number of points within the range, the segment is identified as non-speech. Next, the mean, variance, and power ratio of the normalized spectral content of the AM envelope is found and compared to the same for known speech and non-speech. The segment is identified as being of the same type as the known speech or non-speech to which it most closely compares. These steps are repreated for each signal segment of interest.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: October 24, 2006
    Assignee: The United States of America as represented by the National Security Agency
    Inventor: David C. Smith
  • Patent number: 7092875
    Abstract: A first CN code (silence code) obtained by encoding a silence signal, which is contained in an input signal, by a silence compression function of a first speech encoding scheme is transcoded to a second CN code of a second speech encoding scheme without decoding the first CN code to a CN signal. For example, the first CN code is demultiplexed into a plurality of first element codes by a code demultiplexer, the first element codes are each transcoded to a plurality of second element codes that constitute the second CN code, and the second element codes obtained by this transcoding are multiplexed to output the second CN code.
    Type: Grant
    Filed: March 27, 2002
    Date of Patent: August 15, 2006
    Assignee: Fujitsu Limited
    Inventors: Yoshiteru Tsuchinaga, Yasuji Ota, Masanao Suzuki
  • Patent number: 7080007
    Abstract: An apparatus and a method for computing a Speech Absence Probability (SAP), and an apparatus and a method for removing noise by using the SAP computing device and method are provided.
    Type: Grant
    Filed: September 25, 2002
    Date of Patent: July 18, 2006
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chang-yong Son, Vladimir Shin, Sang-ryong Kim
  • Patent number: 7075907
    Abstract: A method for operating a wireless communications system includes a step of signalling, between a mobile station to a network, that the mobile station or the network is temporarily ceasing transmission of circuit switched information (DTX), which could be voice frames or data frames. For the case of voice, the method further includes a step, executed in the network, of determining if a current uplink or downlink voice traffic channel that is assigned to the mobile station can be retained by the mobile station, or whether the current uplink or downlink voice traffic channel must be released by the mobile station. Only if it is determined that the current uplink or downlink voice traffic channel must be released by the mobile station, does the network signal to the mobile station to release the channel.
    Type: Grant
    Filed: June 6, 2000
    Date of Patent: July 11, 2006
    Assignee: Nokia Corporation
    Inventor: Raino Lintulampi
  • Patent number: 7072831
    Abstract: Enhanced estimation of the noise component of a signal is accomplished by using a plurality of filters. Each filter provides an estimate of a minimum sample in a sample set that includes a plurality of signal samples. A comparator, coupled to the plurality of filters, successively compares the estimates among the plurality of filters, and selects the signal estimate having the lowest magnitude. The selected signal estimate represents an enhanced estimate of the noise component of the signal.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: July 4, 2006
    Assignee: Lucent Technologies Inc.
    Inventor: Walter Etter
  • Patent number: 7072828
    Abstract: Problems of front-end clipping and excessively long holdover times in digitally encoded speech are resolved by the introduction of a queue at the transmitting end of a digital conversation. Samples are transmitted from the queue until an interval of low energy samples is encountered upon which time samples are not transmitted from queue until energy samples are present.
    Type: Grant
    Filed: May 13, 2002
    Date of Patent: July 4, 2006
    Assignee: Avaya Technology Corp.
    Inventor: Norman W. Petty
  • Patent number: 7065486
    Abstract: Various time-domain noise suppression methods and devices for suppressing a noise signal in a speech signal are provided. For example, a time-domain noise suppression method comprises estimating a plurality of linear prediction coefficients for the speech signal, generating a prediction error estimate based on the plurality of prediction coeficients, generating an estimate of the speech signal based on the plurality of linear prediction coefficients, using a voice activity detector to determine voice activity in the speech signal, updating a plurality of noise parameters based on the prediction error and if the voice activity detector determines no voice activity in the speech signal, generating an estimate of the noise signal based on the plurality of noise parameters, and passing the speech signal through a filter derived from the estimate of the noise signal and the estimate of the speech signal to generate a clean speech signal estimate.
    Type: Grant
    Filed: April 11, 2002
    Date of Patent: June 20, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Jes Thyssen
  • Patent number: 7058889
    Abstract: A method of synchronizing visual information with audio playback includes the steps of selecting a desired audio file from a list stored in memory associated with a display device, sending a signal from the display device to a separate playback device to cause the separate playback device to start playing the desired audio file; and displaying visual information associated with the desired audio file on the display device in accordance with timestamp data such that the visual information is displayed synchronously with the playing of the desired audio file, wherein the commencement of playing the desired audio file and the commencement of the displaying step are a function of the signal from the display device.
    Type: Grant
    Filed: November 29, 2001
    Date of Patent: June 6, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Karen I. Trovato, Dongge Li, Muralidharan Ramaswamy
  • Patent number: 7024353
    Abstract: In a distributed voice recognition system, a back-end pattern matching unit 27 can be informed of voice activity detection information as developed through use of a back-end voice activity detector 25. Although no specific voice activity detection information is developed or forwarded by the front-end of the system, precursor information as developed at the back-end can be used by the voice activity detector to nevertheless ascertain with relative accuracy the presence or absence of voice in a given set of corresponding voice recognition features as developed by the front-end of the system.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: April 4, 2006
    Assignee: Motorola, Inc.
    Inventor: Tenkasi Ramabadran
  • Patent number: 7016834
    Abstract: In general, this invention concerns speech encoding and decoding used in digital radio systems and a method by which the processing capacity required can be reduced in a telecommunication system using discontinuous transmission between a transmitter and receiver. In particular, the method according to the invention is used to match two telecommunication systems using different encoding methods between the transmitter and receiver. In the method, the signals transmitted by the transmitter are made suitable for the receiver in the signal path so that in the first step, at least one information parameter comprising at least two content identifiers is formed for each data frame of the data parameters (101) received. In the next step, data corresponding to the original data is synthesized from the data parameters (101) of the received frames, after which the synthesized data is transmitted for recoding with an encoding method suitable for the receiver.
    Type: Grant
    Filed: July 14, 2000
    Date of Patent: March 21, 2006
    Assignee: Nokia Corporation
    Inventor: Ari Lakaniemi
  • Patent number: 7013267
    Abstract: A communication system includes a destination that receives voice samples and a voice parameter generated by a source. The destination uses the voice samples and voice parameter to reconstruct voice information in response to a packet loss. The destination may reconstruct voice information from multiple sources.
    Type: Grant
    Filed: July 30, 2001
    Date of Patent: March 14, 2006
    Assignee: Cisco Technology, Inc.
    Inventors: Pascal H. Huart, Luke K. Surazski
  • Patent number: 6999920
    Abstract: Method for the reduction of echo and/or noise signals in TK systems for the transmission of useful acoustic signals, in which, when a silence interval is present, the distorted useful signal is modified by a time-dependent control signal ao(t) or by a control signal ao(k) cycled in the rhythm of a scan rate fT=1/T. The control signal ao(k) is varied in such manner that, during the presence of speech signals in the useful signals, the amplitude of the control signal ao(k) is set to a predetermined constant value co and, when a silence interval begins, the amplitude of the control signal ao(k) is reduced continuously from one sample value to the next in accordance with the recurrence formula ao(k+1)=ao(k).? with ?<1. After the end of the silence interval, ao(k) is again set equal to co.
    Type: Grant
    Filed: November 21, 2000
    Date of Patent: February 14, 2006
    Assignee: Alcatel
    Inventors: Hans-Jürgen Matt, Michael Walker, Michael Maurer
  • Patent number: 6999921
    Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.
    Type: Grant
    Filed: December 13, 2001
    Date of Patent: February 14, 2006
    Assignee: Motorola, Inc.
    Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
  • Patent number: 6980950
    Abstract: An utterance detector for speech recognition is described. The detector consists of two components. The first part makes a speech/non-speech decision for each incoming speech frame. The decision is based on a frequency-selective autocorrelation function obtained by speech power spectrum estimation, frequency filter, and inverse Fourier transform. The second component makes utterance detection decision, using a state machine that describes the detection process in terms of the speech/non-speech decision made by the first component.
    Type: Grant
    Filed: September 21, 2000
    Date of Patent: December 27, 2005
    Assignee: Texas Instruments Incorporated
    Inventors: Yifan Gong, Yu-Hung Kao
  • Patent number: 6947412
    Abstract: A method of improving sound playback of digitized speech signals transmitted to a telecommunications terminal at the beginning of a telephone call set up over a communications network where the signals are transmitted in the form of packets, and in particular at the beginning of a VOIP call set up under Internet protocol, at the time said call is set up from a sending telecommunications terminal fitted with voice activity detection means so as to be capable of transmitting only those digitized signal packets that contain speech taken from a set of sound signals that are suitable for being transmitted in the form of packets after the sound has been digitized and encoded in the sending terminal. Signal packets are transmitted from the digitizing and encoding means during an initial call optimization stage without taking account of whether or not any speech signals are present. The invention also provides telecommunications hardware implementing the method.
    Type: Grant
    Filed: December 19, 2000
    Date of Patent: September 20, 2005
    Assignee: Alcatel
    Inventors: Luc Attimont, Jannick Bodin
  • Patent number: 6885987
    Abstract: At an audio source, pause information is added to audio data, the combination of which is subsequently packetized. The resulting packets are transmitted to an audio destination via a network in which different packets may be subjected to varying levels of delay. At the audio destination, the pause information may be used to insert pauses at appropriate times to accommodate the occurrence of delays in packet delivery. In one embodiment, pauses are inserted based on a hierarchy of pause types. During pauses, audio filler information may be injected. In this manner, the effects of variable network delays upon reconstructed audio may be mitigated.
    Type: Grant
    Filed: February 9, 2001
    Date of Patent: April 26, 2005
    Assignee: fastmobile, Inc.
    Inventors: Dale R. Buchholz, Bashar Jano, Ira Gerson
  • Patent number: 6856961
    Abstract: The invention provides a speech coding system with input signal transformation that may reduce or essentially eliminate “silence noise” from the input or speech signal. The speech coding system may comprise an encoder disposed to receive an input signal. The encoder ramps the input signal to a zero-level when a portion of the input signal comprises silence noise.
    Type: Grant
    Filed: February 13, 2001
    Date of Patent: February 15, 2005
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Jes Thyssen
  • Patent number: 6847930
    Abstract: Voice activity is detected by comparing an analog signal with two voltage thresholds and producing data representing the energy of the signal. The data, in binary form, is compared with thresholds to determine voice activity. In accordance with another aspect of the invention, the thresholds are adjusted based upon statistical information. In accordance with another aspect of the invention, the data can be weighted to provide an indication of the quasi-RMS energy of an input signal. The input signal itself is not converted into digital form yet the data derived from the input signal has high resolution.
    Type: Grant
    Filed: January 25, 2002
    Date of Patent: January 25, 2005
    Assignee: Acoustic Technologies, Inc.
    Inventors: Justin L. Allen, Steven M. Domer
  • Patent number: 6826600
    Abstract: Mechanisms and techniques allow computer systems to create and exchange uniquely identified shared objects. Using this invention, a client computer system can operate client software to generate local object definitions in a local object specification. To assure that the local object definitions created by the client are uniquely identifiable by this client, as well as by a server and possibly other clients which may require access to such object definitions (e.g., other clients in a collaboration software system), the invention allows the client to send the local object specification to the server for unique identification of the object definitions. The server receives the local object specification containing the local object definitions created by the client and can convert each local object definition within the local object specification to a global object definition in a global object specification.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: November 30, 2004
    Assignee: Cisco Technology, Inc.
    Inventor: Paul J. Russell
  • Patent number: 6826527
    Abstract: A decoder for code excited LP encoded frames with both adaptive and fixed codebooks; erased frame concealment uses muted repetitive excitation, threshold-adapted bandwidth expanded repetitive synthesis filter, and jittered repetitive pitch lag.
    Type: Grant
    Filed: November 3, 2000
    Date of Patent: November 30, 2004
    Assignee: Texas Instruments Incorporated
    Inventor: Takahiro Unno
  • Publication number: 20040225494
    Abstract: In a digital voice communication system, a transmitting radio (100/102) checks for activity on a channel. If activity is not detected on the channel, the transmitting radio transmits at least one voice frame on the channel. If activity is detected on the channel, the transmitting radio temporarily buffers the at least one voice frame and waits a period of time prior to re-checking the channel for activity. If activity is still detected on the channel, the transmitting radio repeats the step of waiting until activity is no longer detected on the channel; otherwise, the transmitting radio transmits the at least one voice frame that was temporarily buffered.
    Type: Application
    Filed: May 7, 2003
    Publication date: November 11, 2004
    Inventors: Kevin B. Mayginnes, Darrell J. Stogner
  • Patent number: 6816832
    Abstract: A comfort noise block, that include a hangover period and comfort noise parameters, is transmitted in such a manner that it is not interrupted by other messages, such as FACCH messages. This is accomplished in a mobile station by a determination of whether any FACCH messages are required to be transmitted. If such FACCH messages exist, a further determination may be made as to which transmission can be made in the shortest time (i.e., the FACCH message or messages or the comfort noise parameters message), and this transmission is made first. In any event the comfort noise parameters block is transmitted without interruption. In a further embodiment of this invention the comfort noise parameters message is transmitted by being concatenated with another message, such as a neighbor channel measurement results message, so as to reduce overhead, conserve bandwidth, and reduce power consumption.
    Type: Grant
    Filed: June 11, 2001
    Date of Patent: November 9, 2004
    Assignee: Nokia Corporation
    Inventors: Seppo Alanara, Pekka Kapanen
  • Patent number: 6807525
    Abstract: A method to reduce the amount of bandwidth used in the transmission of digitized voice packets is described. The method is used to reduce the number of transmitted packets by suspending transmission during periods of silence or when only noise is present. The system determines if a background noise update is warranted based on human auditory perception factors instead of an artificial limiter on excessive silence insertion descriptor packets. The system searches for characteristics in the perceptual changes of background noise instead of analyzing speech for improved audio compression. The invention weighs factors affecting the perception of sound including frequency masking, temporal masking, loudness perception based on tone, and auditory perception differential based on tone.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: October 19, 2004
    Assignee: Telogy Networks, Inc.
    Inventors: Dunling Li, Gokhan Sisli, Daniel Thomas
  • Patent number: 6801894
    Abstract: A speech synthesizer includes a data memory having a plurality of address areas, which stores a plurality of phases in the address areas and an address designating circuit designating one of the address areas based on the phase signal. Further, a speech synthesizer includes a speech synthesizing circuit generating a speech synthesizing signal corresponding to the phase, which is stored in the designated area, a digital/analog converter transforming the speech synthesizing signal to an analog signal having amplitude, and a counter setting a period of silence. Furthermore, a speech synthesizer includes a silence-input circuit being connected between the speech synthesizing circuit and the digital/analog converter, which supplies a predetermined voltage to the digital/analog converter for the period that is set by the counter.
    Type: Grant
    Filed: March 22, 2001
    Date of Patent: October 5, 2004
    Assignee: Oki Electric Industry Co., Ltd.
    Inventors: Yoshihisa Nakamura, Hiroaki Matsubara
  • Publication number: 20040193409
    Abstract: Systems and methods for dynamically analyzing temporality in an individual's speech in order to selectively categorize the speech fluency of the individual and/or to selectively provide speech training based on the results of the dynamic analysis. Temporal variables in one or more speech samples are dynamically quantified. The temporal variables in combination with a dynamic process, which is derived from analyses of temporality in the speech of native speakers and language learners, are used to provide a fluency score that identifies a proficiency of the individual. In some implementations, temporal variables are measured instantaneously.
    Type: Application
    Filed: December 11, 2003
    Publication date: September 30, 2004
    Inventors: Lynne Hansen, Joshua Rowe
  • Patent number: 6799161
    Abstract: A speech coding apparatus having a speech input unit for receiving input speech, a speech coding rate selector for selecting an appropriate speech coding rate according to the power of the input speech, a speech analyzer for processing the input speech to estimate a transfer function of the speaker's oral cavity, and a speech coding unit forming a synthesis filter based on the transfer function of the oral cavity. The speech coding unit also codes an excitation signal of the synthesis filter on the basis of an estimation result supplied by the speech analyzer. A gain suppressor interposed between the speech input unit and the speech coding unit suppresses the gain of a signal supplied from the speech input unit to the speech coding unit during an unvoiced period according to information from the speech coding rate selector.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: September 28, 2004
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Atsushi Yokoyama
  • Patent number: 6785644
    Abstract: With respect to data having periodicity to be compressed, windows of the same size are set for every two sections according to an interval of peaks appearing substantially periodically and processing for sorting sample data alternately among the set windows of the same size is sequentially performed, whereby a frequency of data having periodicity is replaced with an approximately half frequency without damaging reproducibility to original data at all to make it possible to apply compression processing to data of the replaced low frequency. If this sorting processing is applied to compression processing having a characteristic that a compression ratio is not increased in a high-frequency region, it becomes possible to improve a compression ratio without damaging a quality of reproduced data by decompression at all.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: August 31, 2004
    Assignee: Yasue Sakai
    Inventor: Yukio Koyanagi
  • Patent number: 6782363
    Abstract: A method and apparatus for performing real-time endpoint detection for use in automatic speech recognition. A filter is applied to the input speech signal and the filter output is then evaluated with use of a state transition diagram (i.e., a finite state machine). The filter is advantageously designed in light of several criteria in order to increase the accuracy and robustness of detection. The state transition diagram advantageously has three states. The endpoints which are detected may then be advantageously applied to the problem of energy normalization of the speech portion of the signal.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: August 24, 2004
    Assignee: Lucent Technologies Inc.
    Inventors: Chin-Hui Lee, Qi P. Li, Jinsong Zheng, Qiru Zhou
  • Patent number: 6766291
    Abstract: The invention relates to a method and apparatus for controlling the transition of a bypass capable codec between operative modes, based on a certain characteristic of the audio data signal processed by the codec. The apparatus relies on a control signal to determine when the codec will switch from one mode to another. This control signal reflects a characteristic of the audio data signal received at the apparatus, such as the type of speech activity or the format of the audio data signal. When in the active (non-bypass) mode, the apparatus relies on an additional control signal to switch to the inactive (bypass) mode. This additional control signal is received from a control unit at a remote codec that indicates that the remote codec is also bypass capable, hence the decoder at the first codec and the encoder at the remote codec can switch to the inactive mode to pass between them the compressed data frames.
    Type: Grant
    Filed: June 18, 1999
    Date of Patent: July 20, 2004
    Assignee: Nortel Networks Limited
    Inventors: Chung Cheung C. Chu, Rafi Rabipour, David G. Sloan
  • Publication number: 20040138880
    Abstract: Estimating a signal power in a compressed audio signal [A] is provided, the audio signal comprising blocks of quantized samples, a given block being provided with a set of scale factors. The estimating is performed by extracting the set of scale factors from the compressed audio signal, and estimating the signal power in the given block based on a combination of the scale factors. Advantageously, the extracting step and estimating step are performed on only a sub-set of the set of scale factors. The signal power estimation may be used in a silence detector (11) for use in a receiver (1).
    Type: Application
    Filed: November 6, 2003
    Publication date: July 15, 2004
    Inventors: Alessio Stella, Jan Alexis Daniel Nesvadba, Mauro Barbieri, Freddy Snijder
  • Publication number: 20040133420
    Abstract: Compressed signals contain amplitude data (for example, scale factors in an MPEG frame) which can be examined to enable a decision to be taken on whether the signal contains information or not (e.g. silence in the case of audio or no image in the case of video).
    Type: Application
    Filed: December 15, 2003
    Publication date: July 8, 2004
    Inventors: Gavin Robert Ferris, Michael Vincent Woodward
  • Publication number: 20040133421
    Abstract: Acoustic noise suppression is provided in multiple-microphone systems using Voice Activity Detectors (VAD). A host system receives acoustic signals via multiple microphones. The system also receives information on the vibration of human tissue associated with human voicing activity via the VAD. In response, the system generates a transfer function representative of the received acoustic signals upon determining that voicing information is absent from the received acoustic signals during at least one specified period of time. The system removes noise from the received acoustic signals using the transfer function, thereby producing a denoised acoustic data stream.
    Type: Application
    Filed: September 18, 2003
    Publication date: July 8, 2004
    Inventors: Gregory C. Burnett, Eric F. Breitfeller
  • Patent number: 6754620
    Abstract: A system and method is provided for rendering data indicative of delays associated with enabling and/or disabling an analog-to-digital conversion system employed by a telephony communication network. The system of the present invention utilizes a display device and an interface manager. The interface manager receives data indicative of power levels at various frequencies and times of signals received by a transceiver that is communicating via the conventional telephony communication network. The interface manager then renders a graphical display via the display device based on the received data. The graphical display may include clusters, in which each of the clusters is associated with a particular range of power levels. By analyzing the clusters, a user can determine the delays associated with enabling and/or disabling the analog-to-digital conversion system. The graphical display may also include indicators that may be used to determine the foregoing delays.
    Type: Grant
    Filed: March 29, 2000
    Date of Patent: June 22, 2004
    Assignee: Agilent Technologies, Inc.
    Inventor: Samuel M Bauer
  • Publication number: 20040107092
    Abstract: A digital line transmission unit can carry out switching between speech codecs during the same call to achieve balance between making effective use of a line and a high sound quality without bringing about a feeling of discomfort in a user by the switching. It includes in an encoder a first speech codec 7 with a high sound quality and a high bit rate, a second speech codec 8 with a reasonable sound quality but a low bit rate. It carries out switching between these speech codecs in response to the control information an operation monitoring controller 4 obtains by making a decision as to the traffic volume of the bearer line 111. The switching between the speech codecs is made during a speech pause a speech burst detector 31 in a signal detector 3 detects in an input speech signal.
    Type: Application
    Filed: October 6, 2003
    Publication date: June 3, 2004
    Inventor: Yoshihisa Harada
  • Publication number: 20040098253
    Abstract: A method and system for allowing a user to interface to an interactive voice response system via natural language commands.
    Type: Application
    Filed: October 14, 2003
    Publication date: May 20, 2004
    Inventors: Bruce Balentine, Rex Stringham, Ralph Melaragno, Justin Munroe