Silence Decision Patents (Class 704/215)
-
Patent number: 7302388Abstract: Method and apparatus detect voice activity for spectrum or power efficiency purposes. The method determines and tracks the instant, minimum and maximum power levels of the input signal. The method selects a first range of signals to be considered as noise, and a second range of signals to be considered as voice. The method uses the selected voice, noise and power levels to calculate a log likelihood ratio (LLR). The method uses the LLR to determine a threshold, then uses the threshold for differentiating between noise and voice.Type: GrantFiled: February 17, 2004Date of Patent: November 27, 2007Assignee: Ciena CorporationInventors: Song Zhang, Eric Verreault
-
Patent number: 7299173Abstract: Speech presence is detected by first bandpass filtering (141, 143, 145) the speech to split it into banks of sub-bands. A matrix of shift registers (150) store each sub-band of speech. A power determining circuit (259) then determines individual power measurements of the speech stored in each shift register element. A variance combining circuit (160) combines the individual power measurements to provide a variance for the individual shift registers. A comparator circuit (170) finally compares the variance with at least one threshold to indicate whether speech is detected.Type: GrantFiled: January 30, 2002Date of Patent: November 20, 2007Assignee: Motorola Inc.Inventors: Changxue Ma, Mark Randolph
-
Publication number: 20070233471Abstract: A speech processing apparatus includes a sound input unit that receives an input of a sound including a voice of one of an operator and a person other than the operator; a designation-duration accepting unit that accepts a designation-duration designated by the operator as a time interval that is a target of a speech processing within the input sound; a voice-duration detecting unit that detects a voice-duration that is a time interval in which the voice is present from the input sound; a speaker determining unit that determines whether a speaker of the voice is the operator or the person based on the input sound; and a deciding unit that detects an overlapping period between the designation-duration and the voice-duration, and decides that the voice-duration including the overlapping period is a processing duration, when the overlapping period is detected and the speaker is determined to be the person.Type: ApplicationFiled: October 17, 2006Publication date: October 4, 2007Applicant: Kabushiki Kaisha ToshibaInventor: Masahide ARIU
-
Patent number: 7272552Abstract: The present invention is a system and method that improves upon voice activity detection by packetizing actual noise signals, typically background noise. In accordance with the present invention an access network receives an input voice signal (including noise) and converts the input voice signal into a packetized voice signal. The packetized voice signal is transmitted via a network to an egress network. The egress network receives the packetized voice signal, converts the packetized voice signal into an output voice signal, and outputs the output voice signal. The egress network also extracts and stores noise packets from the received packetized voice signal and converts the packetized noise signal into an output noise signal. When the access network ceases to receive the input voice signal while the call is still ongoing, the access network instructs the egress network to continually output the output noise signal.Type: GrantFiled: December 27, 2002Date of Patent: September 18, 2007Assignee: AT&T Corp.Inventors: James H James, Joshua Hal Rosenbluth
-
Patent number: 7243063Abstract: A method segments an audio signal including frames into non-speech and speech segments. First, high-dimensional spectral features are extracted from the audio signal. The high-dimensional features are then projected non-linearly to low-dimensional features that are subsequently averaged using a sliding window and weighted averages. A linear discriminant is applied to the averaged low-dimensional features to determine a threshold separating the low-dimensional features. The linear discriminant can be determined from a Gaussian mixture or a polynomial applied to a bi-model histogram distribution of the low-dimensional features. Then, the threshold can be used to classify the frames into either non-speech or speech segments. Speech segments having a very short duration can be discarded, and the longer speech segments can be further extended. In batch-mode or real-time the threshold can be updated continuously.Type: GrantFiled: July 17, 2002Date of Patent: July 10, 2007Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Bhiksha Ramakrishnan, Rita Singh
-
Patent number: 7233895Abstract: A low energy detector maintains a sorted list that defines to the circuit removing samples from the queue the location and energy of all samples that fall below the predetermined energy level. This allows a circuit removing samples to execute an algorithm that allows samples to be deleted in accordance with a predetermined pattern.Type: GrantFiled: May 30, 2002Date of Patent: June 19, 2007Assignee: Avaya Technology Corp.Inventor: Norman W. Petty
-
Patent number: 7228271Abstract: The telephone apparatus of the present invention comprises a first voice band expander for generating a voiced signal frequency component by shifting the frequency of the voice signal received, a second voice band expander for generating a voiceless signal frequency component by shifting the frequency of the voice signal received, and a voice composer for composing the voice signal received, the output of the first voice band expander, and the output of the second voice band expander, which is able to output clear voices in aural communication.Type: GrantFiled: December 23, 2002Date of Patent: June 5, 2007Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Toshimichi Tokuda, Takashi Kimura
-
Patent number: 7203640Abstract: An input signal is input via an input part. A plurality of signal section candidate detecting parts having different detection algorithms detect an intended signal section candidate and a noise signal section candidate from the input signal. A signal section classifying part is notified of detection results from the respective signal section candidate detecting parts, and classifies the respective signal section candidates based on a combination of the detection results.Type: GrantFiled: October 30, 2002Date of Patent: April 10, 2007Assignee: Fujitsu LimitedInventors: Kentaro Murase, Takuya Noda, Kazuhiro Watanabe
-
Patent number: 7203638Abstract: A source-controlled Variable bit-rate Multi-mode WideBand (VMR-WB) codec, having a mode of operation that is interoperable with the Adaptive Multi-Rate wideband (AMR-WB) codec, the codec comprising: at least one Interoperable full-rate (I-FR) mode, having a first bit allocation structure based on one of a AMR-WB codec coding types; and at least one comfort noise generator (CNG) coding type for encoding inactive speech frame having a second bit allocation structure based on AMR-WB SID_UPDATE coding type.Type: GrantFiled: January 19, 2005Date of Patent: April 10, 2007Assignee: Nokia CorporationInventors: Milan Jelinek, Redwan Salami
-
Patent number: 7177810Abstract: A method and apparatus for finding endpoints in speech by utilizing information contained in speech prosody. Prosody denotes the way speakers modulate the timing, pitch and loudness of phones, words, and phrases to convey certain aspects of meaning; informally, prosody includes what is perceived as the “rhythm” and “melody” of speech. Because speakers use prosody to convey units of speech to listeners, the method and apparatus performs endpoint detection by extracting and interpreting the relevant prosodic properties of speech.Type: GrantFiled: April 10, 2001Date of Patent: February 13, 2007Assignee: SRI InternationalInventors: Elizabeth Shriberg, Harry Bratt, Mustafa K. Sonmez
-
Patent number: 7162418Abstract: A buffering process for real-time digital audio is provided to effect of network “jitter” from inconsistent network packet delivery rates. The buffering algorithm is particularly useful for audio data including distinct bursts separated by silence, such as speech. The process holds incoming audio packets in a queue until either: (a) the buffer contents meet a predetermined threshold; or (b) the end packet of a burst is received. The result is that silent periods between bursts may expand or decrease relative to the original audio pattern, allowing cumulative jitter to be played out as silence. The threshold is sized such that the deviation in silence is unnoticeable by a listener. In an optional embodiment, the process periodically adjusts the threshold to adapt to network conditions.Type: GrantFiled: November 15, 2001Date of Patent: January 9, 2007Assignee: Microsoft CorporationInventors: Ivan J. Leichtling, Ido Ben-Shachar
-
Patent number: 7155385Abstract: An estimate is made of the power of a speech portion of a speech signal that includes speech portions separated by non-speech portions, the power for the speech portion being estimated based on a power envelope that spans the speech portion. The gain of an automatic gain control is not adjusted during the speech portions.Type: GrantFiled: May 16, 2002Date of Patent: December 26, 2006Assignee: Comerica Bank, as Administrative AgentInventors: Alexander Berestesky, David E. Duehren
-
Patent number: 7136630Abstract: The present invention relates to a mobile set integrating a memory efficient data storage system for the real time recording of voice conversations, data transmission and the like. The data recorder has the capacity to selectively choose the most relevant time frames of a conversation for recording, while discarding time frames that only occupy additional space in memory without holding any conversational data. The invention executes a series of logic steps on each signal including a voice activity detector step, frame comparison step, and sequential recording step. A mobile set having a modified architecture for performing the methods of the present invention is also disclosed.Type: GrantFiled: December 22, 2000Date of Patent: November 14, 2006Assignee: Broadcom CorporationInventor: Fei Xie
-
Patent number: 7130797Abstract: A method of locating a talker in a reverberant environment comprises receiving multiple audio signals from a microphone array that include direct path audio signal and reverberation signal components. The direct path audio signal components of the multiple audio signals are detected and are used to weight the multiple audio signals. A position estimate based on the weighted audio signals is then calculated. Periods of speech activity are detected and a final position estimate is generated during the periods of speech activity.Type: GrantFiled: August 15, 2002Date of Patent: October 31, 2006Assignee: Mitel Networks CorporationInventors: Franck Beaucoup, Michael Tetelbaum
-
Patent number: 7130793Abstract: When it is determined that a sample queue exceeds a first predefined level, samples being received from a IP switched network are modified such that samples are removed within the voiced region of the samples by removing whole pitch periods of samples. If the sample queue is below a second predefined number, additional samples are placed into the queue by analyzing voiced samples from the IP switched network and generating additional pitch periods of samples.Type: GrantFiled: April 5, 2002Date of Patent: October 31, 2006Assignee: Avaya Technology Corp.Inventors: Norman C. Chan, Sharmistha Sarkar Das
-
Patent number: 7127392Abstract: The present invention is a device for and method of detecting voice activity. First, the AM envelope of a segment of a signal of interest is determined. Next, the number of times the AM envelope crosses a user-definable threshold is determined. If there are no crossings, the segment is identified as non-speech. next, the number of points on the AM envelope within a user-definable range is determined. If there are less than a user-definable number of points within the range, the segment is identified as non-speech. Next, the mean, variance, and power ratio of the normalized spectral content of the AM envelope is found and compared to the same for known speech and non-speech. The segment is identified as being of the same type as the known speech or non-speech to which it most closely compares. These steps are repreated for each signal segment of interest.Type: GrantFiled: February 12, 2003Date of Patent: October 24, 2006Assignee: The United States of America as represented by the National Security AgencyInventor: David C. Smith
-
Patent number: 7092875Abstract: A first CN code (silence code) obtained by encoding a silence signal, which is contained in an input signal, by a silence compression function of a first speech encoding scheme is transcoded to a second CN code of a second speech encoding scheme without decoding the first CN code to a CN signal. For example, the first CN code is demultiplexed into a plurality of first element codes by a code demultiplexer, the first element codes are each transcoded to a plurality of second element codes that constitute the second CN code, and the second element codes obtained by this transcoding are multiplexed to output the second CN code.Type: GrantFiled: March 27, 2002Date of Patent: August 15, 2006Assignee: Fujitsu LimitedInventors: Yoshiteru Tsuchinaga, Yasuji Ota, Masanao Suzuki
-
Patent number: 7080007Abstract: An apparatus and a method for computing a Speech Absence Probability (SAP), and an apparatus and a method for removing noise by using the SAP computing device and method are provided.Type: GrantFiled: September 25, 2002Date of Patent: July 18, 2006Assignee: Samsung Electronics Co., Ltd.Inventors: Chang-yong Son, Vladimir Shin, Sang-ryong Kim
-
Patent number: 7075907Abstract: A method for operating a wireless communications system includes a step of signalling, between a mobile station to a network, that the mobile station or the network is temporarily ceasing transmission of circuit switched information (DTX), which could be voice frames or data frames. For the case of voice, the method further includes a step, executed in the network, of determining if a current uplink or downlink voice traffic channel that is assigned to the mobile station can be retained by the mobile station, or whether the current uplink or downlink voice traffic channel must be released by the mobile station. Only if it is determined that the current uplink or downlink voice traffic channel must be released by the mobile station, does the network signal to the mobile station to release the channel.Type: GrantFiled: June 6, 2000Date of Patent: July 11, 2006Assignee: Nokia CorporationInventor: Raino Lintulampi
-
Patent number: 7072831Abstract: Enhanced estimation of the noise component of a signal is accomplished by using a plurality of filters. Each filter provides an estimate of a minimum sample in a sample set that includes a plurality of signal samples. A comparator, coupled to the plurality of filters, successively compares the estimates among the plurality of filters, and selects the signal estimate having the lowest magnitude. The selected signal estimate represents an enhanced estimate of the noise component of the signal.Type: GrantFiled: June 30, 1998Date of Patent: July 4, 2006Assignee: Lucent Technologies Inc.Inventor: Walter Etter
-
Patent number: 7072828Abstract: Problems of front-end clipping and excessively long holdover times in digitally encoded speech are resolved by the introduction of a queue at the transmitting end of a digital conversation. Samples are transmitted from the queue until an interval of low energy samples is encountered upon which time samples are not transmitted from queue until energy samples are present.Type: GrantFiled: May 13, 2002Date of Patent: July 4, 2006Assignee: Avaya Technology Corp.Inventor: Norman W. Petty
-
Patent number: 7065486Abstract: Various time-domain noise suppression methods and devices for suppressing a noise signal in a speech signal are provided. For example, a time-domain noise suppression method comprises estimating a plurality of linear prediction coefficients for the speech signal, generating a prediction error estimate based on the plurality of prediction coeficients, generating an estimate of the speech signal based on the plurality of linear prediction coefficients, using a voice activity detector to determine voice activity in the speech signal, updating a plurality of noise parameters based on the prediction error and if the voice activity detector determines no voice activity in the speech signal, generating an estimate of the noise signal based on the plurality of noise parameters, and passing the speech signal through a filter derived from the estimate of the noise signal and the estimate of the speech signal to generate a clean speech signal estimate.Type: GrantFiled: April 11, 2002Date of Patent: June 20, 2006Assignee: Mindspeed Technologies, Inc.Inventor: Jes Thyssen
-
Patent number: 7058889Abstract: A method of synchronizing visual information with audio playback includes the steps of selecting a desired audio file from a list stored in memory associated with a display device, sending a signal from the display device to a separate playback device to cause the separate playback device to start playing the desired audio file; and displaying visual information associated with the desired audio file on the display device in accordance with timestamp data such that the visual information is displayed synchronously with the playing of the desired audio file, wherein the commencement of playing the desired audio file and the commencement of the displaying step are a function of the signal from the display device.Type: GrantFiled: November 29, 2001Date of Patent: June 6, 2006Assignee: Koninklijke Philips Electronics N.V.Inventors: Karen I. Trovato, Dongge Li, Muralidharan Ramaswamy
-
Patent number: 7024353Abstract: In a distributed voice recognition system, a back-end pattern matching unit 27 can be informed of voice activity detection information as developed through use of a back-end voice activity detector 25. Although no specific voice activity detection information is developed or forwarded by the front-end of the system, precursor information as developed at the back-end can be used by the voice activity detector to nevertheless ascertain with relative accuracy the presence or absence of voice in a given set of corresponding voice recognition features as developed by the front-end of the system.Type: GrantFiled: August 9, 2002Date of Patent: April 4, 2006Assignee: Motorola, Inc.Inventor: Tenkasi Ramabadran
-
Patent number: 7016834Abstract: In general, this invention concerns speech encoding and decoding used in digital radio systems and a method by which the processing capacity required can be reduced in a telecommunication system using discontinuous transmission between a transmitter and receiver. In particular, the method according to the invention is used to match two telecommunication systems using different encoding methods between the transmitter and receiver. In the method, the signals transmitted by the transmitter are made suitable for the receiver in the signal path so that in the first step, at least one information parameter comprising at least two content identifiers is formed for each data frame of the data parameters (101) received. In the next step, data corresponding to the original data is synthesized from the data parameters (101) of the received frames, after which the synthesized data is transmitted for recoding with an encoding method suitable for the receiver.Type: GrantFiled: July 14, 2000Date of Patent: March 21, 2006Assignee: Nokia CorporationInventor: Ari Lakaniemi
-
Patent number: 7013267Abstract: A communication system includes a destination that receives voice samples and a voice parameter generated by a source. The destination uses the voice samples and voice parameter to reconstruct voice information in response to a packet loss. The destination may reconstruct voice information from multiple sources.Type: GrantFiled: July 30, 2001Date of Patent: March 14, 2006Assignee: Cisco Technology, Inc.Inventors: Pascal H. Huart, Luke K. Surazski
-
Patent number: 6999920Abstract: Method for the reduction of echo and/or noise signals in TK systems for the transmission of useful acoustic signals, in which, when a silence interval is present, the distorted useful signal is modified by a time-dependent control signal ao(t) or by a control signal ao(k) cycled in the rhythm of a scan rate fT=1/T. The control signal ao(k) is varied in such manner that, during the presence of speech signals in the useful signals, the amplitude of the control signal ao(k) is set to a predetermined constant value co and, when a silence interval begins, the amplitude of the control signal ao(k) is reduced continuously from one sample value to the next in accordance with the recurrence formula ao(k+1)=ao(k).? with ?<1. After the end of the silence interval, ao(k) is again set equal to co.Type: GrantFiled: November 21, 2000Date of Patent: February 14, 2006Assignee: AlcatelInventors: Hans-Jürgen Matt, Michael Walker, Michael Maurer
-
Patent number: 6999921Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.Type: GrantFiled: December 13, 2001Date of Patent: February 14, 2006Assignee: Motorola, Inc.Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
-
Patent number: 6980950Abstract: An utterance detector for speech recognition is described. The detector consists of two components. The first part makes a speech/non-speech decision for each incoming speech frame. The decision is based on a frequency-selective autocorrelation function obtained by speech power spectrum estimation, frequency filter, and inverse Fourier transform. The second component makes utterance detection decision, using a state machine that describes the detection process in terms of the speech/non-speech decision made by the first component.Type: GrantFiled: September 21, 2000Date of Patent: December 27, 2005Assignee: Texas Instruments IncorporatedInventors: Yifan Gong, Yu-Hung Kao
-
Patent number: 6947412Abstract: A method of improving sound playback of digitized speech signals transmitted to a telecommunications terminal at the beginning of a telephone call set up over a communications network where the signals are transmitted in the form of packets, and in particular at the beginning of a VOIP call set up under Internet protocol, at the time said call is set up from a sending telecommunications terminal fitted with voice activity detection means so as to be capable of transmitting only those digitized signal packets that contain speech taken from a set of sound signals that are suitable for being transmitted in the form of packets after the sound has been digitized and encoded in the sending terminal. Signal packets are transmitted from the digitizing and encoding means during an initial call optimization stage without taking account of whether or not any speech signals are present. The invention also provides telecommunications hardware implementing the method.Type: GrantFiled: December 19, 2000Date of Patent: September 20, 2005Assignee: AlcatelInventors: Luc Attimont, Jannick Bodin
-
Patent number: 6885987Abstract: At an audio source, pause information is added to audio data, the combination of which is subsequently packetized. The resulting packets are transmitted to an audio destination via a network in which different packets may be subjected to varying levels of delay. At the audio destination, the pause information may be used to insert pauses at appropriate times to accommodate the occurrence of delays in packet delivery. In one embodiment, pauses are inserted based on a hierarchy of pause types. During pauses, audio filler information may be injected. In this manner, the effects of variable network delays upon reconstructed audio may be mitigated.Type: GrantFiled: February 9, 2001Date of Patent: April 26, 2005Assignee: fastmobile, Inc.Inventors: Dale R. Buchholz, Bashar Jano, Ira Gerson
-
Patent number: 6856961Abstract: The invention provides a speech coding system with input signal transformation that may reduce or essentially eliminate “silence noise” from the input or speech signal. The speech coding system may comprise an encoder disposed to receive an input signal. The encoder ramps the input signal to a zero-level when a portion of the input signal comprises silence noise.Type: GrantFiled: February 13, 2001Date of Patent: February 15, 2005Assignee: Mindspeed Technologies, Inc.Inventor: Jes Thyssen
-
Patent number: 6847930Abstract: Voice activity is detected by comparing an analog signal with two voltage thresholds and producing data representing the energy of the signal. The data, in binary form, is compared with thresholds to determine voice activity. In accordance with another aspect of the invention, the thresholds are adjusted based upon statistical information. In accordance with another aspect of the invention, the data can be weighted to provide an indication of the quasi-RMS energy of an input signal. The input signal itself is not converted into digital form yet the data derived from the input signal has high resolution.Type: GrantFiled: January 25, 2002Date of Patent: January 25, 2005Assignee: Acoustic Technologies, Inc.Inventors: Justin L. Allen, Steven M. Domer
-
Patent number: 6826600Abstract: Mechanisms and techniques allow computer systems to create and exchange uniquely identified shared objects. Using this invention, a client computer system can operate client software to generate local object definitions in a local object specification. To assure that the local object definitions created by the client are uniquely identifiable by this client, as well as by a server and possibly other clients which may require access to such object definitions (e.g., other clients in a collaboration software system), the invention allows the client to send the local object specification to the server for unique identification of the object definitions. The server receives the local object specification containing the local object definitions created by the client and can convert each local object definition within the local object specification to a global object definition in a global object specification.Type: GrantFiled: November 2, 2000Date of Patent: November 30, 2004Assignee: Cisco Technology, Inc.Inventor: Paul J. Russell
-
Patent number: 6826527Abstract: A decoder for code excited LP encoded frames with both adaptive and fixed codebooks; erased frame concealment uses muted repetitive excitation, threshold-adapted bandwidth expanded repetitive synthesis filter, and jittered repetitive pitch lag.Type: GrantFiled: November 3, 2000Date of Patent: November 30, 2004Assignee: Texas Instruments IncorporatedInventor: Takahiro Unno
-
Publication number: 20040225494Abstract: In a digital voice communication system, a transmitting radio (100/102) checks for activity on a channel. If activity is not detected on the channel, the transmitting radio transmits at least one voice frame on the channel. If activity is detected on the channel, the transmitting radio temporarily buffers the at least one voice frame and waits a period of time prior to re-checking the channel for activity. If activity is still detected on the channel, the transmitting radio repeats the step of waiting until activity is no longer detected on the channel; otherwise, the transmitting radio transmits the at least one voice frame that was temporarily buffered.Type: ApplicationFiled: May 7, 2003Publication date: November 11, 2004Inventors: Kevin B. Mayginnes, Darrell J. Stogner
-
Patent number: 6816832Abstract: A comfort noise block, that include a hangover period and comfort noise parameters, is transmitted in such a manner that it is not interrupted by other messages, such as FACCH messages. This is accomplished in a mobile station by a determination of whether any FACCH messages are required to be transmitted. If such FACCH messages exist, a further determination may be made as to which transmission can be made in the shortest time (i.e., the FACCH message or messages or the comfort noise parameters message), and this transmission is made first. In any event the comfort noise parameters block is transmitted without interruption. In a further embodiment of this invention the comfort noise parameters message is transmitted by being concatenated with another message, such as a neighbor channel measurement results message, so as to reduce overhead, conserve bandwidth, and reduce power consumption.Type: GrantFiled: June 11, 2001Date of Patent: November 9, 2004Assignee: Nokia CorporationInventors: Seppo Alanara, Pekka Kapanen
-
Patent number: 6807525Abstract: A method to reduce the amount of bandwidth used in the transmission of digitized voice packets is described. The method is used to reduce the number of transmitted packets by suspending transmission during periods of silence or when only noise is present. The system determines if a background noise update is warranted based on human auditory perception factors instead of an artificial limiter on excessive silence insertion descriptor packets. The system searches for characteristics in the perceptual changes of background noise instead of analyzing speech for improved audio compression. The invention weighs factors affecting the perception of sound including frequency masking, temporal masking, loudness perception based on tone, and auditory perception differential based on tone.Type: GrantFiled: October 31, 2000Date of Patent: October 19, 2004Assignee: Telogy Networks, Inc.Inventors: Dunling Li, Gokhan Sisli, Daniel Thomas
-
Patent number: 6801894Abstract: A speech synthesizer includes a data memory having a plurality of address areas, which stores a plurality of phases in the address areas and an address designating circuit designating one of the address areas based on the phase signal. Further, a speech synthesizer includes a speech synthesizing circuit generating a speech synthesizing signal corresponding to the phase, which is stored in the designated area, a digital/analog converter transforming the speech synthesizing signal to an analog signal having amplitude, and a counter setting a period of silence. Furthermore, a speech synthesizer includes a silence-input circuit being connected between the speech synthesizing circuit and the digital/analog converter, which supplies a predetermined voltage to the digital/analog converter for the period that is set by the counter.Type: GrantFiled: March 22, 2001Date of Patent: October 5, 2004Assignee: Oki Electric Industry Co., Ltd.Inventors: Yoshihisa Nakamura, Hiroaki Matsubara
-
Publication number: 20040193409Abstract: Systems and methods for dynamically analyzing temporality in an individual's speech in order to selectively categorize the speech fluency of the individual and/or to selectively provide speech training based on the results of the dynamic analysis. Temporal variables in one or more speech samples are dynamically quantified. The temporal variables in combination with a dynamic process, which is derived from analyses of temporality in the speech of native speakers and language learners, are used to provide a fluency score that identifies a proficiency of the individual. In some implementations, temporal variables are measured instantaneously.Type: ApplicationFiled: December 11, 2003Publication date: September 30, 2004Inventors: Lynne Hansen, Joshua Rowe
-
Patent number: 6799161Abstract: A speech coding apparatus having a speech input unit for receiving input speech, a speech coding rate selector for selecting an appropriate speech coding rate according to the power of the input speech, a speech analyzer for processing the input speech to estimate a transfer function of the speaker's oral cavity, and a speech coding unit forming a synthesis filter based on the transfer function of the oral cavity. The speech coding unit also codes an excitation signal of the synthesis filter on the basis of an estimation result supplied by the speech analyzer. A gain suppressor interposed between the speech input unit and the speech coding unit suppresses the gain of a signal supplied from the speech input unit to the speech coding unit during an unvoiced period according to information from the speech coding rate selector.Type: GrantFiled: January 15, 2002Date of Patent: September 28, 2004Assignee: Oki Electric Industry Co., Ltd.Inventor: Atsushi Yokoyama
-
Patent number: 6785644Abstract: With respect to data having periodicity to be compressed, windows of the same size are set for every two sections according to an interval of peaks appearing substantially periodically and processing for sorting sample data alternately among the set windows of the same size is sequentially performed, whereby a frequency of data having periodicity is replaced with an approximately half frequency without damaging reproducibility to original data at all to make it possible to apply compression processing to data of the replaced low frequency. If this sorting processing is applied to compression processing having a characteristic that a compression ratio is not increased in a high-frequency region, it becomes possible to improve a compression ratio without damaging a quality of reproduced data by decompression at all.Type: GrantFiled: December 16, 2002Date of Patent: August 31, 2004Assignee: Yasue SakaiInventor: Yukio Koyanagi
-
Patent number: 6782363Abstract: A method and apparatus for performing real-time endpoint detection for use in automatic speech recognition. A filter is applied to the input speech signal and the filter output is then evaluated with use of a state transition diagram (i.e., a finite state machine). The filter is advantageously designed in light of several criteria in order to increase the accuracy and robustness of detection. The state transition diagram advantageously has three states. The endpoints which are detected may then be advantageously applied to the problem of energy normalization of the speech portion of the signal.Type: GrantFiled: May 4, 2001Date of Patent: August 24, 2004Assignee: Lucent Technologies Inc.Inventors: Chin-Hui Lee, Qi P. Li, Jinsong Zheng, Qiru Zhou
-
Patent number: 6766291Abstract: The invention relates to a method and apparatus for controlling the transition of a bypass capable codec between operative modes, based on a certain characteristic of the audio data signal processed by the codec. The apparatus relies on a control signal to determine when the codec will switch from one mode to another. This control signal reflects a characteristic of the audio data signal received at the apparatus, such as the type of speech activity or the format of the audio data signal. When in the active (non-bypass) mode, the apparatus relies on an additional control signal to switch to the inactive (bypass) mode. This additional control signal is received from a control unit at a remote codec that indicates that the remote codec is also bypass capable, hence the decoder at the first codec and the encoder at the remote codec can switch to the inactive mode to pass between them the compressed data frames.Type: GrantFiled: June 18, 1999Date of Patent: July 20, 2004Assignee: Nortel Networks LimitedInventors: Chung Cheung C. Chu, Rafi Rabipour, David G. Sloan
-
Publication number: 20040138880Abstract: Estimating a signal power in a compressed audio signal [A] is provided, the audio signal comprising blocks of quantized samples, a given block being provided with a set of scale factors. The estimating is performed by extracting the set of scale factors from the compressed audio signal, and estimating the signal power in the given block based on a combination of the scale factors. Advantageously, the extracting step and estimating step are performed on only a sub-set of the set of scale factors. The signal power estimation may be used in a silence detector (11) for use in a receiver (1).Type: ApplicationFiled: November 6, 2003Publication date: July 15, 2004Inventors: Alessio Stella, Jan Alexis Daniel Nesvadba, Mauro Barbieri, Freddy Snijder
-
Publication number: 20040133420Abstract: Compressed signals contain amplitude data (for example, scale factors in an MPEG frame) which can be examined to enable a decision to be taken on whether the signal contains information or not (e.g. silence in the case of audio or no image in the case of video).Type: ApplicationFiled: December 15, 2003Publication date: July 8, 2004Inventors: Gavin Robert Ferris, Michael Vincent Woodward
-
Publication number: 20040133421Abstract: Acoustic noise suppression is provided in multiple-microphone systems using Voice Activity Detectors (VAD). A host system receives acoustic signals via multiple microphones. The system also receives information on the vibration of human tissue associated with human voicing activity via the VAD. In response, the system generates a transfer function representative of the received acoustic signals upon determining that voicing information is absent from the received acoustic signals during at least one specified period of time. The system removes noise from the received acoustic signals using the transfer function, thereby producing a denoised acoustic data stream.Type: ApplicationFiled: September 18, 2003Publication date: July 8, 2004Inventors: Gregory C. Burnett, Eric F. Breitfeller
-
Patent number: 6754620Abstract: A system and method is provided for rendering data indicative of delays associated with enabling and/or disabling an analog-to-digital conversion system employed by a telephony communication network. The system of the present invention utilizes a display device and an interface manager. The interface manager receives data indicative of power levels at various frequencies and times of signals received by a transceiver that is communicating via the conventional telephony communication network. The interface manager then renders a graphical display via the display device based on the received data. The graphical display may include clusters, in which each of the clusters is associated with a particular range of power levels. By analyzing the clusters, a user can determine the delays associated with enabling and/or disabling the analog-to-digital conversion system. The graphical display may also include indicators that may be used to determine the foregoing delays.Type: GrantFiled: March 29, 2000Date of Patent: June 22, 2004Assignee: Agilent Technologies, Inc.Inventor: Samuel M Bauer
-
Publication number: 20040107092Abstract: A digital line transmission unit can carry out switching between speech codecs during the same call to achieve balance between making effective use of a line and a high sound quality without bringing about a feeling of discomfort in a user by the switching. It includes in an encoder a first speech codec 7 with a high sound quality and a high bit rate, a second speech codec 8 with a reasonable sound quality but a low bit rate. It carries out switching between these speech codecs in response to the control information an operation monitoring controller 4 obtains by making a decision as to the traffic volume of the bearer line 111. The switching between the speech codecs is made during a speech pause a speech burst detector 31 in a signal detector 3 detects in an input speech signal.Type: ApplicationFiled: October 6, 2003Publication date: June 3, 2004Inventor: Yoshihisa Harada
-
Publication number: 20040098253Abstract: A method and system for allowing a user to interface to an interactive voice response system via natural language commands.Type: ApplicationFiled: October 14, 2003Publication date: May 20, 2004Inventors: Bruce Balentine, Rex Stringham, Ralph Melaragno, Justin Munroe