Voiced Or Unvoiced Patents (Class 704/208)
  • Patent number: 8438014
    Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: May 7, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
  • Patent number: 8438016
    Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: May 7, 2013
    Assignee: City University of Hong Kong
    Inventors: Weijia Jia, Lizhuo Zhang, Huan Li, Wenyan Lu
  • Patent number: 8417524
    Abstract: Analyzing an audio interaction is provided. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.
    Type: Grant
    Filed: February 11, 2010
    Date of Patent: April 9, 2013
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
  • Patent number: 8407044
    Abstract: A method for discriminating a telephony content signal into a first category or a second category is described. The method comprises a filtering procedure for obtaining from the telephony content signal a band signal set comprising one or more band signals, each band signal being associated with a respective frequency band at least one of said band signals being a sub-band signal (n) associated with a sub-band of an overall frequency band of the telephony content signal. Furthermore a determination procedure is provided for determining a band signal variation value (LLn) and a band signal strength value (TLn) for each band signal (n) of said band signal set. Finally, a discrimination procedure discriminates whether the telephony content signal is of the first category or of the second category.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: March 26, 2013
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Arto Juhani Mahkonen
  • Patent number: 8392179
    Abstract: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.
    Type: Grant
    Filed: March 12, 2009
    Date of Patent: March 5, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Rongshan Yu, Regunathan Radhakrishnan, Robert Andersen, Grant Davidson
  • Publication number: 20130041658
    Abstract: A system and method may be configured to process an audio signal. The system and method may track pitch, chirp rate, and/or harmonic envelope across the audio signal, may reconstruct sound represented in the audio signal, and/or may segment or classify the audio signal. A transform may be performed on the audio signal to place the audio signal in a frequency chirp domain that enhances the sound parameter tracking, reconstruction, and/or classification.
    Type: Application
    Filed: August 8, 2011
    Publication date: February 14, 2013
    Applicant: The Intellisis Corporation
    Inventors: David C. BRADLEY, Daniel S. GOLDIN, Robert N. HILTON, Nicholas K. FISHER, Rodney GATEAU, Derrick R. ROOS, Eric WIEWIORA
  • Patent number: 8374851
    Abstract: Acoustic echo control for hands-free phone has acoustic echo cancellation and echo suppression with a voice activity detection for the echo suppression based on near-end input power together with an estimate for acoustic echo cancellation gain.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: February 12, 2013
    Assignee: Texas Instruments Incorporated
    Inventors: Takahiro Unno, Jesper Gormsen Kragh, Fabien Ober, Ali Erdem Ertan
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8370138
    Abstract: A scalable encoding device is capable of improving quality of a decoded signal without increasing an encoding amount and compensating data with a sufficient quality upon data loss. An extension layer bit distribution calculator calculates a bit distribution of a quality improving encoding data and compensation encoding data in the extension layer according to an audio mode of the input signal. An extension layer encoder generates quality improving encoding data according to the specified number of bits. A compensation information encoder extracts a part of core layer encoding data and makes it as compensation encoding data for the core layer. An extension layer encoded data generator multiplexes the extension layer bit distribution information, the compensation encoding data, and the quality improving encoding data so as to obtain extension layer encoding data.
    Type: Grant
    Filed: March 15, 2007
    Date of Patent: February 5, 2013
    Assignee: Panasonic Corporation
    Inventors: Takuya Kawashima, Hiroyuki Ehara, Koji Yoshida
  • Patent number: 8370144
    Abstract: A method for identifying end of voiced speech within an audio stream of a noisy environment employs a speech discriminator. The discriminator analyzes each window of the audio stream, producing an output corresponding to the window. The output is used to classify the window in one of several classes, for example, (1) speech, (2) silence, or (3) noise. A state machine processes the window classifications, incrementing counters as each window is classified: speech counter for speech windows, silence counter for silence, and noise counter for noise. If the speech counter indicates a predefined number of windows, the state machine clears all counters. Otherwise, the state machine appropriately weights the values in the silence and noise counters, adds the weighted values, and compares the sum to a limit imposed on the number of non-voice windows. When the non-voice limit is reached, the state machine terminates processing of the audio stream.
    Type: Grant
    Filed: June 3, 2010
    Date of Patent: February 5, 2013
    Assignee: Applied Voice & Speech Technologies, Inc.
    Inventor: Karl D. Gierach
  • Patent number: 8364477
    Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: January 29, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J Song, John C Johnson
  • Patent number: 8363678
    Abstract: Method and apparatus to synchronize packet rate for audio information are described.
    Type: Grant
    Filed: July 28, 2008
    Date of Patent: January 29, 2013
    Assignee: Intel Corporation
    Inventor: Siu H. Lam
  • Publication number: 20130024192
    Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.
    Type: Application
    Filed: March 28, 2011
    Publication date: January 24, 2013
    Applicant: NEC CORPORATION
    Inventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
  • Patent number: 8346546
    Abstract: A packet loss concealment method and system is described that attempts to reduce or eliminate destructive interference that can occur when an extrapolated waveform representing a lost segment of a speech or audio signal is merged with a good segment after a packet loss. This is achieved by guiding a waveform extrapolation that is performed to replace the bad segment using a waveform available in the first good segment or segments after the packet loss. In another aspect of the invention, a selection is made between a packet loss concealment method that performs the aforementioned guided waveform extrapolation and one that does not. The selection may be made responsive to determining whether the first good segment or segments after the packet loss are available and also to whether a segment preceding the lost segment and the first good segment following the lost segment are deemed voiced.
    Type: Grant
    Filed: July 31, 2007
    Date of Patent: January 1, 2013
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 8346543
    Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: March 17, 2011
    Date of Patent: January 1, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8340078
    Abstract: In one embodiment, a method can include: (i) establishing an internet protocol (IP) connection; (ii) forming a buffered version of a plurality of voice frame slices from received audio packets; and (iii) when an erasure is detected, performing a packet loss concealment (PLC) to provide a synthesized speech signal for the erasure, where the PLC can include: (a) identifying first and second pitches from the buffered version of the plurality of voice frame slices; and (b) forming the synthesized speech signal by using the first and second pitches, and more if needed, followed by an overlay-add (OLA).
    Type: Grant
    Filed: December 21, 2006
    Date of Patent: December 25, 2012
    Assignee: Cisco Technology, Inc.
    Inventors: Duanpei Wu, Luke K. Surazski
  • Patent number: 8332228
    Abstract: In one embodiment, a method of generating a highband excitation signal includes generating a spectrally extended signal by extending the spectrum of a signal that is based on an encoded lowband excitation signal; and performing anti-sparseness filtering of a signal that is based on the encoded lowband excitation signal. In this method, the highband excitation signal is based on the spectrally extended signal, and the highband excitation signal is based on a result of the anti-sparseness filtering.
    Type: Grant
    Filed: April 3, 2006
    Date of Patent: December 11, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Koen Bernard Vos, Ananthapadmanabhan Aasanipalai Kandhadai
  • Patent number: 8326612
    Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.
    Type: Grant
    Filed: April 5, 2010
    Date of Patent: December 4, 2012
    Assignee: Fujitsu Limited
    Inventors: Nobuyuki Washio, Shoji Hayakawa
  • Patent number: 8326613
    Abstract: The present invention relates to a method of synthesizing a signal comprising the steps of determining a required pitch bell locations, mapping the required pitch bell locations onto the signal to provide first pitch bell locations, randomizing the first pitch bell locations to provide second pitch bell locations, windowing the signal on the second pitch bell locations to provide a pitch bell, repeating the aforementioned steps for all required pitch bell locations and performing an overlap and add operation with respect to the pitch bells in order to synthesize the signal.
    Type: Grant
    Filed: August 25, 2010
    Date of Patent: December 4, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Ercan Ferit Gigi
  • Patent number: 8326610
    Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a first frame of the signal, the first frame comprising a voiced frame. One or more cords can be extracted from the voiced frame based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. The one or more cords can collectively comprise less than all of the frame. For example, each of the cords can begin with onset of a glottal pulse and extend to a point prior to an onset of neighboring glottal pulse but may exclude a portion of the frame prior to the onset of the neighboring glottal pulse. A phoneme for the voiced frame can be determined based on at least one of the extracted cords.
    Type: Grant
    Filed: October 23, 2008
    Date of Patent: December 4, 2012
    Assignee: Red Shift Company, LLC
    Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
  • Patent number: 8326611
    Abstract: Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: December 4, 2012
    Assignee: AliphCom, Inc.
    Inventors: Nicolas Petit, Gregory Burnett, Zhinian Jing
  • Patent number: 8321212
    Abstract: A device, computer program product and method for supporting multi-language of a mobile terminal comprising: receiving broadcast data; checking whether a selected broadcast channel supports multi-language based on additional information of the received broadcast data; and outputting an indication message when the broadcast channel supports the multi-language, whereby a user can flexibly set a broadcast language of his desired channel during or before broadcasting outputted, and also a use interface environment can be improved so as to facilitate the setup or change of the broadcast language.
    Type: Grant
    Filed: March 10, 2008
    Date of Patent: November 27, 2012
    Assignee: LG Electronics Inc.
    Inventor: Mi-Sun Kim
  • Patent number: 8321213
    Abstract: Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: November 27, 2012
    Assignee: AliphCom, Inc.
    Inventors: Nicolas Petit, Gregory Burnett, Zhinian Jing
  • Patent number: 8321217
    Abstract: The present invention relates to a voice activity detector (VAD) comprising at least a first primary voice detector. The voice activity detector is configured to output a speech decision ‘vad_flag’ indicative of the presence of speech in an input signal based on at least a primary speech decision ‘vad_prim_A’ produced by said first primary voice detector. The voice activity detector further comprises a short term activity detector and the voice activity detector is further configured to produce a music decision ‘vad_music’ indicative of the presence of music in the input signal based on a short term primary activity signal ?vad_act_prim_A’ produced by said short term activity detector based on the primary speech decision ‘vad_prim_A’ produced by the first voice detector. The short term primary activity signal ‘vad_act_prim_A’ is proportional to the presence of music in the input signal. The invention also relates to a node, e.g. a terminal, in a communication system comprising such a VAD.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: November 27, 2012
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Martin Sehlstedt
  • Patent number: 8315862
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 8315854
    Abstract: A method and an apparatus for detecting a pitch in input voice signals by using a spectral auto-correlation. The pitch detection method includes: performing a Fourier transform on the input voice signals after performing a pre-processing on the input voice signals, performing an interpolation on the transformed voice signals, calculating a spectral difference from a difference between spectrums of the interpolated voice signals, calculating a spectral auto-correlation by using the calculated spectral difference, determining a voicing region based on the calculated spectral auto-correlation, and extracting a pitch by using the spectral auto-correlation corresponding to the voicing region.
    Type: Grant
    Filed: November 27, 2006
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwang Cheol Oh, Jae-Hoon Jeong
  • Patent number: 8315860
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.
    Type: Grant
    Filed: June 27, 2011
    Date of Patent: November 20, 2012
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 8311811
    Abstract: A method and an apparatus for detecting a pitch in input voice signals by using a subharmonic-to-harmonic ratio (SHR). The pitch detection method includes performing a Fourier transform on the input voice signals after performing a pre-processing on the input voice signals, performing an interpolation on the transformed voice signals, calculating a normalized local center of gravity (NLCG) on a spectrum of the interpolated voice signals, calculating a cumulated sum of the calculated NLCG, calculating an SHR from the spectrum based on the calculated cumulated sum, and extracting the pitch based on the calculated SHR.
    Type: Grant
    Filed: November 27, 2006
    Date of Patent: November 13, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwang Cheol Oh, Jae-Hoo Jeong
  • Patent number: 8311842
    Abstract: A method and apparatus for expanding a bandwidth of an input narrowband voice signal is provided. The narrowband voice signal is analyzed separately for each frame, and a Degree of Voicing (DV) and a Degree of Stationary (DS) are calculated depending on the analysis. A Degree of Difficulty of Bandwidth Expansion (DDBWE) of the narrowband voice signal is calculated based on DV and DS. Bandwidth expansion is controlled according to DDBWE.
    Type: Grant
    Filed: March 3, 2008
    Date of Patent: November 13, 2012
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Geun-Bae Song, Min-Sung Kim, Hee-Jin Oh, Austin Kim, Jae-Bum Kim
  • Patent number: 8306134
    Abstract: A high speed receiver is provided using two parallel processing paths to enable rapid variable gain control. The parallel processing paths include a first processing path using a high resolution Discrete Fourier Transform (DFT), and a second processing path using a reduced DFT requiring fewer samples than the high resolution DFT. An initial sample of the data is processed using the second processing path with the reduced DFT by comparing a Fourier transform of the initial sample with predetermined threshold values. As a result of the comparison of the Fourier transform of the initial sample with the predetermined threshold values, a gain determination block determines whether a requirement exists for gain ranging. If gain ranging is needed, the gain of the data signal is adjusted and the gain ranging process repeats.
    Type: Grant
    Filed: July 17, 2009
    Date of Patent: November 6, 2012
    Assignee: Anritsu Company
    Inventors: Jon S. Martens, Helen Chau, David A. Rangel-Guzman, Peter A. Kapetanic, Dan Levassuer
  • Patent number: 8296134
    Abstract: A spectrum modifying method and the like wherein the efficiencies of the signal estimation and prediction can be improved and the spectrum can be more efficiently encoded. According to this method, the pitch period is calculated from an original signal, which serves as a reference signal, and then a basic pitch frequency (f0) is calculated. Thereafter, the spectrum of a target signal, which is a target of spectrum modification, is divided into a plurality of partitions. It is specified here that the width of each partition be the basic pitch frequency. Then, the spectra of bands are interleaved such that a plurality of peaks having similar amplitudes are unified into a group. The basic pitch frequency is used as an interleave pitch.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: October 23, 2012
    Assignee: Panasonic Corporation
    Inventors: Chun Woei Teo, Sua Hong Neo, Koji Yoshida, Michiyo Goto
  • Patent number: 8280724
    Abstract: A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: October 2, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Dan Chazan, Ron Hoory, Zvi Kons, Slava Shechtman, Alexander Sorin
  • Patent number: 8280730
    Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: October 2, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J. Song, John C. Johnson
  • Patent number: 8280732
    Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.
    Type: Grant
    Filed: March 26, 2009
    Date of Patent: October 2, 2012
    Inventors: Wolfgang Richter, Roland Aubauer
  • Patent number: 8275611
    Abstract: An apparatus for adaptively suppressing noise in an input signal frequency spectrum derived from overlapping input frames is provided. The system includes a psychoacoustic power computation module configured to compute a noisy signal power in psychoacoustic bands, a voice activity scoring module configured to compute a probabilistic score for a presence of a speech, and a noise estimation module configured to estimate a noise power in the psychoacoustic bands based on information of past frames, the probabilistic score, and the computed noisy signal power. The system also includes a gain computation module configured to compute a gain for each frequency, based on a probabilistic heuristic, the probabilistic score and the information on the past frames, and a gain post-processing module configured to perform a gain time smoothing, a gain frequency smoothing, and a gain regulation for the computed gain.
    Type: Grant
    Filed: January 18, 2008
    Date of Patent: September 25, 2012
    Assignee: STMicroelectronics Asia Pacific Pte., Ltd.
    Inventors: Wenbo Zong, Yuan Wu, Sapna George
  • Publication number: 20120239389
    Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.
    Type: Application
    Filed: November 24, 2010
    Publication date: September 20, 2012
    Applicant: LG ELECTRONICS INC.
    Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
  • Patent number: 8271291
    Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.
    Type: Grant
    Filed: May 8, 2009
    Date of Patent: September 18, 2012
    Assignee: LG Electronics Inc.
    Inventors: Sang Bae Chon, Lae Hoon Kim, Koeng Mo Sung
  • Patent number: 8260609
    Abstract: Speech encoders and methods of speech encoding are disclosed that encode inactive frames at different rates. Apparatus and methods for processing an encoded speech signal are disclosed that calculate a decoded frame based on a description of a spectral envelope over a first frequency band and the description of a spectral envelope over a second frequency band, in which the description for the first frequency band is based on information from a corresponding encoded frame and the description for the second frequency band is based on information from at least one preceding encoded frame. Calculation of the decoded frame may also be based on a description of temporal information for the second frequency band that is based on information from at least one preceding encoded frame.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: September 4, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
  • Patent number: 8260613
    Abstract: A double talk detector for controlling the echo path estimation in a telecommunication system by indicating when a received coded speech signal is dominated by a non-echo signal; i.e., that so-called double talk exists. This is determined by extracting LSPs from a coded speech frame of the received coded speech signal when the signal power exceeds a first threshold value, converting each of said extracted LSPs into LSFs, and calculating the distance between each two adjacent LSFs. For each distance that is smaller than a second threshold, a spectral peak is located between the two LSFs, and it is determined whether said spectral peak is an echo or not. When a predetermined number of non-echo spectral peaks are located in the received speech signal, double talk will be indicated, and the echo path estimation may be disabled.
    Type: Grant
    Filed: February 21, 2007
    Date of Patent: September 4, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventor: Tonu Trump
  • Patent number: 8251924
    Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: August 28, 2012
    Assignee: Ambient Corporation
    Inventors: Michael Callahan, Thomas Coleman
  • Patent number: 8249863
    Abstract: An apparatus and method for estimating audio signal spectrum information. The method including the steps of performing a morphological operation on a received audio signal, extracting peaks by using various peak extraction methods and extracting a remainder signal region from the extracted peaks, selecting a high-order peaks spectrum from the extracted remainder signal region. In addition, spectral envelopes are detected by performing an interpolation operation on the high-order peaks spectrum.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: August 21, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Hyun-Soo Kim
  • Patent number: 8244525
    Abstract: Embodiments of the invention provide a method and encoder for encoding a frame in of a communication system. The method includes calculating a first set of parameters associated with the frame, wherein said first set of parameters comprises filter bank parameters. The method further includes selecting, in a first stage, one of a plurality of encoding methods based on the first set of parameters one of modes for encoding, calculating a second set of parameters associated with the frame, selecting, in a second stage, one of the plurality of encoding methods based on the result of the first stage selection and the second set of parameters one of modes for encoding, and encoding the frame using the selected encoding excitation method from the second stage.
    Type: Grant
    Filed: November 22, 2004
    Date of Patent: August 14, 2012
    Assignee: Nokia Corporation
    Inventor: Jari M. Makinen
  • Patent number: 8244528
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: August 14, 2012
    Assignee: Nokia Corporation
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Patent number: 8237571
    Abstract: Disclosed are an alarm method and system based on voice events, and a building method on behavior trajectory thereof. The system comprises a signal sensor, a voice-event detector and notice and alarm element. In the method, voice signals are captured from a remote unit in an environment. The captured voice signals are classified into at least a voice event. As such, an emergent-event notice is automatically transmitted out if one of predefined emergent events is detected. In the building method on behavior trajectory, messages on voice events are continuously recorded. When the number of the recorded voice events reaches a threshold, a behavior trajectory is constructed, in which a behavior consists of two or more voice events or a single voice event.
    Type: Grant
    Filed: February 6, 2009
    Date of Patent: August 7, 2012
    Assignee: Industrial Technology Research Institute
    Inventors: Yuh-Ching Wang, Yu-Hsien Chiu, Gwo Lang Yan
  • Patent number: 8229078
    Abstract: Mechanisms are disclosed that allow for the use of a false background, or background tone, in voice input in a telephonic call. A voice input may have both voice from a user talking and a background noise input which may be every other noise received at the user's phone. A trigger to use a background tone is received, the background tone is selected and the associated background tone file is retrieved. Then, a combiner combines the voice input with the background tone file. A filter may be used to filter the actual background noise from the voice input. Additionally, background tones may be used in various telephonic networks, including traditional land-line and wireless cellular networks.
    Type: Grant
    Filed: April 19, 2007
    Date of Patent: July 24, 2012
    Assignee: AT&T Mobility II LLC
    Inventors: Joshua Scott Wright, Paul Humphries, Jeffrey C. Mikan
  • Publication number: 20120179459
    Abstract: A method of pre-processing an audio signal transmitted to a user terminal via a communication network and an apparatus using the method are provided. The method of pre-processing the audio signal may prevent deterioration of a sound quality of the audio signal transmitted to the user terminal by pre-processing the audio signal, and by enabling a codec module, encoding the audio signal, to determine the audio signal as a speech signal. Also, the method of pre-processing the audio signal may improve a probability that the codec module may determine a corresponding audio signal as a speech when the audio signal is transmitted via the communication network by pre-processing the audio signal using a speech codec.
    Type: Application
    Filed: March 21, 2012
    Publication date: July 12, 2012
    Applicant: REALNETWORKS, INC.
    Inventors: Jae Woong Jeong, Seop Hyeong Park, Jong Kyu Ryu
  • Patent number: 8219390
    Abstract: A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.
    Type: Grant
    Filed: September 16, 2003
    Date of Patent: July 10, 2012
    Assignee: Creative Technology Ltd
    Inventor: Jean Laroche
  • Patent number: 8214200
    Abstract: Methods and apparatus are disclosed for approximating an MDCT coefficient of a block of windowed sinusoid having a defined frequency, the block being multiplied by a window sequence and having a block length and a block index. A finite trigonometric series is employed to approximate the window sequence. A window summation table is pre-computed using the finite trigonometric series and the defined frequency of the sinusoid. A block phase is computed for each block with the defined frequency, the block length and the block index. An MDCT coefficient is approximated by the dot product of a phase vector computed using the block phase with a corresponding row of the window summation table.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: July 3, 2012
    Assignee: XFRM, Inc.
    Inventors: Richard C. Cabot, Matthew S. Ashman
  • Patent number: 8214201
    Abstract: A method of refining a pitch period estimation of a signal, the method comprising: for each of a plurality of portions of the signal, scanning over a predefined range of time offsets to find an estimate of the pitch period of the portion within the predefined range of time offsets; identifying the average pitch period of the estimated pitch periods of the portions; determining a refined range of time offsets in dependence on the average pitch period, the refined range of time offsets being narrower than the predefined range of time offsets; and for a subsequent portion of the signal, scanning over the refined range of time offsets to find an estimate of the pitch period of the subsequent portion.
    Type: Grant
    Filed: November 19, 2008
    Date of Patent: July 3, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventor: Xuejing Sun
  • Publication number: 20120158401
    Abstract: In one embodiment, a music detection (MD) module accumulates sets of one or more frames and performs FFT processing on each set to recover a set of coefficients, each corresponding to a different frequency k. For each frame, the module identifies candidate musical tones by searching for peak values in the set of coefficients. If a coefficient corresponds to a peak, then a variable TONE[k] corresponding to the coefficient is set equal to one. Otherwise, the variable is set equal to zero. For each variable TONE[k] having a value of one, a corresponding accumulator A[k] is increased. Candidate musical tones that are short in duration are filtered out by comparing each accumulator A[k] to a minimum duration threshold. A determination is made as to whether or not music is present based on a number of candidate musical tones and a sum of candidate musical tone durations using a state machine.
    Type: Application
    Filed: August 9, 2011
    Publication date: June 21, 2012
    Applicant: LSI Corporation
    Inventors: Ivan Leonidovich Mazurenko, Dmitry Nikolaevich Babin, Alexander Markovic, Denis Vladimirovich Parkhomenko, Alexander Alexandrovich Petyushko