Voiced Or Unvoiced Patents (Class 704/208)
  • Patent number: 8204743
    Abstract: An apparatus and method for concealing frame erasure and a voice decoding apparatus and method using the same. The frame erasure concealment apparatus includes: a parameter extraction unit determining whether there is an erased frame in a voice packet, and extracting an excitement signal parameter and a line spectrum pair parameter of a previous good frame; and an erasure frame concealment unit, if there is an erased frame, restoring the excitement signal and line spectrum pair parameter of the erased frame by using a regression analysis from the excitement signal and line spectrum pair parameter of the previous good frame. According to the method and apparatus, by predicting and restoring the parameter of the erased frame through the regression analysis, the quality of the restored voice signal can be enhanced and the algorithm can be simplified.
    Type: Grant
    Filed: May 4, 2006
    Date of Patent: June 19, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Kangeun Lee, Seungho Choi
  • Patent number: 8200497
    Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.
    Type: Grant
    Filed: August 21, 2009
    Date of Patent: June 12, 2012
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 8195449
    Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: June 5, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
  • Patent number: 8195451
    Abstract: In an information detecting apparatus (1), a speech kind discrimination unit (11) discriminates and classifies an audio signal at an information source into kind (category) such as music or speech, etc. on a predetermined time basis, and a memory unit/recording medium (13) records discrimination information thereof. A discrimination frequency calculating unit (15) calculates, on a predetermined time basis, discrimination frequency every kind at a predetermined time period longer than the time unit.
    Type: Grant
    Filed: February 10, 2004
    Date of Patent: June 5, 2012
    Assignee: Sony Corporation
    Inventor: Yasuhiro Toguri
  • Patent number: 8175868
    Abstract: A voice judging system including feature value extraction means that analyzes a sound signal input from a sound signal input device, and extracts a time series of the feature values, sub-word boundary score calculating means that calculates a time series of sub-word boundary scores, by having reference to sound models of voice stored in a voice model storage unit, temporal regularity analyzing means that analyzes temporal regularity of the sub-word boundary scores, and voice judgment means judges whether the input sound signal is voice or non-voice using of the temporal regularity of the sub-word boundary scores.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: May 8, 2012
    Assignee: NEC Corporation
    Inventor: Makoto Terao
  • Patent number: 8175869
    Abstract: A method, apparatus, and medium for classifying a speech signal and a method, apparatus, and medium for encoding the speech signal using the same are provided. The method for classifying a speech signal includes calculating classification parameters from an input signal having block units, calculating a plurality of classification criteria from the classification parameters, and classifying the level of the input signal using the plurality of classification criteria. The classification parameters include at least one of an energy parameter of the input signal, a cross-correlation parameter between a specific block of a present frame and the input signal, and an integrated cross-correlation parameter obtained by accumulating the cross-correlation parameter.
    Type: Grant
    Filed: July 5, 2006
    Date of Patent: May 8, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Rakesh Taori, Kangeun Lee
  • Patent number: 8155953
    Abstract: A method and an apparatus are provided for discriminating between a voice region and a non-voice region in an environment in which diverse types of noises and voices exist.
    Type: Grant
    Filed: January 12, 2006
    Date of Patent: April 10, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-young Park, Chang-kyu Choi
  • Patent number: 8145476
    Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digit
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: March 27, 2012
    Assignee: Ricoh Company, Ltd.
    Inventor: Yukihiro Imai
  • Patent number: 8135586
    Abstract: Disclosed is a method and an apparatus for estimating noise included in a sound signal during sound signal processing. The method includes estimating harmonics components in a frame of an input sound signal; using the estimated harmonics components, computing a Voice Presence Probability (VPP) on the frame of the input sound signal; determining a weight of an equation necessary to estimate a noise spectrum, depending on the computed VPP; and using the determined weight and the equation necessary to estimate a noise spectrum, estimating the noise spectrum, and updating the noise spectrum.
    Type: Grant
    Filed: March 21, 2008
    Date of Patent: March 13, 2012
    Assignees: Samsung Electronics Co., Ltd, Korea University Industrial & Academic Collaboration Foundation
    Inventors: Hyun-Soo Kim, Hanseok Ko, Sung-Joo Ahn, Jounghoon Beh, Hyun-Jin Yoon
  • Patent number: 8126705
    Abstract: A system and method for automatically adjusting floor controls for a conversation is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Grant
    Filed: November 9, 2009
    Date of Patent: February 28, 2012
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
  • Patent number: 8126706
    Abstract: An audio signal is divided among exponentially related subband filters. The spectral flatness measure in each subband signal is determined and the measures are weighted and combined. The sum is compared with a threshold to determine the presence of music or noise. If music is detected, the noise estimation process in the noise reduction circuitry is turned off to avoid distorting the signal. If music is detected, residual echo suppression circuitry is also turned off to avoid inserting comfort noise.
    Type: Grant
    Filed: December 9, 2005
    Date of Patent: February 28, 2012
    Assignee: Acoustic Technologies, Inc.
    Inventor: Samuel Ponvarma Ebenezer
  • Patent number: 8121833
    Abstract: The exemplary embodiments of the invention provide at least a method and an apparatus to perform operations including dividing a sound signal into a series of successive frames, dividing each frame into a number of subframes, producing a residual signal by filtering the sound signal through a linear prediction analysis filter, locating a last pitch pulse of the sound signal of a previous frame from the residual signal, extracting a pitch pulse prototype of given length around a position of the last pitch pulse of the previous frame using the residual signal, and locating pitch pulses in a current frame using the pitch pulse prototype.
    Type: Grant
    Filed: October 21, 2008
    Date of Patent: February 21, 2012
    Assignee: Nokia Corporation
    Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
  • Publication number: 20120022859
    Abstract: An automatic marking method for Karaoke vocal accompaniment is provided. In the method, pitch, beat position and volume of a singer are compared with the original pitch, beat position and volume of the theme of a song to generate a score of pitch, a score of beat and a score of emotion respectively, so as to obtain a weighted total score in a weighted marking method. By using the method, the pitch, beat position and volume error of each section of the song sung by the singer can be exactly worked out, and a pitch curve and a volume curve can be displayed, so that the singer can learn which part is sung incorrectly and which part needs to be enhanced. The present invention also has the advantages of dual effects of teaching and entertainment, high practicability and technical advancement.
    Type: Application
    Filed: April 7, 2009
    Publication date: January 26, 2012
    Inventor: Wen-Hsin Lin
  • Patent number: 8090577
    Abstract: Methods and apparatus are presented for determining the type of acoustic signal and the type of frequency spectrum exhibited by the acoustic signal in order to selectively delete parameter information before vector quantization. The bits that would otherwise be allocated to the deleted parameters can then be re-allocated to the quantization of the remaining parameters, which results in an improvement of the perceptual quality of the synthesized acoustic signal. Alternatively, the bits that would have been allocated to the deleted parameters are dropped, resulting in an overall bit-rate reduction.
    Type: Grant
    Filed: August 8, 2002
    Date of Patent: January 3, 2012
    Assignee: QUALCOMM Incorported
    Inventors: Khaled Helmi El-Maleh, Ananthapadmanabhan Arasanipalai Kandhadai, Sharath Manjunath
  • Patent number: 8078455
    Abstract: An apparatus, method, and medium for distinguishing a vocal sound. The apparatus includes: a framing unit dividing an input signal into frames, each frame having a predetermined length; a pitch extracting unit determining whether each frame is a voiced frame or an unvoiced frame and extracting a pitch contour from the voiced and unvoiced frames; a zero-cross rate calculator respectively calculating a zero-cross rate for each frame; a parameter calculator calculating parameters including a time length ratio of the voiced frame and the unvoiced frame determined by the pitch extracting unit, statistical information of the pitch contour, and spectral characteristics; and a classifier inputting the zero-cross rates and the parameters output from the parameter calculator and determining whether the input signal is a vocal sound.
    Type: Grant
    Filed: February 7, 2005
    Date of Patent: December 13, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yuan Yuan Shi, Yongbeom Lee, Jaewon Lee
  • Patent number: 8069034
    Abstract: A method for supporting an encoding of an audio signal is shown, wherein at least a first and a second coder mode are available for encoding a section of the audio signal. The first coder mode enables a coding based on two different coding models. A selection of a coding model is enabled by a selection rule which is based on signal characteristics which have been determined for a certain analysis window. In order to avoid a misclassification of a section after a switch to the first coder mode, it is proposed that the selection rule is activated only when sufficient sections for the analysis window have been received. The invention relates equally to a module in which this method is implemented, to a device and a system comprising such a module and to a software program product including a software code for realizing the proposed method.
    Type: Grant
    Filed: May 6, 2005
    Date of Patent: November 29, 2011
    Assignee: Nokia Corporation
    Inventors: Jari Mäkinen, Ari Lakaniemi, Pasi Ojala
  • Patent number: 8069039
    Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: November 29, 2011
    Assignee: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 8063809
    Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: November 22, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
  • Publication number: 20110282658
    Abstract: The present invention relates to co-channel audio source separation. In one embodiment a first frequency-related representation of plural regions of the acoustic signal is prepared over time, and a two-dimensional transform of plural two-dimensional localized regions of the first frequency-related representation, each less than an entire frequency range of the first frequency related representation, is obtained to provide a two-dimensional compressed frequency-related representation with respect to each two dimensional localized region. For each of the plural regions, at least one pitch is identified. The pitch from the plural regions is processed to provide multiple pitch estimates over time. In another embodiment, a mixed acoustic signal is processed by localizing multiple time-frequency regions of a spectrogram of the mixed acoustic signal to obtain one or more acoustic properties.
    Type: Application
    Filed: September 3, 2010
    Publication date: November 17, 2011
    Applicant: Massachusetts Institute of Technology
    Inventors: Tianyu Wang, Thomas R. Quatieri, JR.
  • Patent number: 8050254
    Abstract: A media over packet networking appliance provides a network interface, a voice transducer, and at least one integrated circuit assembly coupling the voice transducer to the network interface. The at least one integrated circuit assembly provides media over packet transmissions and holds bits defining reconstruction of a packet stream having a primary stage and a secondary stage. The secondary stage has one or more of linear predictive coding parameters, long term prediction lags, parity check, and adaptive and fixed codebook gains. The packet stream has an instance of single packet loss, and the reconstruction includes receiving a packet sequence represented by P(n)P(n?1)?, [Lost Packet], P(n+2)P(n+1)?, and P(n+3)P(n+2)?, obtaining as information from the secondary stage one or more of the linear predictive coding parameters, long term prediction lags, parity check, and adaptive and fixed codebook gains, and performing an excitation reconstruction utilizing said packet sequence thus received.
    Type: Grant
    Filed: September 16, 2010
    Date of Patent: November 1, 2011
    Assignee: Texas Instruments Incorporated
    Inventors: Krishnasamy Anandakumar, Vishu R. Viswanathan, Alan V. McCree
  • Patent number: 8050911
    Abstract: A system, method, apparatus, signal-bearing medium, and means for transmitting speech activity in a distributed voice recognition (VR) system. The distributed voice recognition system includes a local VR engine in a subscriber unit (102) and a server VR engine on a server (160). The local VR engine comprises a voice activity detection (VAD) module (106) that detects voice activity within a speech signal, and comprises an advanced feature extraction (AFE) module (104) that extracts features from a speech signal. The detected voice activity information is transmitted over a first wireless communication channel to the server (160). The feature extraction information is transmitted over a second wireless communication channel, separate from the first wireless communication channel, to the server (160). The server (160) processes the received information to determine a linguistic estimate of the electrical speech signal, and transmits the linguistic estimate to the subscriber unit (102).
    Type: Grant
    Filed: March 1, 2007
    Date of Patent: November 1, 2011
    Assignee: QUALCOMM Incorporated
    Inventor: Harinath Garudadri
  • Patent number: 8050922
    Abstract: Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency fmax and a minimum frequency fmin of a filter bank used for processing the input utterance to produce an output. Corresponding gender or age specific acoustic models are used to perform voice recognition based on the filter bank output.
    Type: Grant
    Filed: July 21, 2010
    Date of Patent: November 1, 2011
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ruxin Chen
  • Publication number: 20110264447
    Abstract: Implementations and applications are disclosed for detection of a transition in a voice activity state of an audio signal, based on a change in energy that is consistent in time across a range of frequencies of the signal.
    Type: Application
    Filed: April 22, 2011
    Publication date: October 27, 2011
    Applicant: QUALCOMM Incorporated
    Inventors: Erik Visser, Ian Ernan Liu, Jongwon Shin
  • Publication number: 20110257965
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.
    Type: Application
    Filed: June 27, 2011
    Publication date: October 20, 2011
    Applicant: DIGITAL VOICE SYSTEMS, INC.
    Inventor: John C. Hardwick
  • Patent number: 8036884
    Abstract: The present invention provides a method, a computer-software-product and an apparatus for enabling a determination of speech related audio data within a record of digital audio data. The method comprises steps for extracting audio features from the record of digital audio data, for classifying one or more subsections of the record of digital audio data, and for marking at least a part of the record of digital audio data classified as speech. The classification of the digital audio data record is performed on the basis of the extracted audio features and with respect to at least one predetermined audio class.
    Type: Grant
    Filed: February 24, 2005
    Date of Patent: October 11, 2011
    Assignee: Sony Deutschland GmbH
    Inventors: Yin Hay Lam, Josep Maria Sola I Caros
  • Patent number: 8019386
    Abstract: A method and system for enhancing speech intelligibility using wireless communication in portable, battery-powered and entirely user-supportable devices. The devices may be talker devices and receiver devices, where the audio signals input into the talker devices may be transmitted to the receiver devices to provide better quality audio to person using the receiver devices. The receiver devices may initiate and terminate communications with the talker devices. Additionally, the receiver devices may indicate to the talker devices the gain level the talker devices need to apply to the audio signals before sending them to the receiver devices.
    Type: Grant
    Filed: March 7, 2005
    Date of Patent: September 13, 2011
    Assignee: Etymotic Research, Inc.
    Inventors: William F. Dunn, Mead C. Killion, Andrew J. Haapapuro, Viorel Drambarean
  • Publication number: 20110218801
    Abstract: The invention relates to a method for outputting a speech signal. Speech signal frames are received and are used in a predetermined sequence in order to produce a speech signal to be output. If one speech signal frame to be received is not received, then a substitute speech signal frame is used in its place, which is produced as a function of a previously received speech signal frame. According to the invention, in the situation in which the previously received speech signal frame has a voiceless speech signal, the substitute speech signal frame is produced by means of a noise signal.
    Type: Application
    Filed: September 28, 2009
    Publication date: September 8, 2011
    Applicant: ROBERT BOSCH GMBH
    Inventors: Peter Vary, Frank Mertz
  • Patent number: 8015000
    Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: September 6, 2011
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
  • Patent number: 8010370
    Abstract: Techniques for generating a target digital media item based on a source digital media item are described. A digital media item may be a song, a video clip, an album, or any length of audio or video. When adjusting the bit count for a portion of the target digital media item, instead of using the same set of parameter values used in a perceptual model for each portion of the source media item, the set of parameter values may be modified to encode the portion of the source digital media item. In this way, how audio or video is perceived is taken into account when adjusting a proposed bit count for a given portion of the target digital media item. Thus, while maintaining the same statistical bitrate as before increased digital media quality is achieved.
    Type: Grant
    Filed: July 28, 2006
    Date of Patent: August 30, 2011
    Assignee: Apple Inc.
    Inventor: Frank M. Baumgarte
  • Patent number: 8010358
    Abstract: Methods and apparatus for voice recognition are disclosed. A voice signal is obtained and two or more voice recognition analyses are performed on the voice signal. Each voice recognition analysis uses a filter bank defined by a different maximum frequency and a different minimum frequency and wherein each voice recognition analysis produces a recognition probability ri of recognition of one or more speech units, whereby there are two or more recognition probabilities ri. The maximum frequency and the minimum frequency may be adjusted every time speech is windowed and analyzed. A final recognition probability Pf is determined based on the two or more recognition probabilities ri.
    Type: Grant
    Filed: February 21, 2006
    Date of Patent: August 30, 2011
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ruxin Chen
  • Patent number: 7996215
    Abstract: A method and an apparatus for Voice Activity Detection (VAD) and an encoder are provided. The method for VAD includes: acquiring a fluctuant feature value of a background noise when an input signal is the background noise, in which the fluctuant feature value is used to represent fluctuation of the background noise; performing adaptive adjustment on a VAD decision criterion related parameter according to the fluctuant feature value; and performing VAD decision on the input signal by using the decision criterion related parameter on which the adaptive adjustment is performed. The method, the apparatus, and the encoder can be adaptive to fluctuation of the background noise to perform VAD decision, so as to enhance the VAD decision performance, save limited channel bandwidth resources, and use the channel bandwidth efficiently.
    Type: Grant
    Filed: April 13, 2011
    Date of Patent: August 9, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zhe Wang, Qing Zhang
  • Patent number: 7987089
    Abstract: A method for modifying a window with a frame associated with an audio signal is described. A signal is received. The signal is partitioned into a plurality of frames. A determination is made if a frame within the plurality of frames is associated with a non-speech signal. A modified discrete cosine transform (MDCT) window function is applied to the frame to generate a first zero pad region, where the region has a length of (M?L)/2, where L is an arbitrary value, and a second zero pad region if it was determined that the frame is associated with a non-speech signal. The frame is encoded. The decoder window is the same as the encoder window.
    Type: Grant
    Filed: February 14, 2007
    Date of Patent: July 26, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan A. Kandhadai
  • Patent number: 7979272
    Abstract: The present invention provides a frame erasure concealment device and method that is based on reestimating gain parameters for a code excited linear prediction (CELP) coder. During operation, when a frame in a stream of received data is detected as being erased, the coding parameters, especially an adaptive codebook gain gp and a fixed codebook gain gc, of the erased and subsequent frames can be reestimated by a gain matching procedure. By using this technique with the IS-641 speech coder, it has been found that the present invention improves the speech quality under various channel conditions, compared with a conventional extrapolation-based concealment algorithm.
    Type: Grant
    Filed: October 12, 2007
    Date of Patent: July 12, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Hong-Goo Kang, Hong Kook Kim
  • Patent number: 7970606
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.
    Type: Grant
    Filed: November 13, 2002
    Date of Patent: June 28, 2011
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Publication number: 20110153317
    Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.
    Type: Application
    Filed: December 23, 2009
    Publication date: June 23, 2011
    Applicant: QUALCOMM INCORPORATED
    Inventors: Yinian Mao, Gene Marsh
  • Publication number: 20110153318
    Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.
    Type: Application
    Filed: March 15, 2010
    Publication date: June 23, 2011
    Applicant: MINDSPEED TECHNOLOGIES, INC.
    Inventors: Norbert Rossello, Fabien Klein
  • Patent number: 7966179
    Abstract: A method and apparatus for distinguishing a voice region from a non-voice region in an environment where various types of noise and voice are mixed together are provided. The method includes the steps of converting an input voice signal into a frequency domain signal by preprocessing the input voice signal, performing sigmoid compression on the converted signal, transforming a spectrum vector generated by the sigmoid compression into a voice detection parameter in scalar form, and detecting the voice region using the parameter.
    Type: Grant
    Filed: January 27, 2006
    Date of Patent: June 21, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwang-cheol Oh, Ki-young Park
  • Patent number: 7949520
    Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.
    Type: Grant
    Filed: December 9, 2005
    Date of Patent: May 24, 2011
    Assignee: QNX Software Sytems Co.
    Inventors: Rajeev Nongpiur, Phillip A. Hetherington
  • Patent number: 7949518
    Abstract: A hierarchy encoding apparatus capable of calculating appropriate delay amounts and also capable of suppressing increase in the bit rate. In this apparatus, a first layer encoding part (101) encodes the input signal of the n-th frame to produce a first layer encoded code. A first layer decoding part (102) generates a first layer decoded signal from the first layer encoded code and applies it to a delay amount calculating part (103) and a second layer encoding part (105). The delay amount calculating part (103) uses the first layer decoded signal and input signal to calculate the delay amount to be added to the input signal, and applies the calculated delay amount to a delay part (104). The delay part (104) delays the input signal by the delay amount applied from the delay amount calculating part (103) and then applied it to a second layer encoding part (105). The second layer encoding part (105) uses the first layer decoded signal and the input signal from the delay part (104) for encoding.
    Type: Grant
    Filed: April 22, 2005
    Date of Patent: May 24, 2011
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 7945294
    Abstract: The present invention provides an apparatus and method for providing hands-free operation of a device. A hands-free adapter is provided that communicates with a device and a headset. The hands-free adapter allows a user to use voice commands so that the user does not have to handle the device. The hands-free adapter receives voice commands from the headset and translates the voice commands to commands recognized by the device. The hands-free adapter also monitors the device to detect device events and provides notice of the events to the user via the headset.
    Type: Grant
    Filed: April 27, 2009
    Date of Patent: May 17, 2011
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Lan Zhang, Joseph E. Page, Jr., Barrett M. Kreiner
  • Patent number: 7941313
    Abstract: A system and method for transmitting speech activity in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server. The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. Indications of voice activity are transmitted ahead of features from the subscriber unit to the server.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: May 10, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Michael Stuart Phillips
  • Publication number: 20110099006
    Abstract: In one embodiment, during participation in an online collaborative computing session, a computer process associated with the session may monitor an audio stream of the session for a predefined action-inducing phrase. In response to the phrase, a subsequent segment of the session is recorded, such that a report may be generated containing any recorded segments of the session (e.g., and dynamically sent to participants of the session).
    Type: Application
    Filed: October 27, 2009
    Publication date: April 28, 2011
    Applicant: Cisco Technology, Inc.
    Inventors: Sujatha Sundararaman, Sundar Hariharan, Anand Hariharan, Archana Karchalli Raju
  • Patent number: 7917356
    Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: September 16, 2004
    Date of Patent: March 29, 2011
    Assignee: AT&T Corporation
    Inventors: Bing Chen, James H. James
  • Patent number: 7912708
    Abstract: The present invention relates to a method of synthesizing of a speech signal, comprising: —assigning of a first identifier to a first class of intervals of an original speech signal and assigning of a second identifier to a second class of intervals of the original speech signal, —windowing the original speech signal to provide a number of pitch bells, —processing the pitch bells having the first identifier assigned thereto for modifying a duration of the speech signal, —performing an overlap and add operation on the processed pitch bells.
    Type: Grant
    Filed: August 5, 2003
    Date of Patent: March 22, 2011
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Ercan Ferit Gigi
  • Patent number: 7912712
    Abstract: An encoding method includes extracting background noise characteristic parameters within a hangover period; for a first superframe after the hangover period, performing background noise encoding based on the extracted background noise characteristic parameters; for superframes after the first superframe, performing background noise characteristic parameter extraction and DTX decision for each frame in the superframes after the first superframe; and for the superframes after the first superframe, performing background noise encoding based on extracted background noise characteristic parameters of the current superframe, background noise characteristic parameters of a plurality of superframes previous to the current superframe, and a final DTX decision. Also, a decoding method and apparatus and an encoding apparatus are disclosed.
    Type: Grant
    Filed: September 14, 2010
    Date of Patent: March 22, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Eyal Shlomot, Libin Zhang, Jinliang Dai
  • Publication number: 20110035213
    Abstract: A device and method for estimating a tonality of a sound signal comprise: calculating a current residual spectrum of the sound signal; detecting peaks in the current residual spectrum; calculating a correlation map between the current residual spectrum and a previous residual spectrum for each detected peak; and calculating a long-term correlation map based on the calculated correlation map, the long-term correlation map being indicative of a tonality in the sound signal.
    Type: Application
    Filed: June 20, 2008
    Publication date: February 10, 2011
    Inventors: Vladimir Malenovsky, Milan Jelinek, Tommmy Vaillancourt, Redwan Salami
  • Publication number: 20110029304
    Abstract: A hybrid instantaneous/differential encoding technique is described herein that may be used to reduce the bit rate required to encode a pitch period associated with a segment of a speech signal in a manner that will result in relatively little or no degradation of a decoded speech signal generated using the encoded pitch period. The hybrid instantaneous/differential encoding technique is advantageously applicable to any speech codec that encodes a pitch period associated with a segment of a speech signal.
    Type: Application
    Filed: July 30, 2010
    Publication date: February 3, 2011
    Applicant: BROADCOM CORPORATION
    Inventors: Juin-Hwey Chen, Hong-Goo Kang
  • Patent number: 7877253
    Abstract: In one configuration, erasure of a significant frame of a sustained voiced segment is detected. An adaptive codebook gain value for the erased frame is calculated based on the preceding frame. If the calculated value is less than (alternatively, not greater than) a threshold value, a higher adaptive codebook gain value is used for the erased frame. The higher value may be derived from the calculated value or selected from among one or more predefined values.
    Type: Grant
    Filed: October 5, 2007
    Date of Patent: January 25, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanbhan A. Kandhadai
  • Patent number: 7869990
    Abstract: There is provided a pitch lag predictor for use by a speech decoder to generate a predicted pitch lag parameter. The pitch lag predictor comprises a summation calculator configured to generate a first summation based on a plurality of previous pitch lag parameters, and a second summation based on a plurality of previous pitch lag parameters and a position of each of the plurality of previous pitch lag parameters with respect to the predicted pitch lag parameter; a coefficient calculator configured to generate a first coefficient using a first equation based on the first summation and the second summation, and a second coefficient using a second equation based on the first summation and the second summation, wherein the first equation is different than the second equation; and a predictor configured to generate the predicted pitch lag parameter based on the first coefficient and the second coefficient.
    Type: Grant
    Filed: October 8, 2008
    Date of Patent: January 11, 2011
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7853447
    Abstract: A method for varying speech speed is provided. The method includes the following steps: receive an original speech signal; calculate a pitch period of the original speech signal; define search ranges according to the pitch period; find a maximum within each of the search ranges of the original speech signal; divide the original speech signal into speech sections according to the maxima; obtain a speed-varied speech signal by applying a speed-varying algorithm to each speech section of the original speed signal according to a speed-varying command; and eventually, output the speed-varied speech signal.
    Type: Grant
    Filed: February 16, 2007
    Date of Patent: December 14, 2010
    Assignee: Micro-Star Int'l Co., Ltd.
    Inventors: Ming Hsiang Yen, Jui Yu Yen, Kuang Chien Kao