Voiced Or Unvoiced Patents (Class 704/208)
-
Patent number: 8204743Abstract: An apparatus and method for concealing frame erasure and a voice decoding apparatus and method using the same. The frame erasure concealment apparatus includes: a parameter extraction unit determining whether there is an erased frame in a voice packet, and extracting an excitement signal parameter and a line spectrum pair parameter of a previous good frame; and an erasure frame concealment unit, if there is an erased frame, restoring the excitement signal and line spectrum pair parameter of the erased frame by using a regression analysis from the excitement signal and line spectrum pair parameter of the previous good frame. According to the method and apparatus, by predicting and restoring the parameter of the erased frame through the regression analysis, the quality of the restored voice signal can be enhanced and the algorithm can be simplified.Type: GrantFiled: May 4, 2006Date of Patent: June 19, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Hosang Sung, Kangeun Lee, Seungho Choi
-
Patent number: 8200497Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.Type: GrantFiled: August 21, 2009Date of Patent: June 12, 2012Assignee: Digital Voice Systems, Inc.Inventor: John C. Hardwick
-
Patent number: 8195449Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).Type: GrantFiled: January 30, 2007Date of Patent: June 5, 2012Assignee: Telefonaktiebolaget L M Ericsson (Publ)Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
-
Patent number: 8195451Abstract: In an information detecting apparatus (1), a speech kind discrimination unit (11) discriminates and classifies an audio signal at an information source into kind (category) such as music or speech, etc. on a predetermined time basis, and a memory unit/recording medium (13) records discrimination information thereof. A discrimination frequency calculating unit (15) calculates, on a predetermined time basis, discrimination frequency every kind at a predetermined time period longer than the time unit.Type: GrantFiled: February 10, 2004Date of Patent: June 5, 2012Assignee: Sony CorporationInventor: Yasuhiro Toguri
-
Patent number: 8175868Abstract: A voice judging system including feature value extraction means that analyzes a sound signal input from a sound signal input device, and extracts a time series of the feature values, sub-word boundary score calculating means that calculates a time series of sub-word boundary scores, by having reference to sound models of voice stored in a voice model storage unit, temporal regularity analyzing means that analyzes temporal regularity of the sub-word boundary scores, and voice judgment means judges whether the input sound signal is voice or non-voice using of the temporal regularity of the sub-word boundary scores.Type: GrantFiled: October 10, 2006Date of Patent: May 8, 2012Assignee: NEC CorporationInventor: Makoto Terao
-
Patent number: 8175869Abstract: A method, apparatus, and medium for classifying a speech signal and a method, apparatus, and medium for encoding the speech signal using the same are provided. The method for classifying a speech signal includes calculating classification parameters from an input signal having block units, calculating a plurality of classification criteria from the classification parameters, and classifying the level of the input signal using the plurality of classification criteria. The classification parameters include at least one of an energy parameter of the input signal, a cross-correlation parameter between a specific block of a present frame and the input signal, and an integrated cross-correlation parameter obtained by accumulating the cross-correlation parameter.Type: GrantFiled: July 5, 2006Date of Patent: May 8, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Hosang Sung, Rakesh Taori, Kangeun Lee
-
Patent number: 8155953Abstract: A method and an apparatus are provided for discriminating between a voice region and a non-voice region in an environment in which diverse types of noises and voices exist.Type: GrantFiled: January 12, 2006Date of Patent: April 10, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Ki-young Park, Chang-kyu Choi
-
Patent number: 8145476Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digitType: GrantFiled: January 10, 2008Date of Patent: March 27, 2012Assignee: Ricoh Company, Ltd.Inventor: Yukihiro Imai
-
Patent number: 8135586Abstract: Disclosed is a method and an apparatus for estimating noise included in a sound signal during sound signal processing. The method includes estimating harmonics components in a frame of an input sound signal; using the estimated harmonics components, computing a Voice Presence Probability (VPP) on the frame of the input sound signal; determining a weight of an equation necessary to estimate a noise spectrum, depending on the computed VPP; and using the determined weight and the equation necessary to estimate a noise spectrum, estimating the noise spectrum, and updating the noise spectrum.Type: GrantFiled: March 21, 2008Date of Patent: March 13, 2012Assignees: Samsung Electronics Co., Ltd, Korea University Industrial & Academic Collaboration FoundationInventors: Hyun-Soo Kim, Hanseok Ko, Sung-Joo Ahn, Jounghoon Beh, Hyun-Jin Yoon
-
Patent number: 8126705Abstract: A system and method for automatically adjusting floor controls for a conversation is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.Type: GrantFiled: November 9, 2009Date of Patent: February 28, 2012Assignee: Palo Alto Research Center IncorporatedInventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
-
Patent number: 8126706Abstract: An audio signal is divided among exponentially related subband filters. The spectral flatness measure in each subband signal is determined and the measures are weighted and combined. The sum is compared with a threshold to determine the presence of music or noise. If music is detected, the noise estimation process in the noise reduction circuitry is turned off to avoid distorting the signal. If music is detected, residual echo suppression circuitry is also turned off to avoid inserting comfort noise.Type: GrantFiled: December 9, 2005Date of Patent: February 28, 2012Assignee: Acoustic Technologies, Inc.Inventor: Samuel Ponvarma Ebenezer
-
Patent number: 8121833Abstract: The exemplary embodiments of the invention provide at least a method and an apparatus to perform operations including dividing a sound signal into a series of successive frames, dividing each frame into a number of subframes, producing a residual signal by filtering the sound signal through a linear prediction analysis filter, locating a last pitch pulse of the sound signal of a previous frame from the residual signal, extracting a pitch pulse prototype of given length around a position of the last pitch pulse of the previous frame using the residual signal, and locating pitch pulses in a current frame using the pitch pulse prototype.Type: GrantFiled: October 21, 2008Date of Patent: February 21, 2012Assignee: Nokia CorporationInventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
-
Publication number: 20120022859Abstract: An automatic marking method for Karaoke vocal accompaniment is provided. In the method, pitch, beat position and volume of a singer are compared with the original pitch, beat position and volume of the theme of a song to generate a score of pitch, a score of beat and a score of emotion respectively, so as to obtain a weighted total score in a weighted marking method. By using the method, the pitch, beat position and volume error of each section of the song sung by the singer can be exactly worked out, and a pitch curve and a volume curve can be displayed, so that the singer can learn which part is sung incorrectly and which part needs to be enhanced. The present invention also has the advantages of dual effects of teaching and entertainment, high practicability and technical advancement.Type: ApplicationFiled: April 7, 2009Publication date: January 26, 2012Inventor: Wen-Hsin Lin
-
Patent number: 8090577Abstract: Methods and apparatus are presented for determining the type of acoustic signal and the type of frequency spectrum exhibited by the acoustic signal in order to selectively delete parameter information before vector quantization. The bits that would otherwise be allocated to the deleted parameters can then be re-allocated to the quantization of the remaining parameters, which results in an improvement of the perceptual quality of the synthesized acoustic signal. Alternatively, the bits that would have been allocated to the deleted parameters are dropped, resulting in an overall bit-rate reduction.Type: GrantFiled: August 8, 2002Date of Patent: January 3, 2012Assignee: QUALCOMM IncorportedInventors: Khaled Helmi El-Maleh, Ananthapadmanabhan Arasanipalai Kandhadai, Sharath Manjunath
-
Patent number: 8078455Abstract: An apparatus, method, and medium for distinguishing a vocal sound. The apparatus includes: a framing unit dividing an input signal into frames, each frame having a predetermined length; a pitch extracting unit determining whether each frame is a voiced frame or an unvoiced frame and extracting a pitch contour from the voiced and unvoiced frames; a zero-cross rate calculator respectively calculating a zero-cross rate for each frame; a parameter calculator calculating parameters including a time length ratio of the voiced frame and the unvoiced frame determined by the pitch extracting unit, statistical information of the pitch contour, and spectral characteristics; and a classifier inputting the zero-cross rates and the parameters output from the parameter calculator and determining whether the input signal is a vocal sound.Type: GrantFiled: February 7, 2005Date of Patent: December 13, 2011Assignee: Samsung Electronics Co., Ltd.Inventors: Yuan Yuan Shi, Yongbeom Lee, Jaewon Lee
-
Method and apparatus for encoding an audio signal using multiple coders with plural selection models
Patent number: 8069034Abstract: A method for supporting an encoding of an audio signal is shown, wherein at least a first and a second coder mode are available for encoding a section of the audio signal. The first coder mode enables a coding based on two different coding models. A selection of a coding model is enabled by a selection rule which is based on signal characteristics which have been determined for a certain analysis window. In order to avoid a misclassification of a section after a switch to the first coder mode, it is proposed that the selection rule is activated only when sufficient sections for the analysis window have been received. The invention relates equally to a module in which this method is implemented, to a device and a system comprising such a module and to a software program product including a software code for realizing the proposed method.Type: GrantFiled: May 6, 2005Date of Patent: November 29, 2011Assignee: Nokia CorporationInventors: Jari Mäkinen, Ari Lakaniemi, Pasi Ojala -
Patent number: 8069039Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.Type: GrantFiled: December 21, 2007Date of Patent: November 29, 2011Assignee: Yamaha CorporationInventor: Yasuo Yoshioka
-
Patent number: 8063809Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.Type: GrantFiled: June 29, 2011Date of Patent: November 22, 2011Assignee: Huawei Technologies Co., Ltd.Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
-
Publication number: 20110282658Abstract: The present invention relates to co-channel audio source separation. In one embodiment a first frequency-related representation of plural regions of the acoustic signal is prepared over time, and a two-dimensional transform of plural two-dimensional localized regions of the first frequency-related representation, each less than an entire frequency range of the first frequency related representation, is obtained to provide a two-dimensional compressed frequency-related representation with respect to each two dimensional localized region. For each of the plural regions, at least one pitch is identified. The pitch from the plural regions is processed to provide multiple pitch estimates over time. In another embodiment, a mixed acoustic signal is processed by localizing multiple time-frequency regions of a spectrogram of the mixed acoustic signal to obtain one or more acoustic properties.Type: ApplicationFiled: September 3, 2010Publication date: November 17, 2011Applicant: Massachusetts Institute of TechnologyInventors: Tianyu Wang, Thomas R. Quatieri, JR.
-
Patent number: 8050254Abstract: A media over packet networking appliance provides a network interface, a voice transducer, and at least one integrated circuit assembly coupling the voice transducer to the network interface. The at least one integrated circuit assembly provides media over packet transmissions and holds bits defining reconstruction of a packet stream having a primary stage and a secondary stage. The secondary stage has one or more of linear predictive coding parameters, long term prediction lags, parity check, and adaptive and fixed codebook gains. The packet stream has an instance of single packet loss, and the reconstruction includes receiving a packet sequence represented by P(n)P(n?1)?, [Lost Packet], P(n+2)P(n+1)?, and P(n+3)P(n+2)?, obtaining as information from the secondary stage one or more of the linear predictive coding parameters, long term prediction lags, parity check, and adaptive and fixed codebook gains, and performing an excitation reconstruction utilizing said packet sequence thus received.Type: GrantFiled: September 16, 2010Date of Patent: November 1, 2011Assignee: Texas Instruments IncorporatedInventors: Krishnasamy Anandakumar, Vishu R. Viswanathan, Alan V. McCree
-
Patent number: 8050911Abstract: A system, method, apparatus, signal-bearing medium, and means for transmitting speech activity in a distributed voice recognition (VR) system. The distributed voice recognition system includes a local VR engine in a subscriber unit (102) and a server VR engine on a server (160). The local VR engine comprises a voice activity detection (VAD) module (106) that detects voice activity within a speech signal, and comprises an advanced feature extraction (AFE) module (104) that extracts features from a speech signal. The detected voice activity information is transmitted over a first wireless communication channel to the server (160). The feature extraction information is transmitted over a second wireless communication channel, separate from the first wireless communication channel, to the server (160). The server (160) processes the received information to determine a linguistic estimate of the electrical speech signal, and transmits the linguistic estimate to the subscriber unit (102).Type: GrantFiled: March 1, 2007Date of Patent: November 1, 2011Assignee: QUALCOMM IncorporatedInventor: Harinath Garudadri
-
Patent number: 8050922Abstract: Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency fmax and a minimum frequency fmin of a filter bank used for processing the input utterance to produce an output. Corresponding gender or age specific acoustic models are used to perform voice recognition based on the filter bank output.Type: GrantFiled: July 21, 2010Date of Patent: November 1, 2011Assignee: Sony Computer Entertainment Inc.Inventor: Ruxin Chen
-
Publication number: 20110264447Abstract: Implementations and applications are disclosed for detection of a transition in a voice activity state of an audio signal, based on a change in energy that is consistent in time across a range of frequencies of the signal.Type: ApplicationFiled: April 22, 2011Publication date: October 27, 2011Applicant: QUALCOMM IncorporatedInventors: Erik Visser, Ian Ernan Liu, Jongwon Shin
-
Publication number: 20110257965Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.Type: ApplicationFiled: June 27, 2011Publication date: October 20, 2011Applicant: DIGITAL VOICE SYSTEMS, INC.Inventor: John C. Hardwick
-
Patent number: 8036884Abstract: The present invention provides a method, a computer-software-product and an apparatus for enabling a determination of speech related audio data within a record of digital audio data. The method comprises steps for extracting audio features from the record of digital audio data, for classifying one or more subsections of the record of digital audio data, and for marking at least a part of the record of digital audio data classified as speech. The classification of the digital audio data record is performed on the basis of the extracted audio features and with respect to at least one predetermined audio class.Type: GrantFiled: February 24, 2005Date of Patent: October 11, 2011Assignee: Sony Deutschland GmbHInventors: Yin Hay Lam, Josep Maria Sola I Caros
-
Patent number: 8019386Abstract: A method and system for enhancing speech intelligibility using wireless communication in portable, battery-powered and entirely user-supportable devices. The devices may be talker devices and receiver devices, where the audio signals input into the talker devices may be transmitted to the receiver devices to provide better quality audio to person using the receiver devices. The receiver devices may initiate and terminate communications with the talker devices. Additionally, the receiver devices may indicate to the talker devices the gain level the talker devices need to apply to the audio signals before sending them to the receiver devices.Type: GrantFiled: March 7, 2005Date of Patent: September 13, 2011Assignee: Etymotic Research, Inc.Inventors: William F. Dunn, Mead C. Killion, Andrew J. Haapapuro, Viorel Drambarean
-
Publication number: 20110218801Abstract: The invention relates to a method for outputting a speech signal. Speech signal frames are received and are used in a predetermined sequence in order to produce a speech signal to be output. If one speech signal frame to be received is not received, then a substitute speech signal frame is used in its place, which is produced as a function of a previously received speech signal frame. According to the invention, in the situation in which the previously received speech signal frame has a voiceless speech signal, the substitute speech signal frame is produced by means of a noise signal.Type: ApplicationFiled: September 28, 2009Publication date: September 8, 2011Applicant: ROBERT BOSCH GMBHInventors: Peter Vary, Frank Mertz
-
Patent number: 8015000Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.Type: GrantFiled: April 13, 2007Date of Patent: September 6, 2011Assignee: Broadcom CorporationInventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
-
Patent number: 8010370Abstract: Techniques for generating a target digital media item based on a source digital media item are described. A digital media item may be a song, a video clip, an album, or any length of audio or video. When adjusting the bit count for a portion of the target digital media item, instead of using the same set of parameter values used in a perceptual model for each portion of the source media item, the set of parameter values may be modified to encode the portion of the source digital media item. In this way, how audio or video is perceived is taken into account when adjusting a proposed bit count for a given portion of the target digital media item. Thus, while maintaining the same statistical bitrate as before increased digital media quality is achieved.Type: GrantFiled: July 28, 2006Date of Patent: August 30, 2011Assignee: Apple Inc.Inventor: Frank M. Baumgarte
-
Patent number: 8010358Abstract: Methods and apparatus for voice recognition are disclosed. A voice signal is obtained and two or more voice recognition analyses are performed on the voice signal. Each voice recognition analysis uses a filter bank defined by a different maximum frequency and a different minimum frequency and wherein each voice recognition analysis produces a recognition probability ri of recognition of one or more speech units, whereby there are two or more recognition probabilities ri. The maximum frequency and the minimum frequency may be adjusted every time speech is windowed and analyzed. A final recognition probability Pf is determined based on the two or more recognition probabilities ri.Type: GrantFiled: February 21, 2006Date of Patent: August 30, 2011Assignee: Sony Computer Entertainment Inc.Inventor: Ruxin Chen
-
Patent number: 7996215Abstract: A method and an apparatus for Voice Activity Detection (VAD) and an encoder are provided. The method for VAD includes: acquiring a fluctuant feature value of a background noise when an input signal is the background noise, in which the fluctuant feature value is used to represent fluctuation of the background noise; performing adaptive adjustment on a VAD decision criterion related parameter according to the fluctuant feature value; and performing VAD decision on the input signal by using the decision criterion related parameter on which the adaptive adjustment is performed. The method, the apparatus, and the encoder can be adaptive to fluctuation of the background noise to perform VAD decision, so as to enhance the VAD decision performance, save limited channel bandwidth resources, and use the channel bandwidth efficiently.Type: GrantFiled: April 13, 2011Date of Patent: August 9, 2011Assignee: Huawei Technologies Co., Ltd.Inventors: Zhe Wang, Qing Zhang
-
Patent number: 7987089Abstract: A method for modifying a window with a frame associated with an audio signal is described. A signal is received. The signal is partitioned into a plurality of frames. A determination is made if a frame within the plurality of frames is associated with a non-speech signal. A modified discrete cosine transform (MDCT) window function is applied to the frame to generate a first zero pad region, where the region has a length of (M?L)/2, where L is an arbitrary value, and a second zero pad region if it was determined that the frame is associated with a non-speech signal. The frame is encoded. The decoder window is the same as the encoder window.Type: GrantFiled: February 14, 2007Date of Patent: July 26, 2011Assignee: QUALCOMM IncorporatedInventors: Venkatesh Krishnan, Ananthapadmanabhan A. Kandhadai
-
Patent number: 7979272Abstract: The present invention provides a frame erasure concealment device and method that is based on reestimating gain parameters for a code excited linear prediction (CELP) coder. During operation, when a frame in a stream of received data is detected as being erased, the coding parameters, especially an adaptive codebook gain gp and a fixed codebook gain gc, of the erased and subsequent frames can be reestimated by a gain matching procedure. By using this technique with the IS-641 speech coder, it has been found that the present invention improves the speech quality under various channel conditions, compared with a conventional extrapolation-based concealment algorithm.Type: GrantFiled: October 12, 2007Date of Patent: July 12, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Hong-Goo Kang, Hong Kook Kim
-
Patent number: 7970606Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames and computing a set of model parameters for the frames. The set of model parameters includes at least a first parameter conveying pitch information. The voicing state of a frame is determined and the first parameter conveying pitch information is modified to designate the determined voicing state of the frame, if the determined voicing state of the frame is equal to one of a set of reserved voicing states. The model parameters are quantized to generate quantizer bits which are used to produce the bit stream.Type: GrantFiled: November 13, 2002Date of Patent: June 28, 2011Assignee: Digital Voice Systems, Inc.Inventor: John C. Hardwick
-
Publication number: 20110153317Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.Type: ApplicationFiled: December 23, 2009Publication date: June 23, 2011Applicant: QUALCOMM INCORPORATEDInventors: Yinian Mao, Gene Marsh
-
Publication number: 20110153318Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.Type: ApplicationFiled: March 15, 2010Publication date: June 23, 2011Applicant: MINDSPEED TECHNOLOGIES, INC.Inventors: Norbert Rossello, Fabien Klein
-
Patent number: 7966179Abstract: A method and apparatus for distinguishing a voice region from a non-voice region in an environment where various types of noise and voice are mixed together are provided. The method includes the steps of converting an input voice signal into a frequency domain signal by preprocessing the input voice signal, performing sigmoid compression on the converted signal, transforming a spectrum vector generated by the sigmoid compression into a voice detection parameter in scalar form, and detecting the voice region using the parameter.Type: GrantFiled: January 27, 2006Date of Patent: June 21, 2011Assignee: Samsung Electronics Co., Ltd.Inventors: Kwang-cheol Oh, Ki-young Park
-
Patent number: 7949520Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.Type: GrantFiled: December 9, 2005Date of Patent: May 24, 2011Assignee: QNX Software Sytems Co.Inventors: Rajeev Nongpiur, Phillip A. Hetherington
-
Patent number: 7949518Abstract: A hierarchy encoding apparatus capable of calculating appropriate delay amounts and also capable of suppressing increase in the bit rate. In this apparatus, a first layer encoding part (101) encodes the input signal of the n-th frame to produce a first layer encoded code. A first layer decoding part (102) generates a first layer decoded signal from the first layer encoded code and applies it to a delay amount calculating part (103) and a second layer encoding part (105). The delay amount calculating part (103) uses the first layer decoded signal and input signal to calculate the delay amount to be added to the input signal, and applies the calculated delay amount to a delay part (104). The delay part (104) delays the input signal by the delay amount applied from the delay amount calculating part (103) and then applied it to a second layer encoding part (105). The second layer encoding part (105) uses the first layer decoded signal and the input signal from the delay part (104) for encoding.Type: GrantFiled: April 22, 2005Date of Patent: May 24, 2011Assignee: Panasonic CorporationInventor: Masahiro Oshikiri
-
Patent number: 7945294Abstract: The present invention provides an apparatus and method for providing hands-free operation of a device. A hands-free adapter is provided that communicates with a device and a headset. The hands-free adapter allows a user to use voice commands so that the user does not have to handle the device. The hands-free adapter receives voice commands from the headset and translates the voice commands to commands recognized by the device. The hands-free adapter also monitors the device to detect device events and provides notice of the events to the user via the headset.Type: GrantFiled: April 27, 2009Date of Patent: May 17, 2011Assignee: AT&T Intellectual Property I, L.P.Inventors: Lan Zhang, Joseph E. Page, Jr., Barrett M. Kreiner
-
Patent number: 7941313Abstract: A system and method for transmitting speech activity in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server. The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. Indications of voice activity are transmitted ahead of features from the subscriber unit to the server.Type: GrantFiled: December 14, 2001Date of Patent: May 10, 2011Assignee: QUALCOMM IncorporatedInventors: Harinath Garudadri, Michael Stuart Phillips
-
Publication number: 20110099006Abstract: In one embodiment, during participation in an online collaborative computing session, a computer process associated with the session may monitor an audio stream of the session for a predefined action-inducing phrase. In response to the phrase, a subsequent segment of the session is recorded, such that a report may be generated containing any recorded segments of the session (e.g., and dynamically sent to participants of the session).Type: ApplicationFiled: October 27, 2009Publication date: April 28, 2011Applicant: Cisco Technology, Inc.Inventors: Sujatha Sundararaman, Sundar Hariharan, Anand Hariharan, Archana Karchalli Raju
-
Patent number: 7917356Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: September 16, 2004Date of Patent: March 29, 2011Assignee: AT&T CorporationInventors: Bing Chen, James H. James
-
Patent number: 7912708Abstract: The present invention relates to a method of synthesizing of a speech signal, comprising: —assigning of a first identifier to a first class of intervals of an original speech signal and assigning of a second identifier to a second class of intervals of the original speech signal, —windowing the original speech signal to provide a number of pitch bells, —processing the pitch bells having the first identifier assigned thereto for modifying a duration of the speech signal, —performing an overlap and add operation on the processed pitch bells.Type: GrantFiled: August 5, 2003Date of Patent: March 22, 2011Assignee: Koninklijke Philips Electronics N.V.Inventor: Ercan Ferit Gigi
-
Patent number: 7912712Abstract: An encoding method includes extracting background noise characteristic parameters within a hangover period; for a first superframe after the hangover period, performing background noise encoding based on the extracted background noise characteristic parameters; for superframes after the first superframe, performing background noise characteristic parameter extraction and DTX decision for each frame in the superframes after the first superframe; and for the superframes after the first superframe, performing background noise encoding based on extracted background noise characteristic parameters of the current superframe, background noise characteristic parameters of a plurality of superframes previous to the current superframe, and a final DTX decision. Also, a decoding method and apparatus and an encoding apparatus are disclosed.Type: GrantFiled: September 14, 2010Date of Patent: March 22, 2011Assignee: Huawei Technologies Co., Ltd.Inventors: Eyal Shlomot, Libin Zhang, Jinliang Dai
-
Publication number: 20110035213Abstract: A device and method for estimating a tonality of a sound signal comprise: calculating a current residual spectrum of the sound signal; detecting peaks in the current residual spectrum; calculating a correlation map between the current residual spectrum and a previous residual spectrum for each detected peak; and calculating a long-term correlation map based on the calculated correlation map, the long-term correlation map being indicative of a tonality in the sound signal.Type: ApplicationFiled: June 20, 2008Publication date: February 10, 2011Inventors: Vladimir Malenovsky, Milan Jelinek, Tommmy Vaillancourt, Redwan Salami
-
Publication number: 20110029304Abstract: A hybrid instantaneous/differential encoding technique is described herein that may be used to reduce the bit rate required to encode a pitch period associated with a segment of a speech signal in a manner that will result in relatively little or no degradation of a decoded speech signal generated using the encoded pitch period. The hybrid instantaneous/differential encoding technique is advantageously applicable to any speech codec that encodes a pitch period associated with a segment of a speech signal.Type: ApplicationFiled: July 30, 2010Publication date: February 3, 2011Applicant: BROADCOM CORPORATIONInventors: Juin-Hwey Chen, Hong-Goo Kang
-
Patent number: 7877253Abstract: In one configuration, erasure of a significant frame of a sustained voiced segment is detected. An adaptive codebook gain value for the erased frame is calculated based on the preceding frame. If the calculated value is less than (alternatively, not greater than) a threshold value, a higher adaptive codebook gain value is used for the erased frame. The higher value may be derived from the calculated value or selected from among one or more predefined values.Type: GrantFiled: October 5, 2007Date of Patent: January 25, 2011Assignee: QUALCOMM IncorporatedInventors: Venkatesh Krishnan, Ananthapadmanbhan A. Kandhadai
-
Patent number: 7869990Abstract: There is provided a pitch lag predictor for use by a speech decoder to generate a predicted pitch lag parameter. The pitch lag predictor comprises a summation calculator configured to generate a first summation based on a plurality of previous pitch lag parameters, and a second summation based on a plurality of previous pitch lag parameters and a position of each of the plurality of previous pitch lag parameters with respect to the predicted pitch lag parameter; a coefficient calculator configured to generate a first coefficient using a first equation based on the first summation and the second summation, and a second coefficient using a second equation based on the first summation and the second summation, wherein the first equation is different than the second equation; and a predictor configured to generate the predicted pitch lag parameter based on the first coefficient and the second coefficient.Type: GrantFiled: October 8, 2008Date of Patent: January 11, 2011Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7853447Abstract: A method for varying speech speed is provided. The method includes the following steps: receive an original speech signal; calculate a pitch period of the original speech signal; define search ranges according to the pitch period; find a maximum within each of the search ranges of the original speech signal; divide the original speech signal into speech sections according to the maxima; obtain a speed-varied speech signal by applying a speed-varying algorithm to each speech section of the original speed signal according to a speed-varying command; and eventually, output the speed-varied speech signal.Type: GrantFiled: February 16, 2007Date of Patent: December 14, 2010Assignee: Micro-Star Int'l Co., Ltd.Inventors: Ming Hsiang Yen, Jui Yu Yen, Kuang Chien Kao