Voiced Or Unvoiced Patents (Class 704/208)
  • Patent number: 8619965
    Abstract: Certain embodiments of the present invention employ targeted speech detection as part of end-of-hold detection in an end-of-hold notification system. The targeted speech detector is configured to be particularly sensitive to specific words or phrases so as to increase the likelihood of detecting a correct end-of-hold condition while reducing the likelihood of false end-of-hold detection. Targeted speech detection may be used along with other detection mechanisms such as DTMF detection and/or background noise detection.
    Type: Grant
    Filed: April 28, 2011
    Date of Patent: December 31, 2013
    Assignee: Abraham & Son
    Inventors: Romek Figa, Michael A. Figa
  • Patent number: 8620646
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: December 31, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Publication number: 20130346071
    Abstract: A method of simultaneously transforming at least two input voice signals xi of a communications system (30), each input voice signal xi being received at a specific reception frequency Fi and corresponding to the voice of a remote party communicating with a user of the communications system (30). During an initialization stage, a transformation Ti is allocated to at least one reception frequency Fi of the input voice signals xi, and during a utilization stage, transformations Ti are applied simultaneously to the input voice signals xi as a function of the reception frequencies Fi, modifying at least one characteristic of each of the input voice signals xi. Thus, the voice of each remote party in communication with the user of the communications system (30) is modified artificially by a transformation Ti, thereby making it easier for the user to perceive and discriminate between simultaneous voices from the remote parties.
    Type: Application
    Filed: March 4, 2013
    Publication date: December 26, 2013
    Inventor: Jean-Pierre Baudry
  • Patent number: 8612216
    Abstract: To form an audio signal, frequency components of the audio signal which are allotted to a first subband are formed by means of a subband decoder using supplied fundamental period values which respectively indicate a fundamental period for the audio signal. Frequency components of the audio signal which are allotted to a second subband are formed by exciting an audio synthesis filter using an excitation signal which is specific to the second subband. To produce this excitation signal, an excitation signal generator derives a fundamental period parameter from the fundamental period values. The fundamental period parameter is used by the excitation signal generator to form pulses with a pulse shape which is dependent on the fundamental period parameter at an interval of time which is determined by the fundamental period parameter and to mix them with a noise signal.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: December 17, 2013
    Assignee: Siemens Enterprise Communications GmbH & Co. KG
    Inventors: Martin Gartner, Bernd Geiser, Peter Jax, Stefan Schandl, Herve Taddei, Peter Vary
  • Patent number: 8606569
    Abstract: The present invention relates to means and methods of classifying speech and music signals in voice communication systems, devices, telephones, and methods, and more specifically, to systems, devices, and methods that automate control when either speech or music is detected over communication links. The present invention provides a novel system and method for monitoring the audio signal, analyze selected audio signal components, compare the results of analysis with a pre-determined threshold value, and classify the audio signal either as speech or music.
    Type: Grant
    Filed: November 12, 2012
    Date of Patent: December 10, 2013
    Inventor: Alon Konchitsky
  • Patent number: 8595002
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames, computing model parameters for a frame, and quantizing the model parameters to produce pitch bits conveying pitch information, voicing bits conveying voicing information, and gain bits conveying signal level information. One or more of the pitch bits are combined with one or more of the voicing bits and one or more of the gain bits to create a first parameter codeword that is encoded with an error control code to produce a first FEC codeword that is included in a bit stream for the frame. The process may be reversed to decode the bit stream.
    Type: Grant
    Filed: January 18, 2013
    Date of Patent: November 26, 2013
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 8589152
    Abstract: To this end, a voice detection device includes a band-based power calculation unit that calculates a total of signal power values (sub-band power) of signals entered from the microphones from one preset frequency width (sub-band) to another. The voice detection device also includes a band-based noise estimation unit that estimates the sub-band based noise power, and a sub-band based SNR calculation unit. The sub-band based SNR calculation unit calculates a sub-band SNR from one sub-band to another to output the largest one of the sub-band SNRs as an SNR for a microphone of interest. The voice detection device further includes a voice/non-voice decision unit that determines the voice/non-voice using the SNR for the microphone of interest.
    Type: Grant
    Filed: May 26, 2009
    Date of Patent: November 19, 2013
    Assignee: NEC Corporation
    Inventors: Tadashi Emori, Masanori Tsujikawa
  • Patent number: 8589334
    Abstract: Methods and systems are provided for developing decision information relating to a single system based on data received from a plurality of sensors. The method includes receiving first data from a first sensor that defines first information of a first type that is related to a system, receiving second data from a second sensor that defines second information of a second type that is related to said system, wherein the first type is different from the second type, generating a first decision model, a second decision model, and a third decision model, determining whether data is available from only the first sensor, only the second sensor, or both the first and second sensors, and selecting based on the determination of availability an additional model to apply the available data, wherein the additional model is selected from a plurality of additional decision models including the third decision model.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: November 19, 2013
    Assignee: Telcordia Technologies, Inc.
    Inventor: Akshay Vashist
  • Patent number: 8583427
    Abstract: A signal processing system which discriminates between voice signals and data signals modulated by a voiceband carrier. The signal processing system includes a voice exchange, a data exchange and a call discriminator. The voice exchange is capable of exchanging voice signals between a switched circuit network and a packet based network. The signal processing system also includes a data exchange capable of exchanging data signals modulated by a voiceband carrier on the switched circuit network with unmodulated data signal packets on the packet based network. The data exchange is performed by demodulating data signals from the switched circuit network for transmission on the packet based network, and modulating data signal packets from the packet based network for transmission on the switched circuit network. The call discriminator is used to selectively enable the voice exchange and data exchange.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: November 12, 2013
    Assignee: Broadcom Corporation
    Inventors: Onur Tackin, Scott Branden
  • Patent number: 8577673
    Abstract: In one embodiment, a method of receiving a decoded audio signal that has a transmitted pitch lag is disclosed. The method includes estimating pitch correlations of possible short pitch lags that are smaller than a minimum pitch limitation and have an approximated multiple relationship with the transmitted pitch lag, checking if one of the pitch correlations of the possible short pitch lags is large enough compared to a pitch correlation estimated with the transmitted pitch lag, and selecting a short pitch lag as a corrected pitch lag if a corresponding pitch correlation is large enough. The postprocessing is performed using the corrected pitch lag. In another embodiment, when the existence of irregular harmonics or wrong pitch lag is detected, a coded-excited linear prediction (CELP) postfilter is made more aggressive.
    Type: Grant
    Filed: September 15, 2009
    Date of Patent: November 5, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8577674
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: November 5, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8571853
    Abstract: A method and apparatus for laughter detection. Laughter is detected through the presence of a sequence of at least a predetermined number such as three consecutive bursts, each burst comprising a voiced portion and an unvoiced portion. After detecting bursts, n-tuples such as triplets are detected, and a likelihood of each burst N-tuple to represent laughter is provided by comparison to predetermined thresholds. Finally, a total score is assigned to the signal based on the grades associated with the triplets and parameters such as the distance between the N-tuples, the total score representing the probability that the audio signal comprises a laughter episode. The method and apparatus preferably comprise a training step and module for determining the thresholds according to manually marked audio signals.
    Type: Grant
    Filed: February 11, 2007
    Date of Patent: October 29, 2013
    Assignee: Nice Systems Ltd.
    Inventors: Oren Peleg, Moshe Wasserblat
  • Patent number: 8571852
    Abstract: A scalable decoder device (50) for signals representing audio comprises a primary decoder (21) connected to an input (40). The primary decoder (21) is arranged to provide a primary decoded signal (23) based on received parameters (4). A primary postfilter (31) is connected to the primary decoder (23) to provide a primary postfiltered signal (32). A secondary enhancement decoder (45) is connected to the input (40) and arranged to provide a secondary decoded enhancement signal (44). The device further comprises a combiner arrangement (55), arranged for combining the primary postfiltered signal (32) and a signal (53) based on the secondary decoded enhancement signal (44) into an output signal (6) to be provided at an output (6). The combining is made with an adaptable strength relation between contributions from the two signals. A method for decoding coded signals representing audio operates in analogy with the scalable decoder device (50).
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: October 29, 2013
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8566107
    Abstract: Disclosed is a method of processing a signal, which includes receiving at least one of a first signal and a second signal, receiving mode information, and decoding the at least one of the first signal and the second signal using at least one of a first coding scheme and a second coding scheme according to the mode information. The mode information is information for indicating that a prescribed mode corresponds to one of at least three modes. The method includes detecting when a restricted mode change occurs and changing at least one mode when detecting a restricted mode change.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: October 22, 2013
    Assignees: LG Electronics Inc., Intellectual Discovery Co., Ltd.
    Inventors: Hyen-O Oh, Hong Goo Kang, Chang Heon Lee, Sang Wook Shin, Yang Won Jung
  • Patent number: 8560330
    Abstract: In accordance with an embodiment, A method of encoding an audio bitstream at an encoder includes encoding an original low band signal at the encoder by using a closed loop analysis-by-synthesis approach to obtain a coded low band signal, encoding an original high band signal at the encoder by using an open loop energy matching approach to obtain coded high band energy envelopes, comparing an energy of the coded low band signal with an energy of a corresponding original low band signal for a subframe, and generating an indication flag that indicates whether an energy envelope perceptual correction is needed for the subframe based on comparing the energy.
    Type: Grant
    Filed: July 19, 2011
    Date of Patent: October 15, 2013
    Assignee: Futurewei Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 8560313
    Abstract: A method of and system for transient noise rejection for improved speech recognition. The method comprises the steps of (a) receiving audio including user speech and at least some transient noise associated with the speech, (b) converting the received audio into digital data, (c) segmenting the digital data into acoustic frames, and (d) extracting acoustic feature vectors from the acoustic frames. The method also comprises the steps of (e) evaluating the acoustic frames for transient noise on a frame-by-frame basis, (f) rejecting those acoustic frames having transient noise, (g) accepting as speech frames those acoustic frames having no transient noise and, thereafter, (h) recognizing the user speech using the speech frames.
    Type: Grant
    Filed: May 13, 2010
    Date of Patent: October 15, 2013
    Assignee: General Motors LLC
    Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan
  • Patent number: 8560301
    Abstract: A language expression apparatus and a method based on a context and a intent awareness, are provided. The apparatus and method may recognize a context and an intent of a user and may generate a language expression based on the recognized context and the recognized intent, thereby providing an interpretation/translation service and/or providing an education service for learning a language.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: October 15, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Yeo Jin Kim
  • Publication number: 20130262098
    Abstract: A speech analysis apparatus is provided. An F0 extraction part extracts a pitch value from speech information. A spectrum extraction part extracts spectrum information from the speech information. An MVF extraction part extract a maximum voiced frequency and allows boundary information for respectively filtering a harmonic component and a non-harmonic component to be obtained. According to the speech analysis apparatus, speech synthesis apparatus, and speech analysis synthesis system of the present invention, speech that is closer to the original voice and is more natural may be synthesized. Also, speech may be represented with less data capacity.
    Type: Application
    Filed: March 27, 2013
    Publication date: October 3, 2013
    Applicant: GWANGJU INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: Hong-Kook KIM, Kwang-Myung JEON
  • Publication number: 20130262099
    Abstract: According to one embodiment, an apparatus for applying pitch features in automatic speech recognition is provided. The apparatus includes a distribution evaluation module, normalization module, and random value adjusting module. The distribution evaluation module evaluates the global distribution of pitch features of voiced frames in speech signals, and the global distribution of random values for unvoiced frames in speech signals. The normalization module normalizes the global distribution of random values for unvoiced frames based on the global distribution of pitch features of voiced frames. The random value adjusting module adjusts random values for unvoiced frames based on the normalized global distribution, so that the adjusted random values can be assigned to unvoiced frames in speech signals as pitch features of the unvoiced frames.
    Type: Application
    Filed: March 28, 2013
    Publication date: October 3, 2013
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Pei DING, Liqiang HE
  • Patent number: 8548803
    Abstract: A system and method may be configured to process an audio signal. The system and method may track pitch, chirp rate, and/or harmonic envelope across the audio signal, may reconstruct sound represented in the audio signal, and/or may segment or classify the audio signal. A transform may be performed on the audio signal to place the audio signal in a frequency chirp domain that enhances the sound parameter tracking, reconstruction, and/or classification.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: October 1, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher, Rodney Gateau, Derrick R. Roos, Eric Wiewiora
  • Patent number: 8542983
    Abstract: A method of generating a summary of an audio/visual data stream is provided, the data stream comprising a plurality of consecutive frames having audio and visual properties. A plurality of shots of an audio/visual data stream are detected (step 204). A plurality of segments of the audio/visual data stream are determined (step 206), each segment comprising a plurality of the shots of the data stream having similar visual properties. A segment of the determined plurality of segments is selected (step 208). For each shot of said selected segment of said data stream, the audio in a plurality of consecutive frames which occur after the end of said shot is extracted (step 210). At least one of the shots is selected based on the extracted audio (step 212). A summary is generated to include the selected at least one of the shots (step 214).
    Type: Grant
    Filed: June 2, 2009
    Date of Patent: September 24, 2013
    Assignee: Koninklijke Philips N.V.
    Inventors: Milan Pastrnak, Pedro Fonseca
  • Patent number: 8538765
    Abstract: A parameter decoding apparatus includes a prediction residue decoder that finds a quantized prediction residue based on encoded information included in a current frame subject to decoding and an auto-regressive predictor produces a predicted parameter by multiplying a predictive coefficient with a past decoded parameter. An adder decodes a parameter by adding the quantized prediction residue and the predicted parameter, wherein the prediction residue decoder, when the current frame is erased, finds a current-frame quantized prediction residue from a weighted linear sum of a parameter decoded in the past and a future-frame quantized prediction residue.
    Type: Grant
    Filed: May 17, 2013
    Date of Patent: September 17, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Patent number: 8532984
    Abstract: Applications of dim-and-burst techniques to coding of wideband speech signals are described. Reconstruction of a highband portion of a frame of a wideband speech signal using information from a previous frame is also described.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: September 10, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
  • Patent number: 8529473
    Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.
    Type: Grant
    Filed: July 27, 2012
    Date of Patent: September 10, 2013
    Inventors: Michael Callahan, Thomas Coleman
  • Patent number: 8532986
    Abstract: A speech signal evaluation apparatus includes: an acquisition unit that acquires, as a first frame, a speech signal of a specified length from speech signals; a first detection unit that detects, on the basis of a speech condition, whether the first frame is voiced or unvoiced; a variation calculation unit that, when the first frame is unvoiced, calculates a variation in a spectrum associated with the first frame on the basis of a spectrum of the first frame and a spectrum of a second frame that is unvoiced and precedes the first frame in time; and a second detection unit that detects, on the basis of a non-stationary condition based on the variation in spectrum, whether the variation of the first frame satisfies the non-stationary condition.
    Type: Grant
    Filed: March 24, 2010
    Date of Patent: September 10, 2013
    Assignee: Fujitsu Limited
    Inventor: Chikako Matsumoto
  • Patent number: 8527264
    Abstract: A method for determining mantissa bit allocation of frequency domain audio data to be encoded, including by performing adaptive low frequency compensation on each frequency band of a set of low frequency bands of the data.
    Type: Grant
    Filed: August 17, 2012
    Date of Patent: September 3, 2013
    Assignees: Dolby Laboratories Licensing Corporation, Dolby International AB
    Inventors: Arijit Biswas, Vinay Melkote, Michael Schug, Grant Allen Davidson, Mark Stuart Vinton
  • Patent number: 8521519
    Abstract: An adaptive sound source vector quantization device includes a first pitch cycle instructor, a search range calculator, and a second pitch cycle instructor. The first pitch cycle instructor successively instructs pitch cycle search candidates in a predetermined search range having a search resolution which transits over a predetermined pitch cycle candidate for the first sub-frame. The search range calculator calculates a predetermined range before and after the pitch cycle of the first sub-frame as the pitch cycle search range for the second sub-frame, if the predetermined range includes the predetermined pitch cycle search candidate. In the predetermined range, the search resolution transits over a boundary defined by the predetermined pitch cycle. The second pitch cycle instructor successively instructs the pitch cycle search candidates in the search range for the second sub-frame.
    Type: Grant
    Filed: February 29, 2008
    Date of Patent: August 27, 2013
    Assignee: Panasonic Corporation
    Inventors: Kaoru Sato, Toshiyuki Morii
  • Patent number: 8521530
    Abstract: A method, system, and computer program for enhancing a signal are presented. The signal is received, and energy estimates of the signal may be determined. At least one characteristic of the signal may be inferred based on the energy estimates. A mask may be generated based, in part, on the at least one characteristic. In turn, the mask may be applied to the signal to produce an enhanced signal, which may be outputted.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: August 27, 2013
    Assignee: Audience, Inc.
    Inventors: Mark Every, David Klein
  • Patent number: 8504365
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: August 6, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8498861
    Abstract: An apparatus and method for concealing frame erasure and a voice decoding apparatus and method using the same. The frame erasure concealment apparatus includes: a parameter extraction unit determining whether there is an erased frame in a voice packet, and extracting an excitement signal parameter and a line spectrum pair parameter of a previous good frame; and an erasure frame concealment unit, if there is an erased frame, restoring the excitement signal and line spectrum pair parameter of the erased frame by using a regression analysis from the excitement signal and line spectrum pair parameter of the previous good frame. According to the method and apparatus, by predicting and restoring the parameter of the erased frame through the regression analysis, the quality of the restored voice signal can be enhanced and the algorithm can be simplified.
    Type: Grant
    Filed: May 22, 2012
    Date of Patent: July 30, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Kangeun Lee, Seungho Choi
  • Patent number: 8494849
    Abstract: A method of transmitting speech data to a remote device in a distributed speech recognition system, includes the steps of: dividing an input speech signal into frames; calculating, for each frame, a voice activity value representative of the presence of speech activity in the frame; grouping the frames into multiframes, each multiframe including a predetermined number of frames; calculating, for each multiframe, a voice activity marker representative of the number of frames in the multiframe representing speech activity; and selectively transmitting, on the basis of the voice activity marker associated with each multiframe, the multiframes to the remote device.
    Type: Grant
    Filed: June 20, 2005
    Date of Patent: July 23, 2013
    Assignee: Telecom Italia S.p.A.
    Inventors: Ivano Salvatore Collotta, Donato Ettorre, Maurizio Fodrini, Pierluigi Gallo, Roberto Spagnolo
  • Patent number: 8494842
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: July 23, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 8494840
    Abstract: The invention relates to audio signal processing and speech enhancement. In accordance with one aspect, the invention combines a high-quality audio program that is a mix of speech and non-speech audio with a lower-quality copy of the speech components contained in the audio program for the purpose of generating a high-quality audio program with an increased ratio of speech to non-speech audio such as may benefit the elderly, hearing impaired or other listeners. Aspects of the invention are particularly useful for television and home theater sound, although they may be applicable to other audio and sound applications. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: July 23, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Hannes Muesch
  • Patent number: 8494844
    Abstract: A computerized method and system is provided for automatically selecting from a digitized sound sample a segment of the sample that is optimal for the purpose of measuring clinical metrics for voice and speech assessment. A quality measure based on quality parameters of segments of the sound sample is applied to candidate segments to identify the highest quality segment within the sound sample. The invention can optionally provide feedback to the speaker to help the speaker increase the quality of the sound sample provided. The invention also can optionally perform sound pressure level calibration and noise calibration. The invention may optionally compute clinical metrics on the selected segment and may further include a normative database method or system for storing and analyzing clinical measurements.
    Type: Grant
    Filed: November 19, 2009
    Date of Patent: July 23, 2013
    Assignee: Human Centered Technologies, Inc.
    Inventor: David N. Fernandes
  • Patent number: 8489392
    Abstract: A system and method for modeling speech in such a way that both voiced and unvoiced contributions can co-exist at certain frequencies. In various embodiments, three spectral bands (or bands of up to three different types) are used. In one embodiment, the lowest band or group of bands is completely voiced, the middle band or group of bands contains both voiced and unvoiced contributions, and the highest band or group of bands is completely unvoiced. The embodiments of the present invention may be used for speech coding and other speech processing applications.
    Type: Grant
    Filed: September 13, 2007
    Date of Patent: July 16, 2013
    Assignee: Nokia Corporation
    Inventors: Jani Nurminen, Sakari Himanen
  • Patent number: 8484022
    Abstract: A method and system for adaptive auto-encoders is disclosed. An input audio training signal may be transformed into a sequence of feature vectors, each bearing quantitative measures of acoustic properties of the input audio training signal. An auto-encoder may process the feature vectors to generate an encoded form of the quantitative measures, and a recovered form of the quantitative measures based on an inverse operation by the auto-encoder on the encoded form of the quantitative measures. A duplicate copy of the sequence of feature vectors may be normalized to form a normalized signal in which supra-phonetic acoustic properties are reduced in comparison with phonetic acoustic properties of the input audio training signal. The auto-encoder may then be trained to compensate for supra-phonetic features by reducing the magnitude of an error signal corresponding to a difference between the normalized signal and the recovered form of the quantitative measures.
    Type: Grant
    Filed: July 27, 2012
    Date of Patent: July 9, 2013
    Assignee: Google Inc.
    Inventor: Vincent Vanhoucke
  • Patent number: 8478585
    Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: July 2, 2013
    Assignee: Red Shift Company, LLC
    Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
  • Patent number: 8473284
    Abstract: A voice encoding/decoding method and apparatus. A voice encoder includes: a quantization selection unit generating a quantization selection signal; and a quantization unit extracting a linear prediction coding (LPC) coefficient from an input signal, converting the extracted LPC coefficient into a line spectral frequency (LSF), quantizing the LSF with a first LSF quantization unit or a second LSF quantization unit based on the quantization selection signal, and converting the quantized LSF into a quantized LPC coefficient. The quantization selection signal selects the first LSF quantization unit or second LSF quantization unit based on characteristics of a synthesized voice signal in previous frames of the input signal.
    Type: Grant
    Filed: April 4, 2005
    Date of Patent: June 25, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kangeun Lee, Hosang Sung, Kihyun Choo
  • Patent number: 8473283
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: June 25, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 8473282
    Abstract: In a sound processing device, a modulation spectrum specifier specifies a modulation spectrum of an input sound for each of a plurality of unit intervals. An index calculator calculates an index value corresponding to a magnitude of components of modulation frequencies belonging to a predetermined range of the modulation spectrum. A determinator determines whether the input sound of each of the unit intervals is a vocal sound or a non-vocal sound based on the index value.
    Type: Grant
    Filed: January 23, 2009
    Date of Patent: June 25, 2013
    Assignee: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 8468014
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: June 18, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 8468015
    Abstract: A parameter decoding device performs a parameter compensation process so as to suppress degradation of a main observation quality in a prediction quantization. The parameter decoding device includes first amplifiers which multiply inputted quantization prediction residual vectors by a weighting coefficient. A further amplifier multiplies the preceding frame decoding LSF vector yn?1 by the weighting coefficient. An additional amplifier multiplies the code vector xn+1 outputted from a codebook by the weighting coefficient ?0. An adder calculates the total of the vectors outputted from the amplifiers, the further amplifier, and the additional amplifier. A selector switch selects the vector outputted from the adder if the frame erasure coding Bn of the current frame indicates that ‘the n-th frame is an erased frame’ and the frame erasure coding Bn+1 of the next frame indicates that ‘the n+1-th frame is a normal frame’.
    Type: Grant
    Filed: November 9, 2007
    Date of Patent: June 18, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Patent number: 8463600
    Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Grant
    Filed: February 27, 2012
    Date of Patent: June 11, 2013
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
  • Patent number: 8463599
    Abstract: A method includes defining a transition band for a signal having a spectrum within a first frequency band, where the transition band is defined as a portion of the first frequency band, and is located near an adjacent frequency band that is adjacent to the first frequency band. The method analyzes the transition band to obtain a transition band spectral envelope and a transition band excitation spectrum; estimates an adjacent frequency band spectral envelope; generates an adjacent frequency band excitation spectrum by periodic repetition of at least a part of the transition band excitation spectrum with a repetition period determined by a pitch frequency of the signal; and combines the adjacent frequency band spectral envelope and the adjacent frequency band excitation spectrum to obtain an adjacent frequency band signal spectrum. A signal processing logic for performing the method is also disclosed.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: June 11, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Tenkasi Ramabadran, Mark Jasiuk
  • Publication number: 20130144612
    Abstract: A method for automatic segmentation of pitch periods of speech waveforms takes a speech waveform, a corresponding fundamental frequency contour of the speech waveform, that can be computed by some standard fundamental frequency detection algorithm, and optionally the voicing information of the speech waveform, that can be computed by some standard voicing detection algorithm, as inputs and calculates the corresponding pitch period boundaries of the speech waveform as outputs by iteratively •calculating the Fast Fourier Transform (FFT) of a speech segment having a length of approximately two periods, the period being calculated as the inverse of the mean fundamental frequency associated with these speech segments, •placing the pitch period boundary either at the position where the phase of the third FFT coefficient is ?180 degrees, or at the position where the correlation coefficient of two speech segments shifted within the two period long analysis frame maximizes, or at a position calculated as a combination
    Type: Application
    Filed: December 29, 2010
    Publication date: June 6, 2013
    Applicant: SYNVO GMBH
    Inventor: Harald Romsdorfer
  • Publication number: 20130144613
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into one or more frames, computing model parameters for a frame, and quantizing the model parameters to produce pitch bits conveying pitch information, voicing bits conveying voicing information, and gain bits conveying signal level information. One or more of the pitch bits are combined with one or more of the voicing bits and one or more of the gain bits to create a first parameter codeword that is encoded with an error control code to produce a first FEC codeword that is included in a bit stream for the frame. The process may be reversed to decode the bit stream.
    Type: Application
    Filed: January 18, 2013
    Publication date: June 6, 2013
    Applicant: DIGITAL VOICE SYSTEMS, INC.
    Inventor: DIGITAL VOICE SYSTEMS, INC.
  • Patent number: 8457953
    Abstract: In a method of smoothing background noise in a telecommunication speech session; receiving and decoding S1O a signal representative of a speech session, the signal comprising both a speech component and a background noise component. Subsequently, determining LPC parameters S20 and an excitation signal S30 for the received signal. Thereafter, synthesizing and outputting (S40) an output signal based on the determined LPC parameters and excitation signal. In addition, modifying S35 the determined excitation signal by reducing power and spectral fluctuations of the excitation signal to provide a smoothed output signal.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: June 4, 2013
    Assignee: Telefonaktiebolaget LM Ericsson (Publ)
    Inventor: Stefan Bruhn
  • Patent number: 8447592
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: May 21, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8447617
    Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.
    Type: Grant
    Filed: March 15, 2010
    Date of Patent: May 21, 2013
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Norbert Rossello, Fabien Klein
  • Patent number: 8442817
    Abstract: It is provided a voice activity decision apparatus capable of accurately performing the decision on the state being associated with a sound interval or a silence interval also in terms of the input signal having many aperiodic components and/or plural mixed different periodic components. The apparatus 1 comprises: an autocorrelation calculating unit 11 for calculating autocorrelation values of an input signal; a delay calculating unit 12 for calculating plural delays at which autocorrelation values calculated by the autocorrelation calculating unit 11 become maximums; a noise deciding unit 13 for deciding whether the input signal is a noise or not based on the plurality of delays calculated by the delay calculating unit 12; and an activity decision unit 14 for performing the activity decision in terms of the input signal based on results of decision by the noise deciding unit 13 and the input signal.
    Type: Grant
    Filed: December 23, 2004
    Date of Patent: May 14, 2013
    Assignee: NTT DoCoMo, Inc.
    Inventors: Nobuhiko Naka, Tomoyuki Ohya