Formant Patents (Class 704/209)
  • Patent number: 11495245
    Abstract: An urgency level estimation technique of estimating an urgency level of a speaker for free uttered speech, which does not require a specific word, is provided. An urgency level estimation apparatus includes a feature amount extracting part configured to extract a feature amount of an utterance from uttered speech, and an urgency level estimating part configured to estimate an urgency level of a speaker of the uttered speech from the feature amount based on a relationship between a feature amount extracted from uttered speech and an urgency level of a speaker of the uttered speech, the relationship being determined in advance, and the feature amount includes at least one of a feature indicating speaking speed of the uttered speech, a feature indicating voice pitch of the uttered speech and a feature indicating a power level of the uttered speech.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: November 8, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hosana Kamiyama, Satoshi Kobashikawa, Atsushi Ando
  • Patent number: 11495200
    Abstract: A method of converting a frame of a voice sample to a singing frame includes obtaining a pitch value of the frame; obtaining formant information of the frame using the pitch value; obtaining aperiodicity information of the frame using the pitch value; obtaining a tonic pitch and chord pitches; using the formant information, the aperiodicity information, the tonic pitch, and the chord pitches to obtain the singing frame; and outputting or saving the singing frame.
    Type: Grant
    Filed: January 14, 2021
    Date of Patent: November 8, 2022
    Assignee: Agora Lab, Inc.
    Inventors: Jianyuan Feng, Ruixiang Hang, Linsheng Zhao, Fan Li
  • Patent number: 11314945
    Abstract: In some embodiments, text for user consumption may be generated based on an intended user action category and a user profile. In some embodiments, an action category, a plurality of text seeds, and a profile comprising feature values may be obtained. Context values may be generated based on the feature values, and text generation models may be obtained based on the text seeds. In some embodiments, messages may be generated using the text generation models based on the action category and the context values. Weights associated with the messages may be determined, and a first text message of the messages may be sent to an address associated with the profile based on the weights. Based on a reaction value obtained in response to the first message, a first expected allocation value may be updated based on the reaction value.
    Type: Grant
    Filed: January 6, 2021
    Date of Patent: April 26, 2022
    Assignee: Capital One Services, LLC
    Inventors: Huong Nguyen, Isha Chaturvedi, Kalanand Mishra
  • Patent number: 11074926
    Abstract: A method for voice signal fatigue compensation, that includes sampling, using an audio signal capturing apparatus, a segment of a voice signal in a normal time series to form a normal series sample, generating, using a processor and a memory, from the normal series sample, a reversed series sample, and constructing, by executing using the processor and the memory a time-series mixing component, a first synthesized segment by mixing the normal series sample and the reversed series sample, the first synthesized segment including a compensation for an instance of micro fatigue in the segment of the voice signal. The method also includes forming a fatigue-compensated voice segment from the first synthesized segment, and outputting, as a fatigue-compensated voice segment, the first synthesized segment.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: July 27, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Aaron K. Baughman, Shikhar Kwatra, Gary William Reiss, Gray Cannon
  • Patent number: 10964307
    Abstract: A method far adjusting a voice frequency and a sound playing device thereof are disclosed. The method includes the following steps: acquiring an input voice; when the input voice has a consonant, performing: detecting whether a main frequency range of the consonant has an ambient sound and an energy of the ambient sound is enough to disturb the consonant; if not, the frequency of the consonant is not shifted and the consonant is output; and if so, the consonant is shifted to a target frequency to avoid the ambient sound to form a frequency-shifting consonant. Then the frequency-shifting sub-note consonant is output to form an output voice. The target frequency is located near the main frequency of the consonant, and no other ambient sound exists in the target frequency and no the energy of the other ambient sound is enough to disturb the consonant.
    Type: Grant
    Filed: March 21, 2019
    Date of Patent: March 30, 2021
    Assignee: PIXART IMAGING INC.
    Inventors: Yu-Chieh Huang, Kuan-Li Chao, Neo Bob Chih-Yung Young, Kuo-Ping Yang
  • Patent number: 10565533
    Abstract: Exemplary embodiments of the present disclosure provide for identifying similar trademarks from one or more repositories based on training a goods and/or services similarity engine to identify similarities between pairs of descriptions of goods and/or services in a corpus of training data that includes the descriptions of goods and/or services for registered trademarks and trademark classes associated with the descriptions of goods and/or services. A goods and/or services similarity value indicative of similarities between a reference description of goods and/or services and descriptions of goods and/or services associated with registered trademarks can be generated by a goods and/or services similarity engine and a presentation of at least a subset of the set of trademarks can be generated that includes graphics emphasizing the registered trademarks in the subset based, at least in part, on the plurality of goods and/or services similarity values.
    Type: Grant
    Filed: May 19, 2016
    Date of Patent: February 18, 2020
    Assignee: Camelot UK Bidco Limited
    Inventors: Peter Keyngnaert, Jan Waerniers, Ann Smet
  • Patent number: 10403274
    Abstract: An automatic speech recognition with detection of at least one contextual element, and application to aircraft flying and maintenance are provided. The automatic speech recognition device comprises a unit for acquiring an audio signal, a device for detecting the state of at least one contextual element, and a language decoder for determining an oral instruction corresponding to the audio signal. The language decoder comprises at least one acoustic model defining an acoustic probability law and at least two syntax models each defining a syntax probability law.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: September 3, 2019
    Assignee: DASSAULT AVIATION
    Inventors: Hervé Girod, Paul Kou, Jean-François Saez
  • Patent number: 10297268
    Abstract: A voice signal processing apparatus and a voice signal processing method are provided. Adjust a consonant signal judgment condition of a target voice frame according to whether an original voice sampling signal corresponding to a previous voice frame adjacent to the target voice frame is a consonant signal, so as to improve comfort of listening to the sound and recognition of a voice signal.
    Type: Grant
    Filed: November 2, 2017
    Date of Patent: May 21, 2019
    Assignee: Acer Incorporated
    Inventors: Po-Jen Tu, Jia-Ren Chang, Kai-Meng Tzeng
  • Patent number: 10235998
    Abstract: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if one or more of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Based at least on part on the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: March 19, 2019
    Inventor: Karen Elaine Khaleghi
  • Patent number: 10192561
    Abstract: An audio processor for processing an audio signal includes an audio signal phase measure calculator configured for calculating a phase measure of an audio signal for a time frame, a target phase measure determiner for determining a target phase measure for the time frame, and a phase corrector configured for correcting phases of the audio signal for the time frame using the calculated phase measure and the target phase measure to obtain a processed audio signal.
    Type: Grant
    Filed: December 28, 2016
    Date of Patent: January 29, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Mikko-Ville Laitinen, Ville Pulkki
  • Patent number: 10141001
    Abstract: An apparatus includes a first calculator configured to determine a long-term noise estimate of the audio signal. The apparatus also includes a second calculator configured to determine a formant-sharpening factor based on the determined long-term noise estimate. The apparatus includes a filter configured to filter a codebook vector to generate a filtered codebook vector. The filter is based on the determined formant-sharpening factor, and the codebook vector is based on information from the audio signal. The apparatus further includes an audio coder configured to generate a formant-sharpened low-band excitation signal based on the filtered codebook vector.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: November 27, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatraman Atti, Vivek Rajendran, Venkatesh Krishnan
  • Patent number: 10043525
    Abstract: A method is provided for extending the frequency band of an audio signal during a decoding or improvement process. The method includes obtaining the decoded signal in a first frequency band, referred to as a low band. Tonal components and a surround signal are extracted from the signal from the low-band signal, and the tonal components and the surround signal are combined by adaptive mixing using energy-level control factors to obtain an audio signal, referred to as a combined signal. The low-band decoded signal before the extraction step or the combined signal after the combination step are extended over at least one second frequency band which is higher than the first frequency band. Also proved are a frequency-band extension device which implements the described method and a decoder including a device of this type.
    Type: Grant
    Filed: February 4, 2015
    Date of Patent: August 7, 2018
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Magdalena Kaniewska, Stephane Ragot
  • Patent number: 9998081
    Abstract: An apparatus comprising at least one processor and at least one memory including computer program code. The at least one memory and the computer program code is configured to, with the at least one processor, cause the apparatus at least to determine a loudness estimate of a first audio signal, generate a parameter dependent on the loudness estimate; and control the first audio signal dependent on the parameter.
    Type: Grant
    Filed: May 12, 2010
    Date of Patent: June 12, 2018
    Assignee: Nokia Technologies Oy
    Inventors: Jukka Vesa Rauhala, Koray Ozcan
  • Patent number: 9911358
    Abstract: A real-time wireless system for recording natural tongue movements in the 3D oral space. By attaching a small magnetic tracer to the tongue, either temporarily or semi-permanently, and placing an array of magnetic sensors around the mouth, the tracer can be localized with sub-millimeter precision. The system can also combine the tracer localization with acoustic, video, and flow data via additional sensors to form a comprehensive audiovisual biofeedback mechanism for diagnosing speech impairments and improving speech therapy. Additionally, the system can record tongue trajectories and create an indexed library of such traces. The indexed library can be used as a tongue tracking silent speech interface. The library can synthesize words, phrases, or execute commands tied to the individual patterns of magnetic field variations or tongue trajectories.
    Type: Grant
    Filed: May 19, 2014
    Date of Patent: March 6, 2018
    Assignee: Georgia Tech Research Corporation
    Inventors: Maysam Ghovanloo, Jacob Block
  • Patent number: 9886959
    Abstract: A voice encoder/decoder (vocoder) may provide receiving a voice sample and generating zero crossings of the voice sample in response to voice excitation in a first formant and creating a corresponding output signal. Additional operations may include dividing the output signal by two, and sampling the output signal at a predefined frequency such that a resulting combination uses half of a bit rate for an excitation and a remainder for short term spectrum analysis.
    Type: Grant
    Filed: October 9, 2013
    Date of Patent: February 6, 2018
    Assignee: Open Invention Network LLC
    Inventor: Clyde Holmes
  • Patent number: 9818429
    Abstract: A method and apparatus to encoding or decoding an audio signal is provided. In the method and apparatus, a noise-floor level to use in encoding or decoding a high frequency signal is updated according to the degree of a voiced or unvoiced sound included in the signal.
    Type: Grant
    Filed: October 9, 2015
    Date of Patent: November 14, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-hyun Choo, Eun-mi Oh, Ho-sang Sung, Jung-Hoe Kim, Mi-young Kim
  • Patent number: 9763006
    Abstract: In an approach for reducing feedback in a device, wherein a first audio signal is received by a device. A processor determines that an intensity value of the first audio signal is increasing above a predetermined threshold value. A processor determines whether the first audio signal includes a vowel type sound. A processor responsive to determining that the first audio signal does not include the vowel type sound clips the first audio signal.
    Type: Grant
    Filed: March 26, 2015
    Date of Patent: September 12, 2017
    Assignee: International Business Machines Corporation
    Inventors: Anusha Chippigiri Acharya, Mukundan Sundararajan
  • Patent number: 9672833
    Abstract: Provided are methods and systems for concealing missing segments and/or discontinuities in an audio signal, thereby restoring the continuity of the signal. The methods and systems are designed for and targeted at audio signals, are based on interpolation and extrapolation operations for sinusoids, and do not rely on the assumption that the sinusoids are harmonic. The methods and systems are improvements over existing audio concealment approaches in that, among other advantages, the methods and systems facilitate asynchronous interpolation, use an interpolation procedure that corresponds to time-domain waveform interpolation if the signal is harmonic, and have a peak selection procedure that is effective for audio signals.
    Type: Grant
    Filed: February 28, 2014
    Date of Patent: June 6, 2017
    Assignee: Google Inc.
    Inventors: Willem Bastiaan Kleijn, Turaj Zakizadeh Shabestary
  • Patent number: 9646633
    Abstract: Method and device of processing audio signals are disclosed. The method includes: obtaining a set of data, the set of data comprising LSP parameters for an audio signal; determining a set of sampling data points from the set of LSP parameters using a predetermined sampling rule, the set of sampling data points including spectrum amplitude values for a plurality of sampled frequency values; identifying one or more local maxima among the set of sampling data points, and a respective preceding local minimum and a respective succeeding local minimum for each of the identified local maxima; for each of the identified local maxima, shifting one or more of the set of data comprising LSP parameters located between the respective preceding local minimum and the respective succeeding local minimum of an identified local maximum towards the identified local maximum; and adjusting the set of data comprising LSP parameters using an energy coefficient.
    Type: Grant
    Filed: June 16, 2016
    Date of Patent: May 9, 2017
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Xiaoping Wu
  • Patent number: 9576550
    Abstract: A low voltage differential signaling (LVDS) transmitter may include an LVDS transmission device configured to generate a transmission clock and serial data, which are synchronized to the transmission clock on respective clock and data channels. The transmission clock may have different signal patterns when the LVDS transmission device is operating in normal and de-skew modes of operation. A de-skew controller is also provided, which is electrically coupled to the LVDS transmission device. The de-skew controller is configured to drive the LVDS transmission device with control signals that switch the LVDS transmission device between the normal and de-skew modes of operation. A duty cycle of the transmission clock during the de-skew mode of operation may be unequal to a duty cycle of the transmission clock during the normal mode of operation.
    Type: Grant
    Filed: January 15, 2015
    Date of Patent: February 21, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Phil Jae Jeon
  • Patent number: 9576590
    Abstract: An apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor to cause the apparatus to at least perform: estimating a signal to noise ratio value for an audio signal; generating a post-filter comprising at least one of: a first formant frequency filter and a second formant frequency filter, wherein the post-filter is dependent on the signal to noise ratio value for the audio signal.
    Type: Grant
    Filed: February 24, 2012
    Date of Patent: February 21, 2017
    Assignee: Nokia Technologies Oy
    Inventors: Jari Sjoberg, Ville Myllyla, Emma Johanna Jokinen, Paavo Ilmari Alku, Hannu Juhani Pulakka
  • Patent number: 9524650
    Abstract: An automated training system comprising a database containing audio files and a training script that defines a sequence of the audio files making up a training call. The system includes a training engine that automatically makes a call to an external system via a first communications connection, executes the training script and outputs audio data contained in the audio files to the external system via the first communications connection in accordance with the training script. The system has a response receiver that receives voice data from the external system, the voice data representing the voice responses of a user of the external system to the training call. A method for training an employee using the automated training system is also disclosed.
    Type: Grant
    Filed: June 3, 2013
    Date of Patent: December 20, 2016
    Inventor: Hesam Yavari
  • Patent number: 9312826
    Abstract: Systems and methods are described to automatically balance acoustic channel sensitivity. A long-term power level of a main acoustic signal is calculated to obtain an averaged main acoustic signal. Segments of the main acoustic signal are excluded from the averaged main acoustic signal using a desired voice activity detection signal. A long-term power level of a reference acoustic signal is calculated to obtain an averaged reference acoustic signal. Segments of the reference acoustic signal are excluded from the averaged reference acoustic signal using a desired voice activity detection signal. An amplitude correction signal is created using the averaged main acoustic signal and the averaged reference acoustic signal.
    Type: Grant
    Filed: March 12, 2014
    Date of Patent: April 12, 2016
    Assignee: KOPIN CORPORATION
    Inventor: Dashen Fan
  • Patent number: 9268845
    Abstract: Systems and methods audio matching using interest point overlap are disclosed herein. The systems include determining at least one matching reference segment based on a probe segment. Interest points for both the at least one matching reference segment and the probe segment can be generated. Probe segment interest points and matching reference segment interest points can be time aligned and frequency aligned. A count can be generated based on a number of overlapping interest points between each set of reference interest points and the set of probe segment interest points. The disclosed systems and methods allow false positive reference to be identified and eliminated based on the count. The benefits in eliminating false positive matches improve the accuracy of an audio matching system.
    Type: Grant
    Filed: March 8, 2012
    Date of Patent: February 23, 2016
    Assignee: GOOGLE INC.
    Inventors: Matthew Sharifi, Gheorghe Postelnicu, Annie Chen, Dominik Roblek
  • Patent number: 9141604
    Abstract: The present disclosure describes a novel method of analyzing and presenting results of human emotion during a session such as chat, video, audio and combination thereof in real time. The analysis is done using semiotic analysis and hierarchical slope clustering to give feedback for a session or historical sessions to the user or any professional. The method and system is useful for recognizing reaction for a particular session or detection of abnormal behavior. The method and system with unique algorithm is useful in getting instant feedback to stay the course or change in strategy for a desired result during the session.
    Type: Grant
    Filed: February 22, 2013
    Date of Patent: September 22, 2015
    Assignee: RIAEX INC
    Inventors: Rajkumar Thirumalainambi, Shubha Ranjan
  • Patent number: 9117455
    Abstract: Systems and methods for adaptively processing speech to improve voice intelligibility are described. These systems and methods can adaptively identify and track formant locations, thereby enabling formants to be emphasized as they change. As a result, these systems and methods can improve near-end intelligibility, even in noisy environments. The systems and methods can be implemented in Voice-over IP (VoIP) applications, telephone and/or video conference applications (including on cellular phones, smart phones, and the like), laptop and tablet communications, and the like. The systems and methods can also enhance non-voiced speech, which can include speech generated without the vocal track, such as transient speech.
    Type: Grant
    Filed: July 26, 2012
    Date of Patent: August 25, 2015
    Assignee: DTS LLC
    Inventors: James Tracey, Daekyong Noh, Xing He
  • Patent number: 9031835
    Abstract: In a method of improving perceived loudness and sharpness of a reconstructed speech signal delimited by a predetermined bandwidth, performing the steps of providing (S10) the speech signal, and separating (S20) the provided signal into at least a first and a second signal portion. Subsequently, adapting (S30) the first signal portion to emphasize at least a predetermined frequency or frequency interval within the first bandwidth portion. Finally, reconstructing (S40) the second signal portion based on at least the first signal portion, and combining (S50) the adapted first signal portion and the reconstructed second signal portion to provide a reconstructed speech signal with an overall improved perceived loudness and sharpness.
    Type: Grant
    Filed: June 29, 2010
    Date of Patent: May 12, 2015
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Volodya Grancharov, Sigurdur Sverrisson
  • Patent number: 8990081
    Abstract: A method of analyzing an audio signal is disclosed. A digital representation of an audio signal is received and a first output function is generated based on a response of a physiological model to the digital representation. At least one property of the first output function may be determined. One or more values are determined for use in analyzing the audio signal, based on the determined property of the first output function.
    Type: Grant
    Filed: September 11, 2009
    Date of Patent: March 24, 2015
    Assignee: Newsouth Innovations Pty Limited
    Inventors: Wenliang Lu, Dipanjan Sen
  • Patent number: 8949125
    Abstract: Systems and methods are provided to select a most typical pronunciation of a location name on a map from a plurality of user pronunciations. A server generates a reference speech model based on user pronunciations, compares the user pronunciations with the speech model and selects a pronunciation based on comparison. Alternatively, the server compares the distance between one the user pronunciations and every other user pronunciations and selects a pronunciation based on comparison. The server then annotates the map with the selected pronunciation and provides the audio output of the location name to a user device upon a user's request.
    Type: Grant
    Filed: June 16, 2010
    Date of Patent: February 3, 2015
    Assignee: Google Inc.
    Inventor: Gal Chechik
  • Patent number: 8949128
    Abstract: Techniques for providing speech output for speech-enabled applications. A synthesis system receives from a speech-enabled application a text input including a text transcription of a desired speech output. The synthesis system selects one or more audio recordings corresponding to one or more portions of the text input. In one aspect, the synthesis system selects from audio recordings provided by a developer of the speech-enabled application. In another aspect, the synthesis system selects an audio recording of a speaker speaking a plurality of words. The synthesis system forms a speech output including the one or more selected audio recordings and provides the speech output for the speech-enabled application.
    Type: Grant
    Filed: February 12, 2010
    Date of Patent: February 3, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Corinne Bos-Plachez, Martine Marguerite Staessen
  • Patent number: 8949116
    Abstract: A signal processing method is provided. The signal processing method includes extracting a first signal having a first frequency band from a sum signal of a left signal and a right signal, generating a second signal having a second frequency band by using the first signal, generating a third signal by using the first signal and the second signal, and applying a gain, generated by using a rate of a center signal included in the sum signal, to the third signal.
    Type: Grant
    Filed: January 28, 2011
    Date of Patent: February 3, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jae-Hyun Kim
  • Patent number: 8930182
    Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.
    Type: Grant
    Filed: March 17, 2011
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8914291
    Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: December 16, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Stephen R. Springer
  • Patent number: 8879762
    Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: November 4, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: In-Yong Choi
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Publication number: 20140309992
    Abstract: Formant frequencies in a voiced speech signal are detected by filtering the speech signal into multiple frequency channels, determining whether each of the frequency channels meets an energy criterion, and determining minima in envelope fluctuations. The identified formant frequencies can then be enhanced by identifying and amplifying the harmonic of the fundamental frequency (F0) closest to the formant frequency.
    Type: Application
    Filed: April 16, 2014
    Publication date: October 16, 2014
    Applicant: UNIVERSITY OF ROCHESTER
    Inventor: Laurel H. Carney
  • Patent number: 8861689
    Abstract: Methods and systems to facilitate communications between users via different modalities. A method includes identifying, by a first user device, a voice call originating from a second user device, and presenting a user interface to a user of the first user device, where the user interface provides an option to respond to the voice call by voice and an option to respond to the voice call in a text form. The method further includes detecting that the user of the first user device has selected the option to respond to the voice call in the text form, and causing a user response to the voice call to be converted into voice data for the second user device.
    Type: Grant
    Filed: June 8, 2012
    Date of Patent: October 14, 2014
    Assignee: Amazon Technologies, Inc.
    Inventor: Marcello Typrin
  • Patent number: 8825486
    Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
    Type: Grant
    Filed: January 22, 2014
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Stephen R. Springer
  • Patent number: 8804035
    Abstract: A method and system for communicating text descriptive data has a receiving device that receives a data signal having text description data corresponding to a description of a video signal. A text-to-speech converter associated with the receiving device converts the text description data to a first audio signal. A display device in communication with the text-to-speech converter converts the first audio signal associated with the receiving device to an audible signal.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: August 12, 2014
    Assignee: The DIRECTV Group, Inc.
    Inventors: Scott D. Casavant, Brian D. Jupin, Stephen P. Dulac
  • Patent number: 8793123
    Abstract: Apparatus for converting an audio signal into a parameterized representation, has a signal analyzer for analyzing a portion of the audio signal to obtain an analysis result; a band pass estimator for estimating information of a plurality of band pass filters based on the analysis result, wherein the information on the plurality of band pass filters has information on a filter shape for the portion of the audio signal, wherein the band width of a band pass filter is different over an audio spectrum and depends on the center frequency of the band pass filter; a modulation estimator for estimating an amplitude modulation or a frequency modulation or a phase modulation for each band of the plurality of band pass filters for the portion of the audio signal using the information on the plurality of band pass filters; and an output interface for transmitting, storing or modifying information on the amplitude modulation, information on the frequency modulation or phase modulation or the information on the plurality o
    Type: Grant
    Filed: March 10, 2009
    Date of Patent: July 29, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventor: Sascha Disch
  • Patent number: 8793124
    Abstract: A scheme to judge emphasized speech portions, wherein the judgment is executed by a statistical processing in terms of a set of speech parameters including a fundamental frequency, power and a temporal variation of a dynamic measure and/or their derivatives. The emphasized speech portions are used for clues to summarize an audio content or a video content with a speech.
    Type: Grant
    Filed: April 5, 2006
    Date of Patent: July 29, 2014
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Kota Hidaka, Shinya Nakajima, Osamu Mizuno, Hidetaka Kuwano, Haruhiko Kojima
  • Patent number: 8788256
    Abstract: Computer implemented speech processing generates one or more pronunciations of an input word in a first language by a non-native speaker of the first language who is a native speaker of a second language. The input word is converted into one or more pronunciations. Each pronunciation includes one or more phonemes selected from a set of phonemes associated with the second language. Each pronunciation is associated with the input word in an entry in a computer database. Each pronunciation in the database is associated with information identifying a pronunciation language and/or a phoneme language.
    Type: Grant
    Filed: February 2, 2010
    Date of Patent: July 22, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Ruxin Chen, Gustavo Hernandez-Abrego, Masanori Omote, Xavier Menendez-Pidal
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8738370
    Abstract: A speech analyzer includes a speech acquiring section, a frequency converting section, an autocorrelation section, and a pitch detection section. The frequency converting section converts the speech signal acquired by the speech acquiring section into a frequency spectrum. The autocorrelation section determines an autocorrelation waveform by shifting the frequency spectrum along the frequency axis. The pitch detection section determines the pitch frequency from the distance between two local crests or troughs of the autocorrelation waveform.
    Type: Grant
    Filed: June 2, 2006
    Date of Patent: May 27, 2014
    Assignees: AGI Inc.
    Inventors: Shunji Mitsuyoshi, Kaoru Ogata, Fumiaki Monma
  • Patent number: 8725518
    Abstract: A system for providing automatic quality management regarding a level of conformity to a specific accent, including, a recording system, a statistical model database with statistical models representing speech data of different levels of conformity to a specific accent, a speech analysis system, a quality management system. Wherein the recording system is adapted to record one or more samples of a speakers speech and provide it to the speech analysis system for analysis, and wherein the speech analysis system is adapted to provide a score of the speakers speech samples to the quality management system by analyzing the recorded speech samples relative to the statistical models in the statistical model database.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: May 13, 2014
    Assignee: Nice Systems Ltd.
    Inventors: Moshe Waserblat, Barak Eilam
  • Patent number: 8719019
    Abstract: Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.
    Type: Grant
    Filed: April 25, 2011
    Date of Patent: May 6, 2014
    Assignee: Microsoft Corporation
    Inventors: Hoang T. Do, Ivan J. Tashev, Alejandro Acero, Jason S. Flaks, Robert N. Heitkamp, Molly R. Suver
  • Publication number: 20140122067
    Abstract: A speech analysis system uses one or more digital processors to reconstruct a speech signal by accurately extracting speech formants from a digitized version of the speech signal. The system extracts the formants by determining an estimated instantaneous frequency and an estimated instantaneous bandwidth of speech resonances of the digital version of the speech signal in real time. The system digitally filters the digital speech signal using a plurality of complex digital filters in parallel having overlapping bandwidths to ensure that substantially all of the bandwidth of the speech signal is covered. This virtual chain of overlapping complex digital filters produces a corresponding plurality of complex filtered signals. A first estimated frequency and a first estimated bandwidth is generated for each of the filtered signals, and speech resonances of the input speech signal are identified therefrom.
    Type: Application
    Filed: October 31, 2012
    Publication date: May 1, 2014
    Inventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
  • Patent number: 8712768
    Abstract: A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.
    Type: Grant
    Filed: May 25, 2004
    Date of Patent: April 29, 2014
    Assignee: Nokia Corporation
    Inventors: Laura Laaksonen, Päivi Valve
  • Patent number: 8712763
    Abstract: The present disclosure relates to a method, apparatus, and system for encoding and decoding signals. The encoding method includes: converting a first-domain signal into a second-domain signal; performing Linear Prediction (LP) processing and Long-Term Prediction (LTP) processing for the second-domain signal; obtaining a long-term flag value according to a decision criterion; obtaining a second-domain predictive signal according to the LP processing result and the LTP processing result when the long-term flag value is a first value; obtaining a second-domain predictive signal according to the LP processing result when the long-term flag value is a second value; converting the second-domain predictive signal into a first-domain predictive signal, and calculating a first-domain predictive residual signal; and outputting a bit stream that includes the first-domain predictive residual signal.
    Type: Grant
    Filed: July 17, 2013
    Date of Patent: April 29, 2014
    Assignee: Huawei Technologies Co., Ltd
    Inventors: Dejun Zhang, Lei Miao, Jianfeng Xu, Fengyan Qi, Qing Zhang, Lixiong Li, Fuwei Ma, Yang Gao