Pitch Patents (Class 704/207)
  • Patent number: 9818427
    Abstract: Embodiments of a system and method for removing speech by a user from audio frames are generally described herein. A method may include receiving a plurality of frames of audio data, extracting a set of frames of the plurality of frames, the set of frames including speech by a user with a set of remaining frames in the plurality of frames not in the set of frames, suppressing the speech by the user from the set of frames using a trained model to create a speech-suppressed set of frames, and recompiling the plurality of frames using the speech-suppressed set of frames and the set of remaining frames.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: November 14, 2017
    Assignee: Intel Corporation
    Inventors: Niall Cahill, Jakub Wenus, Mark Kelly
  • Patent number: 9813366
    Abstract: A method of communicating between a sender and a recipient via a personalized message is disclosed comprising: (a) identifying text, via the user interface of a communication device, of a desired lyric phrase from within a pre-existing audio recording; (b) extracting audio substantially associated with the desired lyric phrase from the pre-existing recording into a desired audio clip; (c) inputting personalized text via the user interface; (d) creating the personalized message with the sender identification, the personalized text and access to the desired audio clip; (e) sending an electronic message to the electronic address of the recipient, wherein the electronic message may be an SMS/EMS/MMS message, instant message or email message including a link to the personalized message or an EMS/MMS or email message including the personalized message. An associated method of earning money from the communication along with associated systems are also disclosed.
    Type: Grant
    Filed: February 12, 2016
    Date of Patent: November 7, 2017
    Assignee: Rednote LLC
    Inventors: Scott Guthery, Richard van den Bosch
  • Patent number: 9792899
    Abstract: A method for inter-dataset variability compensation, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple samples and metadata associated with at least some of the multiple samples; dividing the multiple samples into multiple homogenous subsets, based on the metadata; averaging high-level features of each of the multiple homogenous subsets, to produce multiple central high-level features for the multiple homogenous subsets, respectively; computing an inter-dataset variability subspace spanned by the multiple central high-level features; removing the inter-dataset variability subspace from the high-level features of the multiple homogenous subsets, to produce denoised samples; and training a machine learning system using the denoised speech samples.
    Type: Grant
    Filed: July 15, 2014
    Date of Patent: October 17, 2017
    Assignee: International Business Machines Corporation
    Inventor: Hagai Aronowitz
  • Patent number: 9773426
    Abstract: A method and apparatus to facilitate tone challenged singers to sing intended notes. In one aspect, the singer determines a note to sing corresponding to an intended frequency fi. The singer utters a note continuously with fundamental frequency fu into a microphone of the natural ear apparatus. The note is processed by the apparatus to produce sound emphasizing the fundamental frequency fu and output through a speaker to the auditory organs of the singer. The singer detects differences between intended frequency fi and uttered fundamental frequency fu. The singer adjusts his vocal organs as he utters the note with the intention of changing fu to reduce difference between fi and fu.
    Type: Grant
    Filed: February 1, 2016
    Date of Patent: September 26, 2017
    Assignee: Board of Regents, The University of Texas System
    Inventors: Eric A. Freudenthal, Eric M. Hanson, Bryan E. Usevitch
  • Patent number: 9767810
    Abstract: A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class. A pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. A strongly voiced class is decided by checking if the pitch lags are stable and the pitch gains are high enough with the frame; for the strongly voiced frame, the pitch lags and the pitch gains can be encoded more efficiently than other speech classes.
    Type: Grant
    Filed: April 24, 2016
    Date of Patent: September 19, 2017
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Yang Gao
  • Patent number: 9761235
    Abstract: An encoding method, a decoding method, an encoding apparatus, a decoding apparatus, a transmitter, a receiver, and a communications system. The encoding method includes: dividing a to-be-encoded time-domain signal into a low band signal and a high band signal; performing encoding on the low band signal to obtain a low frequency encoding parameter; performing encoding on the high band signal to obtain a high frequency encoding parameter, and obtaining a synthesized high band signal; performing short-time post-filtering processing on the synthesized high band signal to obtain a short-time filtering signal; and calculating a high frequency gain based on the high band signal and the short-time filtering signal. A technical solution according to the embodiments of the present invention can improve an encoding and/or decoding effect.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: September 12, 2017
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bin Wang, Zexin Liu, Lei Miao
  • Patent number: 9754193
    Abstract: Provided is a method of authenticating a user by correlating speech and corresponding lip shape. An audiovisual of a user requesting authentication is captured. The audiovisual is processed to generate a speech vector quantization sequence and a corresponding lip vector quantization sequence of the user. A likelihood of the speech vector quantization sequence and the corresponding lip vector quantization sequence with probability distributions of speech vector quantization code words corresponding to different lip shape vector quantization code words of the user requesting authentication weighed by probabilities of speech and lip vector quantization indices of the user requesting authentication is evaluated. If upon evaluation, a likelihood of the user requesting authentication being an authentic user is more than a predefined threshold, the user is authenticated.
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: September 5, 2017
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Sitaram Ramachandrula, Hariharan Ravishankar
  • Patent number: 9754585
    Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.
    Type: Grant
    Filed: April 3, 2012
    Date of Patent: September 5, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
  • Patent number: 9747927
    Abstract: A system for multifaceted singing analysis for retrieval of songs or music including singing voices having some relationship in latent semantics with a singing voice included in one particular song or music. A topic analyzing processor uses a topic model to analyze a plurality of vocal symbolic time series obtained for a plurality of musical audio signals. The topic analyzing processor generates a vocal topic distribution for each of the musical audio signals whereby the vocal topic distribution is composed of a plurality of vocal topics each indicating a relationship of one of the musical audio signals with the other musical audio signals. The topic analyzing processor generates a vocal symbol distribution for each of the vocal topics whereby the vocal symbol distribution indicates occurrence probabilities for the vocal symbols. A multifaceted singing analyzing processor performs analysis of singing voices included in musical audio signals, in the multifaceted viewpoint.
    Type: Grant
    Filed: August 15, 2014
    Date of Patent: August 29, 2017
    Assignee: NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY
    Inventors: Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto
  • Patent number: 9747907
    Abstract: According to an embodiment, a digital watermark detecting device includes a residual signal extractor, a voiced period estimator, a storage, a phase estimator, and a watermark determiner. The residual signal extractor is configured to extract a residual signal from a speech signal. The voiced period estimator is configured to estimate a voiced period based on the speech signal. The storage is configured to store pulse signals modulated in advance so as to have different phases. The phase estimator is configured to clip the voiced period in units of an analysis frame having a predetermined length, and perform pattern matching between the residual signal in the analysis frame and the pulse signals to estimate phase of the speech signal. The watermark determiner is configured to, based on a sequence of phases estimated by the phase estimator, determine whether a digital watermark is embedded in the speech signal or not.
    Type: Grant
    Filed: May 10, 2016
    Date of Patent: August 29, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro Tachibana, Masahiro Morita
  • Patent number: 9741357
    Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
    Type: Grant
    Filed: June 19, 2015
    Date of Patent: August 22, 2017
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Yang Gao, Fengyan Qi
  • Patent number: 9734836
    Abstract: A method and an apparatus for decoding a speech/audio bitstream are disclosed, where the method for decoding a speech/audio bitstream includes determining whether a current frame is a normal decoding frame or a redundancy decoding frame, obtaining a decoded parameter of the current frame by means of parsing when the current frame is a normal decoding frame or a redundancy decoding frame, performing post-processing on the decoded parameter of the current frame to obtain a post-processed decoded parameter of the current frame, and using the post-processed decoded parameter of the current frame to reconstruct a speech/audio signal.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: August 15, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Xingtao Zhang, Lei Miao
  • Patent number: 9734844
    Abstract: Embodiments of the present invention relate to detecting irregularities in audio, such as music. An input signal corresponding to an audio stream is received. The input signal is transformed from a time domain into a frequency domain to generate a plurality of frames that each comprises frequency information for a portion of the input signal. An irregular event in a portion of the input signal corresponding to a set of frames in the plurality of frames is identified based on a comparison of frequency information of the set of frames to the frequency information of other sets of frames of the plurality of frames. This allows an indication of the irregular event to be provided, or for the input signal to be automatically synchronized to a multimedia event.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: August 15, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Minje Kim, Gautham Mysore, Peter Merrill, Paris Smaragdis
  • Patent number: 9722965
    Abstract: A method to send an alert for nonproductivity associated with a conversation is provided. The method may include recording a plurality of communication outputs of at least two users engaged in a remote message exchange or a remote conversation. The method may also include creating a plurality of text tokens based on the recorded plurality of communication outputs. The method may include analyzing, by a graphical text analyzer, the created plurality of text tokens to determine whether the plurality of text tokens has fallen below a threshold. The method may further include sending an alert to the plurality of users involved in the conversation if it is determined that the plurality of text tokens has fallen below the threshold.
    Type: Grant
    Filed: January 29, 2015
    Date of Patent: August 1, 2017
    Assignee: International Business Machines Corporation
    Inventors: Guillermo A. Cecchi, James R. Kozloski, Clifford A. Pickover, Irina Rish
  • Patent number: 9715540
    Abstract: Systems and associated methods configured to provide user-driven audio content navigation for the spoken web are described. Embodiments allow users to skim audio for content that seems to be of relevance to the user, similar to visual skimming of standard web pages, and mark point of interest within the audio. Embodiments provide techniques for navigating audio content while interacting with information systems in a client-server environment, where the client device can be a simple, standard telephone.
    Type: Grant
    Filed: June 24, 2010
    Date of Patent: July 25, 2017
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Nitendra Rajput
  • Patent number: 9710552
    Abstract: Systems and associated methods configured to provide user-driven audio content navigation for the spoken web are described. Embodiments allow users to skim audio for content that seems to be of relevance to the user, similar to visual skimming of standard web pages, and mark point of interest within the audio. Embodiments provide techniques for navigating audio content while interacting with information systems in a client-server environment, where the client device can be a simple, standard telephone.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: July 18, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nitendra Rajput, Om D. Deshmukh
  • Patent number: 9685170
    Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.
    Type: Grant
    Filed: October 21, 2015
    Date of Patent: June 20, 2017
    Assignee: International Business Machines Corporation
    Inventor: Slava Shechtman
  • Patent number: 9653095
    Abstract: A dataset representing repeated sounds within a musical composition recorded on an audio track may be constructed. An audio track duration of an audio track may be partitioned into partitions of a partition size. A current partition may be compared to remaining partitions of the audio track. Audio information for the current partition may be correlated to audio information for remaining partitions to determine a correlated partition for the current partition from among the remaining partitions of the track duration. The correlated partition determined may be identified as most likely to represent the same sound as the current partition. This comparison process may be performed iteratively, for individual ones of the remaining partitions. Correlation results of the comparison process may be recorded to represent the partition time period of the correlated partition as a function of partition time period of the current partition.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: May 16, 2017
    Assignee: GoPro, Inc.
    Inventor: David Tcheng
  • Patent number: 9641673
    Abstract: Embodiments of the present invention disclose a method, a network element, and a system for assessing voice quality, which relates to the communications field and solves a problem that user perception cannot be reflected according to a voice quality assessment result. The method includes acquiring a voice code stream, and collecting statistics on a transmission parameter in each short-time assessment period; decoding the voice code stream, and collecting statistics on a source parameter according to the decoded voice code stream; and calculating a comprehensive voice quality assessment result according the transmission parameter and the source parameter. The present invention is used for voice quality assessment.
    Type: Grant
    Filed: February 24, 2015
    Date of Patent: May 2, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Chunmei Lu, Hongbo Yang, Yunjuan Xie, Haijiao Wang
  • Patent number: 9640200
    Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.
    Type: Grant
    Filed: March 3, 2014
    Date of Patent: May 2, 2017
    Assignee: University of Maryland, College Park
    Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
  • Patent number: 9626987
    Abstract: A speech enhancement apparatus includes: a noise estimating unit which estimates a noise component contained in a speech signal for each frequency band; a signal-to-noise ratio computing unit which computes, for each frequency band, a signal-to-noise ratio; a gain computing unit which selects a frequency band whose computed signal-to-noise ratio indicates that the signal component contained in the speech signal for the frequency band is recognizable, and which determines a gain indicating the degree of enhancement to be applied to the speech signal in accordance with the signal-to-noise ratio of the selected frequency band; and an enhancing unit which amplifies an amplitude component of a frequency domain signal in each frequency band in accordance with the gain, and which corrects the amplitude component of the frequency domain signal by subtracting the noise component from the amplitude component in each frequency band.
    Type: Grant
    Filed: November 6, 2013
    Date of Patent: April 18, 2017
    Assignee: FUJITSU LIMITED
    Inventor: Naoshi Matsuo
  • Patent number: 9613629
    Abstract: A signal processing device, media, and method are provided, where a signal comprises a succession of samples distributed in successive frames. The processing is implemented during decoding of such a signal in order to replace at least one signal frame lost in decoding, and comprising in particular: a) searching, in a valid signal available to the decoder, for a signal segment of length corresponding to a period set as a function of the valid signal; b) analyzing a spectrum of the segment in order to determine spectral components of the segment; and c) synthesizing at least one replacement frame for the lost frame by construction of a synthesized signal from at least a portion of the spectral components.
    Type: Grant
    Filed: January 30, 2014
    Date of Patent: April 4, 2017
    Assignee: Orange
    Inventors: Julien Faure, Stephane Ragot
  • Patent number: 9607612
    Abstract: Technologies for natural language interactions with virtual personal assistant systems include a computing device configured to capture audio input, distort the audio input to produce a number of distorted audio variations, and perform speech recognition on the audio input and the distorted audio variants. The computing device selects a result from a large number of potential speech recognition results based on contextual information. The computing device may measure a user's engagement level by using an eye tracking sensor to determine whether the user is visually focused on an avatar rendered by the virtual personal assistant. The avatar may be rendered in a disengaged state, a ready state, or an engaged state based on the user engagement level. The avatar may be rendered as semitransparent in the disengaged state, and the transparency may be reduced in the ready state or the engaged state. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 20, 2013
    Date of Patent: March 28, 2017
    Assignee: Intel Corporation
    Inventor: William C. Deleeuw
  • Patent number: 9590825
    Abstract: The discovery of a topology of a network with an unknown topology can enable the selection of a data path within the network, and the establishment of a data stream over the selected data path. Routing tables mapping originating nodes to input ports can be created based on the receipt of discovery messages generated by the originating nodes. A source node can select a data path between the source node and a sink node in order to establish a data stream using the routing tables. Data paths can be selected based on, for instance, routing table bandwidth information, latency information, and/or distance information. Data streams can be established over the selected data path, and each node can release any reserved output bandwidth determined to be unnecessary for the data stream.
    Type: Grant
    Filed: January 14, 2015
    Date of Patent: March 7, 2017
    Assignee: Lattice Semiconductor Corporation
    Inventors: Taliaferro Smith, Sergey Yarygin
  • Patent number: 9553900
    Abstract: Systems and methods for a conferencing system. Responsive to a new conference request received at a conference orchestration service, participants of the conference and participant regions for each determined participant are determined. A mixer topology is generated that specifies an assignment of each determined participant to at least one input channel of a plurality of mixers. A mixer state manager generates the mixer topology based on the determined participant regions and at least one regional association of a mixer. Media of each determined participant is routed to the assigned at least one input channel according to the generated mixer topology by using the conference orchestration service. The mixer state manager generates the topology responsive to a request provided by the conference state manager. The conference orchestration service receives the generated mixer topology from the mixer state manager via the conference state manager.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: January 24, 2017
    Assignee: Twilio, Inc.
    Inventors: Christer Fahlgren, Nico Acosta Amador
  • Patent number: 9553553
    Abstract: An engine sound synthesis system is operable to analyze sound. Operation of the system may include providing an input sound signal to be analysed and determining a fundamental frequency of the input signal from the input signal or from at least one guide signal. Furthermore, the frequencies of higher harmonics of the fundamental frequency are determined, thus determining harmonic model parameters. A harmonic signal based on the harmonic model parameters is synthesized and a residual signal is estimated by subtracting the harmonic signal from the input signal. Residual model parameters are estimated based on the residual signal. Furthermore, a corresponding method for synthesizing a sound signal is described.
    Type: Grant
    Filed: July 12, 2013
    Date of Patent: January 24, 2017
    Assignee: Harman Becker Automotive Systems GmbH
    Inventor: Markus Christoph
  • Patent number: 9548060
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes extracting temporal envelope information and spectral components of the baseband portion. The method further includes obtaining a decoded baseband audio signal. The obtaining includes filtering in a frequency domain at least some of the spectral components of the baseband portion with the reconstruction filter using the temporal envelope information to shape a temporal envelope of the baseband portion. The method also includes extracting a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: January 17, 2017
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9542939
    Abstract: In speech recognition, the duration of a phoneme is taken into account when determining recognition scores. Specifically, the duration of a phoneme may be evaluated relative to the duration of neighboring phonemes. A phoneme that is interpreted to be significantly longer or shorter than its neighbors may be given a lower duration score. A duration score for a phoneme may be calculated and used to adjust a recognition score. In this manner a duration model may supplement an acoustic model and language model to improve speech recognition results.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: January 10, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Bjorn Hoffmeister
  • Patent number: 9536523
    Abstract: A system for distinguishing and identifying speech segments originating from speech of one or more relevant speakers in a predefined detection area. The system includes an optical system which outputs optical patterns, each representing audio signals as detected by the optical system in the area within a specific time frame; and a computer processor which receives each of the outputted optical patterns and analyses each respective optical pattern to provide information that enables identification of speech segments thereby, by identifying blank spaces in the optical pattern, which define beginning or ending of each respective speech segment.
    Type: Grant
    Filed: June 21, 2012
    Date of Patent: January 3, 2017
    Assignee: VOCALZOOM SYSTEMS LTD.
    Inventors: Tal Bakish, Gavriel Horowitz, Yekutiel Avargel, Yechiel Kurtz
  • Patent number: 9530423
    Abstract: A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modelled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal comprising successive frames. For each of a plurality of frames of the speech signal: adding a predetermined noise signal generated by a quantization gain multiplied by 0.5 times an inverse of a pitch correlation to the speech signal to generate a simulated signal, determining linear predictive coding coefficients based on the simulated signal frame, and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal and the simulated signal. Then forming an encoded signal representing said speech signal, based on the linear predictive coding coefficients and the linear predictive coding residual signal.
    Type: Grant
    Filed: August 28, 2009
    Date of Patent: December 27, 2016
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Patent number: 9524735
    Abstract: A method for adapting a threshold used in multi-channel audio voice activity detection. Strengths of primary and secondary sound pick up channels are computed. A separation, being a measure of difference between the strengths of the primary and secondary channels, is also computed. An analysis of the peaks in separation is performed, e.g. using a leaky peak capture function that captures a peak in the separation and then decays over time, or using a sliding window min-max detector. A threshold that is to be used in a voice activity detection (VAD) process is adjusted, in accordance with the analysis of the peaks. Other embodiments are also described and claimed.
    Type: Grant
    Filed: January 31, 2014
    Date of Patent: December 20, 2016
    Assignee: Apple Inc.
    Inventors: Vasu Iyengar, Aram M. Lindahl
  • Patent number: 9510309
    Abstract: A device includes a receiver and a processor. The receiver is configured to receive a signal. The processor is configured to generate a first flag indicating whether the signal satisfies one or more first conditions that are based on a number of detected correlation peaks associated with the signal, a correlation peak amplitude, or both, and to generate a second flag indicating whether an inverted signal satisfies one or more second conditions. The processor is further configured to generate a first value of a first synchronization sign indicator associated with the signal and to generate a second value of a second synchronization sign indicator associated with the inverted signal. The processor is also configured to generate an invert flag that indicates whether synchronization inversion is detected in the signal based at least in part on the first flag, the second flag, the first value, and the second value.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: November 29, 2016
    Assignee: Qualcomm Incorporated
    Inventor: Ralf Martin Weber
  • Patent number: 9502045
    Abstract: In general, techniques are described for coding an ambient higher order ambisonic coefficient. An audio decoding device comprising a memory and a processor may perform the techniques. The memory may store a first frame of a bitstream and a second frame of the bitstream. The processor may obtain, from the first frame, one or more bits indicative of whether the first frame is an independent frame that includes additional reference information to enable the first frame to be decoded without reference to the second frame. The processor may further obtain, in response to the one or more bits indicating that the first frame is not an independent frame, prediction information for first channel side information data of a transport channel. The prediction information may be used to decode the first channel side information data of the transport channel with reference to second channel side information data of the transport channel.
    Type: Grant
    Filed: January 29, 2015
    Date of Patent: November 22, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Nils Günther Peters, Dipanjan Sen
  • Patent number: 9489864
    Abstract: Computer-implemented systems and methods are provided for assessing non-native speech proficiency. a non-native speech sample is processed to identify a plurality of vowel sound boundaries in the non-native speech sample. Portions of the non-native speech sample are analyzed within the vowel sound boundaries to extract vowel characteristics associated with a first vowel sound and a second vowel sound represented in the non-native speech sample. The vowel characteristics are processed to identify a first vowel pronunciation metric for the first vowel sound and a second vowel pronunciation metric for the second vowel sound, and the first vowel pronunciation metric and the second vowel pronunciation metric are processed to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound.
    Type: Grant
    Filed: January 7, 2014
    Date of Patent: November 8, 2016
    Assignee: Educational Testing Service
    Inventor: Keelan Evanini
  • Patent number: 9484035
    Abstract: A system and method for distributed speech recognition is provided. A prompt is provided to a caller during a call. One or more audio responses are received from the caller in response to the prompt. Distributed speech recognition is performed on the audio responses by providing a non-overlapping section of a main grammar to each of a plurality of secondary recognizers for each audio response. Speech recognition is performed on the audio responses by each of the secondary recognizers using the non-overlapping section of the main grammar associated with that secondary recognizer. A new grammar is generated based on results of the speech recognition from each of the secondary recognizers. Further speech recognition is performed on the audio responses against the new grammar and a further prompt is selected for providing to the caller based on results of the distributed speech recognition.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: November 1, 2016
    Assignee: INTELLISIST, INC
    Inventor: Gilad Odinak
  • Patent number: 9484027
    Abstract: A method of automated speech recognition in a vehicle. The method includes receiving audio in the vehicle, pre-processing the received audio to generate acoustic feature vectors, decoding the generated acoustic feature vectors to produce at least one speech hypothesis, and post-processing the at least one speech hypothesis using pitch to improve speech recognition accuracy. The speech hypothesis can be accepted as recognized speech during post-processing if pitch is present in the received audio. Alternatively, a pitch count for the received audio can be determined, N-best speech hypotheses can be post-processed by comparing the pitch count to syllable counts associated with the speech hypotheses, and the speech hypothesis having a syllable count equal to the pitch count can be accepted as recognized speech.
    Type: Grant
    Filed: December 10, 2009
    Date of Patent: November 1, 2016
    Assignee: General Motors LLC
    Inventors: Xufang Zhao, Uma Arun
  • Patent number: 9449607
    Abstract: A method for detecting overflow on an electronic device is described. The method includes determining a linear predictive coding synthesis filter gain. The method further includes determining whether overflow is detected based on the linear predictive coding synthesis filter gain and a fixed codebook gain. The method further includes determining a scaling factor if overflow is detected.
    Type: Grant
    Filed: November 1, 2012
    Date of Patent: September 20, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 9437193
    Abstract: Computerized estimation of an identity of a user of a computing system. The system estimates environment-specific alterations of a received user sound that is received at the computing system. The system estimates whether the received user sounds is from a particular user by use of a corresponding user-dependent audio model. The user-dependent audio model may be stored in a multi-system store accessible such that the method may be performed for a given user across multiple systems and on a system that the user has never before trained to recognize the user. This reduces or even eliminates the need for a user to train a system to recognize the voice of a user, and allows multiple systems to take advantage of previous training performed by the user.
    Type: Grant
    Filed: January 21, 2015
    Date of Patent: September 6, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Andrew William Lovitt
  • Patent number: 9412383
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and obtaining a decoded baseband audio signal by decoding the first part. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying in a circular manner a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal by adjusting, based on the estimated spectral envelope of the highband portion, a spectral envelope of the high-frequency reconstructed signal.
    Type: Grant
    Filed: April 14, 2016
    Date of Patent: August 9, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9412388
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and extracting, from the first part, temporal envelope information and spectral components of the baseband portion. The method further includes decoding the first part to obtain a decoded baseband audio signal. The decoding includes filtering in a frequency domain at least some of the spectral components of the baseband portion with the reconstruction filter using the temporal envelope information to shape a temporal envelope of the baseband portion. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal.
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: August 9, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9412389
    Abstract: According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying in a circular manner a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal by adjusting, based on an estimated spectral envelope of the highband portion, a spectral envelope of the high-frequency reconstructed signal. The method further includes generating a noise component based on a noise parameter and obtaining a combined high-frequency signal by adding the noise component to the envelope adjusted high-frequency signal.
    Type: Grant
    Filed: April 14, 2016
    Date of Patent: August 9, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9406311
    Abstract: An encoding method executed by a computer, the method includes converting by the computer information about a transient included in a low-frequency component of an audio signal into information about a transient included in a high-frequency component of the audio signal, detecting, by the computer the transient of the high-frequency component of the audio signal based on the high-frequency component of the audio signal and on the information about the transient of the high-frequency component obtained by the converting; and encoding, by the computer the high-frequency component of the audio signal based on the transient detected by the detecting.
    Type: Grant
    Filed: August 23, 2012
    Date of Patent: August 2, 2016
    Assignee: FUJITSU LIMITED
    Inventors: Shusaku Ito, Yoshiteru Tsuchinaga, Katsumori Hagiwara, Sosaku Moriki
  • Patent number: 9401160
    Abstract: Voice activity detectors and related methods are provided. Methods include receiving a frame of the input signal; determining a first SNR of the received frame; comparing the determined first SNR with an adaptive threshold; and detecting whether the received frame comprises voice based on the comparison. The adaptive threshold is at least based on total noise energy of a noise level, an estimate of a second SNR and on energy variation between different frames.
    Type: Grant
    Filed: October 18, 2010
    Date of Patent: July 26, 2016
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Martin Sehlstedt
  • Patent number: 9390085
    Abstract: Method(s) and system(s) for speech processing of second language speech are described. According to the present subject matter, the system(s) implement the described method(s) for speech processing of Oriya English. The method for speech processing include receiving a plurality of speech samples of Oriya English to form a speech corpora where the plurality of speech samples comprise sounds of both vowels and consonants and, a plurality of speech parameters are associated with each of the plurality of speech samples. Method also includes determining values of the plurality of speech parameters for each of the plurality of speech samples and identifying difference between the values of each of the plurality of speech parameters and a corresponding value of accent neutral English. Further, the method includes articulating governing language rules based on the identifying to assess phonetic variation and mother tongue influence in sounds of vowels and consonants of Oriya English.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: July 12, 2016
    Assignee: TATA CONSULTANCY SEVICES LIMITED
    Inventor: Suman Bhattacharya
  • Patent number: 9382901
    Abstract: The present invention can be included in the technical field of power control systems of electrical generation units comprising a supervisory regulation link applicable to a generation unit which calculates operating parameters or orders based on temporary averages of the power measurement.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: July 5, 2016
    Assignee: Acciona Windpower S.A.
    Inventors: Jose Miguel Garcia Sayes, Teresa Arlaban Gabeiras, Alfonso Ruiz Aldama, Alberto Garcia Barace, Ana Fernandez Garcia de Iturrospe, Diego Otamendi Claramunt, Alejandro Gonzalez Murua, Miguel Nunez Polo
  • Patent number: 9384750
    Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.
    Type: Grant
    Filed: October 3, 2014
    Date of Patent: July 5, 2016
    Assignee: Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand
  • Patent number: 9343071
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and decoding the first part to obtain a decoded baseband audio signal. The method also includes extracting an estimated spectral envelope of the highband portion and a noise parameter from the second part and filtering the decoded baseband audio signal to obtain a plurality of subband signals. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and adjusting a spectral envelope of the high-frequency reconstructed signal based on the estimated spectral envelope of the highband portion to obtain an envelope adjusted high-frequency signal.
    Type: Grant
    Filed: June 10, 2015
    Date of Patent: May 17, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9338492
    Abstract: The present invention refers to a method for reproducing an audio and/or video sequence, as well as a reproducing device and reproducing apparatus that make use of the method; the method reproduces an audio and/or video sequence by means of a decoder (Dav) apt to decode said sequence and a buffer (B) connected upstream to said decoder (Dav) and able to store at least a part of said sequence; the sequence is transmitted by means of a number of data blocks; each of said blocks comprises an audio and/or video information data section and a corresponding error correction data section; such sections are transmitted in different time intervals; the method comprises a transitory operation mode and a steady state operation mode; in the steady state operation mode the correction data of the block (FEC) are applied to the corresponding information data before said information data are supplied to said decoder (Dav), while in the transitory operation mode the information data of a block are directly supplied to said dec
    Type: Grant
    Filed: September 18, 2007
    Date of Patent: May 10, 2016
    Assignees: RAI Radiotelevisione Italiana S.P.A., S.I.SV.EL. S.P.A
    Inventors: Alberto Morello, Massimo Mancin
  • Patent number: 9324328
    Abstract: A method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes decoding an encoded audio signal to obtain a decoded baseband audio signal, filtering the decoded baseband audio signal to obtain subband signals, and generating a high-frequency reconstructed signal by copying a number of consecutive subband signals. The method also includes adjusting a spectral envelope of the high-frequency reconstructed signal based on an estimated spectral envelope of the highband portion extracted from the encoded audio signal to obtain an envelope adjusted high-frequency signal, generating a noise component based on a noise parameter extracted from the encoded audio signal, and adding the noise component to the envelope adjusted high-frequency signal to obtain a noise and envelope adjusted high-frequency signal.
    Type: Grant
    Filed: May 11, 2015
    Date of Patent: April 26, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9318127
    Abstract: An apparatus for generating a bandwidth extended audio signal from an input signal, includes a patch generator for generating one or more patch signals from the input signal, wherein the patch generator is configured for performing a time stretching of subband signals from an analysis filterbank, and wherein the patch generator further includes a phase adjuster for adjusting phases of the subband signals using a filterbank-channel dependent phase correction.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: April 19, 2016
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International AB
    Inventors: Sascha Disch, Frederik Nagel, Stephan Wilde, Lars Villemoes, Per Ekstrand