Pitch Patents (Class 704/207)

Voiced or unvoiced (Class 704/208)

Automatic self-utterance removal from multimedia files

Patent number: 9818427

Abstract: Embodiments of a system and method for removing speech by a user from audio frames are generally described herein. A method may include receiving a plurality of frames of audio data, extracting a set of frames of the plurality of frames, the set of frames including speech by a user with a set of remaining frames in the plurality of frames not in the set of frames, suppressing the speech by the user from the set of frames using a trained model to create a speech-suppressed set of frames, and recompiling the plurality of frames using the speech-suppressed set of frames and the set of remaining frames.

Type: Grant

Filed: December 22, 2015

Date of Patent: November 14, 2017

Assignee: Intel Corporation

Inventors: Niall Cahill, Jakub Wenus, Mark Kelly
Method and system for communicating between a sender and a recipient via a personalized message including an audio clip extracted from a pre-existing recording

Patent number: 9813366

Abstract: A method of communicating between a sender and a recipient via a personalized message is disclosed comprising: (a) identifying text, via the user interface of a communication device, of a desired lyric phrase from within a pre-existing audio recording; (b) extracting audio substantially associated with the desired lyric phrase from the pre-existing recording into a desired audio clip; (c) inputting personalized text via the user interface; (d) creating the personalized message with the sender identification, the personalized text and access to the desired audio clip; (e) sending an electronic message to the electronic address of the recipient, wherein the electronic message may be an SMS/EMS/MMS message, instant message or email message including a link to the personalized message or an EMS/MMS or email message including the personalized message. An associated method of earning money from the communication along with associated systems are also disclosed.

Type: Grant

Filed: February 12, 2016

Date of Patent: November 7, 2017

Assignee: Rednote LLC

Inventors: Scott Guthery, Richard van den Bosch
Dataset shift compensation in machine learning

Patent number: 9792899

Abstract: A method for inter-dataset variability compensation, the method comprising using at least one hardware processor for: receiving a heterogeneous development dataset comprising multiple samples and metadata associated with at least some of the multiple samples; dividing the multiple samples into multiple homogenous subsets, based on the metadata; averaging high-level features of each of the multiple homogenous subsets, to produce multiple central high-level features for the multiple homogenous subsets, respectively; computing an inter-dataset variability subspace spanned by the multiple central high-level features; removing the inter-dataset variability subspace from the high-level features of the multiple homogenous subsets, to produce denoised samples; and training a machine learning system using the denoised speech samples.

Type: Grant

Filed: July 15, 2014

Date of Patent: October 17, 2017

Assignee: International Business Machines Corporation

Inventor: Hagai Aronowitz
Apparatus and method to facilitate singing intended notes

Patent number: 9773426

Abstract: A method and apparatus to facilitate tone challenged singers to sing intended notes. In one aspect, the singer determines a note to sing corresponding to an intended frequency fi. The singer utters a note continuously with fundamental frequency fu into a microphone of the natural ear apparatus. The note is processed by the apparatus to produce sound emphasizing the fundamental frequency fu and output through a speaker to the auditory organs of the singer. The singer detects differences between intended frequency fi and uttered fundamental frequency fu. The singer adjusts his vocal organs as he utters the note with the intention of changing fu to reduce difference between fi and fu.

Type: Grant

Filed: February 1, 2016

Date of Patent: September 26, 2017

Assignee: Board of Regents, The University of Texas System

Inventors: Eric A. Freudenthal, Eric M. Hanson, Bryan E. Usevitch
Packet loss concealment for speech coding

Patent number: 9767810

Abstract: A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class. A pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. A strongly voiced class is decided by checking if the pitch lags are stable and the pitch gains are high enough with the frame; for the strongly voiced frame, the pitch lags and the pitch gains can be encoded more efficiently than other speech classes.

Type: Grant

Filed: April 24, 2016

Date of Patent: September 19, 2017

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventor: Yang Gao
Encoding method, decoding method, encoding apparatus, and decoding apparatus

Patent number: 9761235

Abstract: An encoding method, a decoding method, an encoding apparatus, a decoding apparatus, a transmitter, a receiver, and a communications system. The encoding method includes: dividing a to-be-encoded time-domain signal into a low band signal and a high band signal; performing encoding on the low band signal to obtain a low frequency encoding parameter; performing encoding on the high band signal to obtain a high frequency encoding parameter, and obtaining a synthesized high band signal; performing short-time post-filtering processing on the synthesized high band signal to obtain a short-time filtering signal; and calculating a high frequency gain based on the high band signal and the short-time filtering signal. A technical solution according to the embodiments of the present invention can improve an encoding and/or decoding effect.

Type: Grant

Filed: May 26, 2015

Date of Patent: September 12, 2017

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Bin Wang, Zexin Liu, Lei Miao
Authenticating a user by correlating speech and corresponding lip shape

Patent number: 9754193

Abstract: Provided is a method of authenticating a user by correlating speech and corresponding lip shape. An audiovisual of a user requesting authentication is captured. The audiovisual is processed to generate a speech vector quantization sequence and a corresponding lip vector quantization sequence of the user. A likelihood of the speech vector quantization sequence and the corresponding lip vector quantization sequence with probability distributions of speech vector quantization code words corresponding to different lip shape vector quantization code words of the user requesting authentication weighed by probabilities of speech and lip vector quantization indices of the user requesting authentication is evaluated. If upon evaluation, a likelihood of the user requesting authentication being an authentic user is more than a predefined threshold, the user is authenticated.

Type: Grant

Filed: June 27, 2013

Date of Patent: September 5, 2017

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Sitaram Ramachandrula, Hariharan Ravishankar
Crowdsourced, grounded language for intent modeling in conversational interfaces

Patent number: 9754585

Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.

Type: Grant

Filed: April 3, 2012

Date of Patent: September 5, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
System and method for multifaceted singing analysis

Patent number: 9747927

Abstract: A system for multifaceted singing analysis for retrieval of songs or music including singing voices having some relationship in latent semantics with a singing voice included in one particular song or music. A topic analyzing processor uses a topic model to analyze a plurality of vocal symbolic time series obtained for a plurality of musical audio signals. The topic analyzing processor generates a vocal topic distribution for each of the musical audio signals whereby the vocal topic distribution is composed of a plurality of vocal topics each indicating a relationship of one of the musical audio signals with the other musical audio signals. The topic analyzing processor generates a vocal symbol distribution for each of the vocal topics whereby the vocal symbol distribution indicates occurrence probabilities for the vocal symbols. A multifaceted singing analyzing processor performs analysis of singing voices included in musical audio signals, in the multifaceted viewpoint.

Type: Grant

Filed: August 15, 2014

Date of Patent: August 29, 2017

Assignee: NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY

Inventors: Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto
Digital watermark detecting device, method, and program

Patent number: 9747907

Abstract: According to an embodiment, a digital watermark detecting device includes a residual signal extractor, a voiced period estimator, a storage, a phase estimator, and a watermark determiner. The residual signal extractor is configured to extract a residual signal from a speech signal. The voiced period estimator is configured to estimate a voiced period based on the speech signal. The storage is configured to store pulse signals modulated in advance so as to have different phases. The phase estimator is configured to clip the voiced period in units of an analysis frame having a predetermined length, and perform pattern matching between the residual signal in the analysis frame and the pulse signals to estimate phase of the speech signal. The watermark determiner is configured to, based on a sequence of phases estimated by the phase estimator, determine whether a digital watermark is embedded in the speech signal or not.

Type: Grant

Filed: May 10, 2016

Date of Patent: August 29, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro Tachibana, Masahiro Morita
Very short pitch detection and coding

Patent number: 9741357

Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.

Type: Grant

Filed: June 19, 2015

Date of Patent: August 22, 2017

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Yang Gao, Fengyan Qi
Method and apparatus for decoding speech/audio bitstream

Patent number: 9734836

Abstract: A method and an apparatus for decoding a speech/audio bitstream are disclosed, where the method for decoding a speech/audio bitstream includes determining whether a current frame is a normal decoding frame or a redundancy decoding frame, obtaining a decoded parameter of the current frame by means of parsing when the current frame is a normal decoding frame or a redundancy decoding frame, performing post-processing on the decoded parameter of the current frame to obtain a post-processed decoded parameter of the current frame, and using the post-processed decoded parameter of the current frame to reconstruct a speech/audio signal.

Type: Grant

Filed: June 29, 2016

Date of Patent: August 15, 2017

Assignee: Huawei Technologies Co., Ltd.

Inventors: Zexin Liu, Xingtao Zhang, Lei Miao
Irregularity detection in music

Patent number: 9734844

Abstract: Embodiments of the present invention relate to detecting irregularities in audio, such as music. An input signal corresponding to an audio stream is received. The input signal is transformed from a time domain into a frequency domain to generate a plurality of frames that each comprises frequency information for a portion of the input signal. An irregular event in a portion of the input signal corresponding to a set of frames in the plurality of frames is identified based on a comparison of frequency information of the set of frames to the frequency information of other sets of frames of the plurality of frames. This allows an indication of the irregular event to be provided, or for the input signal to be automatically synchronized to a multimedia event.

Type: Grant

Filed: November 23, 2015

Date of Patent: August 15, 2017

Assignee: Adobe Systems Incorporated

Inventors: Minje Kim, Gautham Mysore, Peter Merrill, Paris Smaragdis
Smartphone indicator for conversation nonproductivity

Patent number: 9722965

Abstract: A method to send an alert for nonproductivity associated with a conversation is provided. The method may include recording a plurality of communication outputs of at least two users engaged in a remote message exchange or a remote conversation. The method may also include creating a plurality of text tokens based on the recorded plurality of communication outputs. The method may include analyzing, by a graphical text analyzer, the created plurality of text tokens to determine whether the plurality of text tokens has fallen below a threshold. The method may further include sending an alert to the plurality of users involved in the conversation if it is determined that the plurality of text tokens has fallen below the threshold.

Type: Grant

Filed: January 29, 2015

Date of Patent: August 1, 2017

Assignee: International Business Machines Corporation

Inventors: Guillermo A. Cecchi, James R. Kozloski, Clifford A. Pickover, Irina Rish
User driven audio content navigation

Patent number: 9715540

Abstract: Systems and associated methods configured to provide user-driven audio content navigation for the spoken web are described. Embodiments allow users to skim audio for content that seems to be of relevance to the user, similar to visual skimming of standard web pages, and mark point of interest within the audio. Embodiments provide techniques for navigating audio content while interacting with information systems in a client-server environment, where the client device can be a simple, standard telephone.

Type: Grant

Filed: June 24, 2010

Date of Patent: July 25, 2017

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Nitendra Rajput
User driven audio content navigation

Patent number: 9710552

Abstract: Systems and associated methods configured to provide user-driven audio content navigation for the spoken web are described. Embodiments allow users to skim audio for content that seems to be of relevance to the user, similar to visual skimming of standard web pages, and mark point of interest within the audio. Embodiments provide techniques for navigating audio content while interacting with information systems in a client-server environment, where the client device can be a simple, standard telephone.

Type: Grant

Filed: August 28, 2012

Date of Patent: July 18, 2017

Assignee: International Business Machines Corporation

Inventors: Nitendra Rajput, Om D. Deshmukh
Pitch marking in speech processing

Patent number: 9685170

Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.

Type: Grant

Filed: October 21, 2015

Date of Patent: June 20, 2017

Assignee: International Business Machines Corporation

Inventor: Slava Shechtman
Systems and methods for determining a repeatogram in a music composition using audio features

Patent number: 9653095

Abstract: A dataset representing repeated sounds within a musical composition recorded on an audio track may be constructed. An audio track duration of an audio track may be partitioned into partitions of a partition size. A current partition may be compared to remaining partitions of the audio track. Audio information for the current partition may be correlated to audio information for remaining partitions to determine a correlated partition for the current partition from among the remaining partitions of the track duration. The correlated partition determined may be identified as most likely to represent the same sound as the current partition. This comparison process may be performed iteratively, for individual ones of the remaining partitions. Correlation results of the comparison process may be recorded to represent the partition time period of the correlated partition as a function of partition time period of the current partition.

Type: Grant

Filed: August 30, 2016

Date of Patent: May 16, 2017

Assignee: GoPro, Inc.

Inventor: David Tcheng
Method, network element, and system for assessing voice quality

Patent number: 9641673

Abstract: Embodiments of the present invention disclose a method, a network element, and a system for assessing voice quality, which relates to the communications field and solves a problem that user perception cannot be reflected according to a voice quality assessment result. The method includes acquiring a voice code stream, and collecting statistics on a transmission parameter in each short-time assessment period; decoding the voice code stream, and collecting statistics on a source parameter according to the decoded voice code stream; and calculating a comprehensive voice quality assessment result according the transmission parameter and the source parameter. The present invention is used for voice quality assessment.

Type: Grant

Filed: February 24, 2015

Date of Patent: May 2, 2017

Assignee: Huawei Technologies Co., Ltd.

Inventors: Chunmei Lu, Hongbo Yang, Yunjuan Xie, Haijiao Wang
Multiple pitch extraction by strength calculation from extrema

Patent number: 9640200

Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.

Type: Grant

Filed: March 3, 2014

Date of Patent: May 2, 2017

Assignee: University of Maryland, College Park

Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
Speech enhancement apparatus and speech enhancement method

Patent number: 9626987

Abstract: A speech enhancement apparatus includes: a noise estimating unit which estimates a noise component contained in a speech signal for each frequency band; a signal-to-noise ratio computing unit which computes, for each frequency band, a signal-to-noise ratio; a gain computing unit which selects a frequency band whose computed signal-to-noise ratio indicates that the signal component contained in the speech signal for the frequency band is recognizable, and which determines a gain indicating the degree of enhancement to be applied to the speech signal in accordance with the signal-to-noise ratio of the selected frequency band; and an enhancing unit which amplifies an amplitude component of a frequency domain signal in each frequency band in accordance with the gain, and which corrects the amplitude component of the frequency domain signal by subtracting the noise component from the amplitude component in each frequency band.

Type: Grant

Filed: November 6, 2013

Date of Patent: April 18, 2017

Assignee: FUJITSU LIMITED

Inventor: Naoshi Matsuo
Correction of frame loss during signal decoding

Patent number: 9613629

Abstract: A signal processing device, media, and method are provided, where a signal comprises a succession of samples distributed in successive frames. The processing is implemented during decoding of such a signal in order to replace at least one signal frame lost in decoding, and comprising in particular: a) searching, in a valid signal available to the decoder, for a signal segment of length corresponding to a period set as a function of the valid signal; b) analyzing a spectrum of the segment in order to determine spectral components of the segment; and c) synthesizing at least one replacement frame for the lost frame by construction of a synthesized signal from at least a portion of the spectral components.

Type: Grant

Filed: January 30, 2014

Date of Patent: April 4, 2017

Assignee: Orange

Inventors: Julien Faure, Stephane Ragot
Natural human-computer interaction for virtual personal assistant systems

Patent number: 9607612

Abstract: Technologies for natural language interactions with virtual personal assistant systems include a computing device configured to capture audio input, distort the audio input to produce a number of distorted audio variations, and perform speech recognition on the audio input and the distorted audio variants. The computing device selects a result from a large number of potential speech recognition results based on contextual information. The computing device may measure a user's engagement level by using an eye tracking sensor to determine whether the user is visually focused on an avatar rendered by the virtual personal assistant. The avatar may be rendered in a disengaged state, a ready state, or an engaged state based on the user engagement level. The avatar may be rendered as semitransparent in the disengaged state, and the transparency may be reduced in the ready state or the engaged state. Other embodiments are described and claimed.

Type: Grant

Filed: May 20, 2013

Date of Patent: March 28, 2017

Assignee: Intel Corporation

Inventor: William C. Deleeuw
Stream creation with limited topology information

Patent number: 9590825

Abstract: The discovery of a topology of a network with an unknown topology can enable the selection of a data path within the network, and the establishment of a data stream over the selected data path. Routing tables mapping originating nodes to input ports can be created based on the receipt of discovery messages generated by the originating nodes. A source node can select a data path between the source node and a sink node in order to establish a data stream using the routing tables. Data paths can be selected based on, for instance, routing table bandwidth information, latency information, and/or distance information. Data streams can be established over the selected data path, and each node can release any reserved output bandwidth determined to be unnecessary for the data stream.

Type: Grant

Filed: January 14, 2015

Date of Patent: March 7, 2017

Assignee: Lattice Semiconductor Corporation

Inventors: Taliaferro Smith, Sergey Yarygin
System and method for managing conferencing in a distributed communication network

Patent number: 9553900

Abstract: Systems and methods for a conferencing system. Responsive to a new conference request received at a conference orchestration service, participants of the conference and participant regions for each determined participant are determined. A mixer topology is generated that specifies an assignment of each determined participant to at least one input channel of a plurality of mixers. A mixer state manager generates the mixer topology based on the determined participant regions and at least one regional association of a mixer. Media of each determined participant is routed to the assigned at least one input channel according to the generated mixer topology by using the conference orchestration service. The mixer state manager generates the topology responsive to a request provided by the conference state manager. The conference orchestration service receives the generated mixer topology from the mixer state manager via the conference state manager.

Type: Grant

Filed: December 9, 2015

Date of Patent: January 24, 2017

Assignee: Twilio, Inc.

Inventors: Christer Fahlgren, Nico Acosta Amador
Engine sound synthesis system

Patent number: 9553553

Abstract: An engine sound synthesis system is operable to analyze sound. Operation of the system may include providing an input sound signal to be analysed and determining a fundamental frequency of the input signal from the input signal or from at least one guide signal. Furthermore, the frequencies of higher harmonics of the fundamental frequency are determined, thus determining harmonic model parameters. A harmonic signal based on the harmonic model parameters is synthesized and a residual signal is estimated by subtracting the harmonic signal from the input signal. Residual model parameters are estimated based on the residual signal. Furthermore, a corresponding method for synthesizing a sound signal is described.

Type: Grant

Filed: July 12, 2013

Date of Patent: January 24, 2017

Assignee: Harman Becker Automotive Systems GmbH

Inventor: Markus Christoph
High frequency regeneration of an audio signal with temporal shaping

Patent number: 9548060

Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes extracting temporal envelope information and spectral components of the baseband portion. The method further includes obtaining a decoded baseband audio signal. The obtaining includes filtering in a frequency domain at least some of the spectral components of the baseband portion with the reconstruction filter using the temporal envelope information to shape a temporal envelope of the baseband portion. The method also includes extracting a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal.

Type: Grant

Filed: September 7, 2016

Date of Patent: January 17, 2017

Inventors: Michael M. Truman, Mark S. Vinton
Duration ratio modeling for improved speech recognition

Patent number: 9542939

Abstract: In speech recognition, the duration of a phoneme is taken into account when determining recognition scores. Specifically, the duration of a phoneme may be evaluated relative to the duration of neighboring phonemes. A phoneme that is interpreted to be significantly longer or shorter than its neighbors may be given a lower duration score. A duration score for a phoneme may be calculated and used to adjust a recognition score. In this manner a duration model may supplement an acoustic model and language model to improve speech recognition results.

Type: Grant

Filed: August 31, 2012

Date of Patent: January 10, 2017

Assignee: AMAZON TECHNOLOGIES, INC.

Inventor: Bjorn Hoffmeister
Method and system for identification of speech segments

Patent number: 9536523

Abstract: A system for distinguishing and identifying speech segments originating from speech of one or more relevant speakers in a predefined detection area. The system includes an optical system which outputs optical patterns, each representing audio signals as detected by the optical system in the area within a specific time frame; and a computer processor which receives each of the outputted optical patterns and analyses each respective optical pattern to provide information that enables identification of speech segments thereby, by identifying blank spaces in the optical pattern, which define beginning or ending of each respective speech segment.

Type: Grant

Filed: June 21, 2012

Date of Patent: January 3, 2017

Assignee: VOCALZOOM SYSTEMS LTD.

Inventors: Tal Bakish, Gavriel Horowitz, Yekutiel Avargel, Yechiel Kurtz
Speech encoding by determining a quantization gain based on inverse of a pitch correlation

Patent number: 9530423

Abstract: A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modelled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal comprising successive frames. For each of a plurality of frames of the speech signal: adding a predetermined noise signal generated by a quantization gain multiplied by 0.5 times an inverse of a pitch correlation to the speech signal to generate a simulated signal, determining linear predictive coding coefficients based on the simulated signal frame, and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal and the simulated signal. Then forming an encoded signal representing said speech signal, based on the linear predictive coding coefficients and the linear predictive coding residual signal.

Type: Grant

Filed: August 28, 2009

Date of Patent: December 27, 2016

Assignee: Skype

Inventor: Koen Bernard Vos
Threshold adaptation in two-channel noise estimation and voice activity detection

Patent number: 9524735

Abstract: A method for adapting a threshold used in multi-channel audio voice activity detection. Strengths of primary and secondary sound pick up channels are computed. A separation, being a measure of difference between the strengths of the primary and secondary channels, is also computed. An analysis of the peaks in separation is performed, e.g. using a leaky peak capture function that captures a peak in the separation and then decays over time, or using a sliding window min-max detector. A threshold that is to be used in a voice activity detection (VAD) process is adjusted, in accordance with the analysis of the peaks. Other embodiments are also described and claimed.

Type: Grant

Filed: January 31, 2014

Date of Patent: December 20, 2016

Assignee: Apple Inc.

Inventors: Vasu Iyengar, Aram M. Lindahl
Codec inversion detection

Patent number: 9510309

Abstract: A device includes a receiver and a processor. The receiver is configured to receive a signal. The processor is configured to generate a first flag indicating whether the signal satisfies one or more first conditions that are based on a number of detected correlation peaks associated with the signal, a correlation peak amplitude, or both, and to generate a second flag indicating whether an inverted signal satisfies one or more second conditions. The processor is further configured to generate a first value of a first synchronization sign indicator associated with the signal and to generate a second value of a second synchronization sign indicator associated with the inverted signal. The processor is also configured to generate an invert flag that indicates whether synchronization inversion is detected in the signal based at least in part on the first flag, the second flag, the first value, and the second value.

Type: Grant

Filed: May 13, 2015

Date of Patent: November 29, 2016

Assignee: Qualcomm Incorporated

Inventor: Ralf Martin Weber
Coding independent frames of ambient higher-order ambisonic coefficients

Patent number: 9502045

Abstract: In general, techniques are described for coding an ambient higher order ambisonic coefficient. An audio decoding device comprising a memory and a processor may perform the techniques. The memory may store a first frame of a bitstream and a second frame of the bitstream. The processor may obtain, from the first frame, one or more bits indicative of whether the first frame is an independent frame that includes additional reference information to enable the first frame to be decoded without reference to the second frame. The processor may further obtain, in response to the one or more bits indicating that the first frame is not an independent frame, prediction information for first channel side information data of a transport channel. The prediction information may be used to decode the first channel side information data of the transport channel with reference to second channel side information data of the transport channel.

Type: Grant

Filed: January 29, 2015

Date of Patent: November 22, 2016

Assignee: QUALCOMM Incorporated

Inventors: Nils Günther Peters, Dipanjan Sen
Systems and methods for an automated pronunciation assessment system for similar vowel pairs

Patent number: 9489864

Abstract: Computer-implemented systems and methods are provided for assessing non-native speech proficiency. a non-native speech sample is processed to identify a plurality of vowel sound boundaries in the non-native speech sample. Portions of the non-native speech sample are analyzed within the vowel sound boundaries to extract vowel characteristics associated with a first vowel sound and a second vowel sound represented in the non-native speech sample. The vowel characteristics are processed to identify a first vowel pronunciation metric for the first vowel sound and a second vowel pronunciation metric for the second vowel sound, and the first vowel pronunciation metric and the second vowel pronunciation metric are processed to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound.

Type: Grant

Filed: January 7, 2014

Date of Patent: November 8, 2016

Assignee: Educational Testing Service

Inventor: Keelan Evanini
System and method for distributed speech recognition

Patent number: 9484035

Abstract: A system and method for distributed speech recognition is provided. A prompt is provided to a caller during a call. One or more audio responses are received from the caller in response to the prompt. Distributed speech recognition is performed on the audio responses by providing a non-overlapping section of a main grammar to each of a plurality of secondary recognizers for each audio response. Speech recognition is performed on the audio responses by each of the secondary recognizers using the non-overlapping section of the main grammar associated with that secondary recognizer. A new grammar is generated based on results of the speech recognition from each of the secondary recognizers. Further speech recognition is performed on the audio responses against the new grammar and a further prompt is selected for providing to the caller based on results of the distributed speech recognition.

Type: Grant

Filed: December 28, 2015

Date of Patent: November 1, 2016

Assignee: INTELLISIST, INC

Inventor: Gilad Odinak
Using pitch during speech recognition post-processing to improve recognition accuracy

Patent number: 9484027

Abstract: A method of automated speech recognition in a vehicle. The method includes receiving audio in the vehicle, pre-processing the received audio to generate acoustic feature vectors, decoding the generated acoustic feature vectors to produce at least one speech hypothesis, and post-processing the at least one speech hypothesis using pitch to improve speech recognition accuracy. The speech hypothesis can be accepted as recognized speech during post-processing if pitch is present in the received audio. Alternatively, a pitch count for the received audio can be determined, N-best speech hypotheses can be post-processed by comparing the pitch count to syllable counts associated with the speech hypotheses, and the speech hypothesis having a syllable count equal to the pitch count can be accepted as recognized speech.

Type: Grant

Filed: December 10, 2009

Date of Patent: November 1, 2016

Assignee: General Motors LLC

Inventors: Xufang Zhao, Uma Arun
Systems and methods for detecting overflow

Patent number: 9449607

Abstract: A method for detecting overflow on an electronic device is described. The method includes determining a linear predictive coding synthesis filter gain. The method further includes determining whether overflow is detected based on the linear predictive coding synthesis filter gain and a fixed codebook gain. The method further includes determining a scaling factor if overflow is detected.

Type: Grant

Filed: November 1, 2012

Date of Patent: September 20, 2016

Assignee: QUALCOMM Incorporated

Inventors: Vivek Rajendran, Ananthapadmanabhan Arasanipalai Kandhadai
Environment adjusted speaker identification

Patent number: 9437193

Abstract: Computerized estimation of an identity of a user of a computing system. The system estimates environment-specific alterations of a received user sound that is received at the computing system. The system estimates whether the received user sounds is from a particular user by use of a corresponding user-dependent audio model. The user-dependent audio model may be stored in a multi-system store accessible such that the method may be performed for a given user across multiple systems and on a system that the user has never before trained to recognize the user. This reduces or even eliminates the need for a user to train a system to recognize the voice of a user, and allows multiple systems to take advantage of previous training performed by the user.

Type: Grant

Filed: January 21, 2015

Date of Patent: September 6, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventor: Andrew William Lovitt
High frequency regeneration of an audio signal by copying in a circular manner

Patent number: 9412383

Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and obtaining a decoded baseband audio signal by decoding the first part. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying in a circular manner a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal by adjusting, based on the estimated spectral envelope of the highband portion, a spectral envelope of the high-frequency reconstructed signal.

Type: Grant

Filed: April 14, 2016

Date of Patent: August 9, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael M. Truman, Mark S. Vinton
High frequency regeneration of an audio signal with temporal shaping

Patent number: 9412388

Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and extracting, from the first part, temporal envelope information and spectral components of the baseband portion. The method further includes decoding the first part to obtain a decoded baseband audio signal. The decoding includes filtering in a frequency domain at least some of the spectral components of the baseband portion with the reconstruction filter using the temporal envelope information to shape a temporal envelope of the baseband portion. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal.

Type: Grant

Filed: April 20, 2016

Date of Patent: August 9, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael M. Truman, Mark S. Vinton
High frequency regeneration of an audio signal by copying in a circular manner

Patent number: 9412389

Abstract: According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying in a circular manner a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal by adjusting, based on an estimated spectral envelope of the highband portion, a spectral envelope of the high-frequency reconstructed signal. The method further includes generating a noise component based on a noise parameter and obtaining a combined high-frequency signal by adding the noise component to the envelope adjusted high-frequency signal.

Type: Grant

Filed: April 14, 2016

Date of Patent: August 9, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael M. Truman, Mark S. Vinton
Encoding method, encoding apparatus, and computer readable recording medium

Patent number: 9406311

Abstract: An encoding method executed by a computer, the method includes converting by the computer information about a transient included in a low-frequency component of an audio signal into information about a transient included in a high-frequency component of the audio signal, detecting, by the computer the transient of the high-frequency component of the audio signal based on the high-frequency component of the audio signal and on the information about the transient of the high-frequency component obtained by the converting; and encoding, by the computer the high-frequency component of the audio signal based on the transient detected by the detecting.

Type: Grant

Filed: August 23, 2012

Date of Patent: August 2, 2016

Assignee: FUJITSU LIMITED

Inventors: Shusaku Ito, Yoshiteru Tsuchinaga, Katsumori Hagiwara, Sosaku Moriki
Methods and voice activity detectors for speech encoders

Patent number: 9401160

Abstract: Voice activity detectors and related methods are provided. Methods include receiving a frame of the input signal; determining a first SNR of the received frame; comparing the determined first SNR with an adaptive threshold; and detecting whether the received frame comprises voice based on the comparison. The adaptive threshold is at least based on total noise energy of a noise level, an estimate of a second SNR and on energy variation between different frames.

Type: Grant

Filed: October 18, 2010

Date of Patent: July 26, 2016

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventor: Martin Sehlstedt
Speech processing system and method for recognizing speech samples from a speaker with an oriyan accent when speaking english

Patent number: 9390085

Abstract: Method(s) and system(s) for speech processing of second language speech are described. According to the present subject matter, the system(s) implement the described method(s) for speech processing of Oriya English. The method for speech processing include receiving a plurality of speech samples of Oriya English to form a speech corpora where the plurality of speech samples comprise sounds of both vowels and consonants and, a plurality of speech parameters are associated with each of the plurality of speech samples. Method also includes determining values of the plurality of speech parameters for each of the plurality of speech samples and identifying difference between the values of each of the plurality of speech parameters and a corresponding value of accent neutral English. Further, the method includes articulating governing language rules based on the identifying to assess phonetic variation and mother tongue influence in sounds of vowels and consonants of Oriya English.

Type: Grant

Filed: March 13, 2013

Date of Patent: July 12, 2016

Assignee: TATA CONSULTANCY SEVICES LIMITED

Inventor: Suman Bhattacharya
Power control system of an electrical generation unit

Patent number: 9382901

Abstract: The present invention can be included in the technical field of power control systems of electrical generation units comprising a supervisory regulation link applicable to a generation unit which calculates operating parameters or orders based on temporary averages of the power measurement.

Type: Grant

Filed: July 31, 2014

Date of Patent: July 5, 2016

Assignee: Acciona Windpower S.A.

Inventors: Jose Miguel Garcia Sayes, Teresa Arlaban Gabeiras, Alfonso Ruiz Aldama, Alberto Garcia Barace, Ana Fernandez Garcia de Iturrospe, Diego Otamendi Claramunt, Alejandro Gonzalez Murua, Miguel Nunez Polo
Oversampling in a combined transposer filterbank

Patent number: 9384750

Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.

Type: Grant

Filed: October 3, 2014

Date of Patent: July 5, 2016

Assignee: Dolby International AB

Inventors: Lars Villemoes, Per Ekstrand
Reconstructing an audio signal with a noise parameter

Patent number: 9343071

Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and decoding the first part to obtain a decoded baseband audio signal. The method also includes extracting an estimated spectral envelope of the highband portion and a noise parameter from the second part and filtering the decoded baseband audio signal to obtain a plurality of subband signals. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and adjusting a spectral envelope of the high-frequency reconstructed signal based on the estimated spectral envelope of the highband portion to obtain an envelope adjusted high-frequency signal.

Type: Grant

Filed: June 10, 2015

Date of Patent: May 17, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael M. Truman, Mark S. Vinton
Method for reproducing an audio and/or video sequence, a reproducing device and reproducing apparatus using the method

Patent number: 9338492

Abstract: The present invention refers to a method for reproducing an audio and/or video sequence, as well as a reproducing device and reproducing apparatus that make use of the method; the method reproduces an audio and/or video sequence by means of a decoder (Dav) apt to decode said sequence and a buffer (B) connected upstream to said decoder (Dav) and able to store at least a part of said sequence; the sequence is transmitted by means of a number of data blocks; each of said blocks comprises an audio and/or video information data section and a corresponding error correction data section; such sections are transmitted in different time intervals; the method comprises a transitory operation mode and a steady state operation mode; in the steady state operation mode the correction data of the block (FEC) are applied to the corresponding information data before said information data are supplied to said decoder (Dav), while in the transitory operation mode the information data of a block are directly supplied to said dec

Type: Grant

Filed: September 18, 2007

Date of Patent: May 10, 2016

Assignees: RAI Radiotelevisione Italiana S.P.A., S.I.SV.EL. S.P.A

Inventors: Alberto Morello, Massimo Mancin
Reconstructing an audio signal with a noise parameter

Patent number: 9324328

Abstract: A method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes decoding an encoded audio signal to obtain a decoded baseband audio signal, filtering the decoded baseband audio signal to obtain subband signals, and generating a high-frequency reconstructed signal by copying a number of consecutive subband signals. The method also includes adjusting a spectral envelope of the high-frequency reconstructed signal based on an estimated spectral envelope of the highband portion extracted from the encoded audio signal to obtain an envelope adjusted high-frequency signal, generating a noise component based on a noise parameter extracted from the encoded audio signal, and adding the noise component to the envelope adjusted high-frequency signal to obtain a noise and envelope adjusted high-frequency signal.

Type: Grant

Filed: May 11, 2015

Date of Patent: April 26, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Michael M. Truman, Mark S. Vinton
Device and method for improved magnitude response and temporal alignment in a phase vocoder based bandwidth extension method for audio signals

Patent number: 9318127

Abstract: An apparatus for generating a bandwidth extended audio signal from an input signal, includes a patch generator for generating one or more patch signals from the input signal, wherein the patch generator is configured for performing a time stretching of subband signals from an analysis filterbank, and wherein the patch generator further includes a phase adjuster for adjusting phases of the subband signals using a filterbank-channel dependent phase correction.

Type: Grant

Filed: September 5, 2012

Date of Patent: April 19, 2016

Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International AB

Inventors: Sascha Disch, Frederik Nagel, Stephan Wilde, Lars Villemoes, Per Ekstrand

prev 1 2 3 4 5 6 7 8 … next