Pattern Matching Vocoders Patents (Class 704/221)
  • Patent number: 10482196
    Abstract: A method, computer readable medium, and system are disclosed for generating a Gaussian mixture model hierarchy. The method includes the steps of receiving point cloud data defining a plurality of points; defining a Gaussian Mixture Model (GMM) hierarchy that includes a number of mixels, each mixel encoding parameters for a probabilistic occupancy map; and adjusting the parameters for one or more probabilistic occupancy maps based on the point cloud data utilizing a number of iterations of an Expectation-Maximum (EM) algorithm.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: November 19, 2019
    Assignee: NVIDIA Corporation
    Inventors: Benjamin David Eckart, Kihwan Kim, Alejandro Jose Troccoli, Jan Kautz
  • Patent number: 10460727
    Abstract: Various systems and methods for multi-talker speech separation and recognition are disclosed herein. In one example, a system includes a memory and a processor to process mixed speech audio received from a microphone. In an example, the processor can also separate the mixed speech audio using permutation invariant training, wherein a criterion of the permutation invariant training is defined on an utterance of the mixed speech audio. In an example, the processor can also generate a plurality of separated streams for submission to a speech decoder.
    Type: Grant
    Filed: May 23, 2017
    Date of Patent: October 29, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: James Droppo, Xuedong Huang, Dong Yu
  • Patent number: 10424319
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10325611
    Abstract: An audio decoder for providing a decoded audio information on the basis of an encoded audio information includes a linear-prediction-domain decoder configured to provide a first decoded audio information on the basis of an audio frame encoded in a linear prediction domain, a frequency domain decoder configured to provide a second decoded audio information on the basis of an audio frame encoded in a frequency domain, and a transition processor. The transition processor is configured to obtain a zero-input-response of a linear predictive filtering, wherein an initial state of the linear predictive filtering is defined depending on the first decoded audio information and the second decoded audio information, and modify the second decoded audio information depending on the zero-input-response, to obtain a smooth transition between the first and the modified second decoded audio information.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: June 18, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Emmanuel Ravelli, Guillaume Fuchs, Sascha Disch, Markus Multrus, Grzegorz Pietrzyk, Benjamin Schubert
  • Patent number: 10311895
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: June 4, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, MD Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10297273
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: May 21, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Md Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10217454
    Abstract: According to an embodiment, a voice synthesizer includes a content selection unit, a content generation unit, and a content registration unit. The content selection unit determines selected content among a plurality of pieces of content registered in a content storage unit. The content includes tagged text in which tag information for controlling voice synthesis is added to text serving as a target of the voice synthesis. The content generation unit applies the tag information in the tagged text included in the selected content to designated text to generate new content. The content registration unit registers the generated new content in the content storage unit.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: February 26, 2019
    Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA SOLUTIONS CORPORATION
    Inventors: Kaoru Hirano, Masaru Suzuki, Hiroyuki Mizutani
  • Patent number: 10157620
    Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: December 18, 2018
    Inventors: Srinath Cheluvaraja, Ananth Nagaraja Iyer, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 10147422
    Abstract: Systems, methods, and computer-readable media that may be used to modify a voice action system to include voice actions provided by advertisers or users are provided. One method includes receiving electronic voice action bids from advertisers to modify the voice action system to include a specific voice action (e.g., a triggering phrase and an action). One or more bids may be selected. The method includes, for each of the selected bids, modifying data associated with the voice action system to include the voice action associated with the bid, such that the action associated with the respective voice action is performed when voice input from a user is received that the voice action system determines to correspond to the triggering phrase associated with the respective voice action.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: December 4, 2018
    Assignee: GOOGLE LLC
    Inventor: Pedro J. Moreno Mengibar
  • Patent number: 10033836
    Abstract: A server comprising a processor circuit and a database may receive address book data comprising information associated with at least one contact from a communication device via a network. The processor circuit may identify information associated with the at least one contact in the database and/or from public data. The processor circuit may add the identified information to the address book data. The processor circuit may store the address book data with the added information in the database and send the added information with or without the address book data to the communication device via the network.
    Type: Grant
    Filed: October 11, 2017
    Date of Patent: July 24, 2018
    Assignee: FUZE, INC.
    Inventors: Alberto Lopez Toledo, Julio Andres Viera Sotillo, Inaki Berenguer, Joaquim Castellà Vilaseca
  • Patent number: 9953633
    Abstract: Various implementations disclosed herein include a training module configured to produce a set of segment templates from a concurrent segmentation of a plurality of vocalization instances of a VSP vocalized by a particular speaker, who is identifiable by a corresponding set of vocal characteristics. Each segment template provides a stochastic characterization of how each of one or more portions of a VSP is vocalized by the particular speaker in accordance with the corresponding set of vocal characteristics. Additionally, in various implementations, the training module includes systems, methods and/or devices configured to produce a set of VSP segment maps that each provide a quantitative characterization of how respective segments of the plurality of vocalization instances vary in relation to a corresponding one of a set of segment templates.
    Type: Grant
    Filed: July 23, 2015
    Date of Patent: April 24, 2018
    Assignee: MALASPINA LABS (BARBADOS), INC.
    Inventors: Clarence Chu, Alireza Kenarsari Anhari
  • Patent number: 9934793
    Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol capable of analyzing alcohol consumption in a time domain by analyzing a voice, and a recording medium and a terminal for carrying out same.
    Type: Grant
    Filed: January 24, 2014
    Date of Patent: April 3, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9916844
    Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol on the basis of a difference among a plurality of formant energy energies, which are generated by applying linear predictive coding according to a plurality of linear prediction orders, and a recording medium and a terminal for carrying out the method.
    Type: Grant
    Filed: January 28, 2014
    Date of Patent: March 13, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9899039
    Abstract: Disclosed is a method for determining alcohol consumption capable of analyzing alcohol consumption in a time domain by analyzing a formant slope of a voice signal, and a recording medium and a terminal for carrying out same. An terminal for determining whether a person is drunk comprises: a voice input unit for generating a voice frame by receiving a voice signal; a voiced/unvoiced sound analysis unit for determining whether a received voiced frame corresponds to a voiced sound; a formant frequency extraction unit for extracting a plurality of formant frequencies of the voice frame corresponding to the voiced sound; and an alcohol consumption determining unit for calculating a formant slope between the plurality of formant frequencies, and determining the state of alcohol consumption depending on the formant slope, thereby determining whether a person is drunk by analyzing the formant slope of an inputted voice.
    Type: Grant
    Filed: January 24, 2014
    Date of Patent: February 20, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9875081
    Abstract: A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: January 23, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: James David Meyers, Shah Samir Pravinchandra, Yue Liu, Arlen Dean, Daniel Miller, Arindam Mandal
  • Patent number: 9799329
    Abstract: This disclosure describes, in part, techniques and devices for identifying recurring environmental sounds in an environment such that these sounds may be canceled out of corresponding audio signals to increase signal-to-noise ratios (SNRs) of the signals and, hence, improve automatic speech recognition (ASR) on the signals. Recurring environmental sounds may include the ringing of a mobile phone, the beeping sound of a microphone, the buzzing of a washing machine, or the like.
    Type: Grant
    Filed: December 3, 2014
    Date of Patent: October 24, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael Alan Pogue, Kurt Wesley Piersol
  • Patent number: 9747910
    Abstract: A device comprising a memory and one or more processors may be configured extract, from the bitstream, a type of quantization mode. The one or more processors may also be configured to switch, based on the type of quantization mode, between non-predictive vector dequantization to reconstruct a first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and predictive vector dequantization to reconstruct a second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain. The memory may be configured to store the reconstructed first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and the reconstructed second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain.
    Type: Grant
    Filed: September 18, 2015
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters
  • Patent number: 9728196
    Abstract: A method and apparatus to encode and decode an audio/speech signal is provided. An inputted audio signal or speech signal may be transformed into at least one of a high frequency resolution signal and a high temporal resolution signal. The signal may be encoded by determining an appropriate resolution, the encoded signal may be decoded, and thus the audio signal, the speech signal, and a mixed signal of the audio signal and the speech signal may be processed.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: August 8, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Eun Mi Oh, Jung Hoe Kim, Ki-hyun Choo, Ho Sang Sung, Mi Young Kim
  • Patent number: 9667963
    Abstract: The prediction error energy in inter-frame prediction with motion compensation is reduced and the coding efficiency is improved.
    Type: Grant
    Filed: June 22, 2012
    Date of Patent: May 30, 2017
    Assignee: Nippon Telegraph And Telephone Corporation
    Inventors: Shohei Matsuo, Yukihiro Bandoh, Seishi Takamura, Hirohisa Jozawa
  • Patent number: 9595261
    Abstract: According to an embodiment, a pattern recognition device includes a signal processor, a first recognizer, a detector, and a second recognizer. The signal processor is configured to calculate a feature of a time-series signal for each frame. The first recognizer is configured to recognize which of a leaf class and a single class of a first class group the time-series signal belongs to for each frame based on the feature and output a recognition result. The detector is configured to detect a segment including a first target class on the basis of a sum of probabilities of the leaf classes which the frame belongs to on the basis of the recognition results for each frame. The second recognizer is configured to recognize which of second target classes the segment belongs to on the basis of the recognition results for the frames within the segment.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: March 14, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hiroshi Fujimura
  • Patent number: 9582702
    Abstract: Embodiments of the present invention generally relate to data processing, and further the embodiments of the invention relate to a method of processing a visible coding sequence and a system thereof, a method of playing a visible coding sequence and a system thereof. The present invention creatively proposes a scheme of determining sampling rate with synchronized frames to realize effective processing of a visible coding sequence. The scheme of processing a visible coding sequence according to the present invention is helpful for visible coding synchronization on the capturing side, enabling the capturing side to determine appropriate sampling rate and sampling timing, and thus effectively acquire the visible coding sequence, which may not only reduce resource waste, but also acquire a complete visible coding sequence.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: February 28, 2017
    Assignee: International Business Machines Corporation
    Inventors: Jiexin Jiao, Mengxiang Lin, Song Song, XiaoFeng Wang
  • Patent number: 9531862
    Abstract: A system to optimize a user's messaging by having a mechanism to recommend that a user utilizes an alternative communication channel. The invention relates to mobile messaging applications and to analyzing message content and providing feedback to the user in the form of a graphical or spoken output containing an offer of an alternative communication mode, wherein processing content of the user input comprises analyzing message content to collect parameters relating to message priority, channel type, channel availability, user schedule, user time zone, relationship of user to recipient calculated using a familiarity index, type of content, and number of recipients.
    Type: Grant
    Filed: September 4, 2015
    Date of Patent: December 27, 2016
    Inventor: Vishal Vadodaria
  • Patent number: 9420081
    Abstract: A system and method for providing voice communications with desired characteristics based upon the intended recipient of a voice communication. An apparatus includes a list of dial strings associated with parties having desired voice communication characteristics. A dial string entered by a user and associated with an intended recipient is compared to a list of preferred dial strings to determine the characteristics of an encoded voice signal to be sent to the recipient. The apparatus can include a vocoder having different bit rate modes and a bit rate mode is selected based upon the dial string entered by a user. Dial strings can be stored at the device or on a network. The apparatus can include a mode selector to select a desired vocoder mode to generate an encoded voice signal.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: August 16, 2016
    Assignee: AT&T Mobility II LLC
    Inventors: Jun Shen, Jack Denenberg, Alan MacDonald
  • Patent number: 9378746
    Abstract: Disclosed are a method and apparatus for encoding and decoding a high frequency for bandwidth extension. The method includes: estimating a weight; and generating a high frequency excitation signal by applying the weight between random noise and a decoded low frequency spectrum.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: June 28, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Ki-hyun Choo
  • Patent number: 9373341
    Abstract: Method for measuring level of speech determined by an audio signal in a manner which corrects for and reduces the effect of modification of the signal by the addition of noise thereto and/or amplitude compression thereof, and a system configured to perform any embodiment of the method. In some embodiments, the method includes steps of generating frequency banded, frequency-domain data indicative of an input speech signal, determining from the data a Gaussian parametric spectral model of the speech signal, and determining from the parametric spectral model an estimated mean speech level and a standard deviation value for each frequency band of the data; and generating speech level data indicative of a bias corrected mean speech level for each frequency band, including using at least one correction value to correct the estimated mean speech level for the frequency band, where each correction value has been predetermined using a reference speech model.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: June 21, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: David Gunawan, Glenn Dickins
  • Patent number: 9275639
    Abstract: A client-server architecture for Automatic Speech Recognition (ASR) applications, includes: (a) a client-side including: a client being part of distributed front end for converting acoustic waves to feature vectors; VAD for separating between speech and non-speech acoustic signals; adaptor for WebSockets; and (b) a server side including: a web layer utilizing HTTP protocols and including a Web Server having a Servlet Container; an intermediate layer for transport based on Message-Oriented Middleware being a message broker; a recognition server and an adaptation server both connected to said intermediate layer; a Speech processing server; a Recognition Server for instantiation of a recognition channel per client; an Adaptation Server for adaptation acoustic and linguistic models for each speaker; a Bidirectional communication channel between a Speech processing server and client side; and a Persistent layer for storing a Language Knowledge Base connected to said Speech processing server.
    Type: Grant
    Filed: March 31, 2013
    Date of Patent: March 1, 2016
    Assignee: Dixilang Ltd.
    Inventor: Victor Shagalov
  • Patent number: 9263054
    Abstract: A method for controlling an average encoding rate by an electronic device is described. The method includes obtaining a speech signal. The method also includes determining a first average rate. The method further includes determining a first threshold based on the first average rate. The method additionally includes controlling the average encoding rate by determining at least one other threshold based on the first threshold. The method also includes sending an encoded speech signal.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: February 16, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Subasingha Shaminda Subasingha, Vivek Rajendran, Venkatesh Krishnan, Venkatraman Srinivasa Atti
  • Patent number: 9111531
    Abstract: Improved audio classification is provided for encoding applications. An initial classification is performed, followed by a finer classification, to produce speech classifications and music classifications with higher accuracy and less complexity than previously available. Audio is classified as speech or music on a frame by frame basis. If the frame is classified as music by the initial classification, that frame undergoes a second, finer classification to confirm that the frame is music and not speech (e.g., speech that is tonal and/or structured that may not have been classified as speech by the initial classification). Depending on the implementation, one or more parameters may be used in the finer classification. Example parameters include voicing, modified correlation, signal activity, and long term pitch gain.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: August 18, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatraman Srinivasa Atti, Ethan Robert Duni
  • Patent number: 9098812
    Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: August 4, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Robert Carter Moore, Christopher Brian Quirk
  • Patent number: 9037455
    Abstract: Techniques for a computing device operating in limited-access states are provided. One example method includes determining, by a computing device, that a notification is scheduled for output by the computing device during a first time period and that a pattern of audio detected during the first time period is indicative of human speech. The method further includes delaying output of the notification during the first time period and determining that a pattern of audio detected during a second time period is not indicative of human speech. The method also includes outputting at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period.
    Type: Grant
    Filed: January 8, 2014
    Date of Patent: May 19, 2015
    Assignee: Google Inc.
    Inventors: Alexander Faaborg, Tristan Harris, Austin Robison
  • Patent number: 9015041
    Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: April 21, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
  • Patent number: 9008329
    Abstract: Provided are methods and systems for noise suppression within multiple time-frequency points of spectral representations. A multi-feature cluster tracker is used to track signal and noise sources and to predict signal versus noise dominance at each time-frequency point. Multiple features, such as binaural and monaural features, may be used for these purposes. A Gaussian mixture model (GMM) is developed and, in some embodiments, dynamically updated for distinguishing signal from noise and performing mask-based noise reduction. Each frequency band may use a different GMM or share a GMM with other frequency bands. A GMM may be combined from two models, with one trained to model time-frequency points in which the target dominates and another trained to model time-frequency points in which the noise dominates. Dynamic updates of a GMM may be performed using an expectation-maximization algorithm in an unsupervised fashion.
    Type: Grant
    Filed: June 8, 2012
    Date of Patent: April 14, 2015
    Assignee: Audience, Inc.
    Inventors: Michael Mandel, Carlos Avendano
  • Patent number: 8996362
    Abstract: For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.
    Type: Grant
    Filed: January 20, 2009
    Date of Patent: March 31, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Frederik Nagel, Sascha Disch, Max Neuendorf
  • Patent number: 8990074
    Abstract: A method of noise-robust speech classification is disclosed. Classification parameters are input to a speech classifier from external components. Internal classification parameters are generated in the speech classifier from at least one of the input parameters. A Normalized Auto-correlation Coefficient Function threshold is set. A parameter analyzer is selected according to a signal environment. A speech mode classification is determined based on a noise estimate of multiple frames of input speech.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Ethan Robert Duni, Vivek Rajendran
  • Patent number: 8982942
    Abstract: Disclosed herein are tools and techniques for storing and using video processing tool configuration information that can identify combinations of video processing tools to be used for processing video. In one exemplary embodiment, video processing tools of a computing system are identified. The performance of a combination of the video processing tools is measured. The performance measurement is compared with another performance measurement of another combination of the video processing tools. Based on the comparison, video processing tool configuration information is set. In another exemplary embodiment, video processing tool configuration information indicating a combination of video processing tools is accessed, and video data is processed using the combination of video processing tools based on the video processing tool configuration information.
    Type: Grant
    Filed: June 17, 2011
    Date of Patent: March 17, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wenfeng Gao, Shyam Sadhwani
  • Patent number: 8977544
    Abstract: A quantizing method is provided that includes quantizing an input signal by selecting one of a first quantization scheme not using an inter-frame prediction and a second quantization scheme using the inter-frame prediction, in consideration of one or more of a prediction mode, a predictive error and a transmission channel state.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: March 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho-sang Sung, Eun-mi Oh
  • Patent number: 8977543
    Abstract: A quantizing apparatus is provided that includes a quantization path determiner that determines a path from a first path not using inter-frame prediction and a second path using the inter-frame prediction, as a quantization path of an input signal, based on a criterion before quantization of the input signal; a first quantizer that quantizes the input signal, if the first path is determined as the quantization path of the input signal; and a second quantizer that quantizes the input signal, if the second path is determined as the quantization path of the input signal.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: March 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho-sang Sung, Eun-mi Oh
  • Patent number: 8965773
    Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.
    Type: Grant
    Filed: November 17, 2009
    Date of Patent: February 24, 2015
    Assignee: Orange
    Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader
  • Patent number: 8965761
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: February 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
  • Patent number: 8942975
    Abstract: Techniques are described herein that suppress noise in a Mel-filtered spectral domain. For example, a window may be applied to a representation of a speech signal in a time domain. The windowed representation in the time domain may be converted to a subsequent representation of the speech signal in the Mel-filtered spectral domain. A noise suppression operation may be performed with respect to the subsequent representation to provide noise-suppressed Mel coefficients.
    Type: Grant
    Filed: March 22, 2011
    Date of Patent: January 27, 2015
    Assignee: Broadcom Corporation
    Inventor: Jonas Borgstrom
  • Patent number: 8930197
    Abstract: A method comprising receiving at a user equipment encrypted content. The content is stored in said user equipment in an encrypted form. At least one key for decryption of said stored encrypted content is stored in the user equipment.
    Type: Grant
    Filed: May 9, 2008
    Date of Patent: January 6, 2015
    Assignee: Nokia Corporation
    Inventors: Anssi Ramo, Mikko Tammi, Adriana Vasilache, Lasse Laaksonen
  • Patent number: 8909521
    Abstract: A lossless coding technique for near-logarithmic companded PCM that achieves high compression performance is provided. In coding, the coding method that produces the smaller code amount is selected between the prediction coding method, which performs linear prediction of samples in a frame and codes the amplitude of the prediction error, and the normalization coding method, which normalizes the amplitude of the samples in the frame and codes the normalized amplitude, and a selection code that indicates the selection result is output. The samples in the frame are coded according to the selected coding method to produce a compression code. In decoding, the compression code is decoded according to a decoding process corresponding to the coding method specified by the selection code.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: December 9, 2014
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Takehiro Moriya, Noboru Harada, Yutaka Kamamoto
  • Patent number: 8898060
    Abstract: Method and arrangement in a network node for adapting a property of source coding to the quality of a communication link in packet switched conversational services in a communication system. The method comprises obtaining (404) information related to the quality of a communication link. The method further comprises selecting (406) a source coding mode with an associated source coding delay, based on the obtained information and the associated source coding delay. The selected source coding mode is selected from a set of at least two source coding modes associated with different source coding delays, and is to be used when source coding voice data to be transmitted over the communication link.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: November 25, 2014
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8898058
    Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: November 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
  • Patent number: 8892429
    Abstract: The present invention relates to an encoding device and an encoding method, a decoding device and a decoding method, and a program that reduce deterioration of sound quality due to encoding of audio signals. An envelope emphasis part (51) emphasizes an envelope (ENV). A noise shaping part (52) divides an emphasized envelope (D) formed by emphasis of the envelope (ENV) by a value larger than 1, and subtracts noise shaping (G) specified by information (NS) from a result of the division. A quantization part (14) sets a result of the subtraction as a quantization bit count (WL), and quantizes a normalized spectrum (S1) formed by normalization of a spectrum (S0) based on the quantization bit count (WL). A multiplexing part (53) multiplexes the information (NS), a quantized spectrum (QS) formed by quantization of the normalized spectrum (S1), and the envelope (ENV). The present invention can be applied to an encoding device encoding audio signals, for example.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: November 18, 2014
    Assignee: Sony Corporation
    Inventors: Shiro Suzuki, Yuuki Matsumura, Yasuhiro Toguri, Yuuji Maeda
  • Publication number: 20140330556
    Abstract: Low complexity detection of a time-wise position of a representative segment in media data is described. A subset of offset values is located in a set of offset values in media data using a first type of one or more types of features, which are extractable from (e.g., derivable from components of) the media data. The subset of offset values comprise values that are selected from the set of offset values based on one or more selection criteria. A set of candidate seed time points is identified based on the subset of offset values using a second type of the one or more types of features.
    Type: Application
    Filed: December 10, 2012
    Publication date: November 6, 2014
    Applicants: DOLBY INTERNATIONAL AB, DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Barbara Resch, Regunathan Radhakrishnan, Arijit Biswas, Jonas Engdegard
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8849656
    Abstract: A system enhances speech by detecting a speaker's utterance through a first microphone positioned a first distance from a source of interference. A second microphone may detect the speaker's utterance at a different position. A monitoring device may estimate the power level of a first microphone signal. A synthesizer may synthesize part of the first microphone signal by processing the second microphone signal. The synthesis may occur when power level is below a predetermined level.
    Type: Grant
    Filed: October 14, 2011
    Date of Patent: September 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Gerhard Schmidt, Mohamed Krini
  • Patent number: 8831937
    Abstract: Provided are methods and systems for improving quality of speech communications. The method may be for improving quality of speech communications in a system having a speech encoder configured to encode a first audio signal using a first set of encoding parameters associated with a first noise suppressor. A method may involve receiving a second audio signal at a second noise suppressor which provides much higher quality noise suppression than the first noise suppressor. The second audio signal may be generated by a single microphone or a combination of multiple microphones. The second noise suppressor may suppress the noise in the second audio signal to generate a processed signal which may be sent to a speech encoder. A second set of encoding parameters may be provided by the second noise suppressor for use by the speech encoder when encoding the processed signal into corresponding data.
    Type: Grant
    Filed: November 14, 2011
    Date of Patent: September 9, 2014
    Assignee: Audience, Inc.
    Inventors: Carlo Murgia, Scott Isabelle
  • Patent number: 8781822
    Abstract: Methods and apparatus for audio and speech processing including generating a plurality of frames, each of the frames comprising a plurality of transform coefficients, and allocating bits to the transform coefficients in each of the frames such that at least two of the transform coefficients in the same frame have different bit allocations and the total number of the bits allocated to the transform coefficients in at least two of the frames is equal.
    Type: Grant
    Filed: February 2, 2010
    Date of Patent: July 15, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Somdeb Majumdar, Amin Fazeldehkordi, Harinath Garudadri