Pattern Matching Vocoders Patents (Class 704/221)
  • Patent number: 11875139
    Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.
    Type: Grant
    Filed: February 6, 2023
    Date of Patent: January 16, 2024
    Assignee: GOOGLE LLC
    Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
  • Patent number: 11573774
    Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.
    Type: Grant
    Filed: February 21, 2022
    Date of Patent: February 7, 2023
    Assignee: GOOGLE LLC
    Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
  • Patent number: 11562761
    Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: January 24, 2023
    Assignee: Zoom Video Communications, Inc.
    Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
  • Patent number: 11256485
    Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: February 22, 2022
    Assignee: Google LLC
    Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
  • Patent number: 11250867
    Abstract: A vocoder system incorporates situational awareness data into unused bits in the trailing bytes of the vocoder frames by dividing the situational awareness data according to the number of known blank bits in each vocoder frame and incorporating the data, in order, such that the receiving system can extract and reconstruct the situational awareness data. Synchronization signals of predefined bit streams are incorporated to allow the receiving system to more accurately identify situational awareness bits in the trailing byte.
    Type: Grant
    Filed: October 8, 2019
    Date of Patent: February 15, 2022
    Assignee: Rockwell Collins, Inc.
    Inventor: James A. Stevens
  • Patent number: 11228958
    Abstract: Certain aspects of the present disclosure provide techniques transmitting a recommended bit rate query. A method that may be performed by a user equipment (UE) generally includes participating in a voice call with a base station using a channel and a bit rate for the voice call, measuring one or more channel quality metrics for the channel during the voice call, determining whether to transmit a query message to the base station to request a change in the bit rate based, at least in part, on the measured one or more channel quality metrics, at least one of a handover indication received from the base station or a change mode request received from the base station, and a prohibit timer, and taking one or more actions based on the determination.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: January 18, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Vashishth Jhunjhunwala, Tapas Ranjan Das, Ravi Kanth Kotreka
  • Patent number: 11212381
    Abstract: Embodiments disclosed herein are directed to a method and system of processing a short code voice call request is disclosed herein. A computing system receives a voice call request. The voice call request includes a short code associated with a target recipient. The computing system determines the target recipient based on the short code in the voice call request. The computing system determines preferences of the target recipient for processing the voice call request. The computing system processes the voice call request based on the determined preferences.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: December 28, 2021
    Inventor: Christopher A. Currie
  • Patent number: 10983808
    Abstract: The present invention relates to a method and apparatus for providing an emotion-adaptive user interface (UI) on the basis of an affective computing service, in which the provided service is configured with at least one of a service operation condition, a service end condition, and an emotion-adaptive UI service type on the basis of purpose information of the service, and the detailed pattern is changed and provided on the basis of the purpose information and the usefulness information of the service.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: April 20, 2021
    Assignee: Electronics and Telecommunications Research institute
    Inventors: Kyoung Ju Noh, Hyun Tae Jeong, Ga Gue Kim, Ji Youn Lim, Seung Eun Chung
  • Patent number: 10885931
    Abstract: A voice processing method for estimating an impression of speech includes: executing an acquisition process that includes acquiring voice signals; executing a feature acquisition process that includes acquiring acoustic features regarding the voice signals from the voice signals; executing a voice-parameter acquisition process that includes acquiring a voice parameter regarding a frame of the voice signals; executing a relative-value determination process that includes determining a relative value between the determined voice parameter and a statistical value of the voice parameter; executing a weight assignment process that includes assigning a weight to the frame of the voice signals in accordance with the relative value; and executing a distribution determination process that includes determining a distribution of the acoustic features, based on the weight assigned to the frame of the voice signals.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: January 5, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Taro Togawa, Sayuri Nakayama, Takeshi Otani
  • Patent number: 10867621
    Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: December 15, 2020
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10832696
    Abstract: A method for improving speech signal intelligibility is performed at a device. A speech signal is obtained. A correspondence between the speech signal and a respective user group among different user groups having distinct voice characteristics is identified. Pre-encoding signal augmentation is performed on the speech signal with a respective pre-augmentation filtering coefficient that corresponds to the respective user group to obtain a group-specific pre-augmented speech signal. The device encodes the pre-augmented speech signal for subsequent transmission through the voice communication channel. An encoded version of the pre-augmented speech signal has reduced loss of signal quality as compared to an encoded version of the speech signal that is obtained without the pre-encoding signal augmentation.
    Type: Grant
    Filed: June 6, 2018
    Date of Patent: November 10, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 10482196
    Abstract: A method, computer readable medium, and system are disclosed for generating a Gaussian mixture model hierarchy. The method includes the steps of receiving point cloud data defining a plurality of points; defining a Gaussian Mixture Model (GMM) hierarchy that includes a number of mixels, each mixel encoding parameters for a probabilistic occupancy map; and adjusting the parameters for one or more probabilistic occupancy maps based on the point cloud data utilizing a number of iterations of an Expectation-Maximum (EM) algorithm.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: November 19, 2019
    Assignee: NVIDIA Corporation
    Inventors: Benjamin David Eckart, Kihwan Kim, Alejandro Jose Troccoli, Jan Kautz
  • Patent number: 10460727
    Abstract: Various systems and methods for multi-talker speech separation and recognition are disclosed herein. In one example, a system includes a memory and a processor to process mixed speech audio received from a microphone. In an example, the processor can also separate the mixed speech audio using permutation invariant training, wherein a criterion of the permutation invariant training is defined on an utterance of the mixed speech audio. In an example, the processor can also generate a plurality of separated streams for submission to a speech decoder.
    Type: Grant
    Filed: May 23, 2017
    Date of Patent: October 29, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: James Droppo, Xuedong Huang, Dong Yu
  • Patent number: 10424319
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10325611
    Abstract: An audio decoder for providing a decoded audio information on the basis of an encoded audio information includes a linear-prediction-domain decoder configured to provide a first decoded audio information on the basis of an audio frame encoded in a linear prediction domain, a frequency domain decoder configured to provide a second decoded audio information on the basis of an audio frame encoded in a frequency domain, and a transition processor. The transition processor is configured to obtain a zero-input-response of a linear predictive filtering, wherein an initial state of the linear predictive filtering is defined depending on the first decoded audio information and the second decoded audio information, and modify the second decoded audio information depending on the zero-input-response, to obtain a smooth transition between the first and the modified second decoded audio information.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: June 18, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Emmanuel Ravelli, Guillaume Fuchs, Sascha Disch, Markus Multrus, Grzegorz Pietrzyk, Benjamin Schubert
  • Patent number: 10311895
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: June 4, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, MD Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10297273
    Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: May 21, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Md Mansurul A. Bhuiyan, Shereen M. Oraby
  • Patent number: 10217454
    Abstract: According to an embodiment, a voice synthesizer includes a content selection unit, a content generation unit, and a content registration unit. The content selection unit determines selected content among a plurality of pieces of content registered in a content storage unit. The content includes tagged text in which tag information for controlling voice synthesis is added to text serving as a target of the voice synthesis. The content generation unit applies the tag information in the tagged text included in the selected content to designated text to generate new content. The content registration unit registers the generated new content in the content storage unit.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: February 26, 2019
    Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA SOLUTIONS CORPORATION
    Inventors: Kaoru Hirano, Masaru Suzuki, Hiroyuki Mizutani
  • Patent number: 10157620
    Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: December 18, 2018
    Inventors: Srinath Cheluvaraja, Ananth Nagaraja Iyer, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 10147422
    Abstract: Systems, methods, and computer-readable media that may be used to modify a voice action system to include voice actions provided by advertisers or users are provided. One method includes receiving electronic voice action bids from advertisers to modify the voice action system to include a specific voice action (e.g., a triggering phrase and an action). One or more bids may be selected. The method includes, for each of the selected bids, modifying data associated with the voice action system to include the voice action associated with the bid, such that the action associated with the respective voice action is performed when voice input from a user is received that the voice action system determines to correspond to the triggering phrase associated with the respective voice action.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: December 4, 2018
    Assignee: GOOGLE LLC
    Inventor: Pedro J. Moreno Mengibar
  • Patent number: 10033836
    Abstract: A server comprising a processor circuit and a database may receive address book data comprising information associated with at least one contact from a communication device via a network. The processor circuit may identify information associated with the at least one contact in the database and/or from public data. The processor circuit may add the identified information to the address book data. The processor circuit may store the address book data with the added information in the database and send the added information with or without the address book data to the communication device via the network.
    Type: Grant
    Filed: October 11, 2017
    Date of Patent: July 24, 2018
    Assignee: FUZE, INC.
    Inventors: Alberto Lopez Toledo, Julio Andres Viera Sotillo, Inaki Berenguer, Joaquim Castellà Vilaseca
  • Patent number: 9953633
    Abstract: Various implementations disclosed herein include a training module configured to produce a set of segment templates from a concurrent segmentation of a plurality of vocalization instances of a VSP vocalized by a particular speaker, who is identifiable by a corresponding set of vocal characteristics. Each segment template provides a stochastic characterization of how each of one or more portions of a VSP is vocalized by the particular speaker in accordance with the corresponding set of vocal characteristics. Additionally, in various implementations, the training module includes systems, methods and/or devices configured to produce a set of VSP segment maps that each provide a quantitative characterization of how respective segments of the plurality of vocalization instances vary in relation to a corresponding one of a set of segment templates.
    Type: Grant
    Filed: July 23, 2015
    Date of Patent: April 24, 2018
    Assignee: MALASPINA LABS (BARBADOS), INC.
    Inventors: Clarence Chu, Alireza Kenarsari Anhari
  • Patent number: 9934793
    Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol capable of analyzing alcohol consumption in a time domain by analyzing a voice, and a recording medium and a terminal for carrying out same.
    Type: Grant
    Filed: January 24, 2014
    Date of Patent: April 3, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9916844
    Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol on the basis of a difference among a plurality of formant energy energies, which are generated by applying linear predictive coding according to a plurality of linear prediction orders, and a recording medium and a terminal for carrying out the method.
    Type: Grant
    Filed: January 28, 2014
    Date of Patent: March 13, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9899039
    Abstract: Disclosed is a method for determining alcohol consumption capable of analyzing alcohol consumption in a time domain by analyzing a formant slope of a voice signal, and a recording medium and a terminal for carrying out same. An terminal for determining whether a person is drunk comprises: a voice input unit for generating a voice frame by receiving a voice signal; a voiced/unvoiced sound analysis unit for determining whether a received voiced frame corresponds to a voiced sound; a formant frequency extraction unit for extracting a plurality of formant frequencies of the voice frame corresponding to the voiced sound; and an alcohol consumption determining unit for calculating a formant slope between the plurality of formant frequencies, and determining the state of alcohol consumption depending on the formant slope, thereby determining whether a person is drunk by analyzing the formant slope of an inputted voice.
    Type: Grant
    Filed: January 24, 2014
    Date of Patent: February 20, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
  • Patent number: 9875081
    Abstract: A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: January 23, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: James David Meyers, Shah Samir Pravinchandra, Yue Liu, Arlen Dean, Daniel Miller, Arindam Mandal
  • Patent number: 9799329
    Abstract: This disclosure describes, in part, techniques and devices for identifying recurring environmental sounds in an environment such that these sounds may be canceled out of corresponding audio signals to increase signal-to-noise ratios (SNRs) of the signals and, hence, improve automatic speech recognition (ASR) on the signals. Recurring environmental sounds may include the ringing of a mobile phone, the beeping sound of a microphone, the buzzing of a washing machine, or the like.
    Type: Grant
    Filed: December 3, 2014
    Date of Patent: October 24, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael Alan Pogue, Kurt Wesley Piersol
  • Patent number: 9747910
    Abstract: A device comprising a memory and one or more processors may be configured extract, from the bitstream, a type of quantization mode. The one or more processors may also be configured to switch, based on the type of quantization mode, between non-predictive vector dequantization to reconstruct a first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and predictive vector dequantization to reconstruct a second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain. The memory may be configured to store the reconstructed first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and the reconstructed second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain.
    Type: Grant
    Filed: September 18, 2015
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters
  • Patent number: 9728196
    Abstract: A method and apparatus to encode and decode an audio/speech signal is provided. An inputted audio signal or speech signal may be transformed into at least one of a high frequency resolution signal and a high temporal resolution signal. The signal may be encoded by determining an appropriate resolution, the encoded signal may be decoded, and thus the audio signal, the speech signal, and a mixed signal of the audio signal and the speech signal may be processed.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: August 8, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Eun Mi Oh, Jung Hoe Kim, Ki-hyun Choo, Ho Sang Sung, Mi Young Kim
  • Patent number: 9667963
    Abstract: The prediction error energy in inter-frame prediction with motion compensation is reduced and the coding efficiency is improved.
    Type: Grant
    Filed: June 22, 2012
    Date of Patent: May 30, 2017
    Assignee: Nippon Telegraph And Telephone Corporation
    Inventors: Shohei Matsuo, Yukihiro Bandoh, Seishi Takamura, Hirohisa Jozawa
  • Patent number: 9595261
    Abstract: According to an embodiment, a pattern recognition device includes a signal processor, a first recognizer, a detector, and a second recognizer. The signal processor is configured to calculate a feature of a time-series signal for each frame. The first recognizer is configured to recognize which of a leaf class and a single class of a first class group the time-series signal belongs to for each frame based on the feature and output a recognition result. The detector is configured to detect a segment including a first target class on the basis of a sum of probabilities of the leaf classes which the frame belongs to on the basis of the recognition results for each frame. The second recognizer is configured to recognize which of second target classes the segment belongs to on the basis of the recognition results for the frames within the segment.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: March 14, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hiroshi Fujimura
  • Patent number: 9582702
    Abstract: Embodiments of the present invention generally relate to data processing, and further the embodiments of the invention relate to a method of processing a visible coding sequence and a system thereof, a method of playing a visible coding sequence and a system thereof. The present invention creatively proposes a scheme of determining sampling rate with synchronized frames to realize effective processing of a visible coding sequence. The scheme of processing a visible coding sequence according to the present invention is helpful for visible coding synchronization on the capturing side, enabling the capturing side to determine appropriate sampling rate and sampling timing, and thus effectively acquire the visible coding sequence, which may not only reduce resource waste, but also acquire a complete visible coding sequence.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: February 28, 2017
    Assignee: International Business Machines Corporation
    Inventors: Jiexin Jiao, Mengxiang Lin, Song Song, XiaoFeng Wang
  • Patent number: 9531862
    Abstract: A system to optimize a user's messaging by having a mechanism to recommend that a user utilizes an alternative communication channel. The invention relates to mobile messaging applications and to analyzing message content and providing feedback to the user in the form of a graphical or spoken output containing an offer of an alternative communication mode, wherein processing content of the user input comprises analyzing message content to collect parameters relating to message priority, channel type, channel availability, user schedule, user time zone, relationship of user to recipient calculated using a familiarity index, type of content, and number of recipients.
    Type: Grant
    Filed: September 4, 2015
    Date of Patent: December 27, 2016
    Inventor: Vishal Vadodaria
  • Patent number: 9420081
    Abstract: A system and method for providing voice communications with desired characteristics based upon the intended recipient of a voice communication. An apparatus includes a list of dial strings associated with parties having desired voice communication characteristics. A dial string entered by a user and associated with an intended recipient is compared to a list of preferred dial strings to determine the characteristics of an encoded voice signal to be sent to the recipient. The apparatus can include a vocoder having different bit rate modes and a bit rate mode is selected based upon the dial string entered by a user. Dial strings can be stored at the device or on a network. The apparatus can include a mode selector to select a desired vocoder mode to generate an encoded voice signal.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: August 16, 2016
    Assignee: AT&T Mobility II LLC
    Inventors: Jun Shen, Jack Denenberg, Alan MacDonald
  • Patent number: 9378746
    Abstract: Disclosed are a method and apparatus for encoding and decoding a high frequency for bandwidth extension. The method includes: estimating a weight; and generating a high frequency excitation signal by applying the weight between random noise and a decoded low frequency spectrum.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: June 28, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Ki-hyun Choo
  • Patent number: 9373341
    Abstract: Method for measuring level of speech determined by an audio signal in a manner which corrects for and reduces the effect of modification of the signal by the addition of noise thereto and/or amplitude compression thereof, and a system configured to perform any embodiment of the method. In some embodiments, the method includes steps of generating frequency banded, frequency-domain data indicative of an input speech signal, determining from the data a Gaussian parametric spectral model of the speech signal, and determining from the parametric spectral model an estimated mean speech level and a standard deviation value for each frequency band of the data; and generating speech level data indicative of a bias corrected mean speech level for each frequency band, including using at least one correction value to correct the estimated mean speech level for the frequency band, where each correction value has been predetermined using a reference speech model.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: June 21, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: David Gunawan, Glenn Dickins
  • Patent number: 9275639
    Abstract: A client-server architecture for Automatic Speech Recognition (ASR) applications, includes: (a) a client-side including: a client being part of distributed front end for converting acoustic waves to feature vectors; VAD for separating between speech and non-speech acoustic signals; adaptor for WebSockets; and (b) a server side including: a web layer utilizing HTTP protocols and including a Web Server having a Servlet Container; an intermediate layer for transport based on Message-Oriented Middleware being a message broker; a recognition server and an adaptation server both connected to said intermediate layer; a Speech processing server; a Recognition Server for instantiation of a recognition channel per client; an Adaptation Server for adaptation acoustic and linguistic models for each speaker; a Bidirectional communication channel between a Speech processing server and client side; and a Persistent layer for storing a Language Knowledge Base connected to said Speech processing server.
    Type: Grant
    Filed: March 31, 2013
    Date of Patent: March 1, 2016
    Assignee: Dixilang Ltd.
    Inventor: Victor Shagalov
  • Patent number: 9263054
    Abstract: A method for controlling an average encoding rate by an electronic device is described. The method includes obtaining a speech signal. The method also includes determining a first average rate. The method further includes determining a first threshold based on the first average rate. The method additionally includes controlling the average encoding rate by determining at least one other threshold based on the first threshold. The method also includes sending an encoded speech signal.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: February 16, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Subasingha Shaminda Subasingha, Vivek Rajendran, Venkatesh Krishnan, Venkatraman Srinivasa Atti
  • Patent number: 9111531
    Abstract: Improved audio classification is provided for encoding applications. An initial classification is performed, followed by a finer classification, to produce speech classifications and music classifications with higher accuracy and less complexity than previously available. Audio is classified as speech or music on a frame by frame basis. If the frame is classified as music by the initial classification, that frame undergoes a second, finer classification to confirm that the frame is music and not speech (e.g., speech that is tonal and/or structured that may not have been classified as speech by the initial classification). Depending on the implementation, one or more parameters may be used in the finer classification. Example parameters include voicing, modified correlation, signal activity, and long term pitch gain.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: August 18, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatraman Srinivasa Atti, Ethan Robert Duni
  • Patent number: 9098812
    Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: August 4, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Robert Carter Moore, Christopher Brian Quirk
  • Patent number: 9037455
    Abstract: Techniques for a computing device operating in limited-access states are provided. One example method includes determining, by a computing device, that a notification is scheduled for output by the computing device during a first time period and that a pattern of audio detected during the first time period is indicative of human speech. The method further includes delaying output of the notification during the first time period and determining that a pattern of audio detected during a second time period is not indicative of human speech. The method also includes outputting at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period.
    Type: Grant
    Filed: January 8, 2014
    Date of Patent: May 19, 2015
    Assignee: Google Inc.
    Inventors: Alexander Faaborg, Tristan Harris, Austin Robison
  • Patent number: 9015041
    Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: April 21, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
  • Patent number: 9008329
    Abstract: Provided are methods and systems for noise suppression within multiple time-frequency points of spectral representations. A multi-feature cluster tracker is used to track signal and noise sources and to predict signal versus noise dominance at each time-frequency point. Multiple features, such as binaural and monaural features, may be used for these purposes. A Gaussian mixture model (GMM) is developed and, in some embodiments, dynamically updated for distinguishing signal from noise and performing mask-based noise reduction. Each frequency band may use a different GMM or share a GMM with other frequency bands. A GMM may be combined from two models, with one trained to model time-frequency points in which the target dominates and another trained to model time-frequency points in which the noise dominates. Dynamic updates of a GMM may be performed using an expectation-maximization algorithm in an unsupervised fashion.
    Type: Grant
    Filed: June 8, 2012
    Date of Patent: April 14, 2015
    Assignee: Audience, Inc.
    Inventors: Michael Mandel, Carlos Avendano
  • Patent number: 8996362
    Abstract: For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.
    Type: Grant
    Filed: January 20, 2009
    Date of Patent: March 31, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Frederik Nagel, Sascha Disch, Max Neuendorf
  • Patent number: 8990074
    Abstract: A method of noise-robust speech classification is disclosed. Classification parameters are input to a speech classifier from external components. Internal classification parameters are generated in the speech classifier from at least one of the input parameters. A Normalized Auto-correlation Coefficient Function threshold is set. A parameter analyzer is selected according to a signal environment. A speech mode classification is determined based on a noise estimate of multiple frames of input speech.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Ethan Robert Duni, Vivek Rajendran
  • Patent number: 8982942
    Abstract: Disclosed herein are tools and techniques for storing and using video processing tool configuration information that can identify combinations of video processing tools to be used for processing video. In one exemplary embodiment, video processing tools of a computing system are identified. The performance of a combination of the video processing tools is measured. The performance measurement is compared with another performance measurement of another combination of the video processing tools. Based on the comparison, video processing tool configuration information is set. In another exemplary embodiment, video processing tool configuration information indicating a combination of video processing tools is accessed, and video data is processed using the combination of video processing tools based on the video processing tool configuration information.
    Type: Grant
    Filed: June 17, 2011
    Date of Patent: March 17, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wenfeng Gao, Shyam Sadhwani
  • Patent number: 8977544
    Abstract: A quantizing method is provided that includes quantizing an input signal by selecting one of a first quantization scheme not using an inter-frame prediction and a second quantization scheme using the inter-frame prediction, in consideration of one or more of a prediction mode, a predictive error and a transmission channel state.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: March 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho-sang Sung, Eun-mi Oh
  • Patent number: 8977543
    Abstract: A quantizing apparatus is provided that includes a quantization path determiner that determines a path from a first path not using inter-frame prediction and a second path using the inter-frame prediction, as a quantization path of an input signal, based on a criterion before quantization of the input signal; a first quantizer that quantizes the input signal, if the first path is determined as the quantization path of the input signal; and a second quantizer that quantizes the input signal, if the second path is determined as the quantization path of the input signal.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: March 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho-sang Sung, Eun-mi Oh
  • Patent number: 8965761
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: February 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
  • Patent number: 8965773
    Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.
    Type: Grant
    Filed: November 17, 2009
    Date of Patent: February 24, 2015
    Assignee: Orange
    Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader