Pattern Matching Vocoders Patents (Class 704/221)

Vector quantization (Class 704/222)

Excitation patterns (Class 704/223)

Method, apparatus and system for hybrid speech synthesis

Patent number: 12254889

Abstract: A method of decoding an original speech signal for hybrid adversarial-parametric speech synthesis comprising: (a) receiving quantized original linear prediction coding parameters estimated by applying linear prediction coding analysis filtering to an original speech signal and a quantized compressed representation of a residual of the original speech signal; (b) dequantizing the original linear prediction coding parameters and the compressed representation of the residual; (c) inputting the dequantized compressed representation of the residual into a decoder part of a Generator for applying adversarial mapping from the compressed residual domain to a fake (first) signal domain; (d) outputting, by the decoder part of the Generator, a fake speech signal; (e) applying linear prediction coding analysis filtering to the fake speech signal for obtaining a corresponding fake residual; (f) reconstructing the original speech signal by applying linear prediction coding cross-synthesis filtering to the fake residual and

Type: Grant

Filed: December 20, 2019

Date of Patent: March 18, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Ahmed Mustafa, Arijit Biswas
Systems and methods for synthesizing code from input and output examples

Patent number: 11875139

Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.

Type: Grant

Filed: February 6, 2023

Date of Patent: January 16, 2024

Assignee: GOOGLE LLC

Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
Systems and methods for synthesizing code from input and output examples

Patent number: 11573774

Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.

Type: Grant

Filed: February 21, 2022

Date of Patent: February 7, 2023

Assignee: GOOGLE LLC

Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
Methods and apparatus for enhancing musical sound during a networked conference

Patent number: 11562761

Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.

Type: Grant

Filed: July 31, 2020

Date of Patent: January 24, 2023

Assignee: Zoom Video Communications, Inc.

Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
Systems and methods for synthesizing code from input and output examples

Patent number: 11256485

Abstract: The present disclosure provides systems and methods for synthesizing computer-readable code based on the receipt of input and output examples. A computing system in accordance with the disclosure can be configured to receive a given input and output, access and library of operations, and perform a search of a library of operations (e.g., transpose, slice, norm, etc.) that can be applied to the input. By applying the operations to the input and tracking the results, the computing system may identify an expression comprising one or a combination of operations that when applied to the input generates the output. In this manner, implementations of the disclosure may be used to identify one or more solutions that a user having access to the library of operations may use to generate the output from the input.

Type: Grant

Filed: July 15, 2020

Date of Patent: February 22, 2022

Assignee: Google LLC

Inventors: Kensen Shi, Rishabh Singh, David J. Bieber
Incorporating data into a voice signal with zero overhead

Patent number: 11250867

Abstract: A vocoder system incorporates situational awareness data into unused bits in the trailing bytes of the vocoder frames by dividing the situational awareness data according to the number of known blank bits in each vocoder frame and incorporating the data, in order, such that the receiving system can extract and reconstruct the situational awareness data. Synchronization signals of predefined bit streams are incorporated to allow the receiving system to more accurately identify situational awareness bits in the trailing byte.

Type: Grant

Filed: October 8, 2019

Date of Patent: February 15, 2022

Assignee: Rockwell Collins, Inc.

Inventor: James A. Stevens
Techniques for transmission of recommended bit rate queries

Patent number: 11228958

Abstract: Certain aspects of the present disclosure provide techniques transmitting a recommended bit rate query. A method that may be performed by a user equipment (UE) generally includes participating in a voice call with a base station using a channel and a bit rate for the voice call, measuring one or more channel quality metrics for the channel during the voice call, determining whether to transmit a query message to the base station to request a change in the bit rate based, at least in part, on the measured one or more channel quality metrics, at least one of a handover indication received from the base station or a change mode request received from the base station, and a prohibit timer, and taking one or more actions based on the determination.

Type: Grant

Filed: May 13, 2020

Date of Patent: January 18, 2022

Assignee: QUALCOMM Incorporated

Inventors: Vashishth Jhunjhunwala, Tapas Ranjan Das, Ravi Kanth Kotreka
Methods and systems for short code voice dialing

Patent number: 11212381

Abstract: Embodiments disclosed herein are directed to a method and system of processing a short code voice call request is disclosed herein. A computing system receives a voice call request. The voice call request includes a short code associated with a target recipient. The computing system determines the target recipient based on the short code in the voice call request. The computing system determines preferences of the target recipient for processing the voice call request. The computing system processes the voice call request based on the determined preferences.

Type: Grant

Filed: May 11, 2020

Date of Patent: December 28, 2021

Inventor: Christopher A. Currie
Method and apparatus for providing emotion-adaptive user interface

Patent number: 10983808

Abstract: The present invention relates to a method and apparatus for providing an emotion-adaptive user interface (UI) on the basis of an affective computing service, in which the provided service is configured with at least one of a service operation condition, a service end condition, and an emotion-adaptive UI service type on the basis of purpose information of the service, and the detailed pattern is changed and provided on the basis of the purpose information and the usefulness information of the service.

Type: Grant

Filed: August 22, 2019

Date of Patent: April 20, 2021

Assignee: Electronics and Telecommunications Research institute

Inventors: Kyoung Ju Noh, Hyun Tae Jeong, Ga Gue Kim, Ji Youn Lim, Seung Eun Chung
Voice processing method, voice processing apparatus, and non-transitory computer-readable storage medium for storing voice processing computer program

Patent number: 10885931

Abstract: A voice processing method for estimating an impression of speech includes: executing an acquisition process that includes acquiring voice signals; executing a feature acquisition process that includes acquiring acoustic features regarding the voice signals from the voice signals; executing a voice-parameter acquisition process that includes acquiring a voice parameter regarding a frame of the voice signals; executing a relative-value determination process that includes determining a relative value between the determined voice parameter and a statistical value of the voice parameter; executing a weight assignment process that includes assigning a weight to the frame of the voice signals in accordance with the relative value; and executing a distribution determination process that includes determining a distribution of the acoustic features, based on the weight assigned to the frame of the voice signals.

Type: Grant

Filed: September 24, 2018

Date of Patent: January 5, 2021

Assignee: FUJITSU LIMITED

Inventors: Taro Togawa, Sayuri Nakayama, Takeshi Otani
System and method for cluster-based audio event detection

Patent number: 10867621

Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.

Type: Grant

Filed: November 26, 2018

Date of Patent: December 15, 2020

Assignee: Pindrop Security, Inc.

Inventors: Elie Khoury, Matthew Garland
Speech signal cascade processing method, terminal, and computer-readable storage medium

Patent number: 10832696

Abstract: A method for improving speech signal intelligibility is performed at a device. A speech signal is obtained. A correspondence between the speech signal and a respective user group among different user groups having distinct voice characteristics is identified. Pre-encoding signal augmentation is performed on the speech signal with a respective pre-augmentation filtering coefficient that corresponds to the respective user group to obtain a group-specific pre-augmented speech signal. The device encodes the pre-augmented speech signal for subsequent transmission through the voice communication channel. An encoded version of the pre-augmented speech signal has reduced loss of signal quality as compared to an encoded version of the speech signal that is obtained without the pre-encoding signal augmentation.

Type: Grant

Filed: June 6, 2018

Date of Patent: November 10, 2020

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Junbin Liang
Modeling point cloud data using hierarchies of Gaussian mixture models

Patent number: 10482196

Abstract: A method, computer readable medium, and system are disclosed for generating a Gaussian mixture model hierarchy. The method includes the steps of receiving point cloud data defining a plurality of points; defining a Gaussian Mixture Model (GMM) hierarchy that includes a number of mixels, each mixel encoding parameters for a probabilistic occupancy map; and adjusting the parameters for one or more probabilistic occupancy maps based on the point cloud data utilizing a number of iterations of an Expectation-Maximum (EM) algorithm.

Type: Grant

Filed: February 26, 2016

Date of Patent: November 19, 2019

Assignee: NVIDIA Corporation

Inventors: Benjamin David Eckart, Kihwan Kim, Alejandro Jose Troccoli, Jan Kautz
Multi-talker speech recognizer

Patent number: 10460727

Abstract: Various systems and methods for multi-talker speech separation and recognition are disclosed herein. In one example, a system includes a memory and a processor to process mixed speech audio received from a microphone. In an example, the processor can also separate the mixed speech audio using permutation invariant training, wherein a criterion of the permutation invariant training is defined on an utterance of the mixed speech audio. In an example, the processor can also generate a plurality of separated streams for submission to a speech decoder.

Type: Grant

Filed: May 23, 2017

Date of Patent: October 29, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: James Droppo, Xuedong Huang, Dong Yu
Assessing the structural quality of conversations

Patent number: 10424319

Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.

Type: Grant

Filed: September 26, 2017

Date of Patent: September 24, 2019

Assignee: International Business Machines Corporation

Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Mansurul A. Bhuiyan, Shereen M. Oraby
Audio decoder, method and computer program using a zero-input-response to obtain a smooth transition

Patent number: 10325611

Abstract: An audio decoder for providing a decoded audio information on the basis of an encoded audio information includes a linear-prediction-domain decoder configured to provide a first decoded audio information on the basis of an audio frame encoded in a linear prediction domain, a frequency domain decoder configured to provide a second decoded audio information on the basis of an audio frame encoded in a frequency domain, and a transition processor. The transition processor is configured to obtain a zero-input-response of a linear predictive filtering, wherein an initial state of the linear predictive filtering is defined depending on the first decoded audio information and the second decoded audio information, and modify the second decoded audio information depending on the zero-input-response, to obtain a smooth transition between the first and the modified second decoded audio information.

Type: Grant

Filed: January 26, 2017

Date of Patent: June 18, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Emmanuel Ravelli, Guillaume Fuchs, Sascha Disch, Markus Multrus, Grzegorz Pietrzyk, Benjamin Schubert
Assessing the structural quality of conversations

Patent number: 10311895

Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.

Type: Grant

Filed: June 5, 2018

Date of Patent: June 4, 2019

Assignee: International Business Machines Corporation

Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, MD Mansurul A. Bhuiyan, Shereen M. Oraby
Assessing the structural quality of conversations

Patent number: 10297273

Abstract: Input of a conversation is received. The conversation includes at least a first user. An utterance of the conversation is analyzed to identify a dialog act attribute, an emotion attribute, and a tone attribute. The dialog act attribute, emotion attribute, and tone attribute are annotated to the utterance of the conversation. The conversation is validated based on the annotated attributes compared with a threshold. The annotated conversation and the validation of the conversation are stored.

Type: Grant

Filed: June 5, 2018

Date of Patent: May 21, 2019

Assignee: International Business Machines Corporation

Inventors: Rama Kalyani T. Akkiraju, Jalal Mahmud, Vibha S. Sinha, Anbang Xu, Pritam S. Gundecha, Md Mansurul A. Bhuiyan, Shereen M. Oraby
Voice synthesizer, voice synthesis method, and computer program product

Patent number: 10217454

Abstract: According to an embodiment, a voice synthesizer includes a content selection unit, a content generation unit, and a content registration unit. The content selection unit determines selected content among a plurality of pieces of content registered in a content storage unit. The content includes tagged text in which tag information for controlling voice synthesis is added to text serving as a target of the voice synthesis. The content generation unit applies the tag information in the tagged text included in the selected content to designated text to generate new content. The content registration unit registers the generated new content in the content storage unit.

Type: Grant

Filed: September 15, 2016

Date of Patent: February 26, 2019

Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA SOLUTIONS CORPORATION

Inventors: Kaoru Hirano, Masaru Suzuki, Hiroyuki Mizutani
System and method to correct for packet loss in automatic speech recognition systems utilizing linear interpolation

Patent number: 10157620

Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.

Type: Grant

Filed: March 4, 2015

Date of Patent: December 18, 2018

Inventors: Srinath Cheluvaraja, Ananth Nagaraja Iyer, Aravind Ganapathiraju, Felix Immanuel Wyss
Customized voice action system

Patent number: 10147422

Abstract: Systems, methods, and computer-readable media that may be used to modify a voice action system to include voice actions provided by advertisers or users are provided. One method includes receiving electronic voice action bids from advertisers to modify the voice action system to include a specific voice action (e.g., a triggering phrase and an action). One or more bids may be selected. The method includes, for each of the selected bids, modifying data associated with the voice action system to include the voice action associated with the bid, such that the action associated with the respective voice action is performed when voice input from a user is received that the voice action system determines to correspond to the triggering phrase associated with the respective voice action.

Type: Grant

Filed: February 26, 2016

Date of Patent: December 4, 2018

Assignee: GOOGLE LLC

Inventor: Pedro J. Moreno Mengibar
Systems and methods of address book management

Patent number: 10033836

Abstract: A server comprising a processor circuit and a database may receive address book data comprising information associated with at least one contact from a communication device via a network. The processor circuit may identify information associated with the at least one contact in the database and/or from public data. The processor circuit may add the identified information to the address book data. The processor circuit may store the address book data with the added information in the database and send the added information with or without the address book data to the communication device via the network.

Type: Grant

Filed: October 11, 2017

Date of Patent: July 24, 2018

Assignee: FUZE, INC.

Inventors: Alberto Lopez Toledo, Julio Andres Viera Sotillo, Inaki Berenguer, Joaquim Castellà Vilaseca
Speaker dependent voiced sound pattern template mapping

Patent number: 9953633

Abstract: Various implementations disclosed herein include a training module configured to produce a set of segment templates from a concurrent segmentation of a plurality of vocalization instances of a VSP vocalized by a particular speaker, who is identifiable by a corresponding set of vocal characteristics. Each segment template provides a stochastic characterization of how each of one or more portions of a VSP is vocalized by the particular speaker in accordance with the corresponding set of vocal characteristics. Additionally, in various implementations, the training module includes systems, methods and/or devices configured to produce a set of VSP segment maps that each provide a quantitative characterization of how respective segments of the plurality of vocalization instances vary in relation to a corresponding one of a set of segment templates.

Type: Grant

Filed: July 23, 2015

Date of Patent: April 24, 2018

Assignee: MALASPINA LABS (BARBADOS), INC.

Inventors: Clarence Chu, Alireza Kenarsari Anhari
Method for determining alcohol consumption, and recording medium and terminal for carrying out same

Patent number: 9934793

Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol capable of analyzing alcohol consumption in a time domain by analyzing a voice, and a recording medium and a terminal for carrying out same.

Type: Grant

Filed: January 24, 2014

Date of Patent: April 3, 2018

Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION

Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Method for determining alcohol consumption, and recording medium and terminal for carrying out same

Patent number: 9916844

Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol on the basis of a difference among a plurality of formant energy energies, which are generated by applying linear predictive coding according to a plurality of linear prediction orders, and a recording medium and a terminal for carrying out the method.

Type: Grant

Filed: January 28, 2014

Date of Patent: March 13, 2018

Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION

Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Method for determining alcohol consumption, and recording medium and terminal for carrying out same

Patent number: 9899039

Abstract: Disclosed is a method for determining alcohol consumption capable of analyzing alcohol consumption in a time domain by analyzing a formant slope of a voice signal, and a recording medium and a terminal for carrying out same. An terminal for determining whether a person is drunk comprises: a voice input unit for generating a voice frame by receiving a voice signal; a voiced/unvoiced sound analysis unit for determining whether a received voiced frame corresponds to a voiced sound; a formant frequency extraction unit for extracting a plurality of formant frequencies of the voice frame corresponding to the voiced sound; and an alcohol consumption determining unit for calculating a formant slope between the plurality of formant frequencies, and determining the state of alcohol consumption depending on the formant slope, thereby determining whether a person is drunk by analyzing the formant slope of an inputted voice.

Type: Grant

Filed: January 24, 2014

Date of Patent: February 20, 2018

Assignee: FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION

Inventors: Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Device selection for providing a response

Patent number: 9875081

Abstract: A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.

Type: Grant

Filed: September 21, 2015

Date of Patent: January 23, 2018

Assignee: Amazon Technologies, Inc.

Inventors: James David Meyers, Shah Samir Pravinchandra, Yue Liu, Arlen Dean, Daniel Miller, Arindam Mandal
Removing recurring environmental sounds

Patent number: 9799329

Abstract: This disclosure describes, in part, techniques and devices for identifying recurring environmental sounds in an environment such that these sounds may be canceled out of corresponding audio signals to increase signal-to-noise ratios (SNRs) of the signals and, hence, improve automatic speech recognition (ASR) on the signals. Recurring environmental sounds may include the ringing of a mobile phone, the beeping sound of a microphone, the buzzing of a washing machine, or the like.

Type: Grant

Filed: December 3, 2014

Date of Patent: October 24, 2017

Assignee: Amazon Technologies, Inc.

Inventors: Michael Alan Pogue, Kurt Wesley Piersol
Switching between predictive and non-predictive quantization techniques in a higher order ambisonics (HOA) framework

Patent number: 9747910

Abstract: A device comprising a memory and one or more processors may be configured extract, from the bitstream, a type of quantization mode. The one or more processors may also be configured to switch, based on the type of quantization mode, between non-predictive vector dequantization to reconstruct a first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and predictive vector dequantization to reconstruct a second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain. The memory may be configured to store the reconstructed first set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain, and the reconstructed second set of one or more weights used to approximate the multi-directional V-Vector in the higher order ambisonics domain.

Type: Grant

Filed: September 18, 2015

Date of Patent: August 29, 2017

Assignee: QUALCOMM Incorporated

Inventors: Moo Young Kim, Nils Günther Peters
Method and apparatus to encode and decode an audio/speech signal

Patent number: 9728196

Abstract: A method and apparatus to encode and decode an audio/speech signal is provided. An inputted audio signal or speech signal may be transformed into at least one of a high frequency resolution signal and a high temporal resolution signal. The signal may be encoded by determining an appropriate resolution, the encoded signal may be decoded, and thus the audio signal, the speech signal, and a mixed signal of the audio signal and the speech signal may be processed.

Type: Grant

Filed: May 9, 2016

Date of Patent: August 8, 2017

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Eun Mi Oh, Jung Hoe Kim, Ki-hyun Choo, Ho Sang Sung, Mi Young Kim
Method and apparatus for encoding video, method and apparatus for decoding video, and programs therefor

Patent number: 9667963

Abstract: The prediction error energy in inter-frame prediction with motion compensation is reduced and the coding efficiency is improved.

Type: Grant

Filed: June 22, 2012

Date of Patent: May 30, 2017

Assignee: Nippon Telegraph And Telephone Corporation

Inventors: Shohei Matsuo, Yukihiro Bandoh, Seishi Takamura, Hirohisa Jozawa
Pattern recognition device, pattern recognition method, and computer program product

Patent number: 9595261

Abstract: According to an embodiment, a pattern recognition device includes a signal processor, a first recognizer, a detector, and a second recognizer. The signal processor is configured to calculate a feature of a time-series signal for each frame. The first recognizer is configured to recognize which of a leaf class and a single class of a first class group the time-series signal belongs to for each frame based on the feature and output a recognition result. The detector is configured to detect a segment including a first target class on the basis of a sum of probabilities of the leaf classes which the frame belongs to on the basis of the recognition results for each frame. The second recognizer is configured to recognize which of second target classes the segment belongs to on the basis of the recognition results for the frames within the segment.

Type: Grant

Filed: March 11, 2015

Date of Patent: March 14, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventor: Hiroshi Fujimura
Processing visible coding sequence, playing visible coding sequence

Patent number: 9582702

Abstract: Embodiments of the present invention generally relate to data processing, and further the embodiments of the invention relate to a method of processing a visible coding sequence and a system thereof, a method of playing a visible coding sequence and a system thereof. The present invention creatively proposes a scheme of determining sampling rate with synchronized frames to realize effective processing of a visible coding sequence. The scheme of processing a visible coding sequence according to the present invention is helpful for visible coding synchronization on the capturing side, enabling the capturing side to determine appropriate sampling rate and sampling timing, and thus effectively acquire the visible coding sequence, which may not only reduce resource waste, but also acquire a complete visible coding sequence.

Type: Grant

Filed: August 31, 2016

Date of Patent: February 28, 2017

Assignee: International Business Machines Corporation

Inventors: Jiexin Jiao, Mengxiang Lin, Song Song, XiaoFeng Wang
Contextual linking module with interactive intelligent agent for managing communications with contacts and navigation features

Patent number: 9531862

Abstract: A system to optimize a user's messaging by having a mechanism to recommend that a user utilizes an alternative communication channel. The invention relates to mobile messaging applications and to analyzing message content and providing feedback to the user in the form of a graphical or spoken output containing an offer of an alternative communication mode, wherein processing content of the user input comprises analyzing message content to collect parameters relating to message priority, channel type, channel availability, user schedule, user time zone, relationship of user to recipient calculated using a familiarity index, type of content, and number of recipients.

Type: Grant

Filed: September 4, 2015

Date of Patent: December 27, 2016

Inventor: Vishal Vadodaria
Dialed digits based vocoder assignment

Patent number: 9420081

Abstract: A system and method for providing voice communications with desired characteristics based upon the intended recipient of a voice communication. An apparatus includes a list of dial strings associated with parties having desired voice communication characteristics. A dial string entered by a user and associated with an intended recipient is compared to a list of preferred dial strings to determine the characteristics of an encoded voice signal to be sent to the recipient. The apparatus can include a vocoder having different bit rate modes and a bit rate mode is selected based upon the dial string entered by a user. Dial strings can be stored at the device or on a network. The apparatus can include a mode selector to select a desired vocoder mode to generate an encoded voice signal.

Type: Grant

Filed: March 18, 2014

Date of Patent: August 16, 2016

Assignee: AT&T Mobility II LLC

Inventors: Jun Shen, Jack Denenberg, Alan MacDonald
Method and apparatus for encoding and decoding high frequency for bandwidth extension

Patent number: 9378746

Abstract: Disclosed are a method and apparatus for encoding and decoding a high frequency for bandwidth extension. The method includes: estimating a weight; and generating a high frequency excitation signal by applying the weight between random noise and a decoded low frequency spectrum.

Type: Grant

Filed: March 21, 2013

Date of Patent: June 28, 2016

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Ki-hyun Choo
Method and system for bias corrected speech level determination

Patent number: 9373341

Abstract: Method for measuring level of speech determined by an audio signal in a manner which corrects for and reduces the effect of modification of the signal by the addition of noise thereto and/or amplitude compression thereof, and a system configured to perform any embodiment of the method. In some embodiments, the method includes steps of generating frequency banded, frequency-domain data indicative of an input speech signal, determining from the data a Gaussian parametric spectral model of the speech signal, and determining from the parametric spectral model an estimated mean speech level and a standard deviation value for each frequency band of the data; and generating speech level data indicative of a bias corrected mean speech level for each frequency band, including using at least one correction value to correct the estimated mean speech level for the frequency band, where each correction value has been predetermined using a reference speech model.

Type: Grant

Filed: March 21, 2013

Date of Patent: June 21, 2016

Assignee: Dolby Laboratories Licensing Corporation

Inventors: David Gunawan, Glenn Dickins
Client-server architecture for automatic speech recognition applications

Patent number: 9275639

Abstract: A client-server architecture for Automatic Speech Recognition (ASR) applications, includes: (a) a client-side including: a client being part of distributed front end for converting acoustic waves to feature vectors; VAD for separating between speech and non-speech acoustic signals; adaptor for WebSockets; and (b) a server side including: a web layer utilizing HTTP protocols and including a Web Server having a Servlet Container; an intermediate layer for transport based on Message-Oriented Middleware being a message broker; a recognition server and an adaptation server both connected to said intermediate layer; a Speech processing server; a Recognition Server for instantiation of a recognition channel per client; an Adaptation Server for adaptation acoustic and linguistic models for each speaker; a Bidirectional communication channel between a Speech processing server and client side; and a Persistent layer for storing a Language Knowledge Base connected to said Speech processing server.

Type: Grant

Filed: March 31, 2013

Date of Patent: March 1, 2016

Assignee: Dixilang Ltd.

Inventor: Victor Shagalov
Systems and methods for controlling an average encoding rate for speech signal encoding

Patent number: 9263054

Abstract: A method for controlling an average encoding rate by an electronic device is described. The method includes obtaining a speech signal. The method also includes determining a first average rate. The method further includes determining a first threshold based on the first average rate. The method additionally includes controlling the average encoding rate by determining at least one other threshold based on the first threshold. The method also includes sending an encoded speech signal.

Type: Grant

Filed: August 30, 2013

Date of Patent: February 16, 2016

Assignee: QUALCOMM Incorporated

Inventors: Subasingha Shaminda Subasingha, Vivek Rajendran, Venkatesh Krishnan, Venkatraman Srinivasa Atti
Multiple coding mode signal classification

Patent number: 9111531

Abstract: Improved audio classification is provided for encoding applications. An initial classification is performed, followed by a finer classification, to produce speech classifications and music classifications with higher accuracy and less complexity than previously available. Audio is classified as speech or music on a frame by frame basis. If the frame is classified as music by the initial classification, that frame undergoes a second, finer classification to confirm that the frame is music and not speech (e.g., speech that is tonal and/or structured that may not have been classified as speech by the initial classification). Depending on the implementation, one or more parameters may be used in the finer classification. Example parameters include voicing, modified correlation, signal activity, and long term pitch gain.

Type: Grant

Filed: December 20, 2012

Date of Patent: August 18, 2015

Assignee: QUALCOMM Incorporated

Inventors: Venkatraman Srinivasa Atti, Ethan Robert Duni
Faster minimum error rate training for weighted linear models

Patent number: 9098812

Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.

Type: Grant

Filed: April 14, 2009

Date of Patent: August 4, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Robert Carter Moore, Christopher Brian Quirk
Limiting notification interruptions

Patent number: 9037455

Abstract: Techniques for a computing device operating in limited-access states are provided. One example method includes determining, by a computing device, that a notification is scheduled for output by the computing device during a first time period and that a pattern of audio detected during the first time period is indicative of human speech. The method further includes delaying output of the notification during the first time period and determining that a pattern of audio detected during a second time period is not indicative of human speech. The method also includes outputting at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period.

Type: Grant

Filed: January 8, 2014

Date of Patent: May 19, 2015

Assignee: Google Inc.

Inventors: Alexander Faaborg, Tristan Harris, Austin Robison
Time warp activation signal provider, audio signal encoder, method for providing a time warp activation signal, method for encoding an audio signal and computer programs

Patent number: 9015041

Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.

Type: Grant

Filed: January 11, 2011

Date of Patent: April 21, 2015

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
Noise reduction using multi-feature cluster tracker

Patent number: 9008329

Abstract: Provided are methods and systems for noise suppression within multiple time-frequency points of spectral representations. A multi-feature cluster tracker is used to track signal and noise sources and to predict signal versus noise dominance at each time-frequency point. Multiple features, such as binaural and monaural features, may be used for these purposes. A Gaussian mixture model (GMM) is developed and, in some embodiments, dynamically updated for distinguishing signal from noise and performing mask-based noise reduction. Each frequency band may use a different GMM or share a GMM with other frequency bands. A GMM may be combined from two models, with one trained to model time-frequency points in which the target dominates and another trained to model time-frequency points in which the noise dominates. Dynamic updates of a GMM may be performed using an expectation-maximization algorithm in an unsupervised fashion.

Type: Grant

Filed: June 8, 2012

Date of Patent: April 14, 2015

Assignee: Audience, Inc.

Inventors: Michael Mandel, Carlos Avendano
Device and method for a bandwidth extension of an audio signal

Patent number: 8996362

Abstract: For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.

Type: Grant

Filed: January 20, 2009

Date of Patent: March 31, 2015

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Frederik Nagel, Sascha Disch, Max Neuendorf
Noise-robust speech coding mode classification

Patent number: 8990074

Abstract: A method of noise-robust speech classification is disclosed. Classification parameters are input to a speech classifier from external components. Internal classification parameters are generated in the speech classifier from at least one of the input parameters. A Normalized Auto-correlation Coefficient Function threshold is set. A parameter analyzer is selected according to a signal environment. A speech mode classification is determined based on a noise estimate of multiple frames of input speech.

Type: Grant

Filed: April 10, 2012

Date of Patent: March 24, 2015

Assignee: QUALCOMM Incorporated

Inventors: Ethan Robert Duni, Vivek Rajendran
Adaptive codec selection

Patent number: 8982942

Abstract: Disclosed herein are tools and techniques for storing and using video processing tool configuration information that can identify combinations of video processing tools to be used for processing video. In one exemplary embodiment, video processing tools of a computing system are identified. The performance of a combination of the video processing tools is measured. The performance measurement is compared with another performance measurement of another combination of the video processing tools. Based on the comparison, video processing tool configuration information is set. In another exemplary embodiment, video processing tool configuration information indicating a combination of video processing tools is accessed, and video data is processed using the combination of video processing tools based on the video processing tool configuration information.

Type: Grant

Filed: June 17, 2011

Date of Patent: March 17, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Wenfeng Gao, Shyam Sadhwani
Apparatus for quantizing linear predictive coding coefficients, sound encoding apparatus, apparatus for de-quantizing linear predictive coding coefficients, sound decoding apparatus, and electronic device therefore

Patent number: 8977543

Abstract: A quantizing apparatus is provided that includes a quantization path determiner that determines a path from a first path not using inter-frame prediction and a second path using the inter-frame prediction, as a quantization path of an input signal, based on a criterion before quantization of the input signal; a first quantizer that quantizes the input signal, if the first path is determined as the quantization path of the input signal; and a second quantizer that quantizes the input signal, if the second path is determined as the quantization path of the input signal.

Type: Grant

Filed: April 23, 2012

Date of Patent: March 10, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ho-sang Sung, Eun-mi Oh
Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium and electronic device therefor

Patent number: 8977544

Abstract: A quantizing method is provided that includes quantizing an input signal by selecting one of a first quantization scheme not using an inter-frame prediction and a second quantization scheme using the inter-frame prediction, in consideration of one or more of a prediction mode, a predictive error and a transmission channel state.

Type: Grant

Filed: April 23, 2012

Date of Patent: March 10, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ho-sang Sung, Eun-mi Oh
Coding with noise shaping in a hierarchical coder

Patent number: 8965773

Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.

Type: Grant

Filed: November 17, 2009

Date of Patent: February 24, 2015

Assignee: Orange

Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader

1 2 3 4 5 … next