Patents Examined by Matthew Baker
  • Patent number: 10056090
    Abstract: The present disclosure provides a speech/audio signal processing method based on wideband switching and a coding apparatus. The method includes: if a first wideband speech/audio signal is a harmonic signal, adjusting a determining condition for determining that a second wideband speech/audio signal is a harmonic signal, to obtain a first determining condition, where the first wideband speech/audio signal is a signal before wideband switching, and the second wideband speech/audio signal is a signal after the wideband switching; and determining, according to the first determining condition, whether the second wideband speech/audio signal is a harmonic signal. In the case of wideband switching, signal types of speech/audio signals remain as consistent as possible before and after the switching, so that continuity of the speech/audio signal decoded by a decoder device is ensured as much as possible, further improving speech communication service quality.
    Type: Grant
    Filed: December 5, 2014
    Date of Patent: August 21, 2018
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Chen Hu, Zexin Liu, Lei Miao
  • Patent number: 10032465
    Abstract: Systems and methods are disclosed for displaying electronic multimedia content to a user. One computer-implemented method for manipulating electronic multimedia content includes generating, using a processor, a speech model and at least one speaker model of an individual speaker. The method further includes receiving electronic media content over a network; extracting an audio track from the electronic media content; and detecting speech segments within the electronic media content based on the speech model. The method further includes detecting a speaker segment within the electronic media content and calculating a probability of the detected speaker segment involving the individual speaker based on the at least one speaker model.
    Type: Grant
    Filed: March 1, 2016
    Date of Patent: July 24, 2018
    Assignee: OATH INC.
    Inventors: Peter F. Kocks, Guoning Hu, Ping-Hao Wu
  • Patent number: 10007661
    Abstract: Techniques are provided for performing automated operations to analyze and prioritize incoming user messages. An indication of a message sent to a recipient user is received. Based at least in part on configuration information associated with the recipient user, the received message is analyzed. Analyzing the received message includes at least one of determining sentiments associated with the received message, determining intentions associated with the received message, determining document classes associated with the received message, and generating summary information corresponding to the received message. Based at least in part on the analyzing of the received message, a prioritized listing of multiple messages associated with the recipient user, including the received message, is displayed to the recipient user.
    Type: Grant
    Filed: September 26, 2016
    Date of Patent: June 26, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gregg M. Arquero, Eli M. Dow, Syed F. Hossain, Joshua Schaeffer, Yunli Tang
  • Patent number: 10008208
    Abstract: Embodiments of the present invention perform speaker identification and verification by first prompting a user to speak a phrase that includes a common phrase component and a personal identifier. Then, the embodiments decompose the spoken phrase to locate the personal identifier. Finally, the embodiments identify and verify the user based on the results of the decomposing.
    Type: Grant
    Filed: September 18, 2014
    Date of Patent: June 26, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Almog Aley-Raz, Kevin R. Farrell, Oshrit Yaron, Luca Scarpato
  • Patent number: 10006777
    Abstract: A system and method of recognizing speech received at a vehicle includes: receiving speech from a vehicle occupant via a microphone; determining whether the speech relates to a point of interest (POI) or an address without receiving a POI command prompt or an address command prompt in the speech from the vehicle occupant; selecting a POI function or an address function based on the determination; and processing the received speech to identify a POI or an address.
    Type: Grant
    Filed: October 2, 2015
    Date of Patent: June 26, 2018
    Assignee: GM Global Technology Operations LLC
    Inventors: Gaurav Talwar, Xufang Zhao
  • Patent number: 9990918
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.
    Type: Grant
    Filed: October 19, 2017
    Date of Patent: June 5, 2018
    Assignee: Google LLC
    Inventors: William Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Noam M. Shazeer
  • Patent number: 9990176
    Abstract: Methods and devices for determining whether a local version of content is stored on an electronic device associated with a user account on a backend system are described herein. In a non-limiting embodiment, the backend system may track and monitor the content stored on the electronic device using the associated user account. If an individual speaks an utterance requesting a particular content item, the backend system may determine, prior to sending the content to the electronic device, whether a local version is stored within the electronic device's memory. If so, the backend system may instruct the electronic device to output the local version, thereby reducing the amount of bandwidth consumed. The backend system may further be capable of predictively generating and then caching certain audio data to the electronic device.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: June 5, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Timothy Thomas Gray
  • Patent number: 9984585
    Abstract: A method and system for constructive response grading for spoken language is disclosed. The method and system are computer implemented and involve a crowdsourcing step to derive evaluation features. The method includes steps for posting a speech test through an automated speech assessment tool, receiving candidate responses from candidates for the speech test; delivering the candidate responses to crowdsource volunteers; receiving crowdsourced responses from crowdsource volunteers, where the crowdsourced responses comprise a transcription of the speech test; deriving features from the transcription; and deriving a individual scores based on the features, where the individual scores are representative of pronunciation score, fluency score, content organization score and grammar score of the spoken language for each candidate.
    Type: Grant
    Filed: September 18, 2014
    Date of Patent: May 29, 2018
    Inventors: Varun Aggarwal, Vinay Shashidhar
  • Patent number: 9978363
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: May 22, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Srinivas Bangalore, Robert Bell, Diamantino Antonio Caseiro, Mazin Gilbert, Patrick Haffner
  • Patent number: 9972340
    Abstract: In a computer system for navigating to a location in recorded content, a computer receives a descriptive term or phrase associated with a searchable tag. The searchable tag corresponds to a point-in-time at which a non-speech sound occurred during the recording of recorded content of a communication between a plurality of participants. The recorded content includes speech from one or more of the plurality of participants, the descriptive term includes an automatically generated phonetic translation of the non-speech sound, and the non-speech sound was transmitted to the plurality of participants during the recording. The computer navigates to a location in the recorded content corresponding to the point-in-time at which the non-speech sound occurred.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: May 15, 2018
    Assignee: International Business Machines Corporation
    Inventors: Denise A. Bell, Lisa Seacat Deluca, Jana H. Jenkins, Jeffrey A. Kusnitz
  • Patent number: 9943260
    Abstract: An alcohol consumption determination method includes detecting a plurality of effective frames of an input voice signal; detecting a difference in signal of within the original signal of each of the effective frames; detecting average energy of the original signal and average energy of the difference signal for each of the effective frames; and determining whether alcohol has been consumed based on a difference between the average energy of the original signal and the average energy of the difference between the signals for each effective frame. Accordingly, it is also possible to determine, from a remote location, whether a driver or an operator remote has consumed alcohol and the degree of inebriation by using a difference signal energy method using a voice signal, thus preventing an accident caused by a operation vehicles and machines under the influence of alcohol.
    Type: Grant
    Filed: April 2, 2014
    Date of Patent: April 17, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY—INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Seong Geon Bae
  • Patent number: 9916825
    Abstract: There are disclosed methods and systems for text-to-speech synthesis for outputting a synthetic speech having a selected speech attribute. First, an acoustic space model is trained based on a set of training data of speech attributes, using a deep neural network to determine interdependency factors between the speech attributes in the training data, the dnn generating a single, continuous acoustic space model based on the interdependency factors, the acoustic space model thereby taking into account a plurality of interdependent speech attributes and allowing for modelling of a continuous spectrum of the interdependent speech attributes. Next, a text is received; a selection of one or more speech attribute is received, each speech attribute having a selected attribute weight; the text is converted into synthetic speech using the acoustic space model, the synthetic speech having the selected speech attribute; and the synthetic speech is outputted as audio having the selected speech attribute.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: March 13, 2018
    Assignee: YANDEX EUROPE AG
    Inventor: Ilya Vladimirovich Edrenkin
  • Patent number: 9910851
    Abstract: Disclosed are on-line voice translation method and device. The method comprises: conducting voice recognition on first voice information input by a first user, so as to obtain first recognition information; prompting the first user to confirm the first recognition information; translating the confirmed first recognition information to obtain and output first translation information; extracting, according to second information which is fed back by a second user, associated information corresponding to the second information; and correcting the first translation information according to the associated information and outputting the corrected translation information. By means of the on-line voice translation method and device, smooth communication can be ensured in cross-language exchanges.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: March 6, 2018
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Haifeng Wang, Hua Wu
  • Patent number: 9907509
    Abstract: An alcohol consumption determination method includes: detecting an effective frame of an input voice signal; detecting a difference signal of an original signal of the effective frame; performing fast Fourier conversion on the original signal and the difference signal; and determining, in the frequency domain, whether alcohol has been consumed based on a slope difference between the fast-Fourier-transformed original signal and the fast-Fourier-transformed difference signal. Accordingly, it is also possible to determine whether a driver or an operator from a remote location has consumed alcohol and a degree of the consumption, thus preventing an accident caused by an individual operating a vehicle under the influence.
    Type: Grant
    Filed: April 2, 2014
    Date of Patent: March 6, 2018
    Assignee: FOUNDATION OF SOONGSIL UNIVERSITY—INDUSTRY COOPERATION
    Inventors: Myung Jin Bae, Sang Gil Lee, Seong Geon Bae
  • Patent number: 9905223
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: February 27, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 9892730
    Abstract: Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.
    Type: Grant
    Filed: July 1, 2009
    Date of Patent: February 13, 2018
    Assignee: Comcast Interactive Media, LLC
    Inventors: David F. Houghton, Seth Michael Murray, Sibley Verbeck Simon
  • Patent number: 9875232
    Abstract: There is provided a method of performing an on-line definition of a first word, the first word received from a user of an electronic device via a communication network. The method can be executed at a server. The method comprises: obtaining a first definition set from a first source, the first definition set being based on the first word; obtaining a second definition set from a second source, the second definition set being based on the first word; parsing the first definition set to obtain individual first set words; parsing the second definition set to obtain individual second set words; organizing the individual first set words into at least one definition cluster; causing the electronic device to display to the user at least the first cluster.
    Type: Grant
    Filed: October 22, 2014
    Date of Patent: January 23, 2018
    Assignee: YANDEX EUROPE AG
    Inventors: Andrey Nikolaevich Mikheev, Andrei Igorevich Shevchenko
  • Patent number: 9870768
    Abstract: A subject estimation system includes a convolutional neural network to estimate a subject label of a dialog. The convolution neural network includes: one or more topic-dependent convolutional layers and one topic-independent convolutional layer, each of the one or more topic-dependent convolutional layers performing, on an input of a word-string vector sequence corresponding to dialog text transcribed from a dialog, a convolution operation dependent on a topic, and the topic-independent convolutional layer performing, on the input of the word-string vector sequence, a convolution operation not dependent on the topic; a pooling layer performing pooling process on outputs of the convolutional layer; and a fully connected layer performing full connection process on outputs of the pooling layer.
    Type: Grant
    Filed: September 12, 2016
    Date of Patent: January 16, 2018
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Hongjie Shi, Takashi Ushio, Mitsuru Endo, Katsuyoshi Yamagami
  • Patent number: 9858938
    Abstract: In a pulse encoding and decoding method and a pulse codec, more than two tracks are jointly encoded, so that free codebook space in the situation of single track encoding can be combined during joint encoding to become code bits that may be saved. Furthermore, a pulse that is on each track and required to be encoded is combined according to positions, and the number of positions having pulses, distribution of the positions that have pulses on the track, and the number of pulses on each position that has a pulse are encoded separately, so as to avoid separate encoding performed on multiple pulses of a same position, thereby further saving code bits.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: January 2, 2018
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Fuwei Ma, Dejun Zhang
  • Patent number: 9858921
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.
    Type: Grant
    Filed: August 2, 2013
    Date of Patent: January 2, 2018
    Assignee: Google Inc.
    Inventors: David P. Singleton, Debajit Ghosh