Patents Examined by Paras D Shah

Method and apparatus for generating speech synthesis model

Patent number: 10971131

Abstract: The present disclosure discloses a method and apparatus for generating a speech synthesis model. A specific embodiment of the method comprises: acquiring a plurality of types of training samples, each of the plurality of types of training samples including a text of the type, and a speech of the text having a style of speech corresponding to the type read by an announcer corresponding to the type; and training a neural network corresponding to a speech synthesis model using the plurality of types of training samples and an annotation of the style of speech in the each of the plurality of types of training samples to obtain the speech synthesis model, the speech synthesis model being used to synthesize speech of the announcer corresponding to each of the plurality of types having a plurality of styles.

Type: Grant

Filed: August 3, 2018

Date of Patent: April 6, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventor: Yongguo Kang
Aircraft systems and methods for monitoring onboard communications

Patent number: 10971155

Abstract: Methods and systems are provided for monitoring onboard communications after a change to a functionality of an onboard system. An exemplary method involves identifying a source initiating the change to the functionality of the onboard system, determining an expected response to the change to the functionality of the onboard system by a vehicle operator based at least in part on the change, the source, and one or more callout rules associated with the onboard system, monitoring for the expected response from the vehicle operator, and generating a user notification in response to an absence of the expected response from the vehicle operator.

Type: Grant

Filed: April 12, 2018

Date of Patent: April 6, 2021

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Anil Kumar Songa, Paula Renee Gardner, Kishore Kumar Sandrana
Speech processing apparatus, and program

Patent number: 10964308

Abstract: A speech processing apparatus is provided in which, while face feature points are extracted from moving image data obtained by imaging a speaker's face, for each frame, a first generation network for generating face feature points of the corresponding frame based on speech feature data extracted from uttered speech of the speaker for each frame is generated, and whether the first generation network is appropriate is evaluated using an identification network, then, a second generation network for generating the uttered speech from a plurality of uncertain settings including at least text representing utterance content of the uttered speech and information indicating emotions included in the uttered speech, a plurality of types of fixed settings which define speech quality, and the face feature points generated by the first generation network evaluated as appropriate, is generated, and whether the second generation network is appropriate is evaluated using the identification network.

Type: Grant

Filed: October 29, 2018

Date of Patent: March 30, 2021

Inventor: Ken-ichi Kainuma
Dynamic natural language processing tagging

Patent number: 10963510

Abstract: A natural language processing system that includes an artificial intelligence (AI) engine and a tagging engine. The AI engine is configured to receive a set of audio files and to identify concepts within the set of audio files. The AI engine is further configured to determine a usage frequency for each of the identified concepts and to generate an AI-defined tag for concepts with a usage frequency that is greater than a usage frequency threshold. The tagging engine is configured to receive an audio file and to identify observed concepts within the audio file. The tagging engine is further configured to compare the observed concepts to the first set of concepts, to determine one or more observed concepts matches concepts linked with AI-defined tags, and to modify metadata for the audio file to include AI-defined tags.

Type: Grant

Filed: August 9, 2018

Date of Patent: March 30, 2021

Assignee: Bank of America Corporation

Inventors: James McCormack, Sean M. Gutman, Manu J. Kurian, Sasidhar Purushothaman, Suki Ramasamy, William P. Jacobson
Computer-implemented generation and utilization of a universal encoder component

Patent number: 10963644

Abstract: Computer-implemented techniques are described herein for generating and utilizing a universal encoder component (UEC). The UEC maps a linguistic expression in a natural language to a language-agnostic representation of the linguistic expression. The representation is said to be agnostic with respect to language because it captures semantic content that is largely independent of the syntactic rules associated with the natural language used to compose the linguistic expression. The representations is also agnostic with respect to task because a downstream training system can leverage it to produce different kinds to machine-trained components that serve different respective tasks. The UEC facilitates the generation of downstream machine-trained models by permitting a developer to train a model based on input examples expressed in a language j?, and thereafter apply it to the interpretation of documents in language j?, with no additional training required.

Type: Grant

Filed: December 27, 2018

Date of Patent: March 30, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Armen Aghajanyan, Xia Song, Saurabh Kumar Tiwary
Global simultaneous interpretation mobile phone and method

Patent number: 10949626

Abstract: The present disclosure provides a global simultaneous interpretation method and production thereof, the method includes the following steps: receiving a calling request sent by a terminal by a smart phone, connecting the calling request, and establishing a calling connection; receiving a first voice information transmitted through the calling connection by the smart phone, and when the first voice information is identified and is determined as a non-specified language, translating the first voice information into a second voice information of a specified language; and playing the second voice information by using a speaker device by the smart phone.

Type: Grant

Filed: March 12, 2019

Date of Patent: March 16, 2021

Assignee: WING TAK LEE SILICONE RUBBER TECHNOLOGY (SHENZHEN) CO., LTD

Inventor: Tak Nam Liu
Computer systems exhibiting improved computer speed and transcription accuracy of automatic speech transcription (AST) based on a multiple speech-to-text engines and methods of use thereof

Patent number: 10930287

Abstract: In some embodiments, an exemplary inventive system for improving computer speed and accuracy of automatic speech transcription includes at least components of: a computer processor configured to perform: generating a recognition model specification for a plurality of distinct speech-to-text transcription engines; where each distinct speech-to-text transcription engine corresponds to a respective distinct speech recognition model; receiving at least one audio recording representing a speech of a person; segmenting the audio recording into a plurality of audio segments; determining a respective distinct speech-to-text transcription engine to transcribe a respective audio segment; receiving, from the respective transcription engine, a hypothesis for the respective audio segment; accepting the hypothesis to remove a need to submit the respective audio segment to another distinct speech-to-text transcription engine, resulting in the improved computer speed and the accuracy of automatic speech transcription; and ge

Type: Grant

Filed: December 3, 2018

Date of Patent: February 23, 2021

Inventors: Tejas Shastry, Matthew Goldey, Svyat Vergun
Speech recognition based audio input and editing method and terminal device

Patent number: 10923118

Abstract: An audio input method includes: in an audio-input mode, receiving a first audio input by a user, recognizing the first audio to generate a first recognition result, and displaying corresponding verbal content to the user based on the first recognition result; and in an editing mode, receiving a second audio input by the user and recognizing and generating a second recognition result, converting the second recognition result to an editing instruction, and executing a corresponding operation based on the editing operation. The audio-input mode and the editing mode are switchable.

Type: Grant

Filed: November 17, 2016

Date of Patent: February 16, 2021

Assignee: BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO., LTD.

Inventors: Liping Li, Suhang Wang, Congxian Yan, Lei Yang, Min Liu, Hong Zhao, Jia Yao
Method and apparatus for training of conversational agents

Patent number: 10923109

Abstract: A computer-implemented method and an apparatus for facilitating training of conversational agents are disclosed. The method includes automatically extracting a workflow associated with each conversation from among a plurality of conversations between agents and customers of an enterprise. The workflow is extracted, at least in part, by encoding one or more utterances associated with the respective conversation and mapping the encoded one or more utterances to predefined workflow stages. A clustering of the plurality of conversations is performed based on a similarity among respective extracted workflows. The clustering of the plurality of conversations configures a plurality of workflow groups. At least one conversational agent is trained in customer engagement using a set of conversations associated with at least one workflow group from among the plurality of workflow groups.

Type: Grant

Filed: August 2, 2018

Date of Patent: February 16, 2021

Assignee: [24]7.ai, Inc.

Inventors: Abir Chakraborty, Sruti Rallapalli, Vidhya Duthaluru
Systems and methods for processing meeting information obtained from multiple sources

Patent number: 10923139

Abstract: Systems and methods are provided for processing information of a meeting. An exemplary system may include a communication interface configured to receive meeting information obtained by a plurality of client devices. The meeting information may include multiple audio streams. The system may also include a memory and a processor. The processor may execute instructions stored on the memory to perform operations. The operations may include determining signal-to-noise-ratio (SNR) indicators associated with the audio streams. The operations may also include selecting, from the audio streams, a candidate audio stream based on the SNR indicators. The SNR indicator associated with the candidate audio stream may indicate that the candidate audio stream has a higher average SNR than that of a predetermined number of other audio streams. In addition, the operations may include generating an output data stream including at least a portion of the candidate audio stream.

Type: Grant

Filed: August 13, 2018

Date of Patent: February 16, 2021

Assignee: MELO INC.

Inventors: Guobin Shen, Zheng Han
Syllable based automatic speech recognition

Patent number: 10916235

Abstract: Systems, methods, and computer programs are described which utilize the structure of syllables as an organizing element of automated speech recognition processing to overcome variations in pronunciation, to efficiently resolve confusable aspects, to exploit context, and to map the speech to orthography.

Type: Grant

Filed: July 10, 2018

Date of Patent: February 9, 2021

Assignee: VOX FRONTERA, INC.

Inventors: Mark B. Pinson, Darrel T. Pinson
Sorting text report categories

Patent number: 10915706

Abstract: A computer-implemented method includes: receiving, by a computing device, a text report request from a user device associated with a user; obtaining a behavior history and personal information of the user; inputting the behavior history and the personal information of the user into a model, to obtain a plurality of personalized evaluation results, each personalized evaluation result corresponding to a respective text report category of a plurality of text report categories, in which each personalized evaluation result indicates a predicted relevance of the corresponding text report category to a problem faced by the user, and in which the model includes a classification model trained using one or more supervised learning techniques on a plurality of user behavior history samples and a plurality of personal information samples; and determining an order in which the plurality of text report categories are to be presented to the user.

Type: Grant

Filed: May 29, 2020

Date of Patent: February 9, 2021

Assignee: Advanced New Technologies Co., Ltd.

Inventors: Hong Jin, Weiqiang Wang
Selecting speech features for building models for detecting medical conditions

Patent number: 10896765

Abstract: A mathematical model may be trained to diagnose a medical condition of a person by processing acoustic features and language features of speech of the person. The performance of the mathematical model may be improved by appropriately selecting the features to be used with the mathematical model. Features may be selected by computing a feature selection score for each acoustic feature and each language feature, and then selecting features using the scores, such as by selecting features with the highest scores. In some implementations, stability determinations may be computed for each feature and features may be selected using both the feature selection scores and the stability determinations. A mathematical model may then be trained using the selected features and deployed. In some implementations, prompts may be selected using computed prompt selection scores, and the deployed mathematical model may be used with the selected prompts.

Type: Grant

Filed: November 13, 2018

Date of Patent: January 19, 2021

Assignee: CANARY SPEECH, LLC

Inventors: Jangwon Kim, Namhee Kwon, Henry O'Connell, Phillip Walstad, Kevin Shengbin Yang
Multi-mode guard for voice commands

Patent number: 10891953

Abstract: Embodiments may be implemented by a computing device, such as a head-mountable display, in order to use a single guard phrase to enable different voice commands in different interface modes. An example device includes an audio sensor and a computing system configured to analyze audio data captured by the audio sensor to detect speech that includes a predefined guard phrase, and to operate in a plurality of different interface modes comprising at least a first and a second interface mode. During operation in the first interface mode, the computing system may initially disable one or more first-mode speech commands, and respond to detection of the guard phrase by enabling the one or more first-mode speech commands. During operation in the second interface mode, the computing system may initially disable a second-mode speech command, and to respond to the guard phrase by enabling the second-mode speech command.

Type: Grant

Filed: December 11, 2018

Date of Patent: January 12, 2021

Assignee: GOOGLE LLC

Inventors: Michael J. LeBeau, Mat Balez
Identification of taste attributes from an audio signal

Patent number: 10891948

Abstract: A system, method and computer product are provided for processing audio signals. An audio signal of a voice and background noise is input, and speech recognition is performed to retrieve speech content of the voice. There is retrieval of content metadata corresponding to the speech content, and environmental metadata corresponding to the background noise. There is a determination of preferences for media content corresponding to the content metadata and the environmental metadata, and an output is provided corresponding to the preferences.

Type: Grant

Filed: February 21, 2018

Date of Patent: January 12, 2021

Assignee: SPOTIFY AB

Inventor: Stéphane Hulaud
Automated generation of sentence-based descriptors from imaging data

Patent number: 10891444

Abstract: A computer-implemented method, a computer system and a non-transitory computer-readable medium for constructing human-readable sentences from imaging data of a subject can include: receiving imaging data including image elements of at least one region of interest of the subject; segmenting the imaging data of the region of interest into a plurality of sub-regions, where each sub-region includes a portion of the image elements; calculating an abnormality factor for each of the sub-regions by quantitatively analyzing segmented image information of the imaging data of the sub-regions using data from a normal database; comparing each abnormality factor to a threshold value; constructing a human-understandable sentence for the subject when a corresponding abnormality factor exceeds the threshold, where each human-understandable sentence references a physical structure threshold associated with the calculation for the region or sub-region; and outputting the human-understandable sentences for the at least one regio

Type: Grant

Filed: October 26, 2016

Date of Patent: January 12, 2021

Assignee: The Johns Hopkins University

Inventors: Susumu Mori, Michael I. Miller
Voice detection method and apparatus, and storage medium

Patent number: 10872620

Abstract: Embodiments of the present disclosure provide a voice detection method. An audio signal can be divided into a plurality of audio segments. Audio characteristics can be extracted from each of the plurality of audio segments. The audio characteristics of the respective audio segment include a time domain characteristic and a frequency domain characteristic of the respective audio segment. At least one target voice segment can be detected from the plurality of audio segments according to the audio characteristics of the plurality of audio segments.

Type: Grant

Filed: May 1, 2018

Date of Patent: December 22, 2020

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Haijin Fan
Detection of liveness

Patent number: 10847165

Abstract: Detecting a replay attack on a voice biometrics system comprises: receiving a speech signal from a voice source; generating and transmitting an ultrasound signal through a transducer of the device; detecting a reflection of the transmitted ultrasound signal; detecting Doppler shifts in the reflection of the generated ultrasound signal; and identifying whether the received speech signal is indicative of liveness of a speaker based on the detected Doppler shifts. The method further comprises: obtaining information about a position of the device; and adapting the generating and transmitting of the ultrasound signal based on the information about the position of the device.

Type: Grant

Filed: October 10, 2018

Date of Patent: November 24, 2020

Assignee: Cirrus Logic, Inc.

Inventor: John Paul Lesso
Estimating lung volume by speech analysis

Patent number: 10847177

Abstract: Described embodiments include an apparatus that includes a network interface and a processor. The processor is configured to receive, via the network interface, a speech signal that represents speech uttered by a subject, the speech including one or more speech segments, divide the speech signal into multiple frames, such that one or more sequences of the frames represent the speech segments, respectively, compute respective estimated total volumes of air exhaled by the subject while the speech segments were uttered, by, for each of the sequences, computing respective estimated flow rates of air exhaled by the subject during the frames belonging to the sequence and, based on the estimated flow rates, computing a respective one of the estimated total volumes of air, and, in response to the estimated total volumes of air, generate an alert. Other embodiments are also described.

Type: Grant

Filed: October 11, 2018

Date of Patent: November 24, 2020

Assignee: CORDIO MEDICAL LTD.

Inventor: Ilan D. Shallom
Audio signal routing to audio devices

Patent number: 10848868

Abstract: In an example, an audio signal may be routed to an audio device based on an indication of audio device historical usage, a measure of audio quality of the audio device, or a combination thereof.

Type: Grant

Filed: February 21, 2017

Date of Patent: November 24, 2020

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Mohit Gupta

prev … 4 5 6 7 8 9 10 11 12 … next