Subportions Patents (Class 704/254)

Speech recognition method and apparatus

Patent number: 11955119

Abstract: A speech recognition method includes receiving speech data, obtaining, from the received speech data, a candidate text including at least one word and a phonetic symbol sequence associated with a pronunciation of a target word included in the received speech data, using a speech recognition model, replacing the phonetic symbol sequence included in the candidate text with a replacement word corresponding to the phonetic symbol sequence, and determining a target text corresponding to the received speech data based on a result of the replacing.

Type: Grant

Filed: December 16, 2022

Date of Patent: April 9, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jihyun Lee
Extensible search, content, and dialog management system with human-in-the-loop curation

Patent number: 11948566

Abstract: The present disclosure describes systems and methods for extensible search, content, and dialog management. Embodiments of the present disclosure provide a dialog system with a trained intent recognition model (e.g., a deep learning model) to receive and understand a natural language query from a user. In cases where intent is not identified for a received query, the dialog system generates one or more candidate responses that may be refined (e.g., using human-in-the-loop curation) to generate a response. The intent recognition model may be updated (e.g., retrained) the accordingly. Upon receiving a subsequent query with similar intent, the dialog system may identify the intent using the updated intent recognition model.

Type: Grant

Filed: March 24, 2021

Date of Patent: April 2, 2024

Assignee: ADOBE INC.

Inventors: Oliver Brdiczka, Kyoung Tak Kim, Charat Maheshwari
Speech processing optimizations based on microphone array

Patent number: 11935525

Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.

Type: Grant

Filed: June 8, 2020

Date of Patent: March 19, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Shiva Kumar Sundaram, Minhua Wu, Anirudh Raju, Spyridon Matsoukas, Arindam Mandal, Kenichi Kumatani
Presenting location related information and implementing a task based on gaze, gesture, and voice detection

Patent number: 11906317

Abstract: Systems and methods for presenting information and executing a task. In an aspect, when a user gazes at a display of a standby device, location related information is presented. In another aspect, when a user utters a voice command and gazes or gestures at a device, a task is executed. In another aspect, a voice input, a gesture, and user information are used to determine a destination for a trip or a product for a purchase. In another aspect, a voice input and user information are used to determine a destination when a user hails a vehicle.

Type: Grant

Filed: June 8, 2021

Date of Patent: February 20, 2024

Inventor: Chian Chiu Li
Method and system for processing speech signal

Patent number: 11900958

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Grant

Filed: December 26, 2022

Date of Patent: February 13, 2024

Assignee: Alibaba Group Holding Limited

Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
Display apparatus and method for registration of user command

Patent number: 11900939

Abstract: A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.

Type: Grant

Filed: October 7, 2022

Date of Patent: February 13, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Nam-yeong Kwon, Kyung-mi Park
Display apparatus and method for registration of user command

Patent number: 11862166

Abstract: A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.

Type: Grant

Filed: October 7, 2022

Date of Patent: January 2, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Nam-yeong Kwon, Kyung-mi Park
Method and system for detecting unsupported utterances in natural language understanding

Patent number: 11854528

Abstract: An apparatus for detecting unsupported utterances in natural language understanding, includes a memory storing instructions, and at least one processor configured to execute the instructions to classify a feature that is extracted from an input utterance of a user, as one of in-domain and out-of-domain (OOD) for a response to the input utterance, obtain an OOD score of the extracted feature, and identify whether the feature is classified as OOD. The at least one processor is further configured to executed the instructions to, based on the feature being identified to be classified as in-domain, identify whether the obtained OOD score is greater than a predefined threshold, and based on the OOD score being identified to be greater than the predefined threshold, re-classify the feature as OOD.

Type: Grant

Filed: August 13, 2021

Date of Patent: December 26, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Yen-Chang Hsu, Yilin Shen, Avik Ray, Hongxia Jin
Method, apparatus, electronic device and storage medium for speech recognition

Patent number: 11842726

Abstract: A computer-implemented method for speech recognition is disclosed. The method includes extracting a feature word associated with location information from a speech to be recognized, and calculating a similarity between the feature word and respective ones of a plurality of candidate words in a corpus. The corpus includes a first sub-corpus associated with at least one user, and the plurality of candidate words include, in the first sub-corpus, a first standard candidate word and at least one first erroneous candidate word. The at least one first erroneous candidate word has a preset correspondence with the first standard candidate word. The method further includes in response to the similarity between the feature word and one or more of the at least one first erroneous candidate word satisfying a predetermined condition, outputting the first standard candidate word as a recognition result based on the preset correspondence.

Type: Grant

Filed: September 8, 2021

Date of Patent: December 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Jing Pei, Xiantao Chen, Meng Xu
Near real time out of home audience measurement

Patent number: 11844043

Abstract: Methods, apparatus, systems and articles of manufacture for near real time out of home audience measurement are disclosed. An example apparatus includes at least one memory; instructions; and processor circuitry to execute the instructions to at least: receive a first data transmission request at a first portable meter; send a second data transmission request from the first portable meter to a second portable meter; determine whether the first portable meter is capable of transmitting at least one data packet, based at least in part on an indication the second portable meter is capable of transmitting the at least one data packet; and in response to determining the first portable meter is capable of transmitting the at least one data packet, transmit the at least one data packet.

Type: Grant

Filed: June 28, 2021

Date of Patent: December 12, 2023

Assignee: The Nielsen Company (US), LLC

Inventors: John T. Livoti, Stanley Wellington Woodruff
Age-sensitive automatic speech recognition

Patent number: 11837221

Abstract: Systems and methods are described to receive a query from a user and provide a reply that is appropriate for an age group of the user. A query for a media asset is received, where such query comprises an inputted term, and the query is determined to be received from a user belonging to a first age group. A context of the inputted term within the query is identified, and in response to the determining, based on the identified context, that the inputted term of the query is inappropriate for the first age group, a replacement term for the inputted term that is related to the inputted term and is appropriate for the first age group in the context of the query is identified. The query is modified to replace the inputted term with the identified replacement term, and a reply to the modified query is generated for output.

Type: Grant

Filed: February 26, 2021

Date of Patent: December 5, 2023

Assignee: Rovi Guides, Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose
Speech recognition with selective use of dynamic language models

Patent number: 11810568

Abstract: A computer-implemented method for transcribing an utterance includes receiving, at a computing system, speech data that characterizes an utterance of a user. A first set of candidate transcriptions of the utterance can be generated using a static class-based language model that includes a plurality of classes that are each populated with class-based terms selected independently of the utterance or the user. The computing system can then determine whether the first set of candidate transcriptions includes class-based terms. Based on whether the first set of candidate transcriptions includes class-based terms, the computing system can determine whether to generate a dynamic class-based language model that includes at least one class that is populated with class-based terms selected based on a context associated with at least one of the utterance and the user.

Type: Grant

Filed: December 10, 2020

Date of Patent: November 7, 2023

Assignee: Google LLC

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Systems and methods for adaptive proper name entity recognition and understanding

Patent number: 11783830

Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.

Type: Grant

Filed: May 26, 2021

Date of Patent: October 10, 2023

Assignee: Promptu Systems Corporation

Inventor: Harry William Printz
System and method for controllable machine text generation architecture

Patent number: 11763100

Abstract: A system is provided comprising a processor and a memory storing instructions which configure the processor to process an original sentence structure through an encoder neural network to decompose the original sentence structure into an original semantics component and an original syntax component, process the original syntax component through a syntax variation autoencoder (VAE) to receive a syntax mean vector and a syntax covariance matrix, obtain a sampled syntax value from a syntax Gaussian posterior parameterized by the syntax mean vector and the syntax covariance matrix, process the original semantics component through a semantics VAE to receive a semantics mean vector and a semantics covariance matrix, obtain a sampled semantics vector from the Gaussian semantics posterior parameterized by the semantics mean vector and the semantics covariance matrix, and process the sampled syntax vector and the sampled semantics vector through a decoder neural network to compose a new sentence.

Type: Grant

Filed: May 22, 2020

Date of Patent: September 19, 2023

Assignee: ROYAL BANK OF CANADA

Inventors: Peng Xu, Yanshuai Cao, Jackie C. K. Cheung
Speaker recognition with assessment of audio frame contribution

Patent number: 11735191

Abstract: This application describes methods and apparatus for speaker recognition. An apparatus according to an embodiment has an analyzer for analyzing each frame of a sequence of frames of audio data which correspond to speech sounds uttered by a user to determine at least one characteristic of the speech sound of that frame. An assessment module determines, for each frame of audio data, a contribution indicator of the extent to which that frame of audio data should be used for speaker recognition processing based on the determined characteristic of the speech sound. Said contribution indicator comprises a weighting to be applied to each frame in the speaker recognition processing. In this way frames which correspond to speech sounds that are of most use for speaker discrimination may be emphasized and/or frames which correspond to speech sounds that are of least use for speaker discrimination may be de-emphasized.

Type: Grant

Filed: June 25, 2019

Date of Patent: August 22, 2023

Assignee: Cirrus Logic, Inc.

Inventors: John Paul Lesso, John Laurence Melanson
Electronic device and method for controlling the same, and storage medium

Patent number: 11735167

Abstract: Disclosed is an electronic device recognizing an utterance voice in units of individual characters. The electronic device includes: a voice receiver; and a processor configured to: obtain a recognition character converted from a character section of a user voice received through the voice receiver, and recognize a candidate character having high acoustic feature related similarity with the character section among a plurality of acquired candidate characters as an utterance character of the character section based on a confusion possibility with the acquired recognition character.

Type: Grant

Filed: November 24, 2020

Date of Patent: August 22, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jihun Park, Dongheon Seok
Display device and system comprising same

Patent number: 11704089

Abstract: A display device according to an embodiment of the present invention may comprise: a display unit for displaying a content image; a microphone for receiving voice commands from a user; a network interface unit for communicating with a natural language processing server and a search server; and a control unit for transmitting the received voice commands to the natural language processing server, receiving intention analysis result information indicating the user's intention corresponding to the voice commands from the natural language processing server, and performing a function of the display device according to the received intention analysis result information.

Type: Grant

Filed: January 11, 2021

Date of Patent: July 18, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Sunki Min, Kiwoong Lee, Hyangjin Lee, Jeean Chang, Seunghyun Heo, Jaekyung Lee
Methods and apparatus to analyze performance of watermark encoding devices

Patent number: 11676073

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that analyze performance of manufacturer independent devices. An example apparatus includes a software development kit (SDK) deployment engine to deploy an SDK to a manufacturer of a device, the SDK to define heartbeat data to be collected from the device and interfacing techniques to transmit the heartbeat data to a measurement entity. In some examples, the apparatus includes a machine learning engine to predict whether the device is associated with one or more failure modes. The example apparatus also includes an alert generator to generate an alert based on a prediction, the alert to indicate at least one of a type of a first one of the failure modes or at least one component of the device to be remedied according to the first one of the one or more failure modes, and transmit the alert to a management agent.

Type: Grant

Filed: July 12, 2021

Date of Patent: June 13, 2023

Assignee: The Nielsen Company (US), LLC

Inventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
Pre-training with alignments for recurrent neural network transducer based end-to-end speech recognition

Patent number: 11657799

Abstract: Techniques performed by a data processing system for training a Recurrent Neural Network Transducer (RNN-T) herein include encoder pretraining by training a neural network-based token classification model using first token-aligned training data representing a plurality of utterances, where each utterance is associated with a plurality of frames of audio data and tokens representing each utterance are aligned with frame boundaries of the plurality of audio frames; obtaining first cross-entropy (CE) criterion from the token classification model, wherein the CE criterion represent a divergence between expected outputs and reference outputs of the model; pretraining an encoder of an RNN-T based on the first CE criterion; and training the RNN-T with second training data after pretraining the encoder of the RNN-T. These techniques also include whole-network pre-training of the RNN-T.

Type: Grant

Filed: April 3, 2020

Date of Patent: May 23, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Rui Zhao, Jinyu Li, Liang Lu, Yifan Gong, Hu Hu
Content analysis to enhance voice search

Patent number: 11636146

Abstract: Methods and apparatus for improving speech recognition accuracy in media content searches are described. An advertisement for a media content item is analyzed to identify keywords that may describe the media content item. The identified keywords are associated with the media content item for use during a voice search to locate the media content item. A user may speak the one or more of the keywords as a search input and be provided with the media content item as a result of the search.

Type: Grant

Filed: September 23, 2019

Date of Patent: April 25, 2023

Assignee: Comcast Cable Communications, LLC

Inventor: George Thomas Des Jardins
Voice morphing apparatus having adjustable parameters

Patent number: 11600284

Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.

Type: Grant

Filed: January 11, 2020

Date of Patent: March 7, 2023

Assignee: SOUNDHOUND, INC.

Inventor: Steve Pearson
Information processing apparatus, information processing method, and computer program product

Patent number: 11593621

Abstract: An information processing apparatus according to an embodiment includes one or more hardware processors. The hardware processors obtain a first categorical distribution sequence corresponding to first input data and obtain a second categorical distribution sequence corresponding to second input data neighboring the first input data, by using a prediction model outputting a categorical distribution sequence representing a sequence of L categorical distributions for a single input data piece, where, L is a natural number of two or more. The hardware processors calculate, for each i of 1 to L, an inter-distribution distance between i-th categorical distributions in the first and second categorical distribution sequences. The hardware processors calculate a sum of L inter-distribution distances. The hardware processors update the prediction model's parameters to lessen the sum.

Type: Grant

Filed: January 24, 2020

Date of Patent: February 28, 2023

Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA DIGITAL SOLUTIONS CORPORATION

Inventor: Ryohei Tanaka
Methods for natural language model training in natural language understanding (NLU) systems

Patent number: 11574127

Abstract: Systems and methods for training a classifier binary model of a natural language understanding (NLU) system are disclosed herein. A determination is made as to whether a text string, with a content entity, includes an obsequious expression. In response to determining the text string includes an obsequious expression, a determination is made as to whether the obsequious expression describes the content entity. The model is trained based on a determination of at least one of: an absence of an obsequious expression in response to determining the obsequious expression describes the content entity; a presence of an obsequious expression in response to determining the obsequious expression describes the content entity; an absence of an obsequious expression in response to determining the obsequious expression does not describe the content entity, and a presence of an obsequious expression in response to determining the obsequious expression does not describe the content entity.

Type: Grant

Filed: February 28, 2020

Date of Patent: February 7, 2023

Assignee: Rovi Guides, Inc.

Inventors: Jeffry Copps Robert Jose, Mithun Umesh
Error correction method and device for search term

Patent number: 11574012

Abstract: The present application provides an error correction method and device for search terms. The method comprises: identifying an incorrect search term; calculating weighted edit distances between the search term and pre-obtained hot terms by using a weighted edit distance algorithm, wherein, during the calculation of the weighted edit distances, different weights are set respectively for the following operations of transforming from the search term to the hot terms: an operation of inserting characters, an operation of deleting characters, an operation of replacing by characters with similar appearance or pronunciation, an operation of replacing by characters with dissimilar appearance or pronunciation, and an operation of exchanging characters; and selecting a predetermined number of hot terms based on the weighted edit distances and popularity of the hot terms for error correction prompt. The method and device of the present application can improve the error correction accuracy of error search terms.

Type: Grant

Filed: August 14, 2017

Date of Patent: February 7, 2023

Assignee: BEIJING QIYI CENTURY SCIENCE & TECHNOLOGY CO., LTD.

Inventors: Jun Hu, Yingjie Chen, Tianchang Wang, Chengcan Ye
Robust checkpoint selection for monotonic autoregressive seq2seq neural generative models

Patent number: 11557274

Abstract: Embodiments may provide improved techniques to assess model checkpoint stability on unseen data on-the-fly, so as to prevent unstable checkpoints from being saved, and to avoid or reduce the need for an expensive thorough evaluation. For example, a method may comprise passing a set of input sequences through a checkpoint of a sequence to sequence model in inference mode to obtain a set of generated sequences of feature vectors, determining whether each of a plurality of generated sequences of feature vectors is complete, counting a number of incomplete generated sequences of feature vectors among the plurality of generated sequences of feature vectors, generating a score indicating a stability of the model based on the count of incomplete generated sequences of feature vectors, and storing the model checkpoint when the score indicating the stability of the model is above a predetermined threshold.

Type: Grant

Filed: March 15, 2021

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventor: Vyacheslav Shechtman
Speech recognition method and apparatus

Patent number: 11557286

Abstract: A speech recognition method includes receiving speech data, obtaining, from the received speech data, a candidate text including at least one word and a phonetic symbol sequence associated with a pronunciation of a target word included in the received speech data, using a speech recognition model, replacing the phonetic symbol sequence included in the candidate text with a replacement word corresponding to the phonetic symbol sequence, and determining a target text corresponding to the received speech data based on a result of the replacing.

Type: Grant

Filed: December 30, 2019

Date of Patent: January 17, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jihyun Lee
Background audio identification for speech disambiguation

Patent number: 11557280

Abstract: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.

Type: Grant

Filed: November 23, 2020

Date of Patent: January 17, 2023

Assignee: Google LLC

Inventors: Jason Sanders, Gabriel Taubman, John J. Lee
Electronic device for processing user utterance and controlling method thereof

Patent number: 11538470

Abstract: A system includes at least one communication interface, at least one processor operatively connected to the at least one communication interface, and at least one memory operatively connected to the at least one processor and storing a plurality of natural language understanding (NLU) models. The at least one memory stores instructions that, when executed, cause the processor to receive first information associated with a user from an external electronic device associated with a user account, using the at least one communication interface, to select at least one of the plurality of NLU models, based on at least part of the first information, and to transmit the selected at least one NLU model to the external electronic device, using the at least one communication interface such that the external electronic device uses the selected at least one NLU model for natural language processing.

Type: Grant

Filed: June 29, 2020

Date of Patent: December 27, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sean Minsung Kim, Jaeyung Yeo
Utilizing a joint-learning self-distillation framework for improving text sequential labeling machine-learning models

Patent number: 11537950

Abstract: This disclosure describes one or more implementations of a text sequence labeling system that accurately and efficiently utilize a joint-learning self-distillation approach to improve text sequence labeling machine-learning models. For example, in various implementations, the text sequence labeling system trains a text sequence labeling machine-learning teacher model to generate text sequence labels. The text sequence labeling system then creates and trains a text sequence labeling machine-learning student model utilizing the training and the output of the teacher model. Upon the student model achieving improved results over the teacher model, the text sequence labeling system re-initializes the teacher model with the learned model parameters of the student model and repeats the above joint-learning self-distillation framework. The text sequence labeling system then utilizes a trained text sequence labeling model to generate text sequence labels from input documents.

Type: Grant

Filed: October 14, 2020

Date of Patent: December 27, 2022

Assignee: Adobe Inc.

Inventors: Trung Bui, Tuan Manh Lai, Quan Tran, Doo Soon Kim
Method and system for processing speech signal

Patent number: 11538488

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Grant

Filed: November 27, 2019

Date of Patent: December 27, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
Extracting natural language semantics from speech without the use of speech recognition

Patent number: 11508355

Abstract: Systems and methods are disclosed herein for discerning aspects of user speech to determine user intent and/or other acoustic features of a sound input without the use of an ASR engine. To this end, a processor may receive a sound signal comprising raw acoustic data from a client device, and divides the data into acoustic units. The processor feeds the acoustic units through a first machine learning model to obtain a first output and determines a first mapping, using the first output, of each respective acoustic unit to a plurality of candidate representations of the respective acoustic unit. The processor feeds each candidate representation of the plurality through a second machine learning model to obtain a second output, determines a second mapping, using the second output, of each candidate representation to a known condition, and determines a label for the sound signal based on the second mapping.

Type: Grant

Filed: October 26, 2018

Date of Patent: November 22, 2022

Assignee: Interactions LLC

Inventors: Ryan Price, Srinivas Bangalore
Contact resolution for communications systems

Patent number: 11495224

Abstract: Methods and systems for performing contact resolution are described herein. When initiating a communications session using a voice activated electronic device, a contact name may be resolved to determine an appropriate contact with which the communications session may be directed to. Contacts from an individual's contact list may be queried to determine a listing of probable contacts associated with the contact name, and contact identifiers associated with the contact may be determined. Using one or more rules for disambiguating between similar contact names, a single contact may be identified, and a communications session with that contact may be initiated.

Type: Grant

Filed: May 19, 2020

Date of Patent: November 8, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Someshwaran Elangovan, Aparna Nandyal, Venkatesh Kancharla, Arun Rajendran, Sumedha Arvind Kshirsagar, Christopher Geiger Parker
Display apparatus and method for registration of user command

Patent number: 11495228

Abstract: An apparatus including a user input receiver; a user voice input receiver; a display; and a processor. The processor is configured to: (a) based on a user input being received through the user input receiver, perform a function corresponding to voice input state for receiving a user voice input; (b) receive a user voice input through the user voice input receiver; (c) identify whether or not a text corresponding to the received user voice input is related to a pre-registered voice command or a prohibited expression; and (d) based on the text being related to the pre-registered voice command or the prohibited expression, control the display to display an indicator that the text is related to the pre-registered voice command or the prohibited expression. A method and non-transitory computer-readable medium are also provided.

Type: Grant

Filed: November 30, 2020

Date of Patent: November 8, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Nam-yeong Kwon, Kyung-mi Park
Electronic device, method, and computer program which support naming

Patent number: 11481556

Abstract: A naming support system is provided that includes a processing unit that receives first language name information input from a user, determines name evaluation's basic information about the first language name information, and generates and transmits name's evaluation information to an output unit based on a target language which includes at least one of a plurality of languages for the name evaluation's basic information, wherein the first language name information includes at least one of character notation information of a first language name, pronunciation information of a first language name, or desired information for a first language name, and the name evaluation's basic information includes the first language name information.

Type: Grant

Filed: April 29, 2020

Date of Patent: October 25, 2022

Inventor: Chul Hwan Jung
Fault diagnosis method of reciprocating machinery based on keyphasor-free complete-cycle signal

Patent number: 11454567

Abstract: The present disclosure relates to a fault diagnosis method of a reciprocating machinery based on a keyphasor-free complete-cycle signal. The method includes the following steps: 1) building a complete-cycle vibration signal image library; 2) training an image recognition model; 3) acquiring a complete-cycle data on a keyphasor-free basis; 4) building an automatic feature extraction model; and 5) inputting a hidden layer feature of an autoencoder into a support vector machine (SVM) classifier to obtain a diagnosis result. By using a deep cascade convolutional neural network (CNN), the present disclosure achieves the goal of complete-cycle data acquisition on a keyphasor-free basis, solves the problems that traditional intelligent fault diagnosis relies on a keyphasor signal and real-time diagnosis fails due to insufficient installation space. In addition, by using an autoencoder for automatic feature extraction, the present disclosure avoids manual feature selection, reduces labor costs.

Type: Grant

Filed: February 2, 2021

Date of Patent: September 27, 2022

Assignee: Beijing University of Chemical Technology

Inventors: Jinjie Zhang, Zhinong Jiang, Haipeng Zhao, Zhiwei Mao, Kun Chang
Systems and methods for improved accuracy of bullying or altercation detection or identification of excessive machine noise

Patent number: 11450327

Abstract: Systems and methods for identifying potential bullying are disclosed. In various aspects, a system for identifying potential bullying includes a sound detector configured to provide samples of sounds over time, a processor, and a memory storing instructions. The instructions, when executed by the processor, cause the system to determine that a noise event has occurred by processing the samples to determine that the sounds exceed a sound level threshold over a time period that exceeds a time period threshold, process the samples to provide frequency spectrum information of the noise event, determine whether the noise event is a potential bullying occurrence based on comparing the frequency spectrum information of the noise event and at least one frequency spectrum profile, and initiate a bullying notification in a case of determining that the noise event is a potential bullying occurrence.

Type: Grant

Filed: April 20, 2021

Date of Patent: September 20, 2022

Assignee: SOTER TECHNOLOGIES, LLC

Inventor: Cary Chu
Method for generating filled pause detecting model corresponding to new domain and device therefor

Patent number: 11443735

Abstract: Disclosed are a method and a device for generating a filled pause detection model using a small amount of speech data included in a new domain in a 5G communication environment by executing artificial intelligence (AI) algorithms and/or machine learning algorithms provided therein. According to the present disclosure, the filled pause detection model generating method for a new domain may include constructing a filled pause detection model for the new domain, determining the initial model parameter for the filled pause detection model for the new domain by combining model parameters for filled pause detection models of a plurality of existing domains, and training the filled pause detection model of the new domain in which initial model parameter is set, using speech data from the new domain as training data.

Type: Grant

Filed: March 2, 2020

Date of Patent: September 13, 2022

Assignee: LG ELECTRONICS INC.

Inventor: Yun Jin Lee
Multi-tap minimum variance distortionless response beamformer with neural networks for target speech separation

Patent number: 11423906

Abstract: A method, computer system, and computer readable medium are provided for automatic speech recognition. Video data and audio data corresponding to one or more speakers is received. A minimum variance distortionless response function is applied to the received audio and video data. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated based on back-propagating the output of the applied minimum variance distortionless response function.

Type: Grant

Filed: July 10, 2020

Date of Patent: August 23, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Yong Xu, Meng Yu, Shi-Xiong Zhang, Chao Weng, Jianming Liu, Dong Yu
Synthesizing patient-specific speech models

Patent number: 11417342

Abstract: An apparatus includes a communication interface and a processor. The processor is configured to receive, via the communication interface, a plurality of speech samples {um0}, m=1 . . . M, which were uttered by a subject while in a first state with respect to a disease, and using {um0} and at least one reference discriminator, which is not specific to the subject, synthesize a subject-specific discriminator, which is specific to the subject and is configured to generate, in response to one or more test utterances uttered by the subject, an output indicating a likelihood that the subject is in a second state with respect to the disease. Other embodiments are also described.

Type: Grant

Filed: June 29, 2020

Date of Patent: August 16, 2022

Assignee: CORDIO MEDICAL LTD.

Inventor: Ilan D. Shallom
Public speaking trainer with 3-D simulation and real-time feedback

Patent number: 11403961

Abstract: A public speaking trainer has a computer system including a display monitor. A microphone is coupled to the computer system. A video capture device is coupled to the computer system. A biometric device is coupled to the computer system. A simulated environment including a simulated audience member is rendered on the display monitor using the computer system. A presentation is recorded onto the computer system using the microphone and video capture device. A first feature of the presentation is extracted based on data from the microphone and video capture device while recording the presentation. A metric is calculated based on the first feature. The simulated audience member is animated in response to a change in the metric. A score is generated based on the metric. The score is displayed on the display monitor of the computer system after recording the presentation. A training video is suggested based on the score.

Type: Grant

Filed: September 6, 2019

Date of Patent: August 2, 2022

Assignee: PITCHVANTAGE LLC

Inventors: Anindya Gupta, Yegor Makhiboroda, Brad H. Story
Verbal periodic screening for heart disease

Patent number: 11398243

Abstract: A method for analyzing a voice sample of a subject to determine a cardiac condition, for example, an arrhythmic condition, comprising extracting at least one voice feature from the voice sample, detecting an effect of the cardiac condition on the at least one voice feature and determining the cardiac condition based on the effect. Disclosed is also a system for determining the cardiac condition in the voice sample provided by the subject. Related apparatus and methods are also described.

Type: Grant

Filed: February 12, 2018

Date of Patent: July 26, 2022

Assignee: CardioKol Ltd.

Inventors: Yirmiyahu Hauptman, Alon Goren, Eli Attar, Pinhas Sabach
Methods for natural language model training in natural language understanding (NLU) systems

Patent number: 11392771

Abstract: Systems and methods for training a natural language model of a natural language understanding (NLU) system are disclosed herein. A text string including at least a content entity is received. A determination is made as to whether the text string includes an obsequious expression. In response to determining the text string includes an obsequious expression, a determination is made as to whether the obsequious expression describes the content entity. A query is forwarded in response to determining the text string includes an obsequious expression and in determining the obsequious expression describes the content entity. In response to determining the obsequious expression describes the content entity, the query includes the obsequious expression and in response to determining the obsequious expression does not describe the content entity, the query does not include the obsequious expression.

Type: Grant

Filed: February 28, 2020

Date of Patent: July 19, 2022

Assignee: Rovi Guides, Inc.

Inventors: Jeffry Copps Robert Jose, Mithun Umesh
Methods for natural language model training in natural language understanding (NLU) systems

Patent number: 11393455

Abstract: Systems and methods for generating a query using a trained natural language model of a natural language understanding (NLU) system are disclosed herein. A text string including at least a content entity is received. A determination is made as to whether the text string includes an obsequious expression. In response to determining the text string includes an obsequious expression, a determination is made as to whether the obsequious expression describes the content entity. In response to determining whether the obsequious expression describes the content entity, the query is generated. In response to determining the obsequious expression describes the content entity, the content entity and the obsequious expression are included in the query and in response to determining the obsequious expression does not describe the content entity, the content entity is included in the query and the obsequious expression is excluded from the query.

Type: Grant

Filed: February 28, 2020

Date of Patent: July 19, 2022

Assignee: ROVI GUIDES, INC.

Inventors: Jeffry Copps Robert Jose, Mithun Umesh
System and method for providing real-time feedback of remote collaborative communication

Patent number: 11386899

Abstract: A system and method for providing real-time feedback of remote collaborative communication includes: processing first audio signals to extract first speech-related features therefrom; processing first EEG signals to extract first brain activity features therefrom; processing second audio signals to extract second speech-related features therefrom; processing second EEG signals to extract second brain activity features therefrom; processing the first and second speech-related features to determine if the speech from the first and second users exhibits positive or negative vocal entrainment; processing the first and second brain activity features to determine if the brain activity of the first and second users is aligned or misaligned; and generating feedback, on at least one display device, that indicates if the speech from the first and second users exhibits positive or negative vocal entrainment and if the brain activity of the first and second users is aligned or misaligned.

Type: Grant

Filed: August 4, 2020

Date of Patent: July 12, 2022

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Nichola Lubold, Santosh Mathan
Using machine learning to correct the output of an automatic speech recognition system

Patent number: 11355122

Abstract: In some examples, a software agent executing on a server an utterance from a customer. The software agent converts the utterance to text. The software agent creates an audio representation of the text and performs a comparison of the audio representation and the utterance. The software agent creates edited text based on the comparison. For example, the software agent may determine, based on the comparison, audio differences between the audio representation and the utterance, create a sequence of edit actions based on the audio differences, and apply the sequence of edit actions to the text to create the edited text. The software agent outputs the edited text as a dialog response to the utterance.

Type: Grant

Filed: September 1, 2021

Date of Patent: June 7, 2022

Assignee: ConverseNowAI

Inventors: Fernando Ezequiel Gonzalez, Vinay Kumar Shukla, Rahul Aggarwal, Vrajesh Navinchandra Sejpal, Leonardo Cordoba, Julia Milanese, Zubair Talib, Matias Grinberg
Emotion estimation system and non-transitory computer readable medium

Patent number: 11355140

Abstract: An emotion estimation system includes a feature amount extraction unit, a vowel section specification unit, and an estimation unit. The feature amount extraction unit analyzes recorded produced speech to extract a predetermined feature amount. The vowel section specification unit specifies, based on the feature amount extracted by the feature amount extraction unit, a section in which a vowel is produced. The estimation unit estimates, based on the feature amount in a vowel section specified by the vowel section specification unit, an emotion of a speaker.

Type: Grant

Filed: July 1, 2019

Date of Patent: June 7, 2022

Assignee: FUJIFILM Business Innovation Corp.

Inventor: Xuan Luo
Operation receiving apparatus, control method, image forming system, and recording medium

Patent number: 11350001

Abstract: An operation receiving apparatus includes: a display; a user interface that overlaps with the display and receives a manual operation by a user; and a controller that acquires a recognition result of a user voice, and controls the display to display a first display region and a second display. The first display region displays an operation item that can receive an instruction by the manual operation, and the second display region displays an operation item that can be instructed by the user voice based on the recognition result.

Type: Grant

Filed: March 18, 2020

Date of Patent: May 31, 2022

Assignee: Konica Minolta, Inc.

Inventor: Tomoko Kuroiwa
Methods and systems for managing virtual assistants in multiple device environments based on user movements

Patent number: 11281727

Abstract: Embodiments for managing virtual assistants are described. Information associated with a user in an internet of things (IoT) device environment having a plurality of IoT devices is received. A request from the user is received. In response to the receiving of the request, a first portion of a response to the request is caused to be rendered utilizing a first of the plurality of IoT devices. Movement of the user within the IoT device environment is detected. In response to the detecting of the movement of the user, a second portion of the response to the request is caused to be rendered utilizing a second of the plurality of IoT devices based on said detected movement of the user and said received information about the user.

Type: Grant

Filed: July 3, 2019

Date of Patent: March 22, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Zachary Silverstein, Robert Grant, Ruchika Bengani, Sarbajit Rakshit
Audio interval detection apparatus, method, and recording medium to eliminate a specified interval that does not represent speech based on a divided phoneme

Patent number: 11276390

Abstract: An audio interval detection apparatus has a processor and a storage storing instructions that, when executed by the processor, control the processor to: detect, from a target audio signal, a specified audio interval including a specified audio signal representing a state of a phoneme of a same consonant produced continuously over a period longer than a specified time, and, by eliminating, from the target audio signal at least the detected specified audio interval, detect from the target audio signal an utterance audio interval that includes a speech utterance signal representing a speech utterance uttered by a speaker.

Type: Grant

Filed: March 13, 2019

Date of Patent: March 15, 2022

Assignee: CASIO COMPUTER CO., LTD.

Inventor: Hiroki Tomita
System and method for input recognition linguistic resource management

Patent number: 11262909

Abstract: A system, method and computer program product for use in providing a linguistic resource for input recognition of multiple input types on a computing device are provided. The computing device is connected to an input interface. A user is able to provide input by applying pressure to or gesturing above the input interface using a finger or an instrument such as a stylus or pen. The computing device has an input management system for recognizing the input. The input management system is configured to allow setting, in the computing device memory, parameters of a linguistic resource for a language model of one or more languages, and cause recognition of input to the input interface of the different input types using the linguistic resource. The resource parameters are set to optimize recognition performance characteristics of each input type while providing the linguistic resource with the pre-determined size.

Type: Grant

Filed: July 21, 2016

Date of Patent: March 1, 2022

Assignee: MYSCRIPT

Inventors: Ali Reza Ebadat, Lois Rigouste

1 2 3 4 5 … next