Patents Examined by Farzad Kazeminezhad

Voice user interface for intervening in conversation of at least one user by adjusting two different thresholds

Patent number: 11594224

Abstract: An electronic device is provided. The electronic device includes a memory configured to store at least one instruction, and at least one processor where the at least one processor is configured to execute the instruction to obtain voice data from a conversation of at least one user, convert the voice data to text data, determine at least one parameter indicating characteristic of the conversation based on at least one of the voice data or the text data, adjust a condition for triggering intervention in the conversation based on the determined at least one parameter, and output a feedback based on the text data when the adjusted condition is satisfied, wherein the adjustment of the condition includes adjusting a first and a second threshold based on change of the at least one parameter.

Type: Grant

Filed: December 4, 2020

Date of Patent: February 28, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Kun Wang, Yanfang Pan, Yazhi Zhao, Lin Ding, Yueyue Jiang, Xu Fan, Bo Peng
Speech translation method electronic device and computer-readable storage medium using SEQ2SEQ for determining alternative translated speech segments

Patent number: 11586831

Abstract: Provided are a speech translation method and apparatus, an electronic device and a storage medium. The method includes: acquiring a source speech corresponding to a to-be-translated language; acquiring a specified target language; inputting the source speech and indication information matched with the target language into a pre-trained speech translation model, where the speech translation model is configured to translate a language in a first language set into a language in a second language set, the first language set includes a plurality of languages, the first language set includes the to-be-translated language, the second language set includes a plurality of languages, and the second language set includes the target language; and acquiring a translated speech corresponding to the target language and output by the speech translation model; where the to-be-translated language is different from the target language.

Type: Grant

Filed: February 26, 2021

Date of Patent: February 21, 2023

Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.

Inventors: Mingxuan Wang, Qianqian Dong, Lei Li
Signal processing apparatus, communication system, method performed by signal processing apparatus, storage medium for signal processing apparatus, method performed by communication terminal, and storage medium for communication terminal to receive text data from another communication terminal in response to a unique texting completion notice

Patent number: 11568154

Abstract: According to one embodiment, a signal processing apparatus correlates a plurality of communication terminals as a group and enables one-to-many communications in the group. The signal processing apparatus includes processing circuitry. The processing circuitry assigns a transmission right to one of the communication terminals in the group. The processing circuitry generates text data based on voice data from said one of the communication terminals in possession of the transmission right. The processing circuitry gives a texting completion notice indicative of completion of texting processing to the communication terminals in the group. The processing circuitry transmits, after the texting completion notice is given, the generated text data to at least one of the communication terminals in the group.

Type: Grant

Filed: July 23, 2019

Date of Patent: January 31, 2023

Assignee: Science Arts, Inc.

Inventors: Hidekazu Hiraoka, Kazuaki Okimoto, Katsumi Yokomichi
Techniques for separating driving emotion from media induced emotion using an additive/subtractive, conjunctive, disjunctive, or Bayesian technique in a driver monitoring system

Patent number: 11532319

Abstract: One or more embodiments include an emotion analysis system for computing and analyzing emotional state of a user. The emotion analysis system acquires, via at least one sensor, sensor data associated with a user. The emotion analysis system determines, based on the sensor data, an emotional state associated with a user. The emotion analysis system determines a first component of the emotional state that corresponds to media content being accessed by the user. The emotion analysis system applies a first function to the emotional state to remove the first component from the emotional state.

Type: Grant

Filed: March 16, 2020

Date of Patent: December 20, 2022

Assignee: Harman International Industries, Incorporated

Inventors: Stefan Marti, Joseph Verbeke
Lip-language identification method and apparatus, and augmented reality (AR) device and storage medium which identifies an object based on an azimuth angle associated with the AR field of view

Patent number: 11527242

Abstract: A lip-language identification method and an apparatus thereof, an augmented reality device and a storage medium. The lip-language identification method includes: acquiring a sequence of face images for an object to be identified; performing lip-language identification based on a sequence of face images so as to determine semantic information of speech content of the object to be identified corresponding to lip actions in a face image; and outputting the semantic information.

Type: Grant

Filed: April 24, 2019

Date of Patent: December 13, 2022

Assignee: Beijing BOE Technology Development Co., Ltd.

Inventors: Naifu Wu, Xitong Ma, Lixin Kou, Sha Feng
Apparatus and method for encoding an audio signal using compensation values between three spectral bands

Patent number: 11521628

Abstract: An apparatus for encoding an audio signal includes: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder includes: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.

Type: Grant

Filed: February 22, 2019

Date of Patent: December 6, 2022

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Sascha Disch, Franz Reutelhuber, Jan Buethe, Markus Multrus, Bernd Edler
Adaptively modifying dialog output by an artificial intelligence engine during a conversation with a customer based on changing the customer's negative emotional state to a positive one

Patent number: 11514894

Abstract: In some examples, a server may receive an utterance from a customer. The utterance may be included in a conversation between the artificial intelligence engine and the customer. The server may convert the utterance to text and determine a customer intent based on the text and a user history. The server may determine a user model of the customer based on the text and the customer intent. The server may update a conversation state associated with the conversation based on the customer intent and the user model. The server may determine a user state based on the user model and the conversation state. The server may select, using a reinforcement learning based module, a particular action from a set of actions, the particular action including a response and provide the response to the customer.

Type: Grant

Filed: November 18, 2021

Date of Patent: November 29, 2022

Assignee: ConverseNowAI

Inventors: Vrajesh Navinchandra Sejpal, Akshay Labh Kayastha, Yuganeshan A J, Pranav Nirmal Mehra, Rahul Aggarwal, Vinay Kumar Shukla, Zubair Talib
Electronic device and method for controlling the electronic device thereof based on determining intent of a user speech in a first language machine translated into a predefined second language

Patent number: 11501089

Abstract: An electronic device and a method for controlling the electronic device thereof are provided. The electronic device includes a memory storing instructions, and a processor configured to control the electronic device by executing the instructions stored in the memory, and the processor is configured to, based on a user's speech being input, acquire a first sentence in a first language corresponding to the user's speech through a speech recognition model corresponding to a language of the user's speech, acquire a second sentence in a second language corresponding to the first sentence in the first language through a machine translation model trained to translate a plurality of languages into the predefined second language, and acquire a control instruction of the electronic device corresponding to the acquired second sentence or acquire a response to the second sentence through a natural language understanding model trained based on the second language.

Type: Grant

Filed: April 2, 2020

Date of Patent: November 15, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jiwan Kim, Seungsoo Kang, Jongyoub Ryu, Soyoon Park, Sangha Kim, Hakjung Kim, Myungjin Eom
Lip language recognition method and mobile terminal using sound and silent modes

Patent number: 11495231

Abstract: A lip language recognition method, applied to a mobile terminal having a sound mode and a silent mode, includes: training a deep neural network in the sound mode; collecting a user's lip images in the silent mode; and identifying content corresponding to the user's lip images with the deep neural network trained in the sound mode. The method further includes: switching from the sound mode to the silent mode when a privacy need of the user arises.

Type: Grant

Filed: November 26, 2018

Date of Patent: November 8, 2022

Assignee: BEIJING BOE TECHNOLOGY DEVELOPMENT CO., LTD.

Inventors: Lihua Geng, Xitong Ma, Zhiguo Zhang
System for creating speaker model based on vocal sounds for a speaker recognition system, computer program product, and controller, using two neural networks

Patent number: 11495235

Abstract: According to one embodiment, a system for creating a speaker model includes one or more processors. The processors change a part of network parameters from an input layer to a predetermined intermediate layer based on a plurality of patterns and inputs a piece of speech into each of neural networks so as to obtain a plurality of outputs from the intermediate layer. The part of network parameters of the each of the neural networks is changed based on one of the plurality of patterns. The processors create a speaker model with respect to one or more words detected from the speech based on the outputs.

Type: Grant

Filed: March 8, 2019

Date of Patent: November 8, 2022

Assignee: Kabushiki Kaisha Toshiba

Inventor: Hiroshi Fujimura
Apparatus and method for encoding and decoding of integrated speech and audio utilizing a band expander with a spectral band replication (SBR) to output the SBR to either time or transform domain encoding according to the input signal

Patent number: 11456002

Abstract: Provided are an apparatus and a method for integrally encoding and decoding a speech signal and a audio signal. The encoding apparatus may include: an input signal analyzer to analyze a characteristic of an input signal; a first conversion encoder to convert the input signal to a frequency domain signal, and to encode the input signal when the input signal is a audio characteristic signal; a Linear Predictive Coding (LPC) encoder to perform LPC encoding of the input signal when the input signal is a speech characteristic signal; and a bitstream generator to generate a bitstream using an output.

Type: Grant

Filed: September 11, 2020

Date of Patent: September 27, 2022

Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION

Inventors: Tae Jin Lee, Seung-Kwon Baek, Min Je Kim, Dae Young Jang, Jeongil Seo, Kyeongok Kang, Jin-Woo Hong, Hochong Park, Young-cheol Park
Systems and methods for generating synthesized speech responses to voice inputs by training a neural network model based on the voice input prosodic metrics and training voice inputs

Patent number: 11450306

Abstract: The system trains a model to provide information used to provide a synthesized speech response to a voice input. The model takes as input prosodic information that may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example. The system receives a plurality of voice inputs, each associated with prosodic metric, as well as a plurality of responses, each also associated with prosodic metrics. The system trains the model based on the plurality of voice inputs, the plurality of responses, the prosodic metrics of the voice inputs, and the prosodic metrics of the responses such that the model outputs information used to generate the response. The model may also take as input user profile information, emotion metrics, and transition information to generate output. The output of the training model may be used by the system to provide synthesized speech responses having relevant prosodic character to received voice inputs.

Type: Grant

Filed: May 13, 2020

Date of Patent: September 20, 2022

Assignee: ROVl GUIDES, INC.

Inventors: Ankur Aher, Jeffry Copps Robert Jose
Systems and methods for generating synthesized speech responses to voice inputs by training a neural network model based on the voice input prosodic metrics and training voice inputs

Patent number: 11443731

Abstract: The system provides a synthesized speech response to a voice input, based on the prosodic character of the voice input. The system receives the voice input and calculates at least one prosodic metric of the voice input. The at least one prosodic metric can be associated with a word, phrase, grouping thereof, or the entire voice input. The system also determines a response to the voice input, which may include the sequence of words that form the response. The system generates the synthesized speech response, by determining prosodic characteristics based on the response, and on the prosodic character of the voice input. The system outputs the synthesized speech response, which includes a more natural, relevant, or both answer to the call of the voice input. The prosodic character of the voice input and/or response may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example.

Type: Grant

Filed: May 13, 2020

Date of Patent: September 13, 2022

Assignee: ROVI GUIDES, INC.

Inventors: Ankur Aher, Jeffry Copps Robert Jose
Activating speech recognition based on hand patterns detected using plurality of filters

Patent number: 11437031

Abstract: A device to process an audio signal representing input sound includes a hand detector configured to generate a first indication responsive to detection of at least a portion of a hand over at least a portion of the device. The device also includes an automatic speech recognition system configured to be activated, responsive to the first indication, to process the audio signal.

Type: Grant

Filed: July 30, 2019

Date of Patent: September 6, 2022

Assignee: QUALCOMM Incorporated

Inventors: Sungrack Yun, Young Mo Kang, Hye Jin Jang, Byeonggeun Kim, Kyu Woong Hwang
Handling away messages with intelligent assistance using voice services

Patent number: 11423905

Abstract: Systems and methods for handling away messages with intelligent assistance using voice services. In some embodiments, an Information Handling System (IHS) may include: a processor; and a memory coupled to the processor, the memory having program instructions stored thereon that, upon execution, cause the IHS to: detect the presence of a person; output an audio greeting in response to the detection; receive an audio instruction in response to the audio greeting; transmit the audio instruction to a voice service provider, the voice service provider configured to: (i) convert the audio instruction into a text instruction, and (ii) transmit the text instruction to an intelligent assistance provider; receive a command from the intelligent assistance provider, the intelligent assistance provider configured to generate the command based upon the text instruction; and execute the command.

Type: Grant

Filed: November 30, 2020

Date of Patent: August 23, 2022

Assignee: Dell Products, L.P.

Inventors: Marc Randall Hammons, Todd Erick Swierk, Tyler Ryan Cox
Method and apparatus for determining a reply statement to a statement based on a sum of a probability of the reply statement being output in response to the statement and a second probability in which the statement is output in response to the statement and further based on a terminator

Patent number: 11416681

Abstract: Aspects of the present disclosure provide a method and an apparatus for determining a reply to a statement. The apparatus includes processing circuitry determining, based on a preset lexicon, potential reply statements in response to a statement, and first matching probabilities respectively corresponding to the potential reply statements. A first matching probability indicates a probability of the corresponding potential reply statement being output in response to the statement according to the preset lexicon. The processing circuitry also obtains second matching probabilities respectively corresponding to the potential reply statements. A second matching probability indicates a probability of words in the statement being output in response to the corresponding potential reply statement according to the preset lexicon.

Type: Grant

Filed: March 20, 2019

Date of Patent: August 16, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Yuan Xin
Methods, apparatuses, systems, devices, and computer-readable storage media for processing speech signals based on horizontal and pitch angles and distance of a sound source relative to a microphone array

Patent number: 11398235

Abstract: The disclosed embodiments disclose methods, apparatuses, systems, devices and computer-readable storage media for processing speech signals. The method comprises: acquiring a real-time image by using an image capturing device, performing facial recognition by using the real-time image, and detecting a period during which a target user makes speech sounds based on a facial recognition result; locating a sound source in an audio signal received by a microphone array, and determining the orientation information of a sound source in the audio signal; and based on the period during which the target user in the real-time image makes the speech sounds and the orientation information of the sound source, performing a speech sound start and end point analysis to determine start and end time points of the speech sounds in the audio signal.

Type: Grant

Filed: August 28, 2019

Date of Patent: July 26, 2022

Assignee: ALIBABA GROUP HOLDING LIMITED

Inventors: Biao Tian, Zhaowei He, Tao Yu
Goal-oriented dialog generation using dialog template, API, and entity data

Patent number: 11393454

Abstract: A dialog generator receives data corresponding to desired dialog, such as application programming interface (API) information and sample dialog. A first model corresponding to an agent simulator and a second model corresponding to a user simulator take turns creating a plurality of dialog outlines of the desired dialog. The dialog generator may determine that one or more additional APIs are relevant to the dialog and may create further dialog outlines related thereto. The dialog outlines are converted to natural dialog to generate the dialog.

Type: Grant

Filed: December 13, 2018

Date of Patent: July 19, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Anish Acharya, Angeliki Metallinou, Tagyoung Chung, Shachi Paul, Shubhra Chandra, Chien-wei Lin, Dilek Hakkani-Tur, Arindam Mandal
Error concealment unit, audio decoder, and related method and computer program using characteristics of a decoded representation of a properly decoded audio frame

Patent number: 11386906

Abstract: There is provided an error concealment unit, method, and computer program, for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. In one embodiment, the error concealment unit provides an error concealment audio information for a lost audio frame on the basis of a properly decoded audio frame preceding the lost audio frame. The error concealment unit derives a damping factor on the basis of characteristics of a decoded representation of the properly decoded audio frame preceding the lost audio frame. The error concealment unit performs a fade out using the damping factor.

Type: Grant

Filed: August 28, 2020

Date of Patent: July 12, 2022

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung, e.V.

Inventors: Jérémie Lecomte, Adrian Tomasek
Characterizing accuracy of ensemble models for automatic speech recognition by determining a predetermined number of multiple ASR engines based on their historical performance

Patent number: 11380315

Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.

Type: Grant

Filed: March 9, 2019

Date of Patent: July 5, 2022

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud

prev 1 2 3 4 5 6 7 … next