Patents Examined by Farzad Kazeminezhad
  • Patent number: 11594224
    Abstract: An electronic device is provided. The electronic device includes a memory configured to store at least one instruction, and at least one processor where the at least one processor is configured to execute the instruction to obtain voice data from a conversation of at least one user, convert the voice data to text data, determine at least one parameter indicating characteristic of the conversation based on at least one of the voice data or the text data, adjust a condition for triggering intervention in the conversation based on the determined at least one parameter, and output a feedback based on the text data when the adjusted condition is satisfied, wherein the adjustment of the condition includes adjusting a first and a second threshold based on change of the at least one parameter.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: February 28, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kun Wang, Yanfang Pan, Yazhi Zhao, Lin Ding, Yueyue Jiang, Xu Fan, Bo Peng
  • Patent number: 11586831
    Abstract: Provided are a speech translation method and apparatus, an electronic device and a storage medium. The method includes: acquiring a source speech corresponding to a to-be-translated language; acquiring a specified target language; inputting the source speech and indication information matched with the target language into a pre-trained speech translation model, where the speech translation model is configured to translate a language in a first language set into a language in a second language set, the first language set includes a plurality of languages, the first language set includes the to-be-translated language, the second language set includes a plurality of languages, and the second language set includes the target language; and acquiring a translated speech corresponding to the target language and output by the speech translation model; where the to-be-translated language is different from the target language.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: February 21, 2023
    Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
    Inventors: Mingxuan Wang, Qianqian Dong, Lei Li
  • Patent number: 11568154
    Abstract: According to one embodiment, a signal processing apparatus correlates a plurality of communication terminals as a group and enables one-to-many communications in the group. The signal processing apparatus includes processing circuitry. The processing circuitry assigns a transmission right to one of the communication terminals in the group. The processing circuitry generates text data based on voice data from said one of the communication terminals in possession of the transmission right. The processing circuitry gives a texting completion notice indicative of completion of texting processing to the communication terminals in the group. The processing circuitry transmits, after the texting completion notice is given, the generated text data to at least one of the communication terminals in the group.
    Type: Grant
    Filed: July 23, 2019
    Date of Patent: January 31, 2023
    Assignee: Science Arts, Inc.
    Inventors: Hidekazu Hiraoka, Kazuaki Okimoto, Katsumi Yokomichi
  • Patent number: 11532319
    Abstract: One or more embodiments include an emotion analysis system for computing and analyzing emotional state of a user. The emotion analysis system acquires, via at least one sensor, sensor data associated with a user. The emotion analysis system determines, based on the sensor data, an emotional state associated with a user. The emotion analysis system determines a first component of the emotional state that corresponds to media content being accessed by the user. The emotion analysis system applies a first function to the emotional state to remove the first component from the emotional state.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: December 20, 2022
    Assignee: Harman International Industries, Incorporated
    Inventors: Stefan Marti, Joseph Verbeke
  • Patent number: 11527242
    Abstract: A lip-language identification method and an apparatus thereof, an augmented reality device and a storage medium. The lip-language identification method includes: acquiring a sequence of face images for an object to be identified; performing lip-language identification based on a sequence of face images so as to determine semantic information of speech content of the object to be identified corresponding to lip actions in a face image; and outputting the semantic information.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: December 13, 2022
    Assignee: Beijing BOE Technology Development Co., Ltd.
    Inventors: Naifu Wu, Xitong Ma, Lixin Kou, Sha Feng
  • Patent number: 11521628
    Abstract: An apparatus for encoding an audio signal includes: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder includes: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: December 6, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Franz Reutelhuber, Jan Buethe, Markus Multrus, Bernd Edler
  • Patent number: 11514894
    Abstract: In some examples, a server may receive an utterance from a customer. The utterance may be included in a conversation between the artificial intelligence engine and the customer. The server may convert the utterance to text and determine a customer intent based on the text and a user history. The server may determine a user model of the customer based on the text and the customer intent. The server may update a conversation state associated with the conversation based on the customer intent and the user model. The server may determine a user state based on the user model and the conversation state. The server may select, using a reinforcement learning based module, a particular action from a set of actions, the particular action including a response and provide the response to the customer.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: November 29, 2022
    Assignee: ConverseNowAI
    Inventors: Vrajesh Navinchandra Sejpal, Akshay Labh Kayastha, Yuganeshan A J, Pranav Nirmal Mehra, Rahul Aggarwal, Vinay Kumar Shukla, Zubair Talib
  • Patent number: 11501089
    Abstract: An electronic device and a method for controlling the electronic device thereof are provided. The electronic device includes a memory storing instructions, and a processor configured to control the electronic device by executing the instructions stored in the memory, and the processor is configured to, based on a user's speech being input, acquire a first sentence in a first language corresponding to the user's speech through a speech recognition model corresponding to a language of the user's speech, acquire a second sentence in a second language corresponding to the first sentence in the first language through a machine translation model trained to translate a plurality of languages into the predefined second language, and acquire a control instruction of the electronic device corresponding to the acquired second sentence or acquire a response to the second sentence through a natural language understanding model trained based on the second language.
    Type: Grant
    Filed: April 2, 2020
    Date of Patent: November 15, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jiwan Kim, Seungsoo Kang, Jongyoub Ryu, Soyoon Park, Sangha Kim, Hakjung Kim, Myungjin Eom
  • Patent number: 11495231
    Abstract: A lip language recognition method, applied to a mobile terminal having a sound mode and a silent mode, includes: training a deep neural network in the sound mode; collecting a user's lip images in the silent mode; and identifying content corresponding to the user's lip images with the deep neural network trained in the sound mode. The method further includes: switching from the sound mode to the silent mode when a privacy need of the user arises.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: November 8, 2022
    Assignee: BEIJING BOE TECHNOLOGY DEVELOPMENT CO., LTD.
    Inventors: Lihua Geng, Xitong Ma, Zhiguo Zhang
  • Patent number: 11495235
    Abstract: According to one embodiment, a system for creating a speaker model includes one or more processors. The processors change a part of network parameters from an input layer to a predetermined intermediate layer based on a plurality of patterns and inputs a piece of speech into each of neural networks so as to obtain a plurality of outputs from the intermediate layer. The part of network parameters of the each of the neural networks is changed based on one of the plurality of patterns. The processors create a speaker model with respect to one or more words detected from the speech based on the outputs.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: November 8, 2022
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Hiroshi Fujimura
  • Patent number: 11456002
    Abstract: Provided are an apparatus and a method for integrally encoding and decoding a speech signal and a audio signal. The encoding apparatus may include: an input signal analyzer to analyze a characteristic of an input signal; a first conversion encoder to convert the input signal to a frequency domain signal, and to encode the input signal when the input signal is a audio characteristic signal; a Linear Predictive Coding (LPC) encoder to perform LPC encoding of the input signal when the input signal is a speech characteristic signal; and a bitstream generator to generate a bitstream using an output.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: September 27, 2022
    Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
    Inventors: Tae Jin Lee, Seung-Kwon Baek, Min Je Kim, Dae Young Jang, Jeongil Seo, Kyeongok Kang, Jin-Woo Hong, Hochong Park, Young-cheol Park
  • Patent number: 11450306
    Abstract: The system trains a model to provide information used to provide a synthesized speech response to a voice input. The model takes as input prosodic information that may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example. The system receives a plurality of voice inputs, each associated with prosodic metric, as well as a plurality of responses, each also associated with prosodic metrics. The system trains the model based on the plurality of voice inputs, the plurality of responses, the prosodic metrics of the voice inputs, and the prosodic metrics of the responses such that the model outputs information used to generate the response. The model may also take as input user profile information, emotion metrics, and transition information to generate output. The output of the training model may be used by the system to provide synthesized speech responses having relevant prosodic character to received voice inputs.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: September 20, 2022
    Assignee: ROVl GUIDES, INC.
    Inventors: Ankur Aher, Jeffry Copps Robert Jose
  • Patent number: 11443731
    Abstract: The system provides a synthesized speech response to a voice input, based on the prosodic character of the voice input. The system receives the voice input and calculates at least one prosodic metric of the voice input. The at least one prosodic metric can be associated with a word, phrase, grouping thereof, or the entire voice input. The system also determines a response to the voice input, which may include the sequence of words that form the response. The system generates the synthesized speech response, by determining prosodic characteristics based on the response, and on the prosodic character of the voice input. The system outputs the synthesized speech response, which includes a more natural, relevant, or both answer to the call of the voice input. The prosodic character of the voice input and/or response may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: September 13, 2022
    Assignee: ROVI GUIDES, INC.
    Inventors: Ankur Aher, Jeffry Copps Robert Jose
  • Patent number: 11437031
    Abstract: A device to process an audio signal representing input sound includes a hand detector configured to generate a first indication responsive to detection of at least a portion of a hand over at least a portion of the device. The device also includes an automatic speech recognition system configured to be activated, responsive to the first indication, to process the audio signal.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: September 6, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Sungrack Yun, Young Mo Kang, Hye Jin Jang, Byeonggeun Kim, Kyu Woong Hwang
  • Patent number: 11423905
    Abstract: Systems and methods for handling away messages with intelligent assistance using voice services. In some embodiments, an Information Handling System (IHS) may include: a processor; and a memory coupled to the processor, the memory having program instructions stored thereon that, upon execution, cause the IHS to: detect the presence of a person; output an audio greeting in response to the detection; receive an audio instruction in response to the audio greeting; transmit the audio instruction to a voice service provider, the voice service provider configured to: (i) convert the audio instruction into a text instruction, and (ii) transmit the text instruction to an intelligent assistance provider; receive a command from the intelligent assistance provider, the intelligent assistance provider configured to generate the command based upon the text instruction; and execute the command.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: August 23, 2022
    Assignee: Dell Products, L.P.
    Inventors: Marc Randall Hammons, Todd Erick Swierk, Tyler Ryan Cox
  • Patent number: 11416681
    Abstract: Aspects of the present disclosure provide a method and an apparatus for determining a reply to a statement. The apparatus includes processing circuitry determining, based on a preset lexicon, potential reply statements in response to a statement, and first matching probabilities respectively corresponding to the potential reply statements. A first matching probability indicates a probability of the corresponding potential reply statement being output in response to the statement according to the preset lexicon. The processing circuitry also obtains second matching probabilities respectively corresponding to the potential reply statements. A second matching probability indicates a probability of words in the statement being output in response to the corresponding potential reply statement according to the preset lexicon.
    Type: Grant
    Filed: March 20, 2019
    Date of Patent: August 16, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Yuan Xin
  • Patent number: 11398235
    Abstract: The disclosed embodiments disclose methods, apparatuses, systems, devices and computer-readable storage media for processing speech signals. The method comprises: acquiring a real-time image by using an image capturing device, performing facial recognition by using the real-time image, and detecting a period during which a target user makes speech sounds based on a facial recognition result; locating a sound source in an audio signal received by a microphone array, and determining the orientation information of a sound source in the audio signal; and based on the period during which the target user in the real-time image makes the speech sounds and the orientation information of the sound source, performing a speech sound start and end point analysis to determine start and end time points of the speech sounds in the audio signal.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: July 26, 2022
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Biao Tian, Zhaowei He, Tao Yu
  • Patent number: 11393454
    Abstract: A dialog generator receives data corresponding to desired dialog, such as application programming interface (API) information and sample dialog. A first model corresponding to an agent simulator and a second model corresponding to a user simulator take turns creating a plurality of dialog outlines of the desired dialog. The dialog generator may determine that one or more additional APIs are relevant to the dialog and may create further dialog outlines related thereto. The dialog outlines are converted to natural dialog to generate the dialog.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: July 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Anish Acharya, Angeliki Metallinou, Tagyoung Chung, Shachi Paul, Shubhra Chandra, Chien-wei Lin, Dilek Hakkani-Tur, Arindam Mandal
  • Patent number: 11386906
    Abstract: There is provided an error concealment unit, method, and computer program, for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. In one embodiment, the error concealment unit provides an error concealment audio information for a lost audio frame on the basis of a properly decoded audio frame preceding the lost audio frame. The error concealment unit derives a damping factor on the basis of characteristics of a decoded representation of the properly decoded audio frame preceding the lost audio frame. The error concealment unit performs a fade out using the damping factor.
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: July 12, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung, e.V.
    Inventors: Jérémie Lecomte, Adrian Tomasek
  • Patent number: 11380315
    Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.
    Type: Grant
    Filed: March 9, 2019
    Date of Patent: July 5, 2022
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud