Patents Examined by Angela A. Armstrong
-
Patent number: 11640505Abstract: Embodiments described herein provide systems and methods for an Explicit Memory Tracker (EMT) that tracks each rule sentence to perform decision making and to generate follow-up clarifying questions. Specifically, the EMT first segments the regulation text into several rule sentences and allocates the segmented rule sentences into memory modules, and then feeds information regarding the user scenario and dialogue history into the EMT sequentially to update each memory module separately. At each dialogue turn, the EMT makes a decision among based on current memory status of the memory modules whether further clarification is needed to come up with an answer to a user question. The EMT determines that further clarification is needed by identifying an underspecified rule sentence span by modulating token-level span distributions with sentence-level selection scores. The EMT extracts the underspecified rule sentence span and rephrases the underspecified rule sentence span to generate a follow-up question.Type: GrantFiled: April 30, 2020Date of Patent: May 2, 2023Assignee: Salesforce.com, Inc.Inventors: Yifan Gao, Chu Hong Hoi, Shafiq Rayhan Joty, Chien-Sheng Wu
-
Patent number: 11587563Abstract: A method of presenting a signal to a speech processing engine is disclosed. According to an example of the method, an audio signal is received via a microphone. A portion of the audio signal is identified, and a probability is determined that the portion comprises speech directed by a user of the speech processing engine as input to the speech processing engine. In accordance with a determination that the probability exceeds a threshold, the portion of the audio signal is presented as input to the speech processing engine. In accordance with a determination that the probability does not exceed the threshold, the portion of the audio signal is not presented as input to the speech processing engine.Type: GrantFiled: February 28, 2020Date of Patent: February 21, 2023Assignee: Magic Leap, Inc.Inventors: Anthony Robert Sheeder, Colby Nelson Leider
-
Patent number: 11568864Abstract: A computing system for generating image data representing a speaker's face includes a detection device configured to route data representing a voice signal to one or more processors and a data processing device comprising the one or more processors configured to generate a representation of a speaker that generated the voice signal in response to receiving the voice signal. The data processing device executes a voice embedding function to generate a feature vector from the voice signal representing one or more signal features of the voice signal, maps a signal feature of the feature vector to a visual feature of the speaker by a modality transfer function specifying a relationship between the visual feature of the speaker and the signal feature of the feature vector; and generates a visual representation of at least a portion of the speaker based on the mapping, the visual representation comprising the visual feature.Type: GrantFiled: August 13, 2019Date of Patent: January 31, 2023Assignee: Carnegie Mellon UniversityInventor: Rita Singh
-
Patent number: 11557284Abstract: A method, system and computer program product for speech recognition using multiple languages includes receiving, by one or more processors, an input from a user, the input includes a sentence in a first language. The one or more processors translate the sentence to a plurality of languages different than the first language, and create vectors associated with the plurality of languages, each vector includes a representation of the sentence in each of the plurality of languages. The one or more processors calculate eigenvectors for each vector associated with a language in the plurality of languages, and based on the calculated eigenvectors, a score is assigned to each of the plurality of languages according to a relevance for determining a meaning of the sentence.Type: GrantFiled: January 3, 2020Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Zhong Fang Yuan, Kun Yan Yin, He Li, Tong Liu, Hai Ji
-
Patent number: 11545163Abstract: A loss function of a signal including an audio signal is determined. A loss function determining system for an audio signal is provided. A loss function is determined by: determining a reference quantization index by quantizing an original input signal; inputting the original input signal to a neural network classifier and applying an activation function to an output layer of the neural network classifier; and determining a total loss function for the neural network classifier using an output of the activation function and the reference quantization index.Type: GrantFiled: December 27, 2019Date of Patent: January 3, 2023Assignee: Electronics and Telecommunications Research InstituteInventors: Seung Kwon Beack, Woo-taek Lim, Tae Jin Lee
-
Patent number: 11532181Abstract: An electronic device and method are disclosed herein. The electronic device includes a microphone, a camera, an output device, a memory, and a processor. The processor implements the method, including receiving a voice input and/or capturing an image, and analyze the first voice input or the image to determine at least one of a user's intent, emotion, and situation based on predefined keywords and expressions, identifying a category based on the input, selecting first information based on the category, selecting and outputting a first query prompting confirmation of output of the first information, detect a first responsive input to the first query, and when a condition to output the first information is satisfied, output a second query, detecting a second input responsive to the second query, and selectively outputting the first information based on the second input.Type: GrantFiled: March 30, 2018Date of Patent: December 20, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Yong Ju Yu, Ja Min Goo, Seong Hoon You, Ki Young Kwon, Ki Won Kim, Eun Young Kim, Ji Min Kim, Chul Kwi Kim, Hyung Woo Kim, Joo Namkung, Ji Hyun Park, Sae Gee Oh, Dong Kyu Lee, Im Sung Lee, Chan Won Lee, Si Hak Jang
-
Patent number: 11531819Abstract: Machine learned models take in vectors representing desired behaviors and generate voice vectors that provide the parameters for text-to-speech (TTS) synthesis. Models may be trained on behavior vectors that include user profile attributes, situational attributes, or semantic attributes. Situational attributes may include age of people present, music that is playing, location, noise, and mood. Semantic attributes may include presence of proper nouns, number of modifiers, emotional charge, and domain of discourse. TTS voice parameters may apply per utterance and per word as to enable contrastive emphasis.Type: GrantFiled: January 14, 2020Date of Patent: December 20, 2022Assignee: SoundHound, Inc.Inventors: Bernard Mont-Reynaud, Monika Almudafar-Depeyrot
-
Patent number: 11527241Abstract: A display device and a method for controlling the same are provided. The display device includes a rollable display screen, a voice acquisition unit, an identification control unit, a drive control unit and a display control unit. The voice acquisition unit is configured to acquire a first voice command. The identification control unit is configured to identify the first voice command acquired by the voice acquisition unit as a voice process command, and the voice process command includes a rolling operation command and a display drive command. The drive control unit is configured to perform an operation corresponding to the rolling operation command on the rollable display screen according to the rolling operation command. The display control unit is configured to control a display state of the rollable display screen according to the display drive command.Type: GrantFiled: October 11, 2019Date of Patent: December 13, 2022Assignees: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., Beijing BOE Technology Development Co., Ltd.Inventors: Jiyang Shao, Yuxin Bi, Jian Sun, Hao Zhang
-
Patent number: 11495236Abstract: An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.Type: GrantFiled: May 19, 2020Date of Patent: November 8, 2022Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International ABInventors: Lars Villemoes, Per Ekstrand, Sascha Disch, Frederik Nagel, Stephan Wilde
-
Patent number: 11488596Abstract: A method for recording audio content in a group conversation among a plurality of members includes: controlling an image capturing device to continuously capture images of the members; executing an image processing procedure on the images of the members to determine whether a specific gesture is detected; when the determination is affirmative, controlling an audio recording device to activate and perform directional audio collection with respect to a direction that is associated with the specific gesture to record audio data; and controlling a data storage to store the audio data and a time stamp associated with the audio data as an entry of conversation record.Type: GrantFiled: April 27, 2020Date of Patent: November 1, 2022Inventor: Hsiao-Han Chen
-
Patent number: 11488577Abstract: The present application discloses a training method and an apparatus for a speech synthesis model, electronic device, and storage medium. The method includes: taking a syllable input sequence, a phoneme input sequence and a Chinese character input sequence of a current sample as inputs of an encoder of a model to be trained, to obtain encoded representations of these three sequences at an output end of the encoder; fusing the encoded representations of these three sequences, to obtain a weighted combination of these three sequences; taking the weighted combination as an input of an attention module, to obtain a weighted average of the weighted combination at each moment at an output end of the attention module; taking the weighted average as an input of a decoder of the model to be trained, to obtain a speech Mel spectrum of the current sample at an output end of the decoder.Type: GrantFiled: June 19, 2020Date of Patent: November 1, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Zhipeng Chen, Jinfeng Bai, Lei Jia
-
Patent number: 11461779Abstract: Techniques for transferring control of a system-user dialog session are described. A first speechlet component may interact with a user until the first speechlet component receives user input that the first speechlet component cannot handle. The first speechlet component may output an action representing the user input. A system may determine a second speechlet component configured to execute the action. The system may send the second speechlet component a navigator object that results in the second speechlet component handling the user interaction that the first speechlet component could not handle. Once the second speechlet component is finished processing, the second speechlet component may output an updated navigator object, which causes the first speechlet component to either further interact with a user or cause a current dialog session to be closed.Type: GrantFiled: March 23, 2018Date of Patent: October 4, 2022Assignee: Amazon Technologies, Inc.Inventors: Rohin Dabas, Troy Dean Schuring, Xu Zhang, Maksym Kolodeznyi, Andres Felipe Borja Jaramillo, Nnenna Eleanya Okwara, Alberto Milan Gutierrez, Rashmi Tonge
-
Patent number: 11449682Abstract: Systems, devices, and methods provide improved autonomous agents that are configured to respond to a user's query based on an emotion with which the query was expressed and a personality trait of the user. The agent may identify candidate answers to the query that are each associated with an emotion and/or a personality trait. The autonomous agent may utilize a predefined protocol set that indicates transitions between emotional states. A transition may correspond to an action associated with an emotion and/or a personality trait that, if performed, is likely to maintain a user in or transition the user to a preferred emotional state. The responses may be scored based at least in part their corresponding emotions and/or personality traits and in light of the transitions identified in the protocol set. A particular scored response may be selected and provided to the user in response to their query.Type: GrantFiled: May 7, 2020Date of Patent: September 20, 2022Assignee: ORACLE INTERNATIONAL CORPORATIONInventor: Boris Galitsky
-
Patent number: 11430427Abstract: This application can provide a method and electronic device for separating mixed sound signal. The method includes: obtaining a first hidden variable representing a human voice feature and a second hidden variable representing an accompaniment sound feature by inputting feature data of a mixed sound extracted from a mixed sound signal into a coding model for the mixed sound; obtaining first feature data of a human voice and second feature data of an accompaniment sound by inputting the first hidden variable and the second hidden variable into a first decoding model for the human voice and a second decoding model for the accompaniment sound respectively; and obtaining, based on the first feature data and the second feature data, the human voice and the accompaniment sound.Type: GrantFiled: June 21, 2021Date of Patent: August 30, 2022Assignee: Beijing Dajia Internet Information Technology Co., Ltd.Inventors: Ning Zhang, Yan Li, Tao Jiang
-
Patent number: 11423906Abstract: A method, computer system, and computer readable medium are provided for automatic speech recognition. Video data and audio data corresponding to one or more speakers is received. A minimum variance distortionless response function is applied to the received audio and video data. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated based on back-propagating the output of the applied minimum variance distortionless response function.Type: GrantFiled: July 10, 2020Date of Patent: August 23, 2022Assignee: TENCENT AMERICA LLCInventors: Yong Xu, Meng Yu, Shi-Xiong Zhang, Chao Weng, Jianming Liu, Dong Yu
-
Patent number: 11410657Abstract: Disclosed is a speech recognition method of an artificial intelligence robot. The speech recognition method includes: receiving uttered speech information of a user from an external device; inputting the speech information to a pre-learned first intent analysis model, and determining an utterance intent of the user according to an output value of the first intent analysis; transmitting response information corresponding to the determined utterance intent of the user to the external device; receive evaluation information of the user on the response information from the external device; and generating a second intent analysis model by adding the evaluation information to learning data and learning the first intent analysis model. Accordingly, an intelligent device is capable of learning an accurate utterance intent even without error correction by a manager.Type: GrantFiled: October 3, 2019Date of Patent: August 9, 2022Assignee: LG ELECTRONICS INC.Inventor: Yireun Kim
-
Patent number: 11404041Abstract: A message management unit receives and accumulates a message, wherein the message is distributed for every update, is the message data representing a latest situation of a competition, an explanation generation unit generates an explanatory text for conveying unconveyed information detected from the message, based on conveyed information, a speech synthesis unit outputs a speech converted from the explanatory text, wherein the explanation generation unit stores the unconveyed information for the explanatory text as the conveyed information, stands by until completion of completion of the speech, and initiates a procedure for generating a new explanatory text based on updated unconveyed information.Type: GrantFiled: May 23, 2018Date of Patent: August 2, 2022Assignees: NIPPON HOSO KYOKAI, NHK Engineering System, Inc.Inventors: Tadashi Kumano, Ichiro Yamada, Atsushi Imai, Hideki Sumiyoshi, Yuko Yamanouchi, Toshihiro Shimizu, Nobumasa Seiyama, Shoei Sato, Reiko Saito, Taro Miyazaki, Kiyoshi Kurihara, Manon Ichiki, Tohru Takagi
-
Patent number: 11398231Abstract: Recommending an automated assistant action for inclusion in an existing automated assistant routine of a user, where the existing automated assistant routine includes a plurality of preexisting automated assistant actions. If the user confirms the recommendation through affirmative user interface input, the automated assistant action can be automatically added to the existing automated assistant routine. Thereafter, when the automated assistant routine is initialized, the preexisting automated assistant actions of the routine will be performed, as well as the automated assistant action that was automatically added to the routine in response to affirmative user interface input received in response to the recommendation.Type: GrantFiled: May 4, 2019Date of Patent: July 26, 2022Assignee: GOOGLE LLCInventor: Michael Andrew Goodman
-
Patent number: 11393476Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. In various implementations, audio data indicative of a voice input that includes a natural language request from a user may be applied as input across multiple speech-to-text (“STT”) machine learning models to generate multiple candidate speech recognition outputs. Each STT machine learning model may trained in a particular language. For each respective STT machine learning model of the multiple STT models, the multiple candidate speech recognition outputs may be analyzed to determine an entropy score for the respective STT machine learning model. Based on the entropy scores, a target language associated with at least one STT machine learning model of the multiple STT machine learning models may be selected. The automated assistant may respond to the request using the target language.Type: GrantFiled: January 8, 2019Date of Patent: July 19, 2022Assignee: GOOGLE LLCInventors: Ignacio Lopez Moreno, Lukas Lopatovsky, Ágoston Weisz
-
Patent number: 11355099Abstract: A word extraction method according to at least one embodiment of the present disclosure includes: converting, with at least one processor operating with a memory device in a device, received speech information into text data; converting the text data into a string of words including a plurality of words; extracting, with the at least one processor operating with the memory device in the device, a keyword included in a keyword database from the plurality of words; and calculating, with the at least one processor operating with the memory device in the device, importance levels of the plurality of words based on timing of utterance of the keyword and timing of utterance of each of the plurality of words.Type: GrantFiled: September 17, 2019Date of Patent: June 7, 2022Assignee: YAMAHA CORPORATIONInventor: Satoshi Ukai