Patents Examined by Susan I McFadden
-
Patent number: 11875787Abstract: This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-semantically-conditioned generative model that has been pretrained based at least on a first training data set having unlabeled training examples and semantically conditioned based at least on a second training data set having dialog act-labeled utterances. The method or technique can also include inputting dialog acts into the semantically-conditioned generative model and obtaining synthetic utterances that are output by the semantically-conditioned generative model. The method or technique can also include outputting the synthetic utterances.Type: GrantFiled: October 11, 2022Date of Patent: January 16, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Nanshan Zeng, Jianfeng Gao
-
Patent number: 11869507Abstract: Methods, systems, and apparatuses for improved speech recognition and transcription of user utterances are described herein. A user utterance may be processed by a speech recognition computing device. One or more acoustic features associated with the user utterance may be used to determine whether one or more actions are to be performed based on a transcription of the user utterance.Type: GrantFiled: December 20, 2022Date of Patent: January 9, 2024Assignee: COMCAST CABLE COMMUNICATIONS, LLCInventors: Rui Min, Stefan Deichmann, Hongcheng Wang, Geifei Yang
-
Patent number: 11869491Abstract: A speech recognition unit converts an input utterance sequence into a confusion network sequence constituted by a k-best of candidate words of speech recognition results; a lattice generating unit generates a lattice sequence having the candidate words as internal nodes and a combination of k words among the candidate words for an identical speech as an external node, in which edges are extended between internal nodes other than internal nodes included in an identical external node, from the confusion network sequence; an integer programming problem generating unit generates an integer programming problem for selecting a path that maximizes an objective function including at least a coverage score of an important word, of paths following the internal nodes with the edges extended, in the lattice sequence; and the summary generating unit generates a high-quality summary having less speech recognition errors and low redundancy using candidate words indicated by the internal nodes included in the path selected bType: GrantFiled: January 16, 2020Date of Patent: January 9, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tsutomu Hirao, Atsunori Ogawa, Tomohiro Nakatani, Masaaki Nagata
-
Patent number: 11862167Abstract: A spoken dialogue device includes a recognition unit that recognizes an acquired user speech, a barge-in speech control unit that determines whether to engage a barge-in speech, a dialogue control unit that outputs a system response to a user based on a recognition result of the user speech other than the barge-in speech determined not to be engaged by the barge-in speech control unit, a response generation unit that generates a system speech based on the system response, and an output unit that outputs a system speech. When each user speech element included in the user speech corresponds to a predetermined morpheme included in the immediately previous system speech and does not correspond to a response candidate to the immediately previous system speech by a user, the barge-in speech control unit does not engage at least the user speech element.Type: GrantFiled: January 14, 2020Date of Patent: January 2, 2024Assignee: NTT DOCOMO, INC.Inventors: Mariko Chiba, Taichi Asami
-
Patent number: 11862172Abstract: Systems, methods, and devices provide a user experience capable of integrating robo-advising with human advising based on various inputs that are actively detected. Inputs from a conversation, or multiple conversations separated in time, may be analyzed to determine, based on voice inputs, that live communications should be initiated. Based on triggers identified, a robo-advising session may additionally or alternatively be initiated. Transitions between advising sessions may be facilitated to allow users to more efficiently employ robo-advising until human advising is triggered.Type: GrantFiled: January 6, 2023Date of Patent: January 2, 2024Assignee: Wells Fargo Bank, N.A.Inventors: Balin Kina Brandt, Laura Fisher, Marie Jeanette Floyd, Katherine J. McGee, Teresa Lynn Rench, Sruthi Vangala
-
Patent number: 11862166Abstract: A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.Type: GrantFiled: October 7, 2022Date of Patent: January 2, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Nam-yeong Kwon, Kyung-mi Park
-
Patent number: 11849908Abstract: A method of providing an intelligent voice recognition model includes obtaining space type information about a placement area of the voice recognition device, extracting space feature information from the space type information; and generating a predetermined voice recognition model matched to the extracted space feature information. At least one device implementing the method of providing the intelligent voice recognition model may be associated with an artificial intelligence module, a unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.Type: GrantFiled: June 5, 2019Date of Patent: December 26, 2023Assignee: LG Electronics Inc.Inventor: Jonghoon Chae
-
Patent number: 11853651Abstract: Systems and methods are described for recognizing and responding to commands in a virtual or physical environment. A system may receive voice data and determine an intended command. The system may then determine a position and viewpoint orientation of the user to be able to determine one or more digital assets associated with the user. The system may then determine a current state associated with each digital asset of the one or more digital assets to be able to determine at least one digital asset that is configured to process the command. The system can then apply the command to at least a first digital asset of the at least one digital asset that is configured to process the command.Type: GrantFiled: November 10, 2022Date of Patent: December 26, 2023Assignee: COMCAST CABLE COMMUNICATIONS, LLCInventor: Mark David Francisco
-
Patent number: 11848001Abstract: Systems and methods are disclosed for providing non-lexical cues in synthesized speech. An example system includes processor circuitry to generate a breathing cue to enhance speech to be synthesized from text; determine a first insertion point of the breathing cue in the text, wherein the breathing cue is identified by a first tag of a markup language; generate a prosody cue to enhance speech to be synthesized from the text; determine a second insertion point of the prosody cue in the text, wherein the prosody cue is identified by a second tag of the markup language; insert the breathing cue at the first insertion point based on the first tag and the prosody cue at the second insertion point based on the second tag; and trigger a synthesis of the speech from the text, the breathing cue, and the prosody cue.Type: GrantFiled: June 23, 2022Date of Patent: December 19, 2023Assignee: Intel CorporationInventors: Jessica M. Christian, Peter Graff, Crystal A. Nakatsu, Beth Ann Hockey
-
Patent number: 11848007Abstract: A display apparatus including a display, a voice input receiver, a memory, a communication circuitry and a processor. The processor being configured to control the display to display at least one first identifier corresponding to at least one first component on a first area in the screen during a first time such that one of the at least one first identifier is selectable by a first user voice input, and control the display to display at least one second identifier corresponding to the at least one second component on a second area in the screen during a second time different from the first time, such that one of the at least one second identifier is selectable by a second user voice input.Type: GrantFiled: July 13, 2022Date of Patent: December 19, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Kyeonga Han, Soungmin Yoo
-
Patent number: 11842743Abstract: Embodiments relate to audio processing unit(s) and methods for decoding an encoded audio bitstream, that includes a fill element with an identifier indicating a start of the fill element and fill data which includes a flag identifying whether to perform a base form of spectral band replication or an enhanced form of spectral band replication, wherein the base form of spectral band replication includes spectral patching, the enhanced form of spectral band replication includes harmonic transposition, one value of the flag indicates that said enhanced form of spectral band replication should be performed on the audio content, and another indicates that said base form of spectral band replication but not said harmonic transposition should be performed on the audio content, wherein the fill data further includes a parameter indicating whether pre-flattening is to be performed after spectral patching for avoiding spectral discontinuities.Type: GrantFiled: June 2, 2022Date of Patent: December 12, 2023Assignee: Dolby International ABInventors: Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11842720Abstract: An audio processing system and a method thereof generate a synthesis model that can input an audio signal to generate feature data that can be used by a signal generator to generate a modified audio signal. Specifically, a pre-trained synthesis model is first generated using training audio data. Thereafter, a re-trained synthesis model is established by additionally training the pre-trained synthesis model. Based on a received instruction to modify at least one of sounding conditions of an audio signal to be processed, feature data is generated by inputting additional condition data into the re-trained synthesis model. The signal generator generates the modified audio signal from the generated feature data.Type: GrantFiled: May 3, 2021Date of Patent: December 12, 2023Assignee: YAMAHA CORPORATIONInventor: Ryunosuke Daido
-
Patent number: 11837232Abstract: This relates to an intelligent automated assistant in a video communication session environment. An example method includes, during a video communication session between at least two user devices, and at a first user device: receiving a first user voice input; in accordance with a determination that the first user voice input represents a communal digital assistant request, transmitting a request to provide context information associated with the first user voice input to the first user device; receiving context information associated with the first user voice input; obtaining a first digital assistant response based at least on a portion of the context information received from the second user device and at least a portion of context information associated with the first user voice input that is stored on the first user device; providing the first digital assistant response to the second user device; and outputting the first digital assistant response.Type: GrantFiled: February 28, 2023Date of Patent: December 5, 2023Assignee: Apple Inc.Inventors: Niranjan Manjunath, Willem Mattelaer, Jessica Peck, Lily Shuting Zhang
-
Patent number: 11837251Abstract: The present disclosure relates to a virtual counseling system in which a user can virtually receive counseling by inputting query information into a system. A virtual counseling system according to an embodiment of the present disclosure may include an input unit obtaining audio information from a user and generating audio data; a determination unit receiving the audio data through the input unit, determining a type of the audio data, and generating type information on the audio data; and a text data generation unit generating object data by receiving the type information from the determination unit, converting content of the audio data into first text data, and combining the object data and the first text data to generate second text data.Type: GrantFiled: March 25, 2021Date of Patent: December 5, 2023Assignee: SOLUGATE INC.Inventor: Sung Tae Min
-
Patent number: 11830497Abstract: A multi-tier domain is provided for processing user voice queries and making routing decisions for generating responses, including for user voice queries that include multi-domain trigger words or phrases. When an utterance is recognized as different intents in different domains, a routing system for a domain may consider contextual signals, including those associated with other domains, to determine whether the domain is the proper one to handle the request. This determination can be performed with a statistical model specifically trained to make such determinations using the available contextual data.Type: GrantFiled: June 24, 2021Date of Patent: November 28, 2023Assignee: Amazon Technologies, Inc.Inventors: Ponnu Jacob, Jingqian Zhao, Prathap Ramachandra, Krupal Maddipati, Jinning Wu, Charlotte Alizerine Dzialo, Daksh Gautam, Wenbo Yan, Liu Yang, Uday Kumar Kollu
-
Patent number: 11823679Abstract: Techniques related to a method and system of audio false keyphrase rejection using speaker recognition are described herein. Such techniques use speaker recognition of a computer originated voice to omit actions triggered when a keyphrase is present in captured audio and omitted when speech of the captured audio was spoken by the computer originated voice.Type: GrantFiled: July 13, 2022Date of Patent: November 21, 2023Assignee: Intel CorporationInventors: Jacek Ossowski, Tobias Bocklet, Kuba Lopatka
-
Patent number: 11817105Abstract: An authentication system may receive non-voice biometric authentication information from a user and authenticate the user via a first authentication method using the non-voice biometric authentication information. After authenticating the user via the first authentication method, the authentication system can enhance or create, based on a verbal request or a verbal command received from the user, a voice profile associated with the user. Once the profile is enhanced or created, the user is enrolled into a voice biometric authentication program.Type: GrantFiled: January 27, 2022Date of Patent: November 14, 2023Assignee: United Services Automobile Association (USAA)Inventors: Zakery Layne Johnson, Maland Keith Mortensen, Gabriel Carlos Fernandez, Debra Randall Casillas, Sudarshan Rangarajan, Thomas Bret Buckingham
-
Patent number: 11817098Abstract: Systems and methods for detecting demographic bias in automatic speech recognition (ASR) systems. Corpuses of transcriptions from different demographic groups are analyzed, where one of the groups is known to be susceptible to bias and another group is known not to be susceptible to bias. A difference between the transcription accuracy for the first group and a transcription accuracy for a second group is measured. ASR accuracy for each group is measured and compared to each other using both statistics-based and practicality-based methodologies to determine whether a given ASR system or model exhibits a meaningful level of bias. Based on the statistical significance and the practical significance, an alert including a recommendation to adjust the ASR model is generated.Type: GrantFiled: March 3, 2023Date of Patent: November 14, 2023Assignee: WELLS FARGO BANK, N.A.Inventors: Yong Yi Bay, Menglin Cao, Yang Yang
-
Patent number: 11810556Abstract: Techniques for outputting interactive content and processing interactions with respect to the interactive content are described. While outputting requested content, a system may determine that interactive content is to be outputted. The system may determine output data including a first portion indicating that interactive content is going to be output and a second portion representing content corresponding to an item. The system may send the output data to the device. A user may interact with the output data, for example, by requesting performance of an action with respect to the item.Type: GrantFiled: June 24, 2021Date of Patent: November 7, 2023Assignee: Amazon Technologies, Inc.Inventors: Mark Conrad Kockerbeck, Srikanth Nori, Jilani Zeribi, Ryan Summers, Volkan Aginlar
-
Patent number: 11810566Abstract: Systems and methods are described for handling interruptions during a digital assistant session between a user and a digital assistant by detecting if an interruption event is to occur during the digital assistant session. In response to detecting that the interruption event is to occur, options to address the interruption are provided.Type: GrantFiled: February 18, 2022Date of Patent: November 7, 2023Assignee: Rovi Guides, Inc.Inventors: Vikram Makam Gupta, Vishwas Sharadanagar Panchaksharaiah