Speech Recognition Depending On Application Context, E.g., In A Computer, Etc. (epo) Patents (Class 704/E15.044)
  • Patent number: 11972759
    Abstract: Mitigating mistranscriptions resolves errors in a transcription of the audio portion of a video based on a semantic matching with contextualized data electronically garnered from one or more sources other than the audio portion of the video. A mistranscription is identified using a pretrained word embedding model that maps words to an embedding space derived from the contextualizing data. A similarity value for each vocabulary word of a multi-word vocabulary of the pretrained word embedding model is determined in relation to the mistranscription. Candidate words are selected based on the similarity values, each indicating a closeness of a corresponding vocabulary word to the mistranscription. The textual rendering is modified by replacing the mistranscription with a candidate word that, based on average semantic similarity values, is more similar to the mistranscription than is each other candidate word.
    Type: Grant
    Filed: December 2, 2020
    Date of Patent: April 30, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shikhar Kwatra, Vijay Ekambaram, Hemant Kumar Sivaswamy, Rodrigo Goulart Silva
  • Patent number: 11973806
    Abstract: Techniques are disclosed relating to automatically altering a displayed user interface for an event. A server computer system may cause, via a conferencing service, display of a user interface for an event having a group of users accessing the conferencing service via a plurality of user devices, the displayed interface including an indication of a video feed of a user in the group of users that is currently active. The system may store, in a database, data for the event, including content of audio and video feeds of users in the event. The system may analyze a set of characteristics included in the content of the audio and video feeds. The system may alter, while the indication of the video feed of the user is being displayed, aspects of the displayed user interface other than the indication, where the altering is performed based on the analyzing.
    Type: Grant
    Filed: January 19, 2023
    Date of Patent: April 30, 2024
    Assignee: Toucan Events Inc.
    Inventors: Ivo Walter Rothschild, Paul Robert Murphy, Asahi Sato, Antonia Theodora Hellman, Ethan Duncan He-Li Hellman, Steven Emmanuel Hellman
  • Patent number: 11955124
    Abstract: An example electronic device includes a housing; a touchscreen display; a microphone; at least one speaker; a button disposed on a portion of the housing or set to be displayed on the touchscreen display; a wireless communication circuit; a processor; and a memory. When a user interface is not displayed on the touchscreen display, the electronic device enables a user to receive a user input through the button, receives user speech through the microphone, and then provides data on the user speech to an external server. An instruction for performing a task is received from the server. When the user interface is displayed on the touchscreen display, the electronic device enables the user to receive the user input through the button, receives user speech through the microphone, and then provides data on the user speech to the external server.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: April 9, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sang-Ki Kang, Jang-Seok Seo, Kook-Tae Choi, Hyun-Woo Kang, Jin-Yeol Kim, Chae-Hwan Li, Kyung-Tae Kim, Dong-Ho Jang, Min-Kyung Hwang
  • Patent number: 11932256
    Abstract: The disclosure generally pertains to systems and methods for identifying a location of an occupant in a vehicle. In an example method, a processor deconvolves a vocal utterance by an occupant of a vehicle and also determines an angle of arrival of the vocal utterance. The location of the occupant in the vehicle is then identified by the processor based on evaluating the deconvolved vocal utterance and the angle of arrival of the vocal utterance. Deconvolving the vocal utterance can involve applying a cabin impulse response to the vocal utterance for eliminating undesirable effects that may be imposed upon the vocal utterance by acoustic characteristics of the cabin of the vehicle (echo, sound reflections, sound damping, reverberation etc.). In some applications, the processor may refer to a lookup table to estimate a location of the occupant in the vehicle.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: March 19, 2024
    Assignee: Ford Global Technologies, LLC
    Inventors: Ranjani Rangarajan, Leah Busch, Karthik Krishnamurthy, Nikhitha Bekkanti
  • Patent number: 11935529
    Abstract: Techniques for virtual assistant execution of ambiguous commands is provided. A voice instruction from a user may be received at a virtual assistant. The voice instruction may request the virtual assistant to perform a command. The command that is most likely being requested by the voice instruction from the user is identified. An ordered set of actions to execute when performing the command may be retrieved. Each action of the ordered set of actions may indicate if the action is reversible. Each action of the ordered set of actions may be executed in order until a not reversible action is reached or no further actions are in the ordered set of actions.
    Type: Grant
    Filed: June 15, 2021
    Date of Patent: March 19, 2024
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Ying Bin Tan, Chew How Lim, Yih Farn Ghee, Joe Yie Chong
  • Patent number: 11922127
    Abstract: According to an embodiment, an electronic device comprises: a memory, a communication module comprising communication circuitry, and a processor operatively connected with the memory and the communication module. The processor is configured to control the electronic device to: obtain a utterance text corresponding to utterance speech, obtain an intent of the utterance text and emotion information based on the utterance speech and the utterance text, obtain a response text for the utterance text based on the intent of the utterance text and the emotion information, obtain a markup language including information about an output unit of text of the response text based on at least one of the intent of the utterance text, the emotion information, or the response text, and add the markup language to the response text and provide the response text. The text output unit is at least one selected from among a phoneme unit, a consonant and vowel unit, a syllable unit, or a word unit.
    Type: Grant
    Filed: May 21, 2021
    Date of Patent: March 5, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kichul Kim, Yoonjae Park, Jooyong Byeon, Youngkyu Kim, Byungkeon Park, Soowon Jang, Changyong Jeong, Sungbin Jin, Jaeyung Yeo
  • Patent number: 11875792
    Abstract: A computer implemented method, computer system, and computer program product for executing a voice command. A number of processor units displays a view of a location with voice command devices in response to detecting the voice command from a user. The number of processor units displays a voice command direction for the voice command in the view of the location. The number of processor units changes the voice command direction in response to a user input. The number of processor units identifies a voice command device from the voice command devices in the location based on the voice command direction to form a selected voice command device. The number of processor units executes the voice command using the selected voice command device.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Clement Decrop, Jeremy R. Fox, Tushar Agrawal, Sarbajit K. Rakshit
  • Patent number: 11869499
    Abstract: An information processing apparatus includes an extracting unit (133) that extracts a changing message related to a change in macro data (M), the changing message including at least one piece of first information indicating a function to be executed, and second information linked to the first information, from a user speech; a presuming unit (134) that presumes an element to be changed in the macro data (M) based on the changing message extracted by the extracting unit (133); and a changing unit (135) that changes the element to be changed in the macro data (M) presumed by the presuming unit (134), based on the changing message.
    Type: Grant
    Filed: July 1, 2019
    Date of Patent: January 9, 2024
    Assignee: Sony Corporation
    Inventors: Yuhei Taki, Hiro Iwase, Kunihito Sawai, Masaki Takase, Akira Miyashita
  • Patent number: 11854573
    Abstract: Techniques for performing conversation recovery of a system/user exchange are described. In response to determining that an action responsive to a user input cannot be performed, a system may determine a topic to recommend to a user. The topic may be unrelated to the original substance of the user input. The system may have access to various data representing a context in which a user provides an input to the system. The system may use these inputs and various data at runtime to make a determination regarding whether a user should be recommended a topic, as well as what that topic should be. The system may cause a question be output to the user, with the question asking the user about the topic, for example whether the user would like a song played, whether the user would like to hear information about a particular individual (e.g., artist), whether the user would like to know about a particular skill (e.g.
    Type: Grant
    Filed: September 10, 2020
    Date of Patent: December 26, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Gregory Newell, Eliav Kahan, Ravi Chandra Reddy Yasa, David Suarez, Joel Toledano
  • Patent number: 11854422
    Abstract: A method and a device for information interaction. The method comprises: in response to receiving an oral practice request initiated by a user, outputting task information for indicating a target oral practice task (201), wherein the task information corresponds to task intention information and task keyword information; acquiring voice information inputted by the user with regard to the task information (202); recognizing the voice information, so as to determine user intention information and user keyword information corresponding to the user (203); generating a matching result for indicating whether the user has completed the target oral practice task (204), wherein the matching result is obtained by the following step: respectively matching the user intention information with the task intention information, and the user keyword information with the task keyword information, so as to obtain the matching result; and presenting the matching result to the user (205).
    Type: Grant
    Filed: August 15, 2022
    Date of Patent: December 26, 2023
    Assignee: DOUYIN VISION CO., LTD.
    Inventors: Haoran Huang, Xi Luo, Fuxiang Li, Hang Li
  • Patent number: 11847124
    Abstract: Techniques for contextual search on multimedia content are provided. An example method includes extracting entities associated with multimedia content, wherein the entities include values characterizing one or more objects represented in the multimedia content, generating one or more query rewrite candidates based on the extracted entities and one or more terms in a query related to the multimedia content, providing the one or more query rewrite candidates to a search engine, scoring the one or more query rewrite candidates, ranking the scored one or more query rewrite candidates based on their respective scores, rewriting the query related to the multimedia content based on a particular ranked query rewrite candidate and providing for display, responsive to the query related to the multimedia content, a result set from the search engine based on the rewritten query.
    Type: Grant
    Filed: November 19, 2021
    Date of Patent: December 19, 2023
    Assignee: GOOGLE LLC
    Inventors: Gökhan Hasan Bakir, Károly Csalogány, Behshad Behzadi
  • Patent number: 11823678
    Abstract: Techniques for determining a command or intent likely to be subsequently invoked by a user of a system are described. A user inputs a command (either via a spoken utterance or textual input) to a system. The system determines content responsive to the command. The system also determines a second command or corresponding intent likely to be invoked by the user subsequent to the previous command. Such determination may involve analyzing pairs of intents, with each pair being associated with a probability that one intent of the pair will be invoked by a user subsequent to a second intent of the pair. The system then outputs first content responsive to the first command and second content soliciting the user as to whether the system to execute the second command.
    Type: Grant
    Filed: February 28, 2022
    Date of Patent: November 21, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Anjishnu Kumar, Xing Fan, Arpit Gupta, Ruhi Sarikaya
  • Patent number: 11823704
    Abstract: Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: November 21, 2023
    Assignee: GOOGLE LLC
    Inventors: Anshul Kothari, Gaurav Bhaya, Tarun Jain
  • Patent number: 11804215
    Abstract: An example process includes: receiving a first natural language input; initiating, by a digital assistant operating on the electronic device, a first task based on the first natural language input; determining whether the first task is of a predetermined type; and in accordance with a determination that the first task is of a predetermined type: determining whether one or more criteria are satisfied; and providing a response to the first natural language input, where providing the response includes: in accordance with a determination that the one or more criteria are not satisfied, outputting a first sound indicative of the initiated first task and a first verbal response indicative of the initiated first task; and in accordance with a determination that the one or more criteria are satisfied, outputting the first sound without outputting the first verbal response.
    Type: Grant
    Filed: September 21, 2022
    Date of Patent: October 31, 2023
    Assignee: Apple Inc.
    Inventors: Daniel A. Castellani, James N. Jones, Pedro Mari, Jessica J. Peck, Hugo D. Verweij, Garrett L. Weinberg, Mitchell R. Lerner
  • Patent number: 11798549
    Abstract: Embodiments include systems and methods for receiving an action item trigger by a user of a conferencing application; and in response to receiving the action item trigger, generating spoken words from audio data of a session of the conferencing application; normalizing the spoken words; generating higher-level representations of the normalized spoken words; determining semantic similarities of the higher-level representations of the normalized spoken words and higher level representations of normalized action words of an action word list; ranking options for top spoken words and action words based at least in part on the semantic similarities; identifying candidates for action words and/or phrases from the top spoken words and action words; and parsing the candidates to generate one or more action items.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: October 24, 2023
    Assignee: Mitel Networks Corporation
    Inventors: Jonathan Braganza, Kevin Lee, Logendra Naidoo
  • Patent number: 11790914
    Abstract: The present disclosure generally relates to voice-control for electronic devices. In some embodiments, the method includes, in response to detecting a plurality of utterances, associating the plurality of operations with a first stored operation set and detecting a second set of one or more inputs corresponding to a request to perform the operations associated with the first stored operation set; and performing the plurality of operations associated with the first stored operation set, in the respective order.
    Type: Grant
    Filed: September 22, 2022
    Date of Patent: October 17, 2023
    Assignee: Apple Inc.
    Inventors: Kevin Bartlett Aitken, Clare T. Kasemset
  • Patent number: 11763804
    Abstract: A method of leveraging a dialogue history of a conversational computing interface to execute an updated dialogue plan. The method comprises maintaining an annotated dialogue history of the conversational computing interface. The annotated dialogue history includes a plurality of traced steps defining a data-flow including input data used to execute a context-dependent operation and output data recorded from a previous execution of the context-dependent operation. The method further comprises recognizing an updated dialogue plan including a prefix of executable steps and an updated executable step following the prefix. The method further comprises automatically computer-recognizing that the prefix of executable steps of the updated dialogue plan matches a corresponding prefix of traced steps in the annotated dialogue history. The method further comprises re-using the data-flow from the prefix of traced steps in the annotated dialogue history to automatically determine input data of the updated executable step.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: September 19, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: David Leo Wright Hall, Pengyu Chen, Jason Andrew Wolfe, Jayant Sivarama Krishnamurthy
  • Patent number: 11755756
    Abstract: Systems and methods for sensitive data management are disclosed. A voice-enabled device may generate audio data representing a request from a user utterance. A remote system may perform speech-processing operations, including obtaining responsive text data from a third-party application. In examples, a sensitivity designation may be received from the third-party application, which may cause the remote system to encrypt the responsive text data, redact the text data, and/or remove the text data from the remote system after the response is provided to the voice-enabled device.
    Type: Grant
    Filed: August 14, 2020
    Date of Patent: September 12, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Jason Cline, Yolando Pereira, Arvind Kumar Babel, Bharanidharan Arul Janakiammal, Rohan Manish Chandra, Gary Scot Henderson
  • Patent number: 11749286
    Abstract: AM and LM parameters to be used for adapting an ASR model are derived for each audio segment of an audio stream comprising multiple audio programs. A set of identifiers, including a speaker identifier, a speaker domain identifier and a program domain identifier, is obtained for each audio segment. The set of identifiers are used to select most suitable AM and LM parameters for the particular audio segment. The embodiments enable provision of maximum constraints on the AMs and LMs and enable adaptation of the ASR model on the fly for audio streams of multiple audio programs, such as broadcast audio. This means that the embodiments enable selecting AM and LM parameters that are most suitable in terms of ASR performance for each audio segment.
    Type: Grant
    Filed: December 6, 2021
    Date of Patent: September 5, 2023
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Volodya Grancharov, Erlendur Karlsson, Sigurdur Sverrisson, Maxim Teslenko, Konstantinos Vandikas, Aneta Vulgarakis Feljan
  • Patent number: 11734502
    Abstract: Systems and methods to maintain amends to an annotation as discrete chronological events are disclosed. Exemplary implementations may: obtain a selection of a first annotation template for a first annotation via a client computing platform; generate a root node based on the selection of the first annotation template; obtain a first command to update to the first annotation; append a first update node to the root node responsive to the first command; obtain a second command to update to the first annotation; append a second update node to the first node responsive to the second command; receive an indication to present the first annotation; generate, responsive to the indication, the first annotation by populating the first annotation template included in the root node based on the first node set and in sequential order indicated by the edges; and effectuate presentation of the first annotation.
    Type: Grant
    Filed: December 1, 2022
    Date of Patent: August 22, 2023
    Assignee: Suki AI, Inc.
    Inventors: Badarinarayan Parthasarathi Burli, Harish Chandra Thuwal, Sai Chaitanya Ramineni
  • Patent number: 11726806
    Abstract: A display apparatus is provided. The display apparatus according to an embodiment includes a display, and a processor configured to control the display to display a UI screen including a plurality of text objects, control the display to display a text object in a different language from a preset language among the plurality of text objects, along with a preset number, and in response to a recognition result of a voice uttered by a user including the displayed number, perform an operation relating to a text object corresponding to the displayed number.
    Type: Grant
    Filed: September 2, 2020
    Date of Patent: August 15, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Yang-soo Kim, Suraj Singh Tanwar
  • Patent number: 11720324
    Abstract: Various embodiments of the present invention relate to an apparatus and a method for displaying an electronic document for processing a user's voice command in an electronic device. The electronic device includes an input device; a display; and a processor, wherein the processor may be configured to detect a voice command of a user using the input device, if outputting an electronic document corresponding to the voice command, identify at least one input field in the electronic document, determine guide information based on information of the at least one input field, and display the electronic document comprising the guide information using the display. Other embodiments may be possible.
    Type: Grant
    Filed: January 2, 2019
    Date of Patent: August 8, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kyungtae Kim, Seonho Lee, Yoonjeong Choi, Hosung You, Bunam Jeon, Taeho Ha, Changho Lee
  • Patent number: 11715455
    Abstract: A machine has a processor and a memory connected to the processor. The memory stores instructions executed by the processor to supply a name page in response to a request from an administrator machine. Name page updates are received from the administrator machine. The name page updates include participants and associated network contact information for the participants. A code is utilized to form a link to the name page. Prompts for textual name information and audio name information are supplied to a client machine that activates the link to the name page. Textual name information and audio name information are received from the client machine. The textual name information and audio name information are stored in association with the name page. Navigation tools are supplied to facilitate access to the textual name information and audio name information.
    Type: Grant
    Filed: October 12, 2020
    Date of Patent: August 1, 2023
    Assignee: NAMECOACH, INC.
    Inventor: Praveen Shanbhag
  • Patent number: 11711469
    Abstract: Methods, computer program products, and systems are presented.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: July 25, 2023
    Assignee: International Business Machines Corporation
    Inventors: Ryan Brink, Andrew R. Freed, Marco Noel
  • Patent number: 11690578
    Abstract: System and methods for controlling healthcare devices and systems using voice commands are presented. In some aspects a listening device may receive voice command from a person. The voice command may be translated into human readable or machine readable text via a speech-to-text service. A control component may receive the text and send device-specific instructions to a medical device associated with a patient based on the translated voice command. In response to the instructions, the medical device may take an action on a patient. Some examples of actions taken may include setting an alarm limit on a monitor actively monitoring a patient and adjusting the amount of medication delivered by an infusion pump. Because these devices may be controlled using a voice command, in some cases, no physical or manual interaction is needed with the device. As such, multiple devices may be hands-free controlled from any location.
    Type: Grant
    Filed: February 2, 2021
    Date of Patent: July 4, 2023
    Assignee: CERNER INNOVATION, INC.
    Inventors: Chad Hays, Randy Lantz
  • Patent number: 11676592
    Abstract: A natural-language voice chatbot engages a consumer in a voice dialogue. The chatbot is customized for engaging the specific consumer based on features and characteristics of that specific consumer's speech and a lexicon associated with terms, words, and commands for item ordering. The consumer can perform voice queries for specific items and/or specific establishments for placing a pre-staged order with the chatbot. Once the consumer selects options with a specific establishment, a pre-staged order is provided to the corresponding establishment on behalf of the user. Location data for a consumer-operated device is monitored and when it is determined that the consumer will arrive at the establishment within a time period required by the establishment to prepare the pre-staged order, a message is sent to the establishment to begin preparing the pre-staged order.
    Type: Grant
    Filed: November 25, 2020
    Date of Patent: June 13, 2023
    Assignee: NCR Corporation
    Inventors: Jodessiah Sumpter, Christian McDaniel, Kendall Marie Rose, Shaundell D. Thompson
  • Patent number: 11664024
    Abstract: An artificial intelligence device may receive first voice data corresponding to first voice uttered by a user from a first peripheral device, acquire a first intention corresponding to the first voice data, transmit a first search result corresponding to the first intention to the first peripheral device, receive second voice data corresponding to second voice uttered by the user from a second peripheral device, acquire a second intention corresponding to the received second voice data, and transmit a search result corresponding to the second intention to the second peripheral device depending on whether the second intention is an interactive intention associated with the first intention.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: May 30, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Hyangjin Lee, Myeongok Son, Jaekyung Lee
  • Patent number: 11657107
    Abstract: Methods and systems for generating search results are disclosed. In some examples, one or more keywords are extracted from one or more stored reviews associated with a merchant offering. A first extracted keyword is associated with a stored listing of the merchant offering. The first extracted keyword may be absent from the stored listing. In response to a search query that includes the first keyword, a set of search results is provided, where the set of search results includes the listing associated with the first keyword.
    Type: Grant
    Filed: April 5, 2022
    Date of Patent: May 23, 2023
    Assignee: SHOPIFY INC.
    Inventors: Siavash Ghorbani, Carl Johan Gustavsson
  • Patent number: 11657803
    Abstract: Disclosed is a speech recognition method performed by one or more processors of a computing device, the speech recognition method including: performing first speech recognition on voice information to obtain first text information on the voice information; receiving feedback regarding the first text information; and generating final text information for the voice information based on the received feedback, in which the first speech recognition includes real-time speech recognition, and is performed through a neural network model of a first structure.
    Type: Grant
    Filed: November 2, 2022
    Date of Patent: May 23, 2023
    Assignee: ActionPower Corp.
    Inventors: Hyungwoo Kim, Dongchan Shin
  • Patent number: 11651770
    Abstract: The present disclosure is generally related to a data processing system to validate vehicular functions in a voice activated computer network environment. The data processing system can improve the efficiency of the network by discarding action data structures and requests that invalid prior to their transmission across the network. The system can invalidate requests by comparing attributes of a vehicular state to attributes of a request state.
    Type: Grant
    Filed: September 14, 2020
    Date of Patent: May 16, 2023
    Assignee: GOOGLE LLC
    Inventors: Haris Ramic, Vikram Aggarwal, Moises Morgenstern Gali, David Roy Schairer, Yao Chen
  • Patent number: 11651765
    Abstract: Techniques and apparatuses for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.
    Type: Grant
    Filed: October 14, 2020
    Date of Patent: May 16, 2023
    Assignee: Google Technology Holdings LLC
    Inventor: Kristin A. Gray
  • Patent number: 11647249
    Abstract: The present disclosure relates to methods and devices for testing video data being rendered at or using a media device. A plurality of video frames to be rendered is received, each frame comprising one or more primary screen objects and at least one further screen object. The received frames are rendered at or using the media device wherein the at least one further screen object is superimposed on the one or more primary screen objects of a given frame during rendering. The rendered frames are provided to a data model. Extracted metadata indicating the presence or absence of further screen objects in the rendered video frames is the output of the data model. The data model is also provided with original metadata associated with the video frames prior to rendering. The rendering of each further screen object is then tested based on the original metadata and extracted metadata relating to a given video frame.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: May 9, 2023
    Assignee: NAGRAVISION S.A.
    Inventors: Douglas Gore, Ping Zou
  • Patent number: 11631407
    Abstract: Smart speaker system mechanisms, associated with a smart speaker device comprising an audio capture device, are provided for processing audio sample data captured by the audio capture device. The mechanisms receive, from the audio capture device of the smart speaker device, an audio sample captured from a monitored environment. The mechanisms classify a sound in the audio sample data as a type of sound based on performing a joint analysis of a plurality of different characteristics of the sound and matching results of the joint analysis to criteria specified in a plurality of sound models. The mechanisms determine, based on the classification of the sound, whether a responsive action is to be performed based on the classification of the sound. In response to determining that a responsive action is to be performed, the mechanisms initiate performance of the responsive action by the smart speaker system.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: April 18, 2023
    Assignee: International Business Machines Corporation
    Inventors: Michael S. Gordon, James Kozloski, Ashish Kundu, Clifford A. Pickover, Komminist Weldemariam
  • Patent number: 11620340
    Abstract: Systems and methods for a media guidance application that generates results in multiple languages for search queries. In particular, the media guidance application resolves multiple language barriers by taking automatic and manual user language settings and applying those settings to a variety of potential search results.
    Type: Grant
    Filed: July 13, 2020
    Date of Patent: April 4, 2023
    Assignee: Rovi Product Corporation
    Inventor: Arun Sreedhara
  • Patent number: 11616756
    Abstract: Systems, methods, and computer-readable storage media for enabling secure transfer of Internet domains between registrars. An example method can include receiving, at a registry, a request from a first registrar for information associated with an object recorded in the registry and registered by the first registrar, then generating, at the registry, an authorization code, the authorization code having an expiration. The registry can then transmit, to the first registrar, the authorization code, which in turn can be given to the registrant. The registrant can forward the authorization code to the second registrar, and the registry can receive, from a second registrar before the expiration has been reached: the authorization code and a transfer request for the object, the transfer request identifying a transfer of the object from the first registrar to the second registrar.
    Type: Grant
    Filed: April 14, 2022
    Date of Patent: March 28, 2023
    Assignee: VeriSign, Inc.
    Inventors: James Gould, Srikanth Veeramachaneni, Matthew Pozun
  • Patent number: 11610590
    Abstract: AM and LM parameters to be used for adapting an ASR model are derived for each audio segment of an audio stream comprising multiple audio programs. A set of identifiers, including a speaker identifier, a speaker domain identifier and a program domain identifier, is obtained for each audio segment. The set of identifiers are used to select most suitable AM and LM parameters for the particular audio segment. The embodiments enable provision of maximum constraints on the AMs and LMs and enable adaptation of the ASR model on the fly for audio streams of multiple audio programs, such as broadcast audio. This means that the embodiments enable selecting AM and LM parameters that are most suitable in terms of ASR performance for each audio segment.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: March 21, 2023
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Volodya Grancharov, Erlendur Karlsson, Sigurdur Sverrisson, Maxim Teslenko, Konstantinos Vandikas, Aneta Vulgarakis Feljan
  • Patent number: 11594218
    Abstract: Web content with a speech interaction user interface capability is provided. Interactable elements of the web content are identified. For each of the interactable elements, one or more associated identifiers are determined and associated with a corresponding interactable element of the identified interactable elements in a data structure. A speech input is received from a user. Using the data structure, one of the interactable elements is matched to the received speech input. An action is automatically performed on the matched interactable element.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: February 28, 2023
    Assignee: ServiceNow, Inc.
    Inventors: Jebakumar Mathuram Santhosm Swvigaradoss, Satya Sarika Sunkara, Ankit Goel, Jason Aloia, Rishabh Verma
  • Patent number: 11595459
    Abstract: A conferencing system terminal device includes a communication device receiving one or more videoconference feeds depicting one or more subjects engaged in a videoconference from one or more remote electronic devices. The conferencing system terminal device includes a contextual information extraction engine extracting contextual information of the videoconference from the one or more videoconference feeds. One or more processors automatically apply overlay indicia generated from the contextual information to at least one videoconference feed during the videoconference. A display presents the at least one videoconference feed after the overlay indicia is applied.
    Type: Grant
    Filed: October 14, 2021
    Date of Patent: February 28, 2023
    Assignee: Motorola Mobility LLC
    Inventors: Alexandre Novaes Olivieri, Amit Kumar Agrawal
  • Patent number: 11580958
    Abstract: The present disclosure relates to a method and a device for recognizing speech in a vehicle. The method for recognizing the speech in the vehicle may include collecting one or more types of information, determining information to be linked with each other for speech recognition based on an information processing priority predefined corresponding to each type of the collected information, analyzing the determined information to perform the speech recognition for a signal input through a microphone, and extracting at least one of a wake up voice or a command voice through the speech recognition to control the vehicle. Therefore, the present disclosure has an advantage of more accurately performing the speech recognition by linking collected various information in the vehicle with each other.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: February 14, 2023
    Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATION
    Inventors: Kyung Chul Lee, Young Jae Park
  • Patent number: 11562744
    Abstract: In one embodiment, a method includes receiving a voice input from a user and determining a first style of the voice input, based on first features extracted from the voice input. A second style for a voice response having second features may then be determined based on the first style. Finally, the voice response may be generated based on the second features of the second style, and this voice response may be provided in response to the voice input.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: January 24, 2023
    Assignee: Meta Platforms Technologies, LLC
    Inventors: Yang Gao, Weiyi Zheng, Zhaojun Yang, Thilo Wolfgang Koehler, Christian Fuegen, Qing He
  • Patent number: 11562747
    Abstract: One embodiment provides a method that includes obtaining a default language corpus. A second language corpus is obtained based on a second language preference. A first transcription of an utterance is received using the default language corpus and natural language processing (NLP). At least one problem word in the first transcription is determined based on an associated grammatical relevance to neighboring words in the first transcription. Upon determining that a first probability score is below a first threshold, an acoustic lookup is performed for an audible match for the problem word in the first transcription based on an associated acoustical relevance. Upon determining that a second probability score is below a second threshold, it is determined whether a match for the problem word exists in the secondary language corpus. Upon determining that the match exists in the secondary language corpus, a second transcription for the utterance is provided.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: January 24, 2023
    Assignee: International Business Machines Corporation
    Inventors: Raphael Arar, Chris Kau, Robert J. Moore, Chung-hao Tan
  • Patent number: 11551691
    Abstract: Systems and techniques for adaptive conversation support bot are described herein. An audio stream may be obtained including a conversation of a first user. An event may be identified in the conversation using the audio stream. A first keyword phrase may be extracted from the audio stream in response to identification of the event. The audio stream may be searched for a second keyword phrase based on the first keyword phrase. An action may be performed based on the first keyword phrase and the second keyword phrase. Results of the action may be out via a context appropriate output channel. The context appropriate output channel may be determined based on a context of the conversation and a privacy setting of the first user.
    Type: Grant
    Filed: January 6, 2021
    Date of Patent: January 10, 2023
    Assignee: Wells Fargo Bank, N.A.
    Inventor: Vincent Le Chevalier
  • Patent number: 11538466
    Abstract: Among other things, a developer of an interaction application for an enterprise can create items of content to be provided to an assistant platform for use in responses to requests of end-users. The developer can deploy the interaction application using defined items of content and an available general interaction model including intents and sample utterances having slots. The developer can deploy the interaction application without requiring the developer to formulate any of the intents, sample utterances, or slots of the general interaction model.
    Type: Grant
    Filed: March 12, 2020
    Date of Patent: December 27, 2022
    Assignee: Voicify, LLC
    Inventors: Jeffrey K. McMahon, Robert T. Naughton, Nicholas G. Laidlaw, Alexander M. Dunn, Gavin Berkowitz
  • Patent number: 11521624
    Abstract: A voice controlled assistant has a housing to hold one or more microphones, one or more speakers, and various computing components. The housing has an elongated cylindrical body extending along a center axis between a base end and a top end. The microphone(s) are mounted in the top end and the speaker(s) are mounted proximal to the base end. The microphone(s) and speaker(s) are coaxially aligned along the center axis. The speaker(s) are oriented to output sound directionally toward the base end and opposite to the microphone(s) in the top end. The sound may then be redirected in a radial outward direction from the center axis at the base end so that the sound is output symmetric to, and equidistance from, the microphone(s).
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: December 6, 2022
    Assignee: Amazon Technologies, Inc.
    Inventor: Timothy Theodore List
  • Patent number: 11518399
    Abstract: An agent device includes a display controller configured to cause a first display to display an agent image when an agent providing a service including causing an output device to output response of voice in response to an utterance of a user is activated, and a controller configured to execute particular control for causing a second display to display the agent image according to loudness of a voice received by an external terminal receiving a vocal input.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: December 6, 2022
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Yoshifumi Wagatsuma, Kengo Naiki, Yusuke Oi
  • Patent number: 11508260
    Abstract: Disclosed is a language learning technology for deaf people. A deaf-specific language learning system includes: a sound input device configured to receive a voice from an external source; a learning server configured to store learning data and correction information; a signal processor configured to output voice pattern information corresponding to a voice signal received from the sound input device; a learning processor configured to output learning pattern information regarding the learning data received from the learning server and also output a learning result through similarity analysis; and an actuator controller configured to vibrate a vibration actuator according to the voice pattern information and the learning pattern information.
    Type: Grant
    Filed: November 8, 2018
    Date of Patent: November 22, 2022
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Sung Yong Shin, Jong Moo Sohn
  • Patent number: 11505200
    Abstract: An agent management device includes a processor including hardware, the processor being configured to: generate a response to an inquiry from an occupant of a moving body; determine whether or not a change in a control content of the moving body to correspond to the response is possible; output a question whether or not to execute the change to the control content to the occupant when determining that the change to the control content is possible; and execute the change to the control content when the occupant approves the change to the control content.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: November 22, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Eiichi Maeda
  • Patent number: 11475884
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. An example process includes causing a first recognition result for a received natural language speech input to be displayed, where the first recognition result is in a first language and a second recognition result for the received natural language speech input is available for display responsive to receiving input indicative of user selection of the first recognition result, the second recognition result being in a second language. The example process further includes receiving the input indicative of user selection of the first recognition result and in response to receiving the input indicative of user selection of the first recognition result, causing the second recognition result to be displayed.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: October 18, 2022
    Assignee: Apple Inc.
    Inventors: Arnab Ghoshal, Roger Hsiao, Gorm Amand, Patrick L. Coffman, Mary Young
  • Patent number: 11468890
    Abstract: The present disclosure generally relates to voice-control for electronic devices. In some embodiments, the method includes, in response to detecting a plurality of utterances, associating the plurality of operations with a first stored operation set and detecting a second set of one or more inputs corresponding to a request to perform the operations associated with the first stored operation set; and performing the plurality of operations associated with the first stored operation set, in the respective order.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: October 11, 2022
    Assignee: Apple Inc.
    Inventors: Jigar Vasant Gada, Kevin Bartlett Aitken
  • Patent number: 11450326
    Abstract: An artificial intelligence (AI) device, such as a robot, comprises: an output interface to output content in response to a request of a user; a camera to acquire an image of the user; a microphone to acquire a voice signal including a voice content uttered by the user; a processor to determine a characteristic of the user based on the content, the image, and/or the voice signal, and recognize the voice content through a voice recognition mode corresponding to the determined characteristic. The AI device may include a communication interface to forward the voice signal to a remote computer that identifies the characteristic and recognizes the voice content based on the characteristic. According to an embodiment, when an irregular voice is recognized from the acquired voice signal, the processor may recognize a regular voice corresponding to the irregular voice using an artificial intelligence-based learning model.
    Type: Grant
    Filed: February 5, 2020
    Date of Patent: September 20, 2022
    Assignee: LG ELECTRONICS INC.
    Inventor: Namgeon Kim