Speech Recognition (epo) Patents (Class 704/E15.001)

  • Patent number: 12248659
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for tailoring composite graphical assistant interfaces for interacting with multiple different connected devices. The composite graphical assistant interfaces can be generated in response to a user providing a request for an automated assistant to cause a connected device to perform a particular function. In response to the automated assistant receiving the request, the automated assistant can identify other connected devices, and other functions capable of being performed by the other connected devices. The other functions can then be mapped to various graphical control elements in order to provide a composite graphical assistant interface from which the user can interact with different connected devices. Each graphical control element can be arranged to reflect how each connected device is operating simultaneous to the presentation of the composite graphical assistant interface.
    Type: Grant
    Filed: May 23, 2023
    Date of Patent: March 11, 2025
    Assignee: GOOGLE LLC
    Inventors: Yuzhao Ni, David Roy Schairer
  • Patent number: 12244392
    Abstract: A hub for supporting a voice relay between wearable terminals in a vehicle includes a reception processing device that receives a voice signal from at least one transmission terminal among two or more wearable terminals in the vehicle, a transmission processing device that transmits the voice signal to at least one reception terminal among the wearable terminals, and a controller that establishes a communication channel with each of the wearable terminals, selects a target terminal based on a voice signal obtained from a conversation attempt terminal attempting to transmit voice, and controls the reception processing device and the transmission processing device to relay bidirectional voice information transfer between the conversation attempt terminal and the target terminal.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: March 4, 2025
    Assignees: Hyundai Motor Company, Kia Corporation
    Inventor: Yong Sik Cho
  • Patent number: 12243517
    Abstract: A task-oriented dialog system determines an endpoint in a user utterance by receiving incremental portions of a user utterance that is provided in real time during a task-oriented communication session between a user and a virtual agent (VA). The task-oriented dialog system recognizes words in the incremental portions using an automated speech recognition (ASR) model and generates semantic information for the incremental portions of the utterance by applying a natural language processing (NLP) model to the recognized words. An acoustic-prosodic signature of the incremental portions of the utterance is generated using an acoustic-prosodic model. The task-oriented dialog system can generate a feature vector that represents the incrementally recognized words, the semantic information, the acoustic-prosodic signature, and corresponding confidence scores of the model outputs. A model is applied to the feature vector to identify a likely endpoint in the user utterance.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: March 4, 2025
    Assignee: Interactions LLC
    Inventors: Mahnoosh Mehrabani, Srinivas Bangalore
  • Patent number: 12236969
    Abstract: A method for processing sound used in a speech recognition robot is disclosed. The method for processing sound comprises the steps of: recognizing, by a robot, an obstacle on a driving path; calculating, by the robot, a driving distance to the obstacle; calculating a driving speed, by the robot; and determining, by the robot, a point in time at which a transient sound is generated by an impact caused by passing through the obstacle, wherein the point in time at which the transient sound is generated may be determined, by the robot, from the driving distance to the obstacle and the driving speed. The robot can transmit and receive a wireless signal on a mobile communication network established according to 5G (fifth generation) communication.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: February 25, 2025
    Assignee: LG ELECTRONICS INC.
    Inventor: Ji Hwan Park
  • Patent number: 12219215
    Abstract: Systems and methods are described herein for displaying subjects of a portion of content. Media data of content is analyzed during playback, and a number of action signatures are identified. Each action signature is associated with a particular subject within the content. The action signature is stored, along with a timestamp corresponding to a playback position at which the action signature begins, in association with an identifier of the particular subject. Upon receiving a command, icons representing each of a number of action signatures at or near the current playback position are displayed. Upon receiving user selection of an icon corresponding to a particular signature, a portion of the content corresponding to the action signature is played back.
    Type: Grant
    Filed: November 28, 2023
    Date of Patent: February 4, 2025
    Assignee: Adeia Guides Inc.
    Inventors: Gabriel C Dalbec, Nicholas Lovell, Lance G. O'Connor
  • Patent number: 12206715
    Abstract: A method includes, at a media bridge configured to distribute a plurality of media streams among a plurality of client devices connected to the media bridge over a network, receiving the plurality of media streams from the plurality of client devices via the media bridge. The media bridge connects the plurality of client devices. The method further includes assigning a pair of names for each of the plurality of media streams. The pair of names include a contribution name and a distribution name. The method further includes presenting a first list to the plurality of client devices. The first list including a plurality of the distribution names for the plurality of media streams received from the plurality of client devices. The method further includes providing an indication of a current active speaker within the plurality of media streams via a signaling process.
    Type: Grant
    Filed: December 29, 2023
    Date of Patent: January 21, 2025
    Assignee: Cisco Technology, Inc.
    Inventors: Jacques Samain, Giovanna Carofiglio, Giulio Grassi, Enrico Loparco, Michele Papalini
  • Patent number: 12198715
    Abstract: A method for generating an impulse response representing a sound wave propagation from at least one sound source received at a listening point in a room includes obtaining the generated impulse response at the listening point in the room from a neural network architecture by providing at least the position of the listening point as input. The generated impulse response is generated using a neural network architecture. The network is trained by obtaining a 3D model of the room including the at least one sound source emitting sound in the room and obtaining a training group of simulated impulse responses, wherein each simulated impulse response is generated for a respective predefined listening point in the 3D model of the virtual room.
    Type: Grant
    Filed: November 10, 2023
    Date of Patent: January 14, 2025
    Assignee: TREBLE TECHNOLOGIES
    Inventor: Martin Eineborg
  • Patent number: 12183335
    Abstract: Provided is an information processing system including: a voice information acquisition unit that acquires voice information including an utterance made by a person; a status acquisition unit that acquires status information related to status of the person; and a support information generation unit that generates support information used for supporting operation of the person based on the voice information and the status information.
    Type: Grant
    Filed: July 5, 2019
    Date of Patent: December 31, 2024
    Assignee: NEC CORPORATION
    Inventor: Masamichi Tanabe
  • Patent number: 12166923
    Abstract: Exemplary aspects involve a data-communications apparatus or system communicate over a broadband network with a plurality of remotely-located data-communications circuits respectively associated with a plurality of remotely-situated client entities. The system includes data-communications platform (e.g., UC-CC) that processes incoming data-communication interactions including different types of digitally-represented communications among which are incoming call, and that is integrated with a memory circuit including a database of information sets. Each of the information sets includes experience data corresponding to past incoming data-communication interactions processed by the platform, and with aggregated and organized data based on data collected in previous incoming interactions.
    Type: Grant
    Filed: January 3, 2024
    Date of Patent: December 10, 2024
    Assignee: 8x8, Inc.
    Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
  • Patent number: 12148432
    Abstract: Provided is a signal processing device including a main speech detection unit that detects, by using a neural network, whether or not a signal input to a sound collection device assigned to each of at least two speakers includes a main speech that is a voice of the corresponding speaker, and outputs frame information indicating presence or absence of the main speech.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: November 19, 2024
    Assignee: SONY GROUP CORPORATION
    Inventor: Atsuo Hiroe
  • Patent number: 12131740
    Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
    Type: Grant
    Filed: June 8, 2023
    Date of Patent: October 29, 2024
    Assignee: Capital One Services, LLC
    Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
  • Patent number: 12117463
    Abstract: Implementations described herein are directed to leveraging odor sensor(s) of client device(s) in responding to user request(s) and/or in generating notification(s). Processor(s) of a given client device can receive a request to identify an odor in an environment of the given client device, process an odor data instance generated by the odor sensor(s) of the given client device, identify the odor based on processing the odor data instance, generate a response that identifies the odor and/or a source of the odor, and cause the response to the request to be rendered via the given client device. Processor(s) of the given client device can additionally, or alternatively, establish baseline odor(s) in the environment and generate a notification when an odor is detected that does not correspond to the baseline odor(s) and/or exclude the baseline odor(s) in generating the response to the request.
    Type: Grant
    Filed: December 13, 2021
    Date of Patent: October 15, 2024
    Assignee: GOOGLE LLC
    Inventor: Evan Brown
  • Patent number: 12087305
    Abstract: Techniques for performing spoken language understanding (SLU) processing are described. An SLU component may include an audio encoder configured to perform an audio-to-text processing task and an audio-to-NLU processing task. The SLU component may also include a joint decoder configured to perform the audio-to-text processing task, the audio-to-NLU processing task and a text-to-NLU processing task. Input audio data, representing a spoken input, is processed by the audio encoder and the joint decoder to determine NLU data corresponding to the spoken input.
    Type: Grant
    Filed: May 26, 2023
    Date of Patent: September 10, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Beiye Liu, Wael Hamza, Liwei Cai, Konstantine Arkoudas, Chengwei Su, Subendhu Rongali
  • Patent number: 12057124
    Abstract: A streaming speech recognition model includes an audio encoder configured to receive a sequence of acoustic frames and generate a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The streaming speech recognition model also includes a label encoder configured to receive a sequence of non-blank symbols output by a final softmax layer and generate a dense representation. The streaming speech recognition model also includes a joint network configured to receive the higher order feature representation generated by the audio encoder and the dense representation generated by the label encoder and generate a probability distribution over possible speech recognition hypotheses. Here, the streaming speech recognition model is trained using self-alignment to reduce prediction delay by encouraging an alignment path that is one frame left from a reference forced-alignment frame.
    Type: Grant
    Filed: December 15, 2021
    Date of Patent: August 6, 2024
    Assignee: Google LLC
    Inventors: Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak
  • Patent number: 12033621
    Abstract: A method for speech recognition based on language adaptivity comprises obtaining voice data of a user. The method also comprises extracting, based on the obtained voice data, a phoneme feature representing pronunciation phoneme information. The phoneme feature is input to a pre-trained language discrimination model that is pre-trained based on a multilingual corpus. A language discrimination result corresponding to the phoneme feature and in accordance with the language discrimination model is obtained. The method also comprises obtaining a speech recognition result of the voice data based on a language acoustic model of a language corresponding to the language discrimination result. The method further comprises determining a speech recognition result of the voice data based on a language acoustic model of a language corresponding to the language discrimination result.
    Type: Grant
    Filed: April 15, 2021
    Date of Patent: July 9, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Dan Su, Tianxiao Fu, Min Luo, Qi Chen, Yulu Zhang, Lin Luo
  • Patent number: 12032919
    Abstract: Examples provide a large language model confidence scoring post-calibration based on a combination of temperature scaling, softmax denominator top-k probabilities selection, and polynomial regression. A secure machine learning system receives results generated by a machine learning (ML) model, the results including at least one confidence score. The secure ML system identifies at least one challenge in accuracy of the results generated by the ML model configured to perform document processing and understanding.
    Type: Grant
    Filed: August 16, 2023
    Date of Patent: July 9, 2024
    Assignee: Snowflake Inc.
    Inventor: Andrzej Szwabe
  • Patent number: 12027166
    Abstract: Systems and processes for operating a digital assistant are provided. An example process for performing a task includes, at an electronic device having one or more processors and memory, receiving a spoken input including a request, receiving an image input including a plurality of objects, selecting a reference resolution module of a plurality of reference resolution modules based on the request and the image input, determining, with the selected reference resolution module, whether the request references a first object of the plurality of objects based on at least the spoken input, and in accordance with a determination that the request references the first object of the plurality of objects, determining a response to the request including information about the first object.
    Type: Grant
    Filed: August 13, 2021
    Date of Patent: July 2, 2024
    Assignee: Apple Inc.
    Inventors: Hong Yu, Saurabh Adya, Shruti Bhargava, Myra C. Lukens, Jianpeng Cheng, Lin Li, Alkeshkumar M. Patel, Dhivya Piraviperumal, Stephen G. Pulman
  • Patent number: 11977841
    Abstract: An apparatus includes a display device that displays an input document in a user interface and at least one processor configured to receive a command to determine a document type of the input document and classify the input document to assign at least one document type and a respective confidence score. The processor assigns a significance score to each word of the input document that is indicative of a degree of influence the word has in deciding that the input document is of the at least one document type. The processor determines a level of visual emphasis to be placed on each word of the input document based on the significance score of the word and displays the input document on the display device with each word of the input document visually emphasized in accordance with the determined level of visual emphasis of the word.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: May 7, 2024
    Assignee: Bank of America Corporation
    Inventors: Jeremy A. Geiman, Kongkuo Lu, Ron Papka
  • Patent number: 11978471
    Abstract: A signal processing device according to an embodiment of the present invention includes: a conversion unit configured to convert an input mixed acoustic signal into a plurality of first internal states, a weighting unit configured to generate a second internal state which is a weighted sum of the plurality of first internal states based on auxiliary information regarding an acoustic signal of a target sound source when the auxiliary information is input, and generate the second internal state by selecting one of the plurality of first internal states when the auxiliary information is not input, and a mask estimation unit configured to estimate a mask based on the second internal state.
    Type: Grant
    Filed: February 12, 2020
    Date of Patent: May 7, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani
  • Patent number: 11966712
    Abstract: Provided are a server and a method for providing a multilingual subtitle service using an artificial intelligence learning model, and a method for controlling the server. The server includes: a communication unit configured to perform data communication with either or both of a first user terminal device of a client requesting translation of a content image and a second user terminal device of a worker performing a translation task; a storage configured to store a worker search list based on learned worker information, and an artificial intelligence learning model for performing a worker's task performance evaluation; and a controller configured to input image information on the content image to the artificial intelligence learning model in accordance with a worker recommendation command of the client to acquire a worker list of workers capable of translating the content image, and control the communication unit to transmit the acquired worker list to the first user terminal device.
    Type: Grant
    Filed: July 8, 2021
    Date of Patent: April 23, 2024
    Assignee: GLOZ INC.
    Inventors: Kug Koung Lee, Ho Kyun Kim, Bong Wan Kim
  • Patent number: 11968088
    Abstract: Example implementations include a method, apparatus, and computer-readable medium configured for generating a network configuration using a large language model (LLM). The apparatus receives, at an interface between a user and LLM, a natural language intent for a network configuration. The apparatus requests the large language model to update the network configuration to an updated network configuration that satisfies the natural language intent in a declarative network configuration language. The apparatus verifies whether the updated network configuration satisfies a configuration syntax of the declarative network configuration language to detect an error. The apparatus requests the large language model to update the updated network configuration to correct the error. The apparatus deploys the updated network configuration to a user network.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: April 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yu Yan, Ryan Andrew Beckett, Paramvir Bahl
  • Patent number: 11955114
    Abstract: Disclosed herein is a method for providing real-time trustworthiness analysis. The method comprises the steps of: receiving, by a speech data receiving module, speech data; delivering, by the speech data receiving module, the speech data to a speech analysis module; analyzing, by the speech analysis module, the speech data to identify one or more speech attributes; quantifying, by the speech analysis module, at least one of the speech attributes with an attribute score; and determining, by a trustworthiness determination module, a trustworthiness level based on the attribute score of the at least one of the speech attributes.
    Type: Grant
    Filed: July 14, 2023
    Date of Patent: April 9, 2024
    Inventor: Craig Hancock, Sr.
  • Patent number: 11908445
    Abstract: A method for proactive notifications in a voice interface device includes: receiving a first user voice request for an action with an future performance time; assigning the first user voice request to a voice assistant service for performance; subsequent to the receiving, receiving a second user voice request and in response to the second user voice request initiating a conversation with the user; and during the conversation: receiving a notification from the voice assistant service of performance of the action; triggering a first audible announcement to the user to indicate a transition from the conversation and interrupting the conversation; triggering a second audible announcement to the user to indicate performance of the action; and triggering a third audible announcement to the user to indicate a transition back to the conversation and rejoining the conversation.
    Type: Grant
    Filed: May 16, 2022
    Date of Patent: February 20, 2024
    Assignee: Google LLC
    Inventors: Kenneth Mixter, Daniel Colish, Tuan Nguyen
  • Patent number: 11904888
    Abstract: A system for controlling autonomously-controllable vehicle functions of an autonomous vehicle cooperating with partner subjects includes a database device with information on communication signals from partner subjects, action objectives, and scenarios, and has an autonomous vehicle with autonomously controllable vehicle functions communicatively connected to the database device. The autonomous vehicle includes a control device with a programmable unit and a surround sensor device. The control device receives sensor signals acquired by the surround sensor device of a surrounding area of the vehicle and communication signals originating from at least one partner subject. The control device determines a situation context based on the database information, and converts the captured communication signals into control signals for the autonomously controllable vehicle functions based on the situation context.
    Type: Grant
    Filed: October 27, 2022
    Date of Patent: February 20, 2024
    Assignee: Ford Global Technologies, LLC
    Inventors: Ahmed Benmimoun, Mohamed Benmimoun, Sufian Ashraf Mazhari, Mohsen Lakehal-Ayat, Muhammad Adeel Awan
  • Patent number: 11889026
    Abstract: Exemplary aspects involve a data-communications apparatus or system communicate over a broadband network with a plurality of remotely-located data-communications circuits respectively associated with a plurality of remotely-situated client entities. The system includes data-communications platform (e.g., UC-CC) that processes incoming data-communication interactions including different types of digitally-represented communications among which are incoming call, and that is integrated with a memory circuit including a database of information sets. Each of the information sets includes experience data corresponding to past incoming data-communication interactions processed by the platform, and with aggregated and organized data based on data collected in previous incoming interactions.
    Type: Grant
    Filed: August 18, 2022
    Date of Patent: January 30, 2024
    Assignee: 8x8, Inc.
    Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
  • Patent number: 11887020
    Abstract: A thermal load prediction method and apparatus. The method includes configuring multiple prediction states and corresponding error thresholds and forming a prediction model. The prediction model predicting first thermal load magnitudes respectively corresponding to multiple testing time periods, wherein a target steam user uses boiler steam in the multiple testing time periods. Determining, according to the first thermal load magnitudes, relative prediction errors respectively corresponding to the multiple testing time periods Forming a state transition probability matrix according to the relative prediction errors, and determining a state probability of each prediction state in each future time period of future time periods according to the state transition probability matrix.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: January 30, 2024
    Assignee: ENNEW Technology Co., Ltd.
    Inventors: Shengwei Liu, Xin Huang
  • Patent number: 11863592
    Abstract: A method includes, at a media bridge configured to distribute a plurality of media streams among a plurality of client devices connected to the media bridge over a network, receiving the plurality of media streams from the plurality of client devices via the media bridge. The media bridge connects the plurality of client devices. The method further includes assigning a pair of names for each of the plurality of media streams. The pair of names include a contribution name and a distribution name. The method further includes presenting a first list to the plurality of client devices. The first list including a plurality of the distribution names for the plurality of media streams received from the plurality of client devices. The method further includes providing an indication of a current active speaker within the plurality of media streams via a signaling process.
    Type: Grant
    Filed: May 14, 2021
    Date of Patent: January 2, 2024
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Jacques Samain, Giovanna Carofiglio, Giulio Grassi, Enrico Loparco, Michele Papalini
  • Patent number: 11829720
    Abstract: Systems and methods for analysis and validation of language models trained using data that is unavailable or inaccessible are provided. One example method includes, at an electronic device with one or more processors and memory, obtaining a first set of data corresponding to one or more tokens predicted based on one or more previous tokens. The method determines a probability that the first set of data corresponds to a prediction generated by a first language model trained using a user privacy preserving training process. In accordance with a determination that the probability is within a predetermined range, the method determines that the one or more tokens correspond to a prediction associated with the user privacy preserving training process and outputs a predicted token sequence including the one or more tokens and the one or more previous tokens.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: November 28, 2023
    Assignee: Apple Inc.
    Inventors: Jerome R. Bellegarda, Bishal Barman, Brent D. Ramerth
  • Patent number: 11790908
    Abstract: A voice command can be received from a user. One or more voice command devices (VCDs) that the voice command is targeting can be determined. A visual indicator of each of the one or more targeted VCDs can be displayed on an XR device worn by the user, wherein each visual indicator visually indicates a respective targeted VCD the voice command is directed to on the XR device.
    Type: Grant
    Filed: February 9, 2021
    Date of Patent: October 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Soma Shekar Naganna, Sarbajit K. Rakshit, Abhishek Seth, Matheen Ahmed Pasha
  • Patent number: 11776546
    Abstract: Techniques are described for providing information during a service session, using an intelligent agent. The intelligent agent executes as a process to monitor communications exchanged during a service session between an individual and a service representative (SR) within a service environment. The agent analyzes the communications to identify questions or other topics that are posed by the individual during the service session. The agent retrieves stored data related to such questions or other topics, and generates a message to address each question or other topic. The message is injected into the service session to be presented to the individual, to supplement the conversation that is taking place between the SR and the individual. In some implementations, the agent monitors the communications, generates the message, and/or injects the message into the service session at least partly autonomously of any explicit action taken by the SR.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: October 3, 2023
    Assignee: United Services Automobile Association (USAA )
    Inventors: Michael Waldmeier, Yuibi Fujimoto
  • Patent number: 11778102
    Abstract: A system and method providing an accessibility tool that enhances a graphical user interface of an online meeting application is described. In one aspect, a computer-implemented method performed by an accessibility tool (128), the method includes accessing (802), in real-time, audio data of a session of an online meeting application (120), identifying (804) a target user, a speaking user, and a task based on the audio data, the speaking user indicating the task assigned to the target user in the audio data, generating (806) a message (318) that identifies the speaking user, the target user, and the task, the message (318) including textual content, and displaying (808) the message (318) in a chat pane (906) of a graphical user interface (902) of the online meeting application (120) during the session.
    Type: Grant
    Filed: April 1, 2022
    Date of Patent: October 3, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shahil Soni, Charles Yin-Che Lee
  • Patent number: 11748462
    Abstract: A method for authenticating a user of an electronic device is disclosed. The method comprises: responsive to detection of a trigger event indicative of a user interaction with the electronic device, generating an audio probe signal to play through an audio transducer of the electronic device; receiving a first audio signal comprising a response of the user's ear to the audio probe signal; receiving a second audio signal comprising speech of the user; and applying an ear biometric algorithm to the first audio signal and a voice biometric algorithm to the second audio signal to authenticate the user as an authorised user.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: September 5, 2023
    Assignee: Cirrus Logic Inc.
    Inventor: John Paul Lesso
  • Patent number: 11749264
    Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: September 5, 2023
    Assignee: Salesforce, Inc.
    Inventors: Chien-Sheng Wu, Chu Hong Hoi, Richard Socher, Caiming Xiong
  • Patent number: 11741984
    Abstract: An acoustic scene conversion method, comprising: receiving sound signals including user's speech and scenic sounds; processing the sound signals according to an artificial intelligence model to generate enhanced speech signals without scenic sounds; and mixing the enhanced speech signals with new scenic sounds to produce converted sound signals.
    Type: Grant
    Filed: June 1, 2021
    Date of Patent: August 29, 2023
    Assignee: ACADEMIA SINICA
    Inventors: Tsao Yu, Syu-Siang Wang, Szu-Wei Fu, Alexander Chao-Fu Kang, Hsin-Min Wang
  • Patent number: 11710484
    Abstract: An agent control device configured to execute a plurality of agents and including a processor, the processor being configured to store an interruptibility list that stipulates interruptibility of execution for each function of one given agent being executed or for an execution status of the one given agent; request execution of each of the agents at a prescribed trigger, or request execution of another given agent at a specific trigger, reference the interruptibility list in order to set permissibility information relating to executability of the other given agent in conjunction with execution of the one given agent; and perform management such that, in a case in which there is a request at the specific trigger for execution of the other given agent while the one given agent is executing, the other given agent is executed based on the request.
    Type: Grant
    Filed: April 8, 2021
    Date of Patent: July 25, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Satoshi Aihara
  • Patent number: 11709654
    Abstract: The present disclosure generally relates to a computer-implemented system for intelligently retaining and recalling memory data. An exemplary method comprises receiving, via a microphone of an electronic device, a speech input of the user; receiving a text input of the user; constructing a first instance of a memory data structure based on the speech input; constructing a second instance of the memory data structure based on the text input; adding the first instance and the second instance of the memory data structure to a memory stack of the user; displaying a user interface for retrieving memory data of the user; receiving, via the user interface, a beginning of a statement from the user; retrieving a particular instance of the memory data structure from the memory stack based on the beginning of the statement; and automatically displaying a completion of the statement.
    Type: Grant
    Filed: May 19, 2022
    Date of Patent: July 25, 2023
    Assignee: Human AI Labs, Inc.
    Inventors: Suman Kanuganti, Xiaoran Zhang, Kristie Kaiser
  • Patent number: 11704460
    Abstract: Embodiments herein provide for reverse engineering of integrated circuits (ICs) for design verification. In example embodiments, an apparatus receives a gate-level netlist for an integrated circuit (IC), generates a list of equivalence classes related to signals included in the gate-level netlist, determines control signals of the gate-level netlist based at least in part on the list of equivalence classes, determines a logic flow of a finite state transducer (FST) based at least in part on the control signals, and generates register transfer level (RTL) source code for the IC based on the FST.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: July 18, 2023
    Assignee: UNIVERSITY OF FLORIDA RESEARCH FOUNDATION, INCORPORATED
    Inventors: Yier Jin, Shaojie Zhang, James Geist, Travis Meade, Jason Liam Portillo
  • Patent number: 11706482
    Abstract: Provided is a display device including a display unit, a storage unit configured to store information on a web page, a microphone configured to receive a user's voice command, a network interface unit configured to perform communication with a natural language processing (NLP) server, and a controller configured to transmit text data of the voice command to the NLP server, to receive intention analysis result information corresponding to the voice command from the NLP server, to select, as a final candidate address, one of a plurality of candidate addresses related to a search word included in the received intention analysis result information if the search word is not stored in the storage unit, and to access a website corresponding to the selected final candidate address.
    Type: Grant
    Filed: February 20, 2018
    Date of Patent: July 18, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Chulmin Son, Seunghyun Heo, Jaekyung Lee
  • Patent number: 11699446
    Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: July 11, 2023
    Assignee: Capital One Services, LLC
    Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
  • Patent number: 11699429
    Abstract: An electronic system is provided. The electronic system includes a host and a display. The host includes an audio processing module, and a smart interpreter engine. The audio processing module is utilized for acquiring audio data corresponding to a first language from audio streams processed by an application program executed on the host. The application program executed on the host includes a specific game software. The smart interpreter engine is utilized for receiving the audio data corresponding to the first language from the audio processing module and converting the audio data corresponding to the first language into text data corresponding to a second language according to the game software executed on the host The display is utilized for receiving the text data corresponding to the second language from the smart interpreter engine and displaying the text data corresponding to the second language.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: July 11, 2023
    Assignee: ACER INCORPORATED
    Inventors: Gianna Tseng, Szu-Ting Chou, Shang-Yao Lin, Shih-Cheng Huang
  • Patent number: 11694032
    Abstract: The present disclosure relates to chatbot systems and, more particularly, to techniques for determining that an input utterance is representative of a task that a particular chatbot can perform, based on matching the input utterance to a template. Techniques are also described for generating templates based on example utterances that have been provided for a chatbot. In certain embodiments, an initial set of templates is generated based on example utterances. This initial set of templates is then refined using template generalization techniques, which can be performed at the word or sentence level to generate a final set of templates for use at runtime, when the templates are matched against user utterances. The final set of templates may include one or more generalized templates that were derived from the initial set of templates and may also include the initial set of templates.
    Type: Grant
    Filed: September 3, 2020
    Date of Patent: July 4, 2023
    Assignee: Oracle International Corporation
    Inventors: Stephen Andrew McRitchie, Sunghye Jeon
  • Patent number: 11687908
    Abstract: A payment button on a device capable of making telephone calls, such as a mobile phone, allows a payer to electronically transfer money while in a phone call with a payee. The payment button also allows a payee to initiate an electronic payment transaction while in a phone call with a payer. The payment button may be a clickable or tappable virtual button presented on a display of the phone when being used to make or receive a call. The payer or the payee can simply enter a payment amount on the phone to complete an electronic payment transaction. A notification of payment is instantly transmitted to the phones being used for the phone call, so that the parties can safely and conveniently conclude a purchase and/or payment transaction during one phone call.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: June 27, 2023
    Assignee: PAYPAL, INC.
    Inventors: Saumil Ashvin Gandhi, Ray Hideki Tanaka
  • Patent number: 11683632
    Abstract: An automatic speech recognition (ASR) triggering system, and a method of providing an ASR trigger signal, is described. The ASR triggering system can include a microphone to generate an acoustic signal representing an acoustic vibration and an accelerometer worn in an ear canal of a user to generate a non-acoustic signal representing a bone conduction vibration. A processor of the ASR triggering system can receive an acoustic trigger signal based on the acoustic signal and a non-acoustic trigger signal based on the non-acoustic signal, and combine the trigger signals to gate an ASR trigger signal. For example, the ASR trigger signal may be provided to an ASR server only when the trigger signals are simultaneously asserted. Other embodiments are also described and claimed.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: June 20, 2023
    Assignee: Apple Inc.
    Inventors: Sorin V. Dusan, Aram M. Lindahl, Robert D. Watson
  • Patent number: 11615781
    Abstract: A singe audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face tack includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.
    Type: Grant
    Filed: October 2, 2020
    Date of Patent: March 28, 2023
    Assignee: Google LLC
    Inventor: Otavio Braga
  • Patent number: 11610586
    Abstract: A method includes receiving a speech recognition result, and using a confidence estimation module (CEM), for each sub-word unit in a sequence of hypothesized sub-word units for the speech recognition result: obtaining a respective confidence embedding that represents a set of confidence features; generating, using a first attention mechanism, a confidence feature vector; generating, using a second attention mechanism, an acoustic context vector; and generating, as output from an output layer of the CEM, a respective confidence output score for each corresponding sub-word unit based on the confidence feature vector and the acoustic feature vector received as input by the output layer of the CEM. For each of the one or more words formed by the sequence of hypothesized sub-word units, the method also includes determining a respective word-level confidence score for the word. The method also includes determining an utterance-level confidence score by aggregating the word-level confidence scores.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: March 21, 2023
    Assignee: Google LLC
    Inventors: David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara Sainath, Ian Mcgraw
  • Patent number: 11595535
    Abstract: An information processing apparatus that is capable of reducing time and effort to set settings of a smart speaker that cooperates with the information processing apparatus when a user starts to use the smart speaker. The information processing apparatus acquires identification information of the user, and acquires audio control information associated with the acquired identification information. Then, the information processing apparatus requests the smart speaker to change the audio setting of the smart speaker based on the acquired audio control information.
    Type: Grant
    Filed: June 10, 2021
    Date of Patent: February 28, 2023
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Ryosuke Kasahara
  • Patent number: 11574638
    Abstract: A system and method are disclosed for generating a teleconference space for two or more communication devices using a computer coupled with a database and comprising a processor and memory. The computer generates a teleconference space and transmits requests to join the teleconference space to the two or more communication devices. The computer stores in memory identification information, and audiovisual data associated with one or more users, for each of the two or more communication devices. The computer stores audio transcription data, transmitted to the computer by each of the two or more communication devices and associated with one or more communication device users, in the computer memory. The computer merges the audio transcription data from each of the two or more communication devices into a master audio transcript, and transmits the master audio transcript to each of the two or more communication devices.
    Type: Grant
    Filed: May 9, 2022
    Date of Patent: February 7, 2023
    Assignee: Nextiva, Inc.
    Inventors: Tomas Gorny, Jean-Baptiste Martinoli, Tracy Conrad, Lukas Gorny
  • Patent number: 11562573
    Abstract: Aspects of the disclosure relate to training and using a phrase recognition model to identify phrases in images. As an example, a selected phrase list may include a plurality of phrases is received. Each phrase of the plurality of phrases includes text. An initial plurality of images may be received. A training image set may be selected from the initial plurality of images by identifying the phrase-containing images that include one or more phrases from the selected phrase list. Each given phrase-containing image of the training image set may be labeled with information identifying the one or more phrases from the selected phrase list included in the given phrase-containing images. The model may be trained based on the training image set such that the model is configured to, in response to receiving an input image, output data indicating whether a phrase of the plurality of phrases is included in the input image.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: January 24, 2023
    Assignee: Waymo LLC
    Inventors: Victoria Dean, Abhijit S Ogale, Henrik Kretzschmar, David Harrison Silver, Carl Kershaw, Pankaj Chaudhari, Chen Wu, Congcong Li
  • Patent number: 11514787
    Abstract: In an information processing device, a first acquirer acquires, from a user, plan information including a scheduled time and a destination. A second acquirer acquires a spare time. A third acquirer acquires travelling schedule information for enabling arrival at the destination earlier than the scheduled time by the spare time or more. A display controller displays, on a display unit, information regarding the travelling schedule information and the spare time.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: November 29, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Koichi Suzuki, Makoto Akahane
  • Patent number: 11495234
    Abstract: A data mining device, and a speech recognition method and system using the same are disclosed. The speech recognition method includes selecting speech data including a dialect from speech data, analyzing and refining the speech data including a dialect, and learning an acoustic model and a language model through an artificial intelligence (AI) algorithm using the refined speech data including a dialect. The user is able to use a dialect speech recognition service which is improved using services such as eMBB, URLLC, or mMTC of 5G mobile communications.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: November 8, 2022
    Assignee: LG Electronics Inc.
    Inventors: Jee Hye Lee, Seon Yeong Park