Patents Examined by Thuykhanh Le
  • Patent number: 11977852
    Abstract: A device configured to receive a sentence that includes a plurality of words. The device is further configured to input the words into a machine learning model that is configured to output a first feature vector based on the words. The device is further configured to identify a keyword within the sentence and to determine that the keyword is an implicit reference to an item. The device is further configured to identify a second feature vector in a reference list that closest matches a numeric value of the first feature vector and to identify an explicit reference in the reference list that is associated with the second feature vector. The device is further configured to replace the keyword with the explicit reference in the sentence and to output the sentence that includes the first explicit reference.
    Type: Grant
    Filed: January 12, 2022
    Date of Patent: May 7, 2024
    Assignee: Bank of America Corporation
    Inventors: Aaron Michael Hosford, Donatus E. Asumu, Emad Noorizadeh, Ramakrishna Reddy Yannam
  • Patent number: 11972226
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: April 30, 2024
    Assignee: Google LLC
    Inventor: Dirk Ryan Padfield
  • Patent number: 11972752
    Abstract: Disclosed is a method for detecting a speech segment, which is performed by a computing device. The method may include: detecting a start point of a speech segment in an audio signal; and detecting an end point of the speech segment based on an offset threshold which is dynamically changed, and the dynamically changed offset threshold may be based on a length of the speech segment.
    Type: Grant
    Filed: November 2, 2022
    Date of Patent: April 30, 2024
    Assignee: ActionPower Corp.
    Inventor: Dongchan Shin
  • Patent number: 11961505
    Abstract: Methods and devices for identifying language level are provided. A first automatic speech recognition (ASR) module is identified, from among a plurality of ASR modules, based on information on a target received at the electronic device. First voice data and first image data for the target are received. The first voice data and the first image data are converted to first text data using the first ASR module. A first language level of the target is identified based on the first text data. Data including at least one of a voice output and an image output is output based on the first language level satisfying a condition.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: April 16, 2024
    Assignee: Samsung Electronics Co., Ltd
    Inventor: Taegu Kim
  • Patent number: 11948558
    Abstract: A messaging system performs trend analysis on content produced by users of the messaging system. The messaging system is configured to extract modifications from content items received from client devices associated with users where the content items are modified using the modifications that comprises a text caption or a media overlay. The messaging system is further configured to determine one or more words from the content items and the extracted modifications and determine a frequency of the one or more words in the content items and the extracted modifications. The messaging system is further configured to determine whether the one or more words is a trend based on the frequency and an aggregate frequency. The messaging system is further configured to in response to the one or more words being determined as the trend, generating trend content associated with the one or more words, the trend content being a text, an image, or an augmentation content.
    Type: Grant
    Filed: October 6, 2022
    Date of Patent: April 2, 2024
    Assignee: Snap Inc.
    Inventors: Leonardo Ribas Machado das Neves, VĂ­tor Silva Sousa, Shubham Vij
  • Patent number: 11948599
    Abstract: A computing system for a plurality of classes of audio events is provided, including one or more processors configured to divide a run-time audio signal into a plurality of segments and process each segment of the run-time audio signal in a time domain to generate a normalized time domain representation of each segment. The processor is further configured to feed the normalized time domain representation of each segment to an input layer of a trained neural network. The processor is further configured to generate, by the neural network, a plurality of predicted classification scores and associated probabilities for each class of audio event contained in each segment of the run-time input audio signal. In post-processing, the processor is further configured to generate smoothed predicted classification scores, associated smoothed probabilities, and class window confidence values for each class for each of a plurality of candidate window sizes.
    Type: Grant
    Filed: January 6, 2022
    Date of Patent: April 2, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Lihi Ahuva Shiloh Perl, Ben Fishman, Gilad Pundak, Yonit Hoffman
  • Patent number: 11942086
    Abstract: A description support device for displaying information on a topic to be checked in an utterance by a user, the description support device includes: an inputter to acquire input information indicating an utterance sentence corresponding to the utterance; a controller to generate information indicating a check result of the topic for the utterance sentence; and a display to display information generated by the controller, wherein the display is configured to display a checklist indicating whether or not the topic is described in the utterance sentence indicated by the input information sequentially acquired by the inputter, and wherein the display is configured to display, according to a likelihood of each utterance sentence, display information including the utterance sentence, the likelihood defining the check result of the topic in the checklist.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: March 26, 2024
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Natsuki Saeki, Shoichi Araki, Masakatsu Hoshimi, Takahiro Kamai
  • Patent number: 11934740
    Abstract: Methods, systems and apparatus for associating electronic devices together based on received audio commands are described. Methods for associating an audio-controlled device with a physically separate display screen device such that information responses can then be provided in both audio and graphic formats using the two devices in conjunction with each other are described. The audio-controlled device can receive audio commands that can be analyzed to determine the author, which can then be used to further streamline the association operation.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: March 19, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Justin-Josef Angel, Eric Alan Breitbard, Sean Robert Ryan, Robert Steven Murdock, Michael Douglas McQueen, Ryan Charles Chase, Colin Neil Swann
  • Patent number: 11915698
    Abstract: A system configured to improve track selection while performing audio type detection using sound source localization (SSL) data is provided. A device processes audio data representing sounds from multiple sound sources to determine SSL data that distinguishes between each of the sound sources. The system detects an acoustic event and performs SSL track selection to select the sound source that corresponds to the acoustic event based on input features. To improve SSL track selection, the system detects current conditions of the environment and determines adaptive weight values that vary based on the current conditions, such as a noise level of the environment, whether playback is detected, whether the device is located near one or more walls, etc. By adjusting the adaptive weight values, the system improves an accuracy of the SSL track selection by prioritizing the input features that are most predictive during the current conditions.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: February 27, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Borham Lee, Wai Chung Chu
  • Patent number: 11900928
    Abstract: Natural language grammars interpret expressions at the conversational human-machine interfaces of devices. Under conditions favoring engagement, as specified in a unit of conversational code, the device initiates a discussion using one or more of TTS, images, video, audio, and animation depending on the device capabilities of screen and audio output. Conversational code units specify conditions based on conversation state, mood, and privacy. Grammars provide intents that cause calls to system functions. Units can provide scripts for guiding the conversation. The device, or supporting server system, can provide feedback to creators of the conversational code units for analysis and machine learning.
    Type: Grant
    Filed: December 23, 2017
    Date of Patent: February 13, 2024
    Assignee: SoundHound AI IP, LLC
    Inventors: Joel McKenzie, Qindi Zhang
  • Patent number: 11893985
    Abstract: Methods and systems are provided for extending ranges of a plurality of voice assistant systems and interacting with the plurality of voice assistant systems using a common beacon network system. In one example, the beacon network system includes at least one primary beacon device communicatively coupled to and/or within corresponding listening ranges of one or more voice assistant systems, and a plurality of secondary beacon devices communicatively coupled to the primary beacon device and distributed within an operating environment. The beacon network system enables multi-user interaction with the plurality of voice assistant systems. Further, the beacon network system is scalable and may be deployed in a wide range of operating environments (e.g., operating environments of various sizes).
    Type: Grant
    Filed: January 15, 2021
    Date of Patent: February 6, 2024
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventors: Sagar Shelke, Srinath Arunachalam
  • Patent number: 11893983
    Abstract: An approach for improving speech recognition is provided. A processor receives a new word to add to a prefix tree. A processor determines a bonus score for a first transition from a first node to a second node in a prefix tree on condition that the first transition is included in a path of at least one transition representing the new word. A processor determines a hypothesis score for a hypothesis that corresponds to a speech sequence based on the prefix tree, where the hypothesis score adds the bonus score to an initial hypothesis score to determine the hypothesis score. In response to a determination that the hypothesis score exceeds a threshold value, a processor generates an output text sequence for the speech sequence based on the hypothesis.
    Type: Grant
    Filed: June 23, 2021
    Date of Patent: February 6, 2024
    Assignee: International Business Machines Corporation
    Inventors: Masayuki Suzuki, Gakuto Kurata
  • Patent number: 11893990
    Abstract: Text-to-speech translation is used to generate a transcript for an audio file. Text segments are associated with time segments in the transcript. A trained machine learning model determines, based on the text in the transcript, one or more topics for the audio file. The transcript is modified to include the determined one or more topics. A user interface may be presented that allows a user to search for portions of an audio file that relate to a particular topic. In response to the selected or entered topic, the user interface presents segments having a matching topic. The user may use voice or other user interface commands to modify the annotation of the audio file. User commands may also be used to extract data from the transcript and copy the data to a clipboard or to another application.
    Type: Grant
    Filed: September 27, 2021
    Date of Patent: February 6, 2024
    Assignee: SAP SE
    Inventor: Hans-Martin Ramsl
  • Patent number: 11869509
    Abstract: Methods and systems for natural language processing/understanding of voice conversations are provided. Using natural language processing, a clinical condition is extracted from a voice conversation. A clinical ontology identifies clinical concepts associated with the clinical conditions. The clinical concepts are classified for documentation. The clinical concepts are searched and validated from within an individual's longitudinal record.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: January 9, 2024
    Assignee: Cerner Innovation, Inc.
    Inventors: Emin Agassi, Tanuj Gupta, Leo V. Perez
  • Patent number: 11847426
    Abstract: A system and method for translating sign language utterances into a target language, including: receiving motion capture data; producing phonemes/sign fragments from the received motion capture data; producing a plurality of sign sequences from the phonemes/sign fragments; parsing these sign sequences to produce grammatically parsed sign utterances; translating the grammatically parsed sign utterances into grammatical representations in the target language; and generating output utterances in the target language based upon the grammatical representations.
    Type: Grant
    Filed: November 8, 2018
    Date of Patent: December 19, 2023
    Assignee: Snap Inc.
    Inventors: David Retek, David Palhazi, Marton Kajtar, Attila Alvarez, Peter Poscsi, Andras Nemeth, Matyas Trosztel, Zsolt Robotka, Janos Rovnyai
  • Patent number: 11843719
    Abstract: Certain aspects of the disclosure are directed to apparatuses and methods for analyzing customer-interaction metrics from digital voice data in a data-communication system. An example method includes accessing digital voice data indicative of transcribed audio conversations between a plurality of agents and customers of a remotely-situated client entity, the digital voice data including keywords and speech characteristic parameters associated with outcomes of the transcribed audio conversations.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: December 12, 2023
    Assignee: 8x8, Inc.
    Inventors: Zhishen Liu, Bryan R. Martin
  • Patent number: 11842726
    Abstract: A computer-implemented method for speech recognition is disclosed. The method includes extracting a feature word associated with location information from a speech to be recognized, and calculating a similarity between the feature word and respective ones of a plurality of candidate words in a corpus. The corpus includes a first sub-corpus associated with at least one user, and the plurality of candidate words include, in the first sub-corpus, a first standard candidate word and at least one first erroneous candidate word. The at least one first erroneous candidate word has a preset correspondence with the first standard candidate word. The method further includes in response to the similarity between the feature word and one or more of the at least one first erroneous candidate word satisfying a predetermined condition, outputting the first standard candidate word as a recognition result based on the preset correspondence.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: December 12, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Jing Pei, Xiantao Chen, Meng Xu
  • Patent number: 11804220
    Abstract: This voice processing device is provided with: an utterer's position detection unit which specifies, as position microphones of an utterer, microphones that receive a voice signal of WuW on the basis of the characteristics of each voice signal for a prescribed time, when the WuW voice is detected, the voice signal being held in a voice signal buffer unit; and a CTC unit (one example of a voice processing unit) which outputs a voice uttered by the utterer and suppress a voice uttered by an occupant, who is not the utterer, by using the voice signal for the prescribed time, which is held in the voice signal buffer unit, and information relating to the utterer's position microphones.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: October 31, 2023
    Assignees: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD., PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Naoya Tanaka, Tomofumi Yamanashi, Masanari Miyamoto
  • Patent number: 11804209
    Abstract: Methods and systems are described herein for generating an audible presentation of a communication received from a remote server. A presentation of a media asset on a user equipment device is generated for a first user. A textual-based communication is received, at the user equipment device from the remote server. The textual-based communication is transmitted to the remote server by a second user and the remote server transmits the textual-based communication to the user equipment device responsive to determining that the second user is on a list of users associated with the first user. An engagement level of the first user with the user equipment device is determined. Responsive to determining that the engagement level does not exceed a threshold value, a presentation of the textual-based communication is generated in audible form.
    Type: Grant
    Filed: December 21, 2022
    Date of Patent: October 31, 2023
    Assignee: Rovi Product Corporation
    Inventor: William Korbecki
  • Patent number: 11798549
    Abstract: Embodiments include systems and methods for receiving an action item trigger by a user of a conferencing application; and in response to receiving the action item trigger, generating spoken words from audio data of a session of the conferencing application; normalizing the spoken words; generating higher-level representations of the normalized spoken words; determining semantic similarities of the higher-level representations of the normalized spoken words and higher level representations of normalized action words of an action word list; ranking options for top spoken words and action words based at least in part on the semantic similarities; identifying candidates for action words and/or phrases from the top spoken words and action words; and parsing the candidates to generate one or more action items.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: October 24, 2023
    Assignee: Mitel Networks Corporation
    Inventors: Jonathan Braganza, Kevin Lee, Logendra Naidoo