Patents Examined by Thuykhanh Le

Anaphoric reference resolution using natural language processing and machine learning

Patent number: 11977852

Abstract: A device configured to receive a sentence that includes a plurality of words. The device is further configured to input the words into a machine learning model that is configured to output a first feature vector based on the words. The device is further configured to identify a keyword within the sentence and to determine that the keyword is an implicit reference to an item. The device is further configured to identify a second feature vector in a reference list that closest matches a numeric value of the first feature vector and to identify an explicit reference in the reference list that is associated with the second feature vector. The device is further configured to replace the keyword with the explicit reference in the sentence and to output the sentence that includes the first explicit reference.

Type: Grant

Filed: January 12, 2022

Date of Patent: May 7, 2024

Assignee: Bank of America Corporation

Inventors: Aaron Michael Hosford, Donatus E. Asumu, Emad Noorizadeh, Ramakrishna Reddy Yannam
Stable real-time translations of audio streams

Patent number: 11972226

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.

Type: Grant

Filed: March 23, 2020

Date of Patent: April 30, 2024

Assignee: Google LLC

Inventor: Dirk Ryan Padfield
Method for detecting speech segment from audio considering length of speech segment

Patent number: 11972752

Abstract: Disclosed is a method for detecting a speech segment, which is performed by a computing device. The method may include: detecting a start point of a speech segment in an audio signal; and detecting an end point of the speech segment based on an offset threshold which is dynamically changed, and the dynamically changed offset threshold may be based on a length of the speech segment.

Type: Grant

Filed: November 2, 2022

Date of Patent: April 30, 2024

Assignee: ActionPower Corp.

Inventor: Dongchan Shin
Electronic device and method for identifying language level of target

Patent number: 11961505

Abstract: Methods and devices for identifying language level are provided. A first automatic speech recognition (ASR) module is identified, from among a plurality of ASR modules, based on information on a target received at the electronic device. First voice data and first image data for the target are received. The first voice data and the first image data are converted to first text data using the first ASR module. A first language level of the target is identified based on the first text data. Data including at least one of a voice output and an image output is output based on the first language level satisfying a condition.

Type: Grant

Filed: January 11, 2022

Date of Patent: April 16, 2024

Assignee: Samsung Electronics Co., Ltd

Inventor: Taegu Kim
Messaging system with trend analysis of content

Patent number: 11948558

Abstract: A messaging system performs trend analysis on content produced by users of the messaging system. The messaging system is configured to extract modifications from content items received from client devices associated with users where the content items are modified using the modifications that comprises a text caption or a media overlay. The messaging system is further configured to determine one or more words from the content items and the extracted modifications and determine a frequency of the one or more words in the content items and the extracted modifications. The messaging system is further configured to determine whether the one or more words is a trend based on the frequency and an aggregate frequency. The messaging system is further configured to in response to the one or more words being determined as the trend, generating trend content associated with the one or more words, the trend content being a text, an image, or an augmentation content.

Type: Grant

Filed: October 6, 2022

Date of Patent: April 2, 2024

Assignee: Snap Inc.

Inventors: Leonardo Ribas Machado das Neves, Vítor Silva Sousa, Shubham Vij
Audio event detection with window-based prediction

Patent number: 11948599

Abstract: A computing system for a plurality of classes of audio events is provided, including one or more processors configured to divide a run-time audio signal into a plurality of segments and process each segment of the run-time audio signal in a time domain to generate a normalized time domain representation of each segment. The processor is further configured to feed the normalized time domain representation of each segment to an input layer of a trained neural network. The processor is further configured to generate, by the neural network, a plurality of predicted classification scores and associated probabilities for each class of audio event contained in each segment of the run-time input audio signal. In post-processing, the processor is further configured to generate smoothed predicted classification scores, associated smoothed probabilities, and class window confidence values for each class for each of a plurality of candidate window sizes.

Type: Grant

Filed: January 6, 2022

Date of Patent: April 2, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Lihi Ahuva Shiloh Perl, Ben Fishman, Gilad Pundak, Yonit Hoffman
Description support device and description support method

Patent number: 11942086

Abstract: A description support device for displaying information on a topic to be checked in an utterance by a user, the description support device includes: an inputter to acquire input information indicating an utterance sentence corresponding to the utterance; a controller to generate information indicating a check result of the topic for the utterance sentence; and a display to display information generated by the controller, wherein the display is configured to display a checklist indicating whether or not the topic is described in the utterance sentence indicated by the input information sequentially acquired by the inputter, and wherein the display is configured to display, according to a likelihood of each utterance sentence, display information including the utterance sentence, the likelihood defining the check result of the topic in the checklist.

Type: Grant

Filed: December 17, 2020

Date of Patent: March 26, 2024

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Natsuki Saeki, Shoichi Araki, Masakatsu Hoshimi, Takahiro Kamai
Audio associating of computing devices

Patent number: 11934740

Abstract: Methods, systems and apparatus for associating electronic devices together based on received audio commands are described. Methods for associating an audio-controlled device with a physically separate display screen device such that information responses can then be provided in both audio and graphic formats using the two devices in conjunction with each other are described. The audio-controlled device can receive audio commands that can be analyzed to determine the author, which can then be used to further streamline the association operation.

Type: Grant

Filed: August 9, 2019

Date of Patent: March 19, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Justin-Josef Angel, Eric Alan Breitbard, Sean Robert Ryan, Robert Steven Murdock, Michael Douglas McQueen, Ryan Charles Chase, Colin Neil Swann
Sound source localization

Patent number: 11915698

Abstract: A system configured to improve track selection while performing audio type detection using sound source localization (SSL) data is provided. A device processes audio data representing sounds from multiple sound sources to determine SSL data that distinguishes between each of the sound sources. The system detects an acoustic event and performs SSL track selection to select the sound source that corresponds to the acoustic event based on input features. To improve SSL track selection, the system detects current conditions of the environment and determines adaptive weight values that vary based on the current conditions, such as a noise level of the environment, whether playback is detected, whether the device is located near one or more walls, etc. By adjusting the adaptive weight values, the system improves an accuracy of the SSL track selection by prioritizing the input features that are most predictive during the current conditions.

Type: Grant

Filed: September 29, 2021

Date of Patent: February 27, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Borham Lee, Wai Chung Chu
System and method for adapted interactive experiences

Patent number: 11900928

Abstract: Natural language grammars interpret expressions at the conversational human-machine interfaces of devices. Under conditions favoring engagement, as specified in a unit of conversational code, the device initiates a discussion using one or more of TTS, images, video, audio, and animation depending on the device capabilities of screen and audio output. Conversational code units specify conditions based on conversation state, mood, and privacy. Grammars provide intents that cause calls to system functions. Units can provide scripts for guiding the conversation. The device, or supporting server system, can provide feedback to creators of the conversational code units for analysis and machine learning.

Type: Grant

Filed: December 23, 2017

Date of Patent: February 13, 2024

Assignee: SoundHound AI IP, LLC

Inventors: Joel McKenzie, Qindi Zhang
Systems and methods for voice exchange beacon devices

Patent number: 11893985

Abstract: Methods and systems are provided for extending ranges of a plurality of voice assistant systems and interacting with the plurality of voice assistant systems using a common beacon network system. In one example, the beacon network system includes at least one primary beacon device communicatively coupled to and/or within corresponding listening ranges of one or more voice assistant systems, and a plurality of secondary beacon devices communicatively coupled to the primary beacon device and distributed within an operating environment. The beacon network system enables multi-user interaction with the plurality of voice assistant systems. Further, the beacon network system is scalable and may be deployed in a wide range of operating environments (e.g., operating environments of various sizes).

Type: Grant

Filed: January 15, 2021

Date of Patent: February 6, 2024

Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED

Inventors: Sagar Shelke, Srinath Arunachalam
Adding words to a prefix tree for improving speech recognition

Patent number: 11893983

Abstract: An approach for improving speech recognition is provided. A processor receives a new word to add to a prefix tree. A processor determines a bonus score for a first transition from a first node to a second node in a prefix tree on condition that the first transition is included in a path of at least one transition representing the new word. A processor determines a hypothesis score for a hypothesis that corresponds to a speech sequence based on the prefix tree, where the hypothesis score adds the bonus score to an initial hypothesis score to determine the hypothesis score. In response to a determination that the hypothesis score exceeds a threshold value, a processor generates an output text sequence for the speech sequence based on the hypothesis.

Type: Grant

Filed: June 23, 2021

Date of Patent: February 6, 2024

Assignee: International Business Machines Corporation

Inventors: Masayuki Suzuki, Gakuto Kurata
Audio file annotation

Patent number: 11893990

Abstract: Text-to-speech translation is used to generate a transcript for an audio file. Text segments are associated with time segments in the transcript. A trained machine learning model determines, based on the text in the transcript, one or more topics for the audio file. The transcript is modified to include the determined one or more topics. A user interface may be presented that allows a user to search for portions of an audio file that relate to a particular topic. In response to the selected or entered topic, the user interface presents segments having a matching topic. The user may use voice or other user interface commands to modify the annotation of the audio file. User commands may also be used to extract data from the transcript and copy the data to a clipboard or to another application.

Type: Grant

Filed: September 27, 2021

Date of Patent: February 6, 2024

Assignee: SAP SE

Inventor: Hans-Martin Ramsl
Document generation from conversational sources

Patent number: 11869509

Abstract: Methods and systems for natural language processing/understanding of voice conversations are provided. Using natural language processing, a clinical condition is extracted from a voice conversation. A clinical ontology identifies clinical concepts associated with the clinical conditions. The clinical concepts are classified for documentation. The clinical concepts are searched and validated from within an individual's longitudinal record.

Type: Grant

Filed: December 19, 2019

Date of Patent: January 9, 2024

Assignee: Cerner Innovation, Inc.

Inventors: Emin Agassi, Tanuj Gupta, Leo V. Perez
Computer vision based sign language interpreter

Patent number: 11847426

Abstract: A system and method for translating sign language utterances into a target language, including: receiving motion capture data; producing phonemes/sign fragments from the received motion capture data; producing a plurality of sign sequences from the phonemes/sign fragments; parsing these sign sequences to produce grammatically parsed sign utterances; translating the grammatically parsed sign utterances into grammatical representations in the target language; and generating output utterances in the target language based upon the grammatical representations.

Type: Grant

Filed: November 8, 2018

Date of Patent: December 19, 2023

Assignee: Snap Inc.

Inventors: David Retek, David Palhazi, Marton Kajtar, Attila Alvarez, Peter Poscsi, Andras Nemeth, Matyas Trosztel, Zsolt Robotka, Janos Rovnyai
Analysis of customer interaction metrics from digital voice data in a data-communication server system

Patent number: 11843719

Abstract: Certain aspects of the disclosure are directed to apparatuses and methods for analyzing customer-interaction metrics from digital voice data in a data-communication system. An example method includes accessing digital voice data indicative of transcribed audio conversations between a plurality of agents and customers of a remotely-situated client entity, the digital voice data including keywords and speech characteristic parameters associated with outcomes of the transcribed audio conversations.

Type: Grant

Filed: February 28, 2019

Date of Patent: December 12, 2023

Assignee: 8x8, Inc.

Inventors: Zhishen Liu, Bryan R. Martin
Method, apparatus, electronic device and storage medium for speech recognition

Patent number: 11842726

Abstract: A computer-implemented method for speech recognition is disclosed. The method includes extracting a feature word associated with location information from a speech to be recognized, and calculating a similarity between the feature word and respective ones of a plurality of candidate words in a corpus. The corpus includes a first sub-corpus associated with at least one user, and the plurality of candidate words include, in the first sub-corpus, a first standard candidate word and at least one first erroneous candidate word. The at least one first erroneous candidate word has a preset correspondence with the first standard candidate word. The method further includes in response to the similarity between the feature word and one or more of the at least one first erroneous candidate word satisfying a predetermined condition, outputting the first standard candidate word as a recognition result based on the preset correspondence.

Type: Grant

Filed: September 8, 2021

Date of Patent: December 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Jing Pei, Xiantao Chen, Meng Xu
Voice processing device, voice processing method and voice processing system

Patent number: 11804220

Abstract: This voice processing device is provided with: an utterer's position detection unit which specifies, as position microphones of an utterer, microphones that receive a voice signal of WuW on the basis of the characteristics of each voice signal for a prescribed time, when the WuW voice is detected, the voice signal being held in a voice signal buffer unit; and a CTC unit (one example of a voice processing unit) which outputs a voice uttered by the utterer and suppress a voice uttered by an occupant, who is not the utterer, by using the voice signal for the prescribed time, which is held in the voice signal buffer unit, and information relating to the utterer's position microphones.

Type: Grant

Filed: December 11, 2018

Date of Patent: October 31, 2023

Assignees: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD., PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Naoya Tanaka, Tomofumi Yamanashi, Masanari Miyamoto
Systems and methods for presenting social network communications in audible form based on user engagement with a user device

Patent number: 11804209

Abstract: Methods and systems are described herein for generating an audible presentation of a communication received from a remote server. A presentation of a media asset on a user equipment device is generated for a first user. A textual-based communication is received, at the user equipment device from the remote server. The textual-based communication is transmitted to the remote server by a second user and the remote server transmits the textual-based communication to the user equipment device responsive to determining that the second user is on a list of users associated with the first user. An engagement level of the first user with the user equipment device is determined. Responsive to determining that the engagement level does not exceed a threshold value, a presentation of the textual-based communication is generated in audible form.

Type: Grant

Filed: December 21, 2022

Date of Patent: October 31, 2023

Assignee: Rovi Product Corporation

Inventor: William Korbecki
Generating action items during a conferencing session

Patent number: 11798549

Abstract: Embodiments include systems and methods for receiving an action item trigger by a user of a conferencing application; and in response to receiving the action item trigger, generating spoken words from audio data of a session of the conferencing application; normalizing the spoken words; generating higher-level representations of the normalized spoken words; determining semantic similarities of the higher-level representations of the normalized spoken words and higher level representations of normalized action words of an action word list; ranking options for top spoken words and action words based at least in part on the semantic similarities; identifying candidates for action words and/or phrases from the top spoken words and action words; and parsing the candidates to generate one or more action items.

Type: Grant

Filed: March 19, 2021

Date of Patent: October 24, 2023

Assignee: Mitel Networks Corporation

Inventors: Jonathan Braganza, Kevin Lee, Logendra Naidoo

1 2 3 4 5 … next