Patents Examined by Michael Colucci

Acoustic event detection

Patent number: 12646502

Abstract: Techniques for reducing occurrences of cross-triggering event types not represented in audio data and false detection of event types are described. Different event types, such as a hand clap event type and a door knock event type may have substantially similar audio characteristics, and if one event type of such event types is represented in audio data, then event detection processing of that audio data may lead to detection of event types not represented in the audio data. Example embodiments involve training a model configured to detect multiple event types to enforce mutual exclusivity between different event type pairs or sets of the multiple event types. The model is trained to enforce mutual exclusivity using a regularizer function and a weight parameter to reduce any positive detection scores of event types not represented in received audio. Similar techniques may be applied to models for object detection using image data.

Type: Grant

Filed: November 30, 2023

Date of Patent: June 2, 2026

Assignee: Amazon Technologies, Inc.

Inventors: Quoc Huy Phan, Byeonggeun Kim, Andrew Thomas Bydlon, Qingming Tang, Chieh-Chi Kao, Chao Wang, Tien Vu Nguyen
Application specific auto-evaluation for large language models (LLMs)

Patent number: 12645883

Abstract: In one embodiment, a non-transitory computer-readable media stores instructions executable by processors for generating a prompt configured for eliciting outputs from large language models (LLMs) based on information associated with a task, inputting the prompt to a first LLM configured to output a response based on processing the prompt, determining metrics for evaluating the first LLM based on the task, wherein each of the metrics is associated with a scoring guideline, generating metric prompts based on the respective metrics and the scoring guidelines associated with the respective metrics, inputting the response and the metric prompts to second LLMs configured to output scores corresponding to the respective metrics based on processing the response and the metric prompts, and generating an analysis report based on the metrics and their corresponding scores.

Type: Grant

Filed: July 16, 2024

Date of Patent: June 2, 2026

Assignee: Oracle International Corporation

Inventors: Liyu Gong, Michael Avendi, Yuying Wang, Tao Sheng, Jun Qian, Vinod Mamtani
Generating synthetic conference transcripts using natural language processing

Patent number: 12640144

Abstract: Synthetic conference transcripts are generated and used to train a natural processing engine to derive intelligence from conference recordings or conference transcripts. A server generates, using a natural language processing engine, synthetic conference transcripts. The server compares the synthetic conference transcripts with conference data to identify artifacts in the synthetic conference transcripts. The server provides additional training to the natural language processing engine using online learning based on the identified artifacts. The server outputs a portion of the synthetic conference transcripts selected based on the identified artifacts.

Type: Grant

Filed: January 25, 2024

Date of Patent: May 26, 2026

Assignee: Zoom Communications, Inc.

Inventors: Yuanling Geng, Liwei Wu, Bing Zhao, Sanqiang Zhao
Machine learning model improvement

Patent number: 12633286

Abstract: There is disclosed, in an example, a computer-implemented system and method, which includes providing a large set of validation prompts; testing a first ML intent model with the large set of validation prompts, wherein the first ML intent model is to select for respective validation prompts a first intent from an intent set; testing a second ML intent model with the large set of validation prompts, wherein the second ML intent model is to select for the same validation prompts a second intent from the intent set; selecting a reduced set of validation prompts, comprising validation prompts for which the first intent and second intent do not match; receiving an analysis of the reduced set of validation prompts, including indicia of hits, wherein one of the ML intent models inferred a correct intent; and selecting as a preferred model an ML model of the first ML intent model or second ML model that provided more hits.

Type: Grant

Filed: December 29, 2023

Date of Patent: May 19, 2026

Assignee: CX360, Inc.

Inventors: Schuyler K. Rank, Laura J. Kleiman, Patrick M. Peterson
System and method for keyword false alarm reduction

Patent number: 12626697

Abstract: A method includes extracting, using a keyword detection model, audio features from audio data. The method also includes processing the audio features by a first layer of the keyword detection model configured to predict a first likelihood that the audio data includes speech. The method also includes processing the audio features by a second layer of the keyword detection model configured to predict a second likelihood that the audio data includes keyword-like speech. The method also includes processing the audio features by a third layer of the keyword detection model configured to predict a third likelihood, for each of a plurality of possible keywords, that the audio data includes the keyword. The method also includes identifying a keyword included in the audio data. The method also includes generating instructions to perform an action based at least in part on the identified keyword.

Type: Grant

Filed: July 14, 2023

Date of Patent: May 12, 2026

Assignee: Samsung Electronics Co., Ltd.

Inventors: Rakshith Sharma Srinivasa, Yashas Malur Saidutta, Ching-Hua Lee, Chou-Chang Yang, Yilin Shen, Hongxia Jin
Using artificial entities for generating personalized responses

Patent number: 12620262

Abstract: Systems, methods and non-transitory computer readable media for generating and operating artificial entities are provided. Some disclosed embodiments may involve receiving information related to a source individual; generating an artificial entity associated with the source individual based on the received information; receiving data reflecting an interaction with the artificial entity; and determining a manner for the artificial entity to respond to the interaction based on the collected information.

Type: Grant

Filed: June 2, 2025

Date of Patent: May 5, 2026

Inventors: Ben Avi Ingel, Ron Zass
Encoding and decoding of acoustic environment

Patent number: 12592240

Abstract: There are disclosed apparatus and methods for encoding and decoding of acoustic environment. In accordance with an example, there is provided an apparatus for decoding an acoustic environment, the acoustic environment including at least one audio source and at least one audio object, the at least one audio object being represented by a structural-acoustic data which links positional data of polygons with acoustic properties of acoustic materials, wherein the positional data includes, for each polygon, the position of the vertexes, the apparatus comprising a bitstream reader for reading, from the bitstream, an encoded version of structural-acoustic data and at least one audio stream to be rendered as generated by the at least one audio source in the acoustic environment. An audio source decoding block to decode the at least one an audio stream representing the at least one audio source. A structural-acoustic data decoding block to decode the structural-acoustic data.

Type: Grant

Filed: November 21, 2023

Date of Patent: March 31, 2026

Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Inventors: Jürgen Herre, Florin Ghido
Context-based automatic speech recognition processing

Patent number: 12586565

Abstract: Techniques for biasing for entities during automatic speech recognition (ASR) processing are described. In some embodiments, a system implements a gating component that is configured to switch on and off entity biasing on an audio frame basis when processing a spoken input. The gating component processes an audio frame to determine whether the audio frame likely includes a representation of a custom entity. Based on the determination, a biasing component, which is configured to generate entity embeddings, may be turned on or off. In this manner, entity biasing does not run on every audio frame, but only on the audio frames where it can be helpful in increasing ASR accuracy.

Type: Grant

Filed: March 29, 2023

Date of Patent: March 24, 2026

Assignee: Amazon Technologies, Inc.

Inventors: Anastasios Alexandridis, Kanthashree Mysore Sathyendra, Grant Strimel, Feng-Ju Chang, Ariya Rastrow, Nathan Anthony Susanj, Athanasios Mouchtaris
Sentiment-based conversation hotspot detection

Patent number: 12586573

Abstract: A system may include machine learning models. A system may receive audio information representing an utterance of a conversation session. A system may divide the audio information into a plurality of audio portions. A system may evaluate a first audio portion using a tone-based sentiment analysis model to generate sentiment probabilities. A system may determine a first positive sentiment probability exceeds a threshold. A system may generate a textual representation of the first audio portion. A system may evaluate the textual representation using a topic identification model to generate a topic result indicating a topic. A system may evaluate a second audio portion using the tone-based sentiment analysis model to generate second sentiment probabilities. A system may determine a second positive sentiment probability does not exceed the threshold.

Type: Grant

Filed: September 29, 2023

Date of Patent: March 24, 2026

Assignee: Amazon Technologies, Inc.

Inventors: Gizem Tabak, Masahito Togami, Michael Mark Goodwin, Amalavoyal Chari, Siddhartha Shankara Rao
Chunk-wise attention for longform ASR

Patent number: 12586570

Abstract: A method includes receiving training data including a corpus of multilingual unspoken textual utterances, a corpus of multilingual un-transcribed non-synthetic speech utterances, and a corpus of multilingual transcribed non-synthetic speech utterances. For each un-transcribed non-synthetic speech utterance, the method includes generating a target quantized vector token and a target token index, generating contrastive context vectors from corresponding masked audio features, and deriving a contrastive loss term. The method also includes generating an alignment output, generating a first probability distribution over possible speech recognition hypotheses for the alignment output, and determining an alignment output loss term. The method also includes generating a second probability distribution over possible speech recognition hypotheses and determining a non-synthetic speech loss term.

Type: Grant

Filed: February 23, 2024

Date of Patent: March 24, 2026

Assignee: Google LLC

Inventors: Yongqiang Wang, Yu Zhang, Wei Han, Parisa Haghani, Pedro J. Moreno Mengibar
Word correction using automatic speech recognition (ASR) incremental response

Patent number: 12573405

Abstract: An exemplary automatic speech recognition (ASR) system may receive an audio input including a segment of speech. The segment of speech may be independently processed by general ASR and domain-specific ASR to generate multiple ASR results. A selection between the multiple ASR results may be performed based on respective confidence levels for the general ASR and domain-specific ASR. As incremental ASR is performed, a composite result may be generated based on general ASR and domain-specific ASR.

Type: Grant

Filed: April 5, 2023

Date of Patent: March 10, 2026

Assignee: Adeia Guides Inc.

Inventor: Jeffry Copps Robert Jose
Transforming natural language to a logical form

Patent number: 12573380

Abstract: Techniques are disclosed herein for managing ambiguous date mentions in natural language utterances in transforming natural language utterances to logical forms by encoding the uncertainties of the ambiguous date mentions and including the encoded uncertainties in the logical forms. In a training phase, training examples including natural language utterances, logical forms, and database schema information are automatically augmented and used to train a machine learning model to convert natural language utterances to logical form. In an inference phase, input database schema information is augmented and used by the trained machine learning model to convert an input natural language utterance to logical form.

Type: Grant

Filed: May 6, 2024

Date of Patent: March 10, 2026

Assignee: Oracle International Corporation

Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Stephen Andrew McRitchie, Steve Wai-Chun Siu, Dalu Guo, Christopher Mark Broadbent, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Kenneth Khiaw Hong Eng, Chandan Basavaraju
System and method for detecting a wakeup command for a voice assistant

Patent number: 12567414

Abstract: A method for detecting a wakeup command for a voice assistant is provided. The method includes receiving an audio signal from one or more sources and determining at least one of acoustic parameters or an environmental context of the user. Further, the method includes generating an embedding vector representation associated with the received audio signal and comparing the generated embedding vector representation with one or more prestored embedding vector representations. Furthermore, the method includes detecting the wakeup command in the received audio signal.

Type: Grant

Filed: October 23, 2023

Date of Patent: March 3, 2026

Assignee: Samsung Electronics Co., Ltd.

Inventor: Ranjan Kumar Samal
Audio-visual question answering

Patent number: 12562165

Abstract: The present disclosure describes techniques for improving audio-visual question answering. A machine learning model is configured for audio-visual question answering (AVQA). The machine learning model comprises a first sub-model configured to capture semantic audio information and output an audio spatial feature map xas(1). The machine learning model comprises a second sub-model configured to extract visual features xvs and audio features xas and further configured to obtain a question vector xq. The machine learning model comprises a third sub-model configured to capture audio-visual correspondence at a granular level. A balanced AVQA dataset is created. The balanced AVQA dataset comprises balanced answer distribution in each question category. The machine learning model is trained to answer questions about visual objects, sounds, and their associations in videos using at least a subset of the balanced AVAQ dataset.

Type: Grant

Filed: September 22, 2023

Date of Patent: February 24, 2026

Assignee: Lemon Inc.

Inventors: Peng Zhang, Xiulong Liu, Zhikang Dong
Explanation of system determination

Patent number: 12562162

Abstract: Techniques for generating and outputting a natural language explanation of a determination made by a system are described. The system presents content to a user, where the content is generated based on a system determination. The system determines history data associated with a user profile associated with the user and context data associated with the system determination. The system uses the history data and the context data to determine a natural language explanation that the output was generated based on the system determination. The system further uses the history data and the context data to generate a predicted system determination representing the system determination that resulted in the output presented to the user. Based on a similarity between the predicted system determination and the actual system determination, the natural language explanation is presented to the user.

Type: Grant

Filed: March 31, 2023

Date of Patent: February 24, 2026

Assignee: Amazon Technologies, Inc.

Inventors: Zheng Chen, Chen Tong, Xing Fan, Michael Alan Frey, Daniel Grace, Jie Hao, Ziyan Jiang, Chenlei Guo, Aram Galstyan, Yang Liu, Pradeep Natarajan
Contemporaneous machine-learning analysis of audio streams

Patent number: 12562153

Abstract: Described techniques select portions of an audio stream for transmission to a trained machine learning application, which generates response recommendations in real-time. This real-time response is facilitated by the system identifying, selecting and transmitting those portions of the audio stream likely to be most relevant to the conversation. Portions of an audio stream less likely to be relevant to the conversation are identified accordingly and not transmitted. The system may identify the relevant portions of an audio stream by detecting events in a contemporaneous event stream, use a trained machine learning model to identify events in an audio stream, or both.

Type: Grant

Filed: January 19, 2022

Date of Patent: February 24, 2026

Assignee: CRESTA INTELLIGENCE INC.

Inventors: Tianlin Shi, Kenneth George Oetzel
Semantic reasoning-based environment learning for activity insights

Patent number: 12555570

Abstract: In one embodiment, a device identifies, using a semantic reasoning engine, activities in a location, based on sensor data obtained from a plurality of sensors deployed to the location. The device associates the activities with areas of the location in which they occurred. The device makes, using the semantic reasoning engine, an inference about a particular activity, based in part on where that activity occurred. The device raises, based on the inference, an alert regarding the particular activity.

Type: Grant

Filed: November 30, 2021

Date of Patent: February 17, 2026

Assignee: Cisco Technology, Inc.

Inventors: Hugo Latapie, Ozkan Kilic, Adam James Lawrence, Gaowen Liu, Ramana Rao V. R. Kompella, Ali Payani
Method and system of automatic context-bound domain-specific speech recognition

Patent number: 12555572

Abstract: A system, article, and method of automatic context-bound domain-specific speech recognition uses general language models.

Type: Grant

Filed: December 24, 2021

Date of Patent: February 17, 2026

Assignee: Intel Corporation

Inventors: Szymon Jessa, Jakub Nowicki, Michal Papaj, Piotr Hoffmann, Krzysztof Swider, Georg Stemmer
Automatic extraction of conversation highlights

Patent number: 12555580

Abstract: This disclosure describes techniques for generating a conversation summary. The techniques may include processing at least one statement indication of the conversation to determine at least one statement that is a candidate highlight of the conversation. The techniques may further include applying linguistic filtering rules to the candidate highlight to determine the candidate highlight is an actual highlight. The techniques may further include generating the conversation summary including providing the actual highlight as at least a portion of the conversation summary.

Type: Grant

Filed: November 14, 2023

Date of Patent: February 17, 2026

Assignee: Cisco Technology, Inc.

Inventors: Varsha Ravikumar Embar, Karthik Raghunathan
Training neural network components

Patent number: 12548559

Abstract: A machine learning model may be configured for training using an associated learning technique. A model configured for end-to-end backpropagation may adapted for associated learning by introducing functions for projecting hidden vectors and labels to a shared representation space and for reconstructing labels from representation vectors. An associated learning loss may be calculated at each layer, with the resulting gradients backpropagated locally through that layer rather than all layers. A reconstruction loss may be calculated using each layer's output including the predicted label. Training by associated learning may be parallelized (e.g., layer by layer) to yield efficiency gains. In addition, associated learning training may be more robust to training label errors.

Type: Grant

Filed: June 26, 2023

Date of Patent: February 10, 2026

Assignee: Amazon Technologies, Inc.

Inventors: I-Fan Chen, Satya Venkata Phani Sankar Nidadavolu, Brian King, Pegah Ghahremani, Pin-Jui Ku

1 2 3 4 5 … next