Patents Examined by Richemond Dorvill

Speech processing system and method

Patent number: 12380882

Abstract: A speech processing system includes an input for receiving an input utterance spoken by a user and a word alignment unit configured to align different sequences of acoustic speech models with the input utterance spoken by the user. Each different sequence of acoustic speech models corresponds to a different possible utterance that a user might make. The system identifies any parts of a read prompt text that the user skipped; any parts of the read prompt text that the user repeated; and any speech sounds that the user inserted between words of the read prompt text. The information from the word alignment unit can be used to assess the proficiency and/or fluency of the user's speech.

Type: Grant

Filed: June 19, 2020

Date of Patent: August 5, 2025

Assignee: THE CHANCELLOR, MASTERS, AND SCHOLARS OF THE UNIVERSITY OF CAMBRIDGE

Inventors: Thomas William John Ash, Anthony John Robinson
Audio automatic mixer with frequency weighting

Patent number: 12380912

Abstract: A method is provided that is performed at a system including multiple speech collectors to collect speech from a talker to produce corresponding ones of multiple audio signals that each convey speech energy: for each audio signal: separating high-frequency speech energy from low-frequency speech energy; and determining a first energy level of the high-frequency speech energy; and determining a preferred audio signal among the multiple audio signals for subsequent processing at least based on the first energy level of each audio signal.

Type: Grant

Filed: March 8, 2022

Date of Patent: August 5, 2025

Assignee: CISCO TECHNOLOGY, INC.

Inventor: Asbjorn Therkelsen
Systems and methods for virtual meeting speaker separation

Patent number: 12380910

Abstract: A computer-implemented machine learning method for improving speaker separation is provided. The method comprises processing audio data to generate prepared audio data and determining feature data and speaker data from the prepared audio data through a clustering iteration to generate an audio file. The method further comprises re-segmenting the audio file to generate a speaker segment and causing to display the speaker segment through a client device.

Type: Grant

Filed: June 30, 2021

Date of Patent: August 5, 2025

Assignee: RingCentral, Inc.

Inventors: Prashant Kukde, Sushant Shivram Hiray
Inter-channel phase difference parameter extraction method and apparatus

Patent number: 12367885

Abstract: An inter-channel phase difference (IPD) parameter extraction method includes obtaining a parameter for obtaining an information extraction manner for a current frame of a multi-channel signal; obtaining an IPD parameter extraction manner for the current frame based on the parameter for obtaining the information extraction manner, where the obtained IPD parameter extraction manner is one of at least two preset IPD parameter extraction manners; and obtaining an IPD parameter of the current frame based on the obtained IPD parameter extraction manner for the current frame.

Type: Grant

Filed: January 19, 2024

Date of Patent: July 22, 2025

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Xingtao Zhang, Haiting Li, Zexin Liu, Lei Miao
Nested named entity recognition

Patent number: 12361218

Abstract: Named Entity Recognition is the identification and classification of named entities within a document. Nested NEs occur when an NE is contained within another NE. The disclosed invention leverages the CapsNet architecture for improved nested NE identification and classification. This includes deriving the features of an input text. The derived features are used to identify and classify any named entities in the text. The system is further configured to identify named entities in the text and perform clustering to group named entities. The disclosed CapsNet considers the context of the whole text to activate higher capsule layers in order to identify the named entities and classify them. The teachings of this invention are applicable to other NER models to improve nested NE identification and classification.

Type: Grant

Filed: April 14, 2022

Date of Patent: July 15, 2025

Inventors: Suzanne M Kirch, Rajiv Baronia, Vineeth Thanikonda Munirathnam, Jack Porter
Method and systems for decoding an audio query

Patent number: 12347423

Abstract: A method for decoding an audio query may include: extracting one or more acoustic features from the audio query in response to receiving the audio query from a user; determining a generic word and a domain specific word based on the one or more acoustic features; and decoding the audio query based on the generic word, the domain specific word, and the one or more acoustic features to identify at least one word associated with the audio query.

Type: Grant

Filed: March 11, 2022

Date of Patent: July 1, 2025

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ankur Kumar, Shatrughan Singh, Aman Maghan, Ashutosh Gupta, Sachin Singh
Door knock access control

Patent number: 12334103

Abstract: Methods, systems, and apparatus for door knock access control are disclosed. A monitoring system for monitoring a property includes: a proximity sensor located at a door of the property; a microphone that is configured to detect sound within an area near the door and generate audio data that represents the detected sound; and a monitor control unit configured to perform operations including: receiving, from the proximity sensor, proximity data indicating an object positioned within a set proximity to the door; based on receiving the proximity data, activating the microphone; receiving, from the microphone, the audio data; determining that a similarity between the audio data and stored audio data representing a knocking pattern satisfies similarity criteria; and in response to determining that the similarity between the audio data and the stored audio data satisfies similarity criteria, performing a monitoring system action.

Type: Grant

Filed: December 30, 2020

Date of Patent: June 17, 2025

Assignee: Alarm.com Incorporated

Inventors: William Wireko Mensah, Daniel John Koniar, Liyu Yao, Martin Logan Elliott, John Zhang
Systems and methods for handling multilingual queries

Patent number: 12333268

Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.

Type: Grant

Filed: February 8, 2024

Date of Patent: June 17, 2025

Assignee: ADEIA GUIDES INC.

Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
Method for training a speech recognition model and method for speech recognition

Patent number: 12334052

Abstract: This application relates to a method for training a speech recognition model comprising: providing a speech training data set comprising a plurality of speech data items and corresponding speech tags; providing a speech recognition model to be trained comprising a convolution neural network, a first fully connected network, a recurrent neural network and a second fully connected network which are cascade coupled together, wherein each of the networks comprises one or more network layers each having a parameter matrix; and the speech recognition model processing speech data items to generate corresponding speech recognition results; and using the speech training data set to train the speech recognition model such that the parameter matrices of at least two adjacent network layers satisfies a predetermined constraint condition; and the speech recognition model trained using at least one loss function can generate speech recognition results at an accuracy satisfying a predetermined recognition target.

Type: Grant

Filed: July 12, 2022

Date of Patent: June 17, 2025

Assignee: MONTAGE LZ TECHNOLOGIES (CHENGDU) CO., LTD.

Inventors: Xuepeng Wang, Taibo Dong, Gang Hu
Internet calling method and apparatus, computer device, and storage medium

Patent number: 12335328

Abstract: This disclosure provides a network call method and apparatus, a computer device, and a storage medium, and belongs to the field of audio data processing. The method includes: performing time-frequency transformation on an acquired audio signal, to obtain a plurality of pieces of frequency domain information of the audio signal; determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information; and encoding the audio signal based on the target bit rate, and performing a network call based on the encoded audio signal.

Type: Grant

Filed: October 21, 2021

Date of Patent: June 17, 2025

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventor: Junbin Liang
Identifying input for speech recognition engine

Patent number: 12327573

Abstract: A method of presenting a signal to a speech recognition engine is disclosed. According to an example of the method, an audio signal is received from a user. A portion of the audio signal is identified, the portion having a first time and a second time. A pause in the portion of the audio signal, the pause comprising the second time, is identified. It is determined whether the pause indicates the completion of an utterance of the audio signal. In accordance with a determination that the pause indicates the completion of the utterance, the portion of the audio signal is presented as input to the speech recognition engine. In accordance with a determination that the pause does not indicate the completion of the utterance, the portion of the audio signal is not presented as input to the speech recognition engine.

Type: Grant

Filed: April 16, 2020

Date of Patent: June 10, 2025

Assignee: Magic Leap, Inc.

Inventors: Anthony Robert Sheeder, Tushar Arora
System and method for direct speech translation system

Patent number: 12327091

Abstract: A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.

Type: Grant

Filed: January 13, 2020

Date of Patent: June 10, 2025

Assignee: Applications Technology (AppTek), LLC

Inventors: Evgeny Matusov, Jintao Jiang, Mudar Yaghi
Automatic audio editor software for interviews and recorded speech

Patent number: 12327563

Abstract: Systems and methods for audio processing are described. The system may include a display; a user input device; an audio communications terminal; a recording medium; a tagging system for tagging each of the messages recorded on the recording medium as a function of a question with which each of the messages recorded on the recording medium is associated; and a processor comprising code for: initiating one of the questions of the plurality of questions; displaying the one of the questions of the plurality of questions, having been initiated, on the display; recording via the audio communications terminal audio for a respective response associated with the one of the questions of the plurality of questions, having been initiated; directing the tagging system to tag the response associated with the one of the questions of the plurality of questions; and repeating the steps of each of the plurality of questions.

Type: Grant

Filed: February 24, 2021

Date of Patent: June 10, 2025

Inventor: Kenneth O'Reilly
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 12296028

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: December 16, 2024

Date of Patent: May 13, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 12300258

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: December 16, 2024

Date of Patent: May 13, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Posted information extraction control device and computer readable storage medium

Patent number: 12293150

Abstract: A registree management function receives member (user) registration, carries out a survey upon registration, performs category classification for the registered user, learns the classified categories, and the like. A comment analysis function performs text mining on comments acquired from an SNS posted comment server, determines post origin positions identified by the text mining and the level of credibility thereof, and executes evaluation and the like of a target relating to a theme. An information provision function edits a social heat map generated based on the results of analyzing the comments to be provided to the user, and also performs user category management and the like.

Type: Grant

Filed: June 17, 2019

Date of Patent: May 6, 2025

Assignee: Takenaka Corporation

Inventors: Kuniaki Andou, Rikuto Kunimoto, Takeshi Takai, Kazuo Ohtake
Systems and methods for flexible regularized distillation of natural language processing models to facilitate interpretation

Patent number: 12288029

Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.

Type: Grant

Filed: May 14, 2024

Date of Patent: April 29, 2025

Assignee: Wells Fargo Bank, N.A.

Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
Voice feedback for user interface of media playback device

Patent number: 12283271

Abstract: A method of providing voice feedback to a listener as part of a user interface of a media playback system may include: storing multiple different voice feedback recordings in at least one computer-readable storage device, where each of the multiple different voice feedback recordings is of a different voice artist; receiving a listener command corresponding to a musical selection; determining an identifying musical characteristic of the musical selection; selecting a first voice feedback recording from the multiple different voice feedback recordings, where the first voice feedback recording corresponds to the identifying musical characteristic; and playing the first voice feedback recording to the listener via the media playback system.

Type: Grant

Filed: May 18, 2021

Date of Patent: April 22, 2025

Assignee: Spotify AB

Inventor: Sten Garmark
Systems and methods to alter voice interactions

Patent number: 12272366

Abstract: Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device during an output time interval. In response, current user contextual data of the user device is retrieved. The voice interaction and output time interval are altered to increase consumption likelihood of the voice interaction based on the current user contextual data. The altered voice interaction is outputted at the user device during the altered output time interval.

Type: Grant

Filed: April 29, 2021

Date of Patent: April 8, 2025

Assignee: Adeia Guides Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose, Reda Harb
Image display device and voice recognition method therefor

Patent number: 12266348

Abstract: An image display device and a voice recognition method thereof are proposed. The image display device and voice recognition method thereof stores received first utterance data in a first buffer when a wake up word is recognized from utterance data, stores second utterance data consecutively uttered from a user in a second buffer, stores third utterance data consecutively uttered from the user in a third buffer when verification of the wake up word is successful on the basis of the first utterance data after booting of an operating system is completed, and joins the second uttered data and the third uttered data respectively stored in the second buffer and the third buffer, so as to perform voice recognition for the utterance data consecutively uttered by the user. Accordingly, including the commands uttered at intervals after the wake up word, user's commands that are consecutively uttered may be recognized.

Type: Grant

Filed: September 24, 2019

Date of Patent: April 1, 2025

Assignee: LG ELECTRONICS INC.

Inventors: Woo Jin Choi, Sung Eun Kim, Hyun Woo Park, Eun Kyung Jung, Dae Gon Chae

1 2 3 4 5 … next