Detect Speech In Noise Patents (Class 704/233)

Vehicle diagnosis method and information presentation method

Patent number: 12347241

Abstract: A computer of a vehicle diagnosis system acquires a sound produced in a traveling vehicle. The computer of the vehicle diagnosis system diagnoses a status of the vehicle based on the sound acquired. The computer of the vehicle diagnosis system causes a diagnosis result indicating the status of the vehicle to be output from an apparatus mounted on the vehicle.

Type: Grant

Filed: December 6, 2022

Date of Patent: July 1, 2025

Assignee: PANASONIC AUTOMOTIVE SYSTEMS CO., LTD.

Inventor: Riku Usami
Multi-level emotional enhancement of dialogue

Patent number: 12333258

Abstract: A system for emotionally enhancing dialogue includes a computing platform having processing hardware and a system memory storing a software code including a predictive model. The processing hardware is configured to execute the software code to receive dialogue data identifying an utterance for use by a digital character in a conversation, analyze, using the dialogue data, an emotionality of the utterance at multiple structural levels of the utterance, and supplement the utterance with one or more emotional attributions, using the predictive model and the emotionality of the utterance at the multiple structural levels, to provide one or more candidate emotionally enhanced utterance(s).

Type: Grant

Filed: August 24, 2022

Date of Patent: June 17, 2025

Assignee: Disney Enterprises, Inc.

Inventors: Sanchita Tiwari, Justin Ali Kennedy, Dirk Van Dall, Xiuyang Yu, Daniel Cahall, Brian Kazmierczak
Fraud detection in contact centers using deep learning model

Patent number: 12335440

Abstract: An example method is described. The method includes receiving, by a computing system, data indicative of a call into an interactive voice response (IVR) system from a user device and determining, by the computing system and based on the data, a set of actions performed by the user device within the IVR system and a corresponding set of results performed by the IVR system during the call. Additionally, the method includes converting, by the computing system, the set of actions and the corresponding set of results into a sequence of code pairs using a dictionary established based on training data, determining, by the computing system, an activity pattern during the call based on the sequence of code pairs; and calculating, by the computing system, a probability that the call is fraudulent based on the activity pattern during the call.

Type: Grant

Filed: March 14, 2024

Date of Patent: June 17, 2025

Assignee: Wells Fargo Bank, N.A.

Inventors: Nick A. Maiorana, Judy Cantor, Kevin R. Cieslak, David Gorlesky, Jeremy Ernst
Systems and methods to improve automatic speech recognition of audio data

Patent number: 12334097

Abstract: Systems and methods to improve automatic speech recognition of audio data are disclosed. Exemplary implementations may: process a first noisy audio data from a set of the noisy audio data, with a generator, to generate first refined audio data; process the first refined audio data and a first comprehensible audio data from a set of the comprehensible audio data, individually with a discriminator, to determine individual decision values; determine a generator update and a discriminator update based on the individual decision values; apply the generator update to the generator such that the generator is modified to improve generation of refined audio data or is not modified; and apply the discriminator update to the discriminator such that the discriminator is modified to improve subsequent determinations of the decision values or is not modified.

Type: Grant

Filed: October 13, 2022

Date of Patent: June 17, 2025

Assignee: Suki AI, Inc.

Inventors: Valerii Garnaga, Vamsi Reddy Chagari, Ajithkumar Warrier, Jian Feng, Mona Abdul-Azeez Sheikh
Detecting utterance in audio

Patent number: 12283272

Abstract: Techniques for processing utterance audio are described. In an example, a computer system determines audio data representing an utterance detected by a device, and generates, based at least in part the audio data, first data representing at least one of portion of the utterance in a frequency domain. The first data specific is to a first frequency range. The computer system determines determining a second frequency range that is a subset of the first frequency range, the second frequency range meeting a frequency threshold, and generates, based at least in part on the first data, second data that represents the at least one portion in the frequency domain. The second data is specific to the second frequency range. The computer system determines, based at least in part on the second data, that additional audio data associated with the device is to be processed.

Type: Grant

Filed: July 19, 2021

Date of Patent: April 22, 2025

Assignee: Amazon Technologies, Inc.

Inventors: Henry Michael D Souza, Vladimir Adam, Ketan Ashok Kulkarni, Oliver Benjamin Hill, Ragini Rajendra Prasad
Image display device and voice recognition method therefor

Patent number: 12266348

Abstract: An image display device and a voice recognition method thereof are proposed. The image display device and voice recognition method thereof stores received first utterance data in a first buffer when a wake up word is recognized from utterance data, stores second utterance data consecutively uttered from a user in a second buffer, stores third utterance data consecutively uttered from the user in a third buffer when verification of the wake up word is successful on the basis of the first utterance data after booting of an operating system is completed, and joins the second uttered data and the third uttered data respectively stored in the second buffer and the third buffer, so as to perform voice recognition for the utterance data consecutively uttered by the user. Accordingly, including the commands uttered at intervals after the wake up word, user's commands that are consecutively uttered may be recognized.

Type: Grant

Filed: September 24, 2019

Date of Patent: April 1, 2025

Assignee: LG ELECTRONICS INC.

Inventors: Woo Jin Choi, Sung Eun Kim, Hyun Woo Park, Eun Kyung Jung, Dae Gon Chae
Application of interaction modeling for lambda triggering of authorization

Patent number: 12267431

Abstract: A conversational interface receives a conversational request for an action which requires step-up authentication; responds to the user using conversational AI with a warning that an additional step is required; transmits the authorization intent and the spoken submission to a virtual private cloud (VPC), and in response to confirmation of the spoken submission, authenticates the user, thereby authorizing the requested action; and receives a confirmation of validation of the passphrase and that the action is being or will be performed.

Type: Grant

Filed: May 11, 2022

Date of Patent: April 1, 2025

Assignee: TRUIST BANK

Inventors: Peter Eric Sutherland, Laurence Henry Lipsmeyer, Nelson Joseph Erb
Audio signal detector

Patent number: 12254896

Abstract: An audio signal detector comprising: an input adapted to receive a transducer signal; an output to provide an output signal indicative of the presence of a voice component in the transducer signal; and a processor circuit adapted to estimate a noise characteristic of the transducer signal; detect a voice component of the transducer signal; adjust one or more adjustable voice-activity parameters for detecting the voice component, wherein the said one or more adjustable voice-activity parameters are adjusted based on the noise characteristic; and generate the output signal.

Type: Grant

Filed: July 19, 2021

Date of Patent: March 18, 2025

Assignee: Renesas Design Netherlands B.V.

Inventors: Gavin Radolan, Yuan Chia Lu, Chih Chuan Chou, Ching-Hua Yeh
Foundational AI model for capturing and encoding audio with artificial intelligence semantic analysis and without low pass or high pass filters

Patent number: 12236964

Abstract: A system and method for enhancing or restoring audio data utilizing an artificial intelligence module, and more particularly utilizing deep neural networks and generative adversarial networks. The system and method are both able to train the artificial intelligence module to provide for different format and other characteristic-specific transforms for determining how to restore audio to source quality and even beyond. The present invention includes the steps of acquiring source data, pre-processing the source data, implementing the artificial intelligence module, indexing the data, applying transforms, and optimizing the data for a particular audio modality.

Type: Grant

Filed: July 29, 2024

Date of Patent: February 25, 2025

Assignee: Seer Global, Inc.

Inventor: Andrew Denis
Metadata-based diarization of teleconferences

Patent number: 12217760

Abstract: A method for audio processing includes receiving a recording of a teleconference among multiple participants over a network, including an audio stream containing speech uttered by the participants and information outside the audio stream. The method further includes processing the audio stream to identify speech segments interspersed with intervals of silence, extracting speaker identifications from the information outside the audio stream in the received recording, labeling a first set of the identified speech segments from the audio stream with the speaker identifications, extracting acoustic features from the speech segments in the first set, learning a correlation between the speaker identifications labelled to the segments in the first set and the extracted acoustic features, and labeling a second set of the identified speech segments using the learned correlation, to indicate the participants who spoke during the speech segments in the second set.

Type: Grant

Filed: January 30, 2022

Date of Patent: February 4, 2025

Assignee: GONGIO Ltd.

Inventors: Eilon Reshef, Hanan Shteingart, Zohar Shay, Shlomi Medalion
Method for generating training data and method for post-processing of speech recognition using the same

Patent number: 12217738

Abstract: Disclosed is a training data construction method and a speech recognition method using the same. The training data construction method is performed by a computing apparatus including at least one processor and includes converting first text data including a plurality of sentences to first speech data; acquiring second speech data by adding noise to the first speech data; and converting the second speech data to second text data. The second text data includes a sentence corresponding to each of the plurality of sentences included in the first text data.

Type: Grant

Filed: May 9, 2022

Date of Patent: February 4, 2025

Assignee: Korea University Research and Business Foundation

Inventors: Heuiseok Lim, Chanjun Park
Apparatus and method for generating an enhanced signal using independent noise-filling identified by an identification vector

Patent number: 12205604

Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.

Type: Grant

Filed: August 14, 2023

Date of Patent: January 21, 2025

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
Context-based model selection

Patent number: 12198057

Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.

Type: Grant

Filed: November 24, 2020

Date of Patent: January 14, 2025

Assignee: QUALCOMM Incorporated

Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser
Text processing model training method, and text processing method and apparatus

Patent number: 12182507

Abstract: A text processing model training method, and a text processing method and apparatus in the natural language processing field in the artificial intelligence field are disclosed. The training method includes: obtaining training text; separately inputting the training text into a teacher model and a student model to obtain sample data output by the teacher model and prediction data output by the student model; the sample data includes a sample semantic feature and a sample label; the prediction data includes a prediction semantic feature and a prediction label; and the teacher model is a pre-trained language model used for text classification; and training a model parameter of the student model based on the sample data and the prediction data, to obtain a target student model. The method enables the student model to effectively perform knowledge transfer, thereby improving accuracy of a text processing result of the student model.

Type: Grant

Filed: February 28, 2022

Date of Patent: December 31, 2024

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen
System and method for lookup source segmentation scoring in a natural language understanding (NLU) framework

Patent number: 12175193

Abstract: A natural language understanding (NLU) framework includes a lookup source system having one or more lookup sources. Each lookup source includes a respective source data representation that is compiled from respective source data. Once compiled, a user utterance can be submitted to the lookup source system, which generates segmentations of the user utterance. Each segmentation generally includes a collection of non-overlapping segments, and each segment generally describes how tokens of the user utterance can be grouped together and matched to the states of the source data representations. During lookup source inference, matches can be made to produced states or using fuzzy matchers that have corresponding of scoring adjustments. These scoring adjustments may be used by a segmentation scoring subsystem, potentially in combination with one or more additional segmentation scoring plugins, to score and rank the segmentations determined by the lookup source system for the user utterance.

Type: Grant

Filed: January 19, 2022

Date of Patent: December 24, 2024

Assignee: ServiceNow, Inc.

Inventors: Omer Anil Turkkan, Edwin Sapugay, Phani Bhushan Kumar Nivarthi
Anonymizing caller identity based on voice print match

Patent number: 12147574

Abstract: Methods and systems for anonymizing an identity of a caller using a voice print match. One system includes a surveillance system including a camera and a microphone; and a dispatch computer. The dispatch computer is configured to receive a voice call from the caller, generate a voice print from the voice call, compare the voice print to a plurality of stored voice prints, and, in response to the voice print not matching any of the plurality of stored voice prints, transmit the voice print to the surveillance system. The surveillance system is configured to receive the voice print, capture audio data via the microphone, compare the audio data to the voice print, and, in response to identifying a match between the voice print and the audio data, tag image data, captured via the camera, corresponding to the audio data for redaction.

Type: Grant

Filed: August 10, 2022

Date of Patent: November 19, 2024

Assignee: MOTOROLA SOLUTIONS, INC.

Inventors: Chun Meng Tan, Anoop Sehgal Paras Ram, Daniela Stankovska
Acoustic aware voice user interface

Patent number: 12142288

Abstract: A method for optimal configuration of a voice user interface is disclosed herein. The method includes receiving an audio signal; processing the audio signal by a context awareness module to generate context information regarding an acoustic environment of the audio signal; determining, based on the context information, an optimal one of a plurality of different configurations of a voice user interface to perform voice user interface processing of the audio signal; and performing the voice user interface processing of the audio signal using the optimal configuration of the plurality of different configurations.

Type: Grant

Filed: December 21, 2020

Date of Patent: November 12, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sridhar Nemala, Dhruv Bharadwaj, Navjot Singh, Sharon Gadonniex
Speaker embeddings for improved automatic speech recognition

Patent number: 12136410

Abstract: A method includes receiving a reference audio signal corresponding to reference speech spoken by a target speaker with atypical speech, and generating, by a speaker embedding network configured to receive the reference audio signal as input, a speaker embedding for the target speaker. The speaker embedding conveys speaker characteristics of the target speaker. The method also includes receiving a speech conversion request that includes input audio data corresponding to an utterance spoken by the target speaker associated with the atypical speech. The method also includes biasing, using the speaker embedding generated for the target speaker by the speaker embedding network, a speech conversion model to convert the input audio data corresponding to the utterance spoken by the target speaker associated with atypical speech into an output canonical representation of the utterance spoken by the target speaker.

Type: Grant

Filed: May 3, 2022

Date of Patent: November 5, 2024

Assignee: Google LLC

Inventors: Fadi Biadsy, Dirk Ryan Padfield, Victoria Zayats
Systems, methods, apparatus, and storage medium for processing a signal

Patent number: 12119015

Abstract: The present disclosure provides systems and methods for processing a signal. The system for processing a signal may include at least one microphone and at least one vibration sensor. The at least one microphone may be configured to collect a sound signal, and the sound signal may include at least one of user voice and environmental noise. The at least one vibration sensor may be configured to collect a vibration signal, and the vibration signal may include at least one of the user voice and the environmental noise. The system for processing a signal may also comprise a processor. The processor may be configured to determine a relationship between a noise component in the sound signal and a noise component in the vibration signal, and obtain a target vibration signal by performing, based at least on the relationship, noise reduction processing on the vibration signal.

Type: Grant

Filed: January 30, 2022

Date of Patent: October 15, 2024

Assignee: SHENZHEN SHOKZ CO., LTD.

Inventors: Jinbo Zheng, Fengyun Liao, Xin Qi
Apparatus and method for estimating direction of sound by using acoustic sensor

Patent number: 12099131

Abstract: Provided is a direction estimating apparatus using an acoustic sensor, the direction estimating apparatus including a non-directional acoustic sensor, a plurality of directional acoustic sensors provided adjacent to the non-directional acoustic sensor, and a processor configured to obtain a first output signal from the non-directional acoustic sensor and a plurality of second output signals from the plurality of directional acoustic sensors, and estimate a direction of a sound source within an error range from ?5 degrees to +5 degrees by comparing magnitudes between the two output signals and phase information between the first output signal and one of the second output signals.

Type: Grant

Filed: June 7, 2022

Date of Patent: September 24, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Daehyuk Son, Cheheung Kim, Jaehyung Jang, Hyeokki Hong
Tone and echo cancellation using two acoustic sound cancellers

Patent number: 12087302

Abstract: Example techniques involve systems with multiple acoustic echo cancellers. An example implementation captures first audio within an acoustic environment and detecting, within the captured first audio content, a wake-word. In response to the wake-word and before playing an acknowledgement tone, the implementation activates (a) a first sound canceller when one or more speakers are playing back audio content or (b) a second sound canceller when the one or more speakers are idle. In response to the wake-word and after activating either (a) the first sound canceller or (b) the second sound canceller, the implementation outputs the acknowledgement tone via the one or more speakers. The implementation captures second audio within the acoustic environment and cancelling the acoustic echo of the acknowledgement tone from the captured second audio using the activated sound canceller.

Type: Grant

Filed: September 21, 2023

Date of Patent: September 10, 2024

Assignee: Sonos, Inc.

Inventor: Saeed Bagheri Sereshki
Terminal, audio cooperative reproduction system, and content display apparatus

Patent number: 12052556

Abstract: A terminal is provided with a first audio output unit and a second audio output unit configured to output an audio content, and the terminal includes: an audio acquiring unit configured to acquire a first audio content and a second audio content; a setting unit configured to set information indicating an output destination of the first audio content and the second audio content; and an audio output controller configured to output the first audio content to one of the first audio output unit and the second audio output unit and output the second audio content to the other of the first audio output unit and the second audio output unit based on the information set by the setting unit.

Type: Grant

Filed: September 30, 2021

Date of Patent: July 30, 2024

Assignee: MAXELL, LTD.

Inventors: Shogo Kitamura, Yuusuke Yatabe, Takashi Matsubara
Conferencing device with beamforming and echo cancellation

Patent number: 12052393

Abstract: This disclosure describes a conferencing device with beamforming and echo cancellation that includes: a microphone array that further comprises a plurality of microphones oriented to develop a corresponding plurality of microphone signals; a processor configured to execute the following steps: (1) performing a beamforming operation; (2) performing an acoustic echo cancellation operation; (3) post processing; (4) selecting one or more of the post processed signals for transmission to the far end where a signal selector uses the far end signal as information to inhibit the signal selector from changing the selection of the post processed signals while only the far end signal is active.

Type: Grant

Filed: August 24, 2023

Date of Patent: July 30, 2024

Assignee: ClearOne, Inc.

Inventors: Ashutosh Pandey, Darrin T. Thurston, David K. Lambert, Tracy A. Bathurst
Speech data processing method and apparatus, electronic device, and readable storage medium

Patent number: 12039987

Abstract: A speech data processing method is provided. In the method, first speech data is received from a sender. An initial speech enhancement parameter is obtained. Speech enhancement is performed on the first speech data based on the obtained initial speech enhancement parameter to obtain first speech enhancement data. A first speech enhancement parameter is determined based on the first speech data. The first speech enhancement data is transmitted to a recipient. Further, the obtained initial speech enhancement parameter is updated based on the first speech enhancement parameter to obtain an updated speech enhancement parameter. The first speech enhancement parameter can be used to perform speech enhancement on second speech data of the sender.

Type: Grant

Filed: September 13, 2021

Date of Patent: July 16, 2024

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Jun Huang, Yannan Wang
Information processing device, and calculation method

Patent number: 12015901

Abstract: An information processing device includes a sound signal acquisition unit that acquires sound signals outputted from a mic array, an analysis unit that analyzes frequencies of the sound signals, an information acquisition unit that acquires predetermined information indicating a steering vector in a first direction as a direction from the mic array to a target sound source, and a calculation unit that calculates a filter for formation in a second direction as a direction different from the first direction based on the frequencies and the information indicating the steering vector in the first direction and calculates a steering vector in the second direction by using an expression indicating a relationship between the calculated filter and the steering vector in the second direction.

Type: Grant

Filed: June 2, 2022

Date of Patent: June 18, 2024

Assignee: Mitsubishi Electric Corporation

Inventors: Tomoharu Awano, Masaru Kimura
Detection and mitigation of a wind whistle

Patent number: 11984109

Abstract: An image capture device detects a wind whistle using two or more microphones. The image capture device includes a processor that obtains microphone signals from the two or more microphones and measures coherence values between the microphone signals across a frequency band. The frequency band includes frequency bins, and the processor measures a coherence value for each frequency bin. Based on a detection of an elevated coherence value in a frequency bin, the processor determines the presence of a whistle. The processor attenuates the frequency bin based on a determination that the elevated coherence value is above a threshold.

Type: Grant

Filed: September 1, 2022

Date of Patent: May 14, 2024

Assignee: GoPro, Inc.

Inventor: Erich Tisch
Direction based end-pointing for speech recognition

Patent number: 11978478

Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.

Type: Grant

Filed: March 13, 2023

Date of Patent: May 7, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Kenneth John Basye, Jeffrey Penrod Adams
Auralization for multi-microphone devices

Patent number: 11924618

Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.

Type: Grant

Filed: October 4, 2022

Date of Patent: March 5, 2024

Assignee: Google LLC

Inventors: Rajeev Conrad Nongpiur, Ananya Misra, Chanwoo Kim
Portable acoustic imaging tool with scanning and analysis capability

Patent number: 11913829

Abstract: Systems and methods directed toward acoustic analysis can include an acoustic sensor array comprising a plurality of acoustic sensor elements, an electromagnetic imaging tool, and a processor in communication with the acoustic sensor array and the electromagnetic imaging tool. The processor can be configured to analyze acoustic data to extract one or more acoustic parameters representative of acoustic signals at one or more locations in an acoustic scene and generate a display image that includes electromagnetic image data and acoustic image data. The display image can further include information indicative of the one or more acoustic parameters at one or more locations in the acoustic scene, such as including acoustic image data in the display image at locations in the scene at which the one or more acoustic parameters satisfies a predetermined condition.

Type: Grant

Filed: December 23, 2021

Date of Patent: February 27, 2024

Assignee: Fluke Corporation

Inventors: Michael D. Stuart, Dileepa Prabhakar
Apparatus and method for generating an enhanced signal using independent noise-filling at random values and scaling thereupon

Patent number: 11908484

Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.

Type: Grant

Filed: January 19, 2022

Date of Patent: February 20, 2024

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
Neural-network-based approach for speech denoising

Patent number: 11894012

Abstract: Disclosed are methods, systems, device, and other implementations, including a method that includes receiving an audio signal representation, detecting in the received audio signal representation, using a first learning model, one or more silent intervals with reduced foreground sound levels, determining based on the detected one or more silent intervals an estimated full noise profile corresponding to the audio signal representation, and generating with a second learning model, based on the received audio signal representation and on the determined estimated full noise profile, a resultant audio signal representation with a reduced noise level.

Type: Grant

Filed: May 19, 2023

Date of Patent: February 6, 2024

Assignees: SoftBank Corp.

Inventors: Changxi Zheng, Ruilin Xu, Rundi Wu, Carl Vondrick, Yuko Ishiwaka
Systems and methods for scripted audio production

Patent number: 11875797

Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.

Type: Grant

Filed: June 22, 2021

Date of Patent: January 16, 2024

Assignee: Pozotron Inc.

Inventors: Jakub Poznanski, Kostiantyn Hlushak
Systems, devices, and methods for segmenting a musical composition into musical segments

Patent number: 11869466

Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.

Type: Grant

Filed: January 8, 2023

Date of Patent: January 9, 2024

Assignee: Obeebo Labs Ltd.

Inventor: Colin P. Williams
Speaker awareness using speaker dependent speech model(s)

Patent number: 11854533

Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Type: Grant

Filed: January 28, 2022

Date of Patent: December 26, 2023

Assignee: GOOGLE LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
Voice command scrubbing

Patent number: 11848015

Abstract: The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD may be processed by an audio scrubbing module to identify voice commands in the audio data that are used for controlling the HMD. When a voice command is identified in the audio data, timestamps corresponding to the voice command may be determined. Filler audio data may then be generated to imitate the environment by processing at least a portion of the audio data by a neural network of a machine learning model. The filler audio data may then be used to replace the audio data corresponding to the identified voice commands, thereby scrubbing the voice command from the audio data.

Type: Grant

Filed: October 1, 2020

Date of Patent: December 19, 2023

Assignee: RealWear, Inc.

Inventor: Christopher Iain Parkinson
Conferencing device with beamforming and echo cancellation

Patent number: 11831812

Abstract: This disclosure describes a conferencing device with beamforming and echo cancellation that includes: a microphone array that further comprises a plurality of microphones oriented to develop a corresponding plurality of microphone signals; a processor configured to execute the following steps: (1) performing a beamforming operation to combine the plurality of microphone signals from the microphone array into a plurality of combined signals, (2) performing an acoustic echo cancellation operation on the plurality of combined signals to generate a plurality of combined echo cancelled signals, (3) receiving with a voice activity detector the far end signal as an input, (4) selecting one or more of the combined echo cancelled signals for transmission to the far end where a signal selector uses the far end signal as information to inhibit the signal selector from changing the selection of the combined echo cancelled signals while only the far end signal is active.

Type: Grant

Filed: November 22, 2022

Date of Patent: November 28, 2023

Assignee: ClearOne, Inc.

Inventors: Ashutosh Pandey, Darrin T. Thurston, David K. Lambert, Tracy A. Bathurst
Electronic apparatus and controlling method thereof

Patent number: 11804241

Abstract: An electronic apparatus and a controlling method thereof are provided. The controlling method includes, based on an audio signal being received through a microphone, determining whether a user is on a public transport; detecting whether the audio signal includes a voice signal output through an acoustic device of the public transport; determining whether the voice signal from the acoustic device includes a voice signal for guiding at least one stop from among a plurality of stops; and outputting information on the at least one stop.

Type: Grant

Filed: January 18, 2022

Date of Patent: October 31, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jubum Han, Changwoo Han
Voice separation device, voice separation method, voice separation program, and voice separation system

Patent number: 11798574

Abstract: A speech separation device (12) of a speech separation system includes a feature amount extraction unit (121) configured to extract time-series data of a speech feature amount of mixed speech, a block division unit (122) configured to divide the time-series data of the speech feature amount into blocks having a certain time width, a speech separation neural network (1b) configured to create time-series data of a mask of each of a plurality of speakers from the time-series data of the speech feature amount divided into blocks, and a speech restoration unit (123) configured to restore the speech data of each of the plurality of speakers from the time-series data of the mask and the time-series data of the speech feature amount of the mixed speech.

Type: Grant

Filed: January 12, 2021

Date of Patent: October 24, 2023

Assignees: MITSUBISHI ELECTRIC CORPORATION, MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC.

Inventors: Ryo Aihara, Toshiyuki Hanazawa, Yohei Okato, Gordon P Wichern, Jonathan Le Roux
Audio-based identification interfaces for selecting objects from video

Patent number: 11783352

Abstract: A method, system, and device for audio-based identification interfaces for selecting objects from video generates and stores frequency-based audio identifiers associated with segments of an audio stream that is integrated with a video stream. The generation of the frequency-based audio identifiers may be performed by a hashing function applied to audio frequencies within audio segments. The video stream comprises identified objects that may be identified by application of a trained neural network. An audio segment is received from a user and a corresponding frequency-based audio identifier is generated and matched against stored frequency-based audio identifiers. The matching determines an audio segment and a temporally corresponding identified object, which is then embodied within an interactive user interface.

Type: Grant

Filed: January 20, 2023

Date of Patent: October 10, 2023

Assignee: Revealit Corporation

Inventors: Garry Anthony Smith, Zachary Oakes, Steven Dennis Flinn
Speaker recognition method and apparatus

Patent number: 11763805

Abstract: A speaker recognition method and apparatus receives a first voice signal of a speaker, generates a second voice signal by enhancing the first voice signal through speech enhancement, generates a multi-channel voice signal by associating the first voice signal with the second voice signal, and recognizes the speaker based on the multi-channel voice signal.

Type: Grant

Filed: May 27, 2022

Date of Patent: September 19, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sung-Jae Cho, Kyuhong Kim, Jaejoon Han
Contextual beamforming to improve signal-to-noise ratio sensitive audio input processing efficiency in noisy environments

Patent number: 11741982

Abstract: An audio processing system includes a microphone array, a speech detection system, and a neural network noise reduction module. The microphone array includes at least two microphones and provides an audio signal from an environment surrounding the microphone array. The speech detection system receives the audio signal, and processes the audio signal to a) detect that a first user is speaking, b) determine a first direction relative to the audio array when the first user is located at a first location within the environment, and c) provide beamforming processing on the audio signal in the first direction, and to provide a processed audio signal based upon the beamforming processing. The neural network noise reduction module reduces noise in the processed audio signal.

Type: Grant

Filed: October 5, 2021

Date of Patent: August 29, 2023

Assignee: Dell Products L.P.

Inventors: Cola Hung Shih, Vivek Viswanathan Iyer
Automated transcript generation from multi-channel audio

Patent number: 11699456

Abstract: Systems and methods are described for generating a transcript of a legal proceeding or other multi-speaker conversation or performance in real time or near-real time using multi-channel audio capture. Different speakers or participants in a conversation may each be assigned a separate microphone that is placed in proximity to the given speaker, where each audio channel includes audio captured by a different microphone. Filters may be applied to isolate each channel to include speech utterances of a different speaker, and these filtered channels of audio data may then be processed in parallel to generate speech-to-text results that are interleaved to form a generated transcript.

Type: Grant

Filed: February 12, 2021

Date of Patent: July 11, 2023

Assignee: Veritext, LLC

Inventors: Anthony Donofrio, David Joseph DaSilva, James Andrew Maraska, Jr., Jonathan Mordecai Kaplan
Method and system for performing voice processing

Patent number: 11659325

Abstract: A voice processing method, an electronic device and a readable storage medium, which relate to the field of voice processing technologies, are disclosed. The method includes: collecting a first audio signal; processing the first audio signal using a preset algorithm to obtain a second audio signal; and sending the second audio signal to a first device, such that the first device performs a voice processing operation on the second audio signal.

Type: Grant

Filed: December 23, 2021

Date of Patent: May 23, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Jingran Li, Liufeng Wang
Sub-matrix input for neural network layers

Patent number: 11620989

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes generating, by a speech recognition system, a matrix from a predetermined quantity of vectors that each represent input for a layer of a neural network, generating a plurality of sub-matrices from the matrix, using, for each of the sub-matrices, the respective sub-matrix as input to a node in the layer of the neural network to determine whether an utterance encoded in an audio signal comprises a keyword for which the neural network is trained.

Type: Grant

Filed: June 26, 2019

Date of Patent: April 4, 2023

Assignee: Google LLC

Inventors: Ignacio Lopez Moreno, Yu-hsin Joyce Chen
Learning speech data generating apparatus, learning speech data generating method, and program

Patent number: 11621015

Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.

Type: Grant

Filed: March 11, 2019

Date of Patent: April 4, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki Fukutomi, Manabu Okamoto, Takashi Nakamura, Kiyoaki Matsui
Systems, devices, and methods for segmenting a musical composition into musical segments

Patent number: 11551651

Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.

Type: Grant

Filed: May 30, 2021

Date of Patent: January 10, 2023

Assignee: Obeebo Labs Ltd.

Inventor: Colin P. Williams
Acoustic device

Patent number: 11540038

Abstract: The present disclosure may provide an acoustic device. The acoustic device may include a housing, at least one low-frequency acoustic driver, at least one high-frequency acoustic driver, and a noise reduction assembly. The housing may be configured to be rested on a shoulder of a user. The at least one low-frequency acoustic driver may be carried by the housing and configured to output first sound from at least two first sound guiding holes. The at least one high-frequency acoustic driver may be carried by the housing and configured to output second sound from at least two second sound guiding holes. The noise reduction assembly may be configured to receive third sound and reduce noise of the third sound.

Type: Grant

Filed: February 7, 2021

Date of Patent: December 27, 2022

Assignee: SHENZHEN SHOKZ CO., LTD.

Inventors: Lei Zhang, Junjiang Fu, Bingyan Yan, Fengyun Liao, Xin Qi
Pre-voice separation/recognition synchronization of time-based voice collections based on device clockcycle differentials

Patent number: 11532302

Abstract: Methods and devices for conducting, based on a clock difference, a synchronization process on voice information collected by a plurality of voice collection devices. Then, after the synchronization process is performed on the voice information collected by the plurality of voice collection devices, conducting a voice separation and recognition process on voice information that was collected by the plurality of voice collection devices and synchronized based on the clock difference among the plurality of voice collection devices.

Type: Grant

Filed: September 28, 2017

Date of Patent: December 20, 2022

Assignee: Harman International Industries, Incorporated

Inventors: Xiangru Bi, Guoxia Zhang
Selecting audio noise reduction models for non-stationary noise suppression in an information handling system

Patent number: 11508387

Abstract: Selecting audio noise reduction models for noise suppression in an information handling system (IHS), including performing calibration and configuration of an audio noise reduction selection model, including: identifying contextual data associated with contextual inputs to the IHS; training, based on the contextual data, the audio noise reduction selection model, including generating a configuration policy including configuration rules, the configuration rules for performing actions for selection of a combination of audio noise reduction models to reduce combinations of noise sources associated with the IHS; performing steady-state monitoring of the IHS, including: monitoring the contextual inputs of the IHS, and in response, accessing the audio noise reduction selection model, identifying configuration rules based on the monitored contextual inputs, applying the configuration rules to select a particular combination of audio noise reduction models, applying particular combination of audio noise reduction mod

Type: Grant

Filed: August 18, 2020

Date of Patent: November 22, 2022

Assignee: Dell Products L.P.

Inventors: Vivek Viswanathan Iyer, Michael S. Gatson
Noise reduction method and apparatus for on-board environment, electronic device and storage medium

Patent number: 11508349

Abstract: A noise reduction method and apparatus for an on-board environment, an electronic device and a storage medium are provided, which are applicable to a field of computer technology, and particularly to a field of audio processing. The noise reduction method for an on-board environment includes: receiving an interference signal in the on-board environment and receiving a sound signal in the on-board environment, the interference signal comprising a vibration signal of a vehicle; and performing noise reduction processing on the sound signal in the on-board environment to obtain a noise-reduced signal; wherein, the noise reduction processing comprises cancelling the interference signal from the sound signal in the on-board environment.

Type: Grant

Filed: March 19, 2021

Date of Patent: November 22, 2022

Assignee: Beijing Baidu Netcom Science and Technology Co., LTD

Inventors: Zaidong Zhang, Zhanxue Li, Tingting Che

1 2 3 4 5 … next