Voiced Or Unvoiced Patents (Class 704/208)

Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 12296028

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: December 16, 2024

Date of Patent: May 13, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 12300258

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: December 16, 2024

Date of Patent: May 13, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 12243543

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: January 19, 2024

Date of Patent: March 4, 2025

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Methods and apparatus to perform speed-enhanced playback of recorded media

Patent number: 12190911

Abstract: Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.

Type: Grant

Filed: December 19, 2023

Date of Patent: January 7, 2025

Assignee: The Nielsen Company (US), LLC

Inventor: Venugopal Srinivasan
Enhancing musical sound during a networked conference

Patent number: 12183357

Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.

Type: Grant

Filed: December 16, 2022

Date of Patent: December 31, 2024

Assignee: Zoom Video Communications, Inc.

Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
Audio decoder, audio encoder, method for providing a decoded audio signal, method for providing an encoded audio signal, audio stream, audio stream provider and computer program using a stream identifier

Patent number: 12142286

Abstract: An audio decoder for providing a decoded audio signal representation on the basis of an encoded audio signal representation is configured to adjust decoding parameters in dependence on a configuration information, to decode one or more audio frames using a current configuration information, to compare a configuration information in a configuration structure associated with one or more frames to be decoded by the current configuration information, and to make a transition to perform decoding using the configuration information in the configuration structure associated with the one or more frames to be decoded as a new configuration information if the configuration information in the configuration structure associated with the one or more frames to be decoded, or a relevant portion thereof, is different from the current configuration information, and to consider a stream identifier information included in the configuration structure when comparing the configuration information.

Type: Grant

Filed: October 23, 2023

Date of Patent: November 12, 2024

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Max Neuendorf, Matthias Felix, Matthias Hildenbrand, Lukas Schuster, Ingo Hofmann, Bernd Herrmann, Nikolaus Rettelbach
Detecting scam callers using conversational agent and machine learning systems and methods

Patent number: 12132862

Abstract: Systems and methods for detecting indications of a scam caller are disclosed. Call data, such as call audio, is received and used to create a training dataset. Using the training dataset, a machine learning model is trained to detect indications of a scam caller in a phone call. An Interactive Voice Response (IVR) model is trained or configured, using voice samples of speech of a subscriber of a telecommunications service provider, to simulate speech and conversation of the subscriber. A conversational agent is generated using the IVR model and the trained machine learning model. The conversational agent receives a phone call, engages a caller in simulated conversation, and detects indications of whether the caller is a likely scam caller. If the caller is determined to be a likely scam caller, an alert can be generated and/or the call can be disconnected.

Type: Grant

Filed: October 6, 2023

Date of Patent: October 29, 2024

Assignee: T-Mobile USA, Inc.

Inventor: Ovidiu Serban
Reducing perceived effects of non-voice data in digital speech

Patent number: 11990144

Abstract: Non-voice data is embedded in a voice bit stream that includes frames of voice bits by selecting a frame of voice bits to carry the non-voice data, placing non-voice identifier bits in a first portion of the voice bits in the selected frame, and placing the non-voice data in a second portion of the voice bits in the selected frame. The non-voice identifier bits are employed to reduce a perceived effect of the non-voice data on audible speech produced from the voice bit stream.

Type: Grant

Filed: July 28, 2021

Date of Patent: May 21, 2024

Assignee: Digital Voice Systems, Inc.

Inventor: John C. Hardwick
Methods and apparatus to perform speed-enhanced playback of recorded media

Patent number: 11869542

Abstract: Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.

Type: Grant

Filed: March 7, 2022

Date of Patent: January 9, 2024

Assignee: The Nielsen Company (US), LLC

Inventor: Venugopal Srinivasan
Audio processing for voice encoding and decoding using spectral shaper model

Patent number: 11621009

Abstract: The present disclosure relates to an audio encoding and decoding (codec) system for voice encoding/decoding using a spectral shaper model. In an embodiment, a method of audio signal decoding comprises: receiving a bit stream associated with an audio signal, the bit stream including encoded transform coefficients, spectral envelope data and one or more parameters of a spectral shaper model, the spectral shaper model indicative of a fundamental frequency of a multi-sinusoidal signal model, where the fundamental frequency corresponds to a time domain delay; decoding the encoded transform coefficients; adjusting the decoded transform coefficients using the spectral envelope data and the spectral shaper model; reconstructing transform coefficients of the audio signal using the adjusted, decoded transform coefficients; and transforming the reconstructed transform coefficients into a time domain audio signal.

Type: Grant

Filed: December 18, 2019

Date of Patent: April 4, 2023

Assignee: Dolby International AB

Inventors: Lars Villemoes, Janusz Klejsa, Per Hedelin
Voice processing method, apparatus, electronic device, and storage medium

Patent number: 11587574

Abstract: Provided in the present disclosure are a voice processing method, an apparatus, an electronic device, and a storage medium, the method comprising: detecting the working state of a current call system, and when the working state is a two-end speaking state or a remote-end speaking state, performing compression processing on a subsequent remote-end voice signal, acquiring a near-end voice signal by means of a microphone, performing echo processing on the basis of the near-end voice signal and the compression-processed remote-end voice signal to obtain an echo-processed near-end voice signal and a remaining echo signal, performing non-linear suppression processing on the near-end voice signal and the remaining echo signal, and performing gain control on the suppression-processed near-end voice signal.

Type: Grant

Filed: August 17, 2021

Date of Patent: February 21, 2023

Assignee: Beijing Dajia Internet Information Technology Co., Ltd.

Inventors: Chen Zhang, Pei Dong
Vowel sensing voice activity detector

Patent number: 11587579

Abstract: Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound.

Type: Grant

Filed: August 5, 2021

Date of Patent: February 21, 2023

Assignee: PLANTRONICS, INC.

Inventor: Arthur Leland Schiro
Techniques for spatially selective wake-up word recognition and related systems and methods

Patent number: 11437020

Abstract: According to some aspects, a system for detecting a designated wake-up word is provided, the system comprising a plurality of microphones to detect acoustic information from a physical space having a plurality of acoustic zones, at least one processor configured to receive a first acoustic signal representing the acoustic information received by the plurality of microphones, process the first acoustic signal to identify content of the first acoustic signal originating from each of the plurality of acoustic zones, provide a plurality of second acoustic signals, each of the plurality of second acoustic signals substantially corresponding to the content identified as originating from a respective one of the plurality of acoustic zones, and performing automatic speech recognition on each of the plurality of second acoustic signals to determine whether the designated wake-up word was spoken.

Type: Grant

Filed: February 10, 2016

Date of Patent: September 6, 2022

Assignee: CERENCE OPERATING COMPANY

Inventors: Julien Prémont, Tim Haulick, Emanuele Dalmasso, Munir Nikolai Alexander Georges, Andreas Kellner, Gaetan Martens, Oliver Van Porten, Holger Quast, Martin Roessler, Tobias Wolff, Markus Buck
Text-to-speech (TTS) processing with transfer of vocal characteristics

Patent number: 11410684

Abstract: Audio data from a first, source speaker is received and processed to determine linguistic units and vocal characteristics corresponding to those linguistic units. The linguistic units may either be determined from received text data or may be determined from the audio data using automatic speech recognition. A model is trained using training data from a second, target speaker. The trained model concatenates the linguistic units with the vocal characteristics to produce output speech that has the “voice” of the target speaker and the vocal characteristics of the source speaker.

Type: Grant

Filed: June 4, 2019

Date of Patent: August 9, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Viacheslav Klimkov, Thomas Renaud Drugman, Alexander Galkin, Srikanth Ronanki
Vehicle-based media system with audio ad and visual content synchronization feature

Patent number: 11362747

Abstract: In one aspect, an example method to be performed by a vehicle-based media system includes (a) receiving audio content; (b) causing one or more speakers to output the received audio content; (c) using a microphone of the vehicle-based media system to capture the output audio content; (d) identifying reference audio content that has at least a threshold extent of similarity with the captured audio content; (e) identifying visual content based at least on the identified reference audio content; and (f) outputting, via a user interface of the vehicle-based media system, the identified visual content.

Type: Grant

Filed: January 28, 2021

Date of Patent: June 14, 2022

Assignee: Gracenote, Inc.

Inventor: Nisarg A. Modi
Voice activity detection method

Patent number: 11315591

Abstract: The present invention relates to the field of voice activity detection technologies, and more particularly, to a voice activity detection method. The method comprises: providing an acquisition unit for acquiring an external sound signal; providing a judgment unit for judging whether the sound signal is a voice signal; if the sound signal is a voice signal, starting a voice processing unit for processing the sound signal; if the sound signal is not a voice signal, the voice processing unit is kept in a sleep state. With the voice activity detection method, the voice processing unit with large power consumption is made to be in a sleep state for a long time, and therefore, the entire system can be kept in a low-power consumption state; the voice activity detection method is low in implementation cost and can obtain better performance with a small amount of calculation and less resource consumption.

Type: Grant

Filed: December 18, 2019

Date of Patent: April 26, 2022

Assignee: AMLOGIC (SHANGHAI) CO., LTD.

Inventors: Jinhong Zhang, Jiye Ye
Methods and apparatus to perform speed-enhanced playback of recorded media

Patent number: 11270735

Abstract: Methods, apparatus, systems and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example media playback devices disclosed herein are to determine a target number of frames of recorded media to drop during playback of the recorded media, the target number determined based on a difference between (1) a total number of frames of the recorded media and (2) a ratio of the total number of frames of the recorded media to a target playback rate. Disclosed example media playback devices are also to select a subset of the frames of the recorded media to drop during the playback of the recorded media, the frames selector to select the subset of the frames based on the target number of frames to drop and skip bytes included in the subset of frames.

Type: Grant

Filed: November 16, 2020

Date of Patent: March 8, 2022

Assignee: The Nielsen Company (US), LLC

Inventor: Venugopal Srinivasan
Audio device with speech-based audio signal processing

Patent number: 11172294

Abstract: An audio device with an electro-acoustic transducer and a processor that is configured to determine if input audio signals are speech-based, and if the input audio signals are determined to be speech-based apply speech dynamic range compression to the input audio signals, to develop revised audio signals. The revised audio signals are provided to the transducer.

Type: Grant

Filed: December 27, 2019

Date of Patent: November 9, 2021

Assignee: Bose Corporation

Inventor: Michael Stark
Apparatus and method for determining a predetermined characteristic related to a spectral enhancement processing of an audio signal

Patent number: 11170794

Abstract: An apparatus for determining a predetermined characteristic related to a spectral enhancement processing of an audio signal includes a deriver configured for obtaining a spectrum of the audio signal and for deriving a local maximum signal from the spectrum. The apparatus includes a determiner configured for determining a similarity between segments of the local maximum signal and includes a processor for providing an information indicating that the audio signal includes the predetermined characteristic dependent on an evaluation of the similarity.

Type: Grant

Filed: September 27, 2019

Date of Patent: November 9, 2021

Assignee: Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.

Inventors: Patrick Gampp, Christian Uhle, Sascha Disch, Antonios Karampourniotis, Julia Havenstein, Oliver Hellmuth, Juergen Herre, Peter Prokein
Autonomous intelligent radio

Patent number: 11152016

Abstract: Embodiments of the disclosed technologies include finding content of interest in an RF spectrum by automatically scanning the RF spectrum; detecting, in a range of frequencies of the RF spectrum that includes one or more undefined channels, a candidate RF segment; where the candidate RF segment includes a frequency-bound time segment of electromagnetic energy; executing a machine learning-based process to determine, for the candidate RF segment, signal characterization data indicative of one or more of: a frequency range, a modulation type, a timestamp; using the signal characterization data to determine whether audio contained in the candidate RF segment corresponds to a search criterion; in response to determining that the candidate RF segment corresponds to the search criterion, outputting, through an electronic device, data indicative of the candidate RF segment; where the data indicative of the candidate RF segment is output in a real-time time interval after the candidate RF segment is detected.

Type: Grant

Filed: May 8, 2019

Date of Patent: October 19, 2021

Assignee: SRI INTERNATIONAL

Inventors: Aaron D. Lawson, Harry Bratt, Mitchell L. McLaren, Martin Graciarena
Acoustical in-cabin noise cancellation system for far-end telecommunications

Patent number: 11146887

Abstract: An in-vehicle noise-cancellation system may optimize far-end user experience. The noise-cancellation system may incorporate real-time acoustic input from the vehicle, as well microphones from a telecommunications device. Audio signals from small, embedded microphones mounted in the vehicle can be processed and mixed into an outgoing telecom signal to effectively cancel acoustic energy from one or more unwanted sources in the vehicle. Multiple microphones may be mounted to headrests and spaced apart in one or more directions to give an indication of the direction of incoming sound from one or more listening zones so that sounds from certain zones may be suppressed. Unwanted noise captured by the embedded microphones may be used as direct inputs to the noise-cancellation system. As direct inputs, these streams can, therefore, be cancelled from the outgoing telecom signal, thus providing the user's far-end correspondent with much higher signal-to-noise ratio, call quality, and speech intelligibility.

Type: Grant

Filed: December 31, 2018

Date of Patent: October 12, 2021

Assignee: Harman International Industries, Incorporated

Inventors: Riley Winton, Chris Ludwig, Gorm H. Jorgensen, Lars Goller, Morten Lydolf
Vowel sensing voice activity detector

Patent number: 11120821

Abstract: Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound.

Type: Grant

Filed: August 8, 2016

Date of Patent: September 14, 2021

Assignee: Plantronics, Inc.

Inventor: Arthur Leland Schiro
Estimation of background noise in audio signals

Patent number: 11114105

Abstract: Background noise estimators and methods are disclosed for estimating background noise in an audio signal. Some methods include obtaining at least one parameter associated with an audio signal segment, such as a frame or part of a frame, based on a first linear prediction gain, calculated as a quotient between a residual signal from a 0th-order linear prediction and a residual signal from a 2nd-order linear prediction for the audio signal segment. A second linear prediction gain is calculated as a quotient between a residual signal from a 2nd-order linear prediction and a residual signal from a 16th-order linear prediction for the audio signal segment. Whether the audio signal segment comprises a pause is determined based at least on the obtained at least one parameter; and a background noise estimate is updated based on the audio signal segment when the audio signal segment comprises a pause.

Type: Grant

Filed: May 10, 2019

Date of Patent: September 7, 2021

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventor: Martin Sehlstedt
Method and system for generating advanced feature discrimination vectors for use in speech recognition

Patent number: 11056097

Abstract: A computer-implemented method of generating advanced feature discrimination vectors (AFDVs) representing sounds forming part of an audio signal input to a device is provided. The method includes taking a plurality of samples of the audio signal, and for each sample of the audio signal taken: performing a signal analysis on the sample to extract one or more high resolution oscillator peaks therefrom; renormalizing the extracted oscillator peaks to eliminate variations in the fundamental frequency and time duration for each sample occurring over the window; normalizing the power of the renormalized extracted oscillator peaks; and forming the renormalized and power normalized extracted oscillator peaks into a respective AFDV for the sample. The method further includes outputting the respective AFDV to a comparison function configured to identify a characteristic of the sample based on a comparison of the respective AFDV with a library of AFDVs associated with known sounds and/or known speakers.

Type: Grant

Filed: July 23, 2019

Date of Patent: July 6, 2021

Assignee: XMOS INC.

Inventors: Kevin M. Short, Brian Hone
Voice message categorization and tagging

Patent number: 11011166

Abstract: In embodiments, a method includes: defining a plurality of visual icons; receiving a voice message, metadata and voice tagging criteria from a first client device; converting the voice message to text; generating a summary of the converted text; generating keywords based on the summary; categorizing the voice message into categories based on the keywords and the metadata; selecting visual icons based on the categories and the tagging criteria; tagging the voice message with the selected visual icons; and sending the tagged voice message to a second client device.

Type: Grant

Filed: November 29, 2018

Date of Patent: May 18, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Si Bin Fan, Yu Gu, Su Liu, Yun Xiang
Randomized input device heating

Patent number: 11010491

Abstract: A security code input may be obfuscated from a thermal imaging device by randomly heating a random set of inputs of an input device. The security code is inputted on an input device, which communicates with a security system to grant or deny access to a user based on an entry of the security code. The input device includes a plurality of hearing elements. The input device may receive an input from the user. A random set of heating elements including one or more heating elements, are generated from the plurality of heating elements. A temperature is determined for the one or more heating elements of the random set of heating elements. The temperature is then applied to the one or more heating elements of the random set of heating elements of the input device.

Type: Grant

Filed: April 9, 2019

Date of Patent: May 18, 2021

Assignee: International Business Machines Corporation

Inventor: Brendan Murray
Noise filling without side information for CELP-like coders

Patent number: 10984810

Abstract: An audio decoder provides a decoded audio information on the basis of an encoded audio information including linear prediction coefficients (LPC) and includes a tilt adjuster to adjust a tilt of a noise using linear prediction coefficients of a current frame to acquire a tilt information and a noise inserter configured to add the noise to the current frame in dependence on the tilt information. Another audio decoder includes a noise level estimator to estimate a noise level for a current frame using a linear prediction coefficient of at least one previous frame to acquire a noise level information; and a noise inserter to add a noise to the current frame in dependence on the noise level information provided by the noise level estimator. Thus, side information about a background noise in the bit-stream may be omitted. Methods and computer programs serve a similar purpose.

Type: Grant

Filed: February 26, 2019

Date of Patent: April 20, 2021

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Guillaume Fuchs, Christian Helmrich, Manuel Jander, Benjamin Schubert, Yoshikazu Yokotani
Time domain spectral bandwidth replication

Patent number: 10978083

Abstract: A wireless audio system for encoding and decoding an audio signal using spectral bandwidth replication is provided. Bandwidth extension is performed in the time-domain, enabling low-latency audio coding.

Type: Grant

Filed: November 13, 2019

Date of Patent: April 13, 2021

Assignee: Shure Acquisition Holdings, Inc.

Inventors: Wenshun Tian, Michael Ryan Lester
Speech extraction method, system, and device based on supervised learning auditory attention

Patent number: 10923136

Abstract: A speech extraction method based on the supervised learning auditory attention includes: converting an original overlapping speech signal into a two-dimensional time-frequency signal representation by a short-time Fourier transform to obtain a first overlapping speech signal; performing a first sparsification on the first overlapping speech signal, mapping intensity information of a time-frequency unit of the first overlapping speech signal to preset D intensity levels, and performing a second sparsification on the first overlapping speech signal based on information of the preset D intensity levels to obtain a second overlapping speech signal; converting the second overlapping speech signal into a pulse signal by a time coding method; extracting a target pulse from the pulse signal by a trained target pulse extraction network; converting the target pulse into a time-frequency representation of the target speech to obtain the target speech by an inverse short-time Fourier transform.

Type: Grant

Filed: April 19, 2019

Date of Patent: February 16, 2021

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jiaming Xu, Yating Huang, Bo Xu
System for shortening audio playback times

Patent number: 10878835

Abstract: Techniques for reducing the time used to output audio or video include increasing the output rate for portions of the content to reduce the time needed for a user to consume the content. Audio data is analyzed to determine portions that are silent and portions that include speech, such as by determining amplitude and other signal characteristics. Portions that include speech are analyzed to determine an initial rate of speech associated with each portion. Speech characteristics of the portions, such as language, volume, and pitch are used to determine a target maximum comprehensible rate of speech for a user. When audio output is generated, the portions that represent speech are each output at a rate that results in the speech being presented within a threshold of the target rate. The portions that represent silence may be removed, shortened, or output at a faster rate than the portions that represent speech.

Type: Grant

Filed: November 16, 2018

Date of Patent: December 29, 2020

Assignee: AMAZON TECHNOLOGIES, INC

Inventors: Pedro Rodrigues Nacione Pedruzzi, Marcos Vinicius Romero, Gabriel Pereira Borges, Arthur Silva Freire, Rafael Jessen Werneck De Almeida Martins
Response sentence generation apparatus, method and program, and voice interaction system

Patent number: 10861458

Abstract: A response sentence generation apparatus includes a conversion device for converting an input voice of a user into text information, an extraction device for extracting prosodic information from the input voice, a specifying device for specifying an emotion occurrence word indicating an occurrence of an emotion of the user based on the text information and the prosodic information, and a generation device for selecting a character string including the specified emotion occurrence word from the text information and generating a response sentence by performing predetermined processing on the selected character string.

Type: Grant

Filed: November 9, 2018

Date of Patent: December 8, 2020

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Narimasa Watanabe
Methods and apparatus to perform speed-enhanced playback of recorded media

Patent number: 10839854

Abstract: Methods, apparatus, systems and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example media playback devices disclosed herein include a target frames calculator to determine a target number of frames of recorded media to drop during playback of the recorded media based on a target playback rate and a total number of frames of the recorded media. Disclosed example media playback devices also include a frames selector to select a subset of the frames of the recorded media to drop during the playback of the recorded media, the frames selector to select the subset of the frames based on the target number of frames to drop and skip bytes included in the subset of frames.

Type: Grant

Filed: April 6, 2018

Date of Patent: November 17, 2020

Assignee: The Nielsen Company (US), LLC

Inventor: Venugopal Srinivasan
Precoding matrix channel information transmission apparatus and method, and system

Patent number: 10819485

Abstract: A channel information transmission apparatus and method, and a system are disclosed to improve channel information feedback precision. A first device includes: a receiving module, configured to receive a reference signal from a second device; a processing module, configured to obtain a precoding matrix based on the reference signal, and generate first channel information and second channel information that are used to indicate the precoding matrix; and a sending module, configured to send the first channel information and the second channel information to the second device.

Type: Grant

Filed: January 29, 2019

Date of Patent: October 27, 2020

Assignee: Huawei Technologies Co., Ltd.

Inventors: Di Zhang, Kunpeng Liu
Acoustic event detector with reduced resource consumption

Patent number: 10789941

Abstract: Techniques are provided for efficient acoustic event detection with reduced resource consumption. A methodology implementing the techniques according to an embodiment includes calculating frames of power spectra based on segments of received acoustic signals. The method further includes two processes, one for detecting impulsive acoustic events and another for detecting continuous acoustic events. The first process includes generating impulsive acoustic event features associated with first and second power spectrum frames, applying a neural network classifier to the impulsive acoustic event features to generate event scores, and detecting an impulsive acoustic event based on those event scores.

Type: Grant

Filed: September 28, 2018

Date of Patent: September 29, 2020

Assignee: Intel Corporation

Inventors: Kuba Lopatka, Mateusz Kotarski, Tobias Bocklet, Marek Zabkiewicz
Systems and methods of detecting speech activity of headphone user

Patent number: 10762915

Abstract: A headphone system is provided that includes a first earpiece, a first front microphone coupled to the first earpiece to receive a first acoustic signal, a first rear microphone coupled to the first earpiece to receive a second acoustic signal, the second acoustic signal being toward the rear of a user's head relative to the first acoustic signal, and a detection circuit coupled to the first front and rear microphones and configured to compare a front signal derived from the first front microphone to a rear signal derived from the first rear microphone and to selectively indicate that the user is speaking based at least in part upon the comparison.

Type: Grant

Filed: June 17, 2019

Date of Patent: September 1, 2020

Assignee: BOSE CORPORATION

Inventors: Xiang-Ern Yeo, Mehmet Ergezer, Alaganandan Ganeshkumar
Method and device for spectral expansion for an audio signal

Patent number: 10636436

Abstract: A method and device for automatically increasing the spectral bandwidth of an audio signal including generating a “mapping” (or “prediction”) matrix based on the analysis of a reference wideband signal and a reference narrowband signal, the mapping matrix being a transformation matrix to predict high frequency energy from a low frequency energy envelope, generating an energy envelope analysis of an input narrowband audio signal, generating a resynthesized noise signal by processing a random noise signal with the mapping matrix and the envelope analysis, high-pass filtering the resynthesized noise signal, and summing the high-pass filtered resynthesized noise signal with the original an input narrowband audio signal. Other embodiments are disclosed.

Type: Grant

Filed: July 27, 2018

Date of Patent: April 28, 2020

Assignee: Staton Techiya, LLC

Inventors: John Usher, Dan Ellis
Voice activity detection using air conduction and bone conduction microphones

Patent number: 10535364

Abstract: A head-mounted wearable device incorporates a transducer that operates as a bone conduction (BC) microphone. Vibrations from a user's speech are transferred through the head of the user to the BC microphone. An air conduction (AC) microphone detects sound transferred via air. Signals from the BC microphone and the AC microphone are compared to determine if a common signal is present in both. For example, both signals may have a cross-correlation that exceeds a threshold value. Based on the comparison, voice activity data is generated that indicates the user wearing the device is speaking.

Type: Grant

Filed: September 8, 2016

Date of Patent: January 14, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Xuan Zhong, Bozhao Tan, Jianchun Dong, Chia-Jean Wang
Audio processing apparatus that outputs, among sounds surrounding user, sound to be provided to user

Patent number: 10510361

Abstract: An audio processing apparatus is provided that includes an acquirer that acquires a surrounding audio signal indicating a sound surrounding a user. The audio processing apparatus also includes an audio extractor that extracts, from the acquired surrounding audio signal, a providing audio signal indicating a sound to be provided to the user. The audio processing apparatus further includes an output that outputs a first audio signal, indicating a main sound, and the providing audio signal.

Type: Grant

Filed: March 3, 2016

Date of Patent: December 17, 2019

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventor: Kazuya Nomura
Detection of acoustic impulse events in voice applications using a neural network

Patent number: 10475471

Abstract: In accordance with embodiments of the present disclosure, an integrated circuit for implementing at least a portion of an audio device may include an audio output configured to reproduce audio information by generating an audio output signal for communication to at least one transducer of the audio device, a microphone input configured to receive an input signal indicative of ambient sound external to the audio device, and a processor configured to implement an impulsive noise detector. The impulsive noise detector may comprise a plurality of processing blocks for determining a feature vector based on characteristics of the input signal and a neural network for determining based on the feature vector whether the impulsive event comprises a speech event or a noise event.

Type: Grant

Filed: May 1, 2017

Date of Patent: November 12, 2019

Assignee: Cirrus Logic, Inc.

Inventor: Samuel Pon Varma Ebenezer
Signal processor

Patent number: 10453469

Abstract: A signal processor comprising: a modelling block, configured to receive a frequency-domain-input-signal, a fundamental-frequency-signal representative of a fundamental frequency of the frequency-domain-input-signal; and configured to provide a pitch-model-signal based on a periodic function, the pitch-model-signal spanning a plurality of discrete frequency bins, each discrete frequency bin having a respective discrete frequency bin index, wherein within each discrete frequency bin the pitch-model-signal is defined by: the periodic function; the fundamental frequency; the frequency-domain-input-signal; and the respective discrete frequency bin index. The signal processor further comprises a manipulation block, configured to provide an output-signal based on the frequency-domain-input-signal and the pitch-model-signal.

Type: Grant

Filed: March 26, 2018

Date of Patent: October 22, 2019

Assignee: NXP B.V.

Inventors: Nilesh Madhu, Wouter Joos Tirry
Speech enhancement method and apparatus for same

Patent number: 10431241

Abstract: A speech enhancement method is provided. The speech enhancement method includes: estimating a direction of a speaker by using an input signal, generating direction information indicating the estimated direction, detecting speech of a speaker based on a result of the estimating the direction, and enhancing the speech of the speaker by using the direction information based on a result of the detecting the speech.

Type: Grant

Filed: May 30, 2014

Date of Patent: October 1, 2019

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jae-youn Cho, Weiwei Cui, Seung-yeol Lee
Apparatus and method for synthesizing an audio signal, decoder, encoder, system and computer program

Patent number: 10431232

Abstract: A method and an apparatus for synthesizing an audio signal are described. A spectral tilt is applied to the code of a codebook used for synthesizing a current frame of the audio signal. The spectral tilt is based on the spectral tilt of the current frame of the audio signal. Further, an audio decoder operating in accordance with the inventive approach is described.

Type: Grant

Filed: July 28, 2015

Date of Patent: October 1, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Guillaume Fuchs, Tom Baeckstroem, Ralf Geiger, Wolfgang Jaegers, Emmanuel Ravelli
Methods for processing and analyzing a signal, and devices implementing such methods

Patent number: 10388302

Abstract: A method for processing an initial signal includes a useful signal and added noise, which comprises a step of frequency selective analysis providing starting from initial signal a plurality of wideband analysis signals corresponding to one of the analyzed frequencies, and comprising the following actions: zero or more complex frequency translations, one or more undersampling operations, computation of the instantaneous Amplitude, of the instantaneous Phase, and of the instantaneous Frequency of the wideband analysis signals. This information then allow to detect modulations of signals included in high levels of noise and to detect with a good probability the presence of a signal in a high level of noise.

Type: Grant

Filed: December 23, 2015

Date of Patent: August 20, 2019

Inventor: Yves Reza
Time-based frequency tuning of analog-to-information feature extraction

Patent number: 10373608

Abstract: A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

Type: Grant

Filed: October 22, 2015

Date of Patent: August 6, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Zhenyong Zhang, Wei Ma
Audio encoder and decoder using a frequency domain processor with full-band gap filling and a time domain processor

Patent number: 10332535

Abstract: An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal havi

Type: Grant

Filed: January 24, 2017

Date of Patent: June 25, 2019

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Sascha Disch, Martin Dietz, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Matthias Neusinger, Markus Schnell, Benjamin Schubert, Bernhard Grill
Transient detection with hangover indicator for encoding an audio signal

Patent number: 10311883

Abstract: A transient detector (100) analyzes (110) a given frame n of the input audio signal to determine, based on audio signal characteristics of the given frame n, a transient hangover indicator for a following frame n+1, and signals (120) the determined transient hangover indicator to an associated audio encoder (10) to enable proper encoding of the following frame n+1.

Type: Grant

Filed: October 18, 2016

Date of Patent: June 4, 2019

Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)

Inventors: Anisse Taleb, Gustaf Ullberg
Noise filling without side information for CELP-like coders

Patent number: 10269365

Abstract: An audio decoder provides a decoded audio information on the basis of an encoded audio information including linear prediction coefficients (LPC) and includes a tilt adjuster to adjust a tilt of a noise using linear prediction coefficients of a current frame to acquire a tilt information and a noise inserter configured to add the noise to the current frame in dependence on the tilt information. Another audio decoder includes a noise level estimator to estimate a noise level for a current frame using a linear prediction coefficient of at least one previous frame to acquire a noise level information; and a noise inserter to add a noise to the current frame in dependence on the noise level information provided by the noise level estimator. Thus, side information about a background noise in the bit-stream may be omitted. Methods and computer programs serve a similar purpose.

Type: Grant

Filed: July 28, 2015

Date of Patent: April 23, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Guillaume Fuchs, Christian Helmrich, Manuel Jander, Benjamin Schubert, Yoshikazu Yokotani
Social networking with assistive technology device

Patent number: 10249288

Abstract: An approach is provided that assists visually impaired users. The approach analyzes a document that is being utilized by the visually impaired user. The analysis derives a sensitivity of the document. A vocal characteristic corresponding to the derived sensitivity is retrieved. Text from the document is audibly read to the visually impaired user with a text to speech process that utilizes the retrieved vocal characteristic. The retrieved vocal characteristic conveys the derived sensitivity of the document to the visually impaired user.

Type: Grant

Filed: June 16, 2018

Date of Patent: April 2, 2019

Assignee: International Business Machines Corporation

Inventors: Maureen E. Kraft, Fang Lu, Azadeh Salehi, Weisong Wang
Automatic and selective context-based gating of a speech-output function of an electronic digital assistant

Patent number: 10248379

Abstract: A process at an electronic computing device (device) for automatic and selective context-based gating of an electronic digital assistant speech-output function includes detecting, via a sensor communicatively coupled to the device or via a notification received at the device, an event associated with a need to reduce a cognitive load on the user. Responsively, temporarily gating a portion of an electronic digital assistant speech-output function to prevent reproduction to the user of one or more first speech-output events generated during a first period of time associated with the event. Subsequently, and after detecting that the event has ended, ending the temporary gating to allow reproduction of one or more second speech-output events generated during a second period of time after the event has ended.

Type: Grant

Filed: July 27, 2017

Date of Patent: April 2, 2019

Assignee: MOTOROLA SOLUTIONS, INC.

Inventor: Paul D. Steinberg
Detection of acoustic impulse events in voice applications

Patent number: 10242696

Abstract: In accordance with embodiments of the present disclosure, an integrated circuit for implementing at least a portion of an audio device may include an audio output configured to reproduce audio information by generating an audio output signal for communication to at least one transducer of the audio device, a microphone input configured to receive an input signal indicative of ambient sound external to the audio device and a processor configured to implement an impulsive noise detector. The impulsive noise detector may include a sudden onset detector for predicting an occurrence of a signal burst event of the input signal and an impulsive detector for determining whether the signal burst event comprises a speech event or a noise event.

Type: Grant

Filed: October 11, 2016

Date of Patent: March 26, 2019

Assignee: Cirrus Logic, Inc.

Inventor: Samuel Pon Varma Ebenezer

1 2 3 4 5 … next