Perceptual Measures For Quality Assessment (epo) Patents (Class 704/E19.002)

Methods and apparatus to determine audio quality

Patent number: 11948598

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to determine audio quality. Example apparatus disclosed herein include an equalization (EQ) model query generator to generate a query to a neural network, the query including a representation of a sample of an audio signal. Example apparatus disclosed herein also include an EQ analyzer to access a plurality of equalization settings determined by the neural network based on the query; and compare the equalization settings to an equalization threshold to determine if the audio signal is to be removed from subsequent processing.

Type: Grant

Filed: October 22, 2021

Date of Patent: April 2, 2024

Assignee: Gracenote, Inc.

Inventor: Markus Kurt Cremer
Multichannel audio speech classification

Patent number: 11900961

Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.

Type: Grant

Filed: May 31, 2022

Date of Patent: February 13, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oron Nir, Inbal Sagiv, Maayan Yedidia, Fardau Van Neerden, Itai Norman
Broadcasting signal frame generation apparatus and method using layered divisional multiplexing

Patent number: 11778076

Abstract: An apparatus and method for broadcast signal frame using layered division multiplexing are disclosed. An apparatus for generating broadcast signal frame according to an embodiment of the present invention includes a combiner configured to generate a multiplexed signal by combining a core layer signal and an enhanced layer signal at different power levels; a power normalizer configured to reduce the power of the multiplexed signal to a power level corresponding to the core layer signal; a time interleaver configured to generate a time-interleaved signal by performing interleaving that is applied to both the core layer signal and the enhanced layer signal; and a frame builder configured to generate a broadcast signal frame including a preamble for signaling time interleaver information shared by the core layer signal and the enhanced layer signal, using the time-interleaved signal.

Type: Grant

Filed: August 8, 2022

Date of Patent: October 3, 2023

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Jae-Young Lee, Heung-Mook Kim, Sung-Ik Park, Sun-Hyoung Kwon
Automatically generating speech markup language tags for text

Patent number: 11380300

Abstract: In particular embodiments, an apparatus comprises a non-transitory computer-readable storage media and a processor coupled to the media executes instructions to: access a plurality of text, generate, using one or more natural language understanding (NLU) models, one or more scores for at least a portion of the plurality of text. The apparatus determines, based on the scores, one or more prosodic values corresponding to the portion of the plurality of text. The apparatus determines, based on the one or more prosodic values, one or more speech synthesis markup language (SSML) tags. The apparatus then generates, based on the prosodic values, SSML-tagged data comprising each determined SSML tag and that tag's location in the plurality of text.

Type: Grant

Filed: January 30, 2020

Date of Patent: July 5, 2022

Assignee: SAMSUNG ELECTRONICS COMPANY, LTD.

Inventors: Vinod Cherian Joseph, Varun Nambikrishnan
Audio quality estimation method, audio quality estimation apparatus, and computer readable recording medium recording a program

Patent number: 8868222

Abstract: An audio quality estimation apparatus includes an audio packet loss frequency calculation unit (11) which, when at least one audio packet to be assessed exists in singly or continuously generated IP packet losses, calculates an audio packet loss frequency based on information of received IP packets by counting the packet losses as an audio packet loss of one time regardless of the continuous length, an average influence time calculation unit (12) which calculates, based on information of received IP packets, an average influence time serving as an average time during which audio quality is influenced when the audio packet loss frequency is 1, and a subjective quality assessment value estimation unit (22) which estimates a subjective quality assessment value based on the audio packet loss frequency and average influence time.

Type: Grant

Filed: May 25, 2009

Date of Patent: October 21, 2014

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Noritsugu Egi, Takanori Hayashi
Method for qualitative evaluation of a digital audio signal

Patent number: 8606385

Abstract: The invention relates to a method of qualitatively evaluating a digital audio signal. It calculates a quality indicator consisting of a vector associated with each time window in real time, in continuous time, and in successive time windows. For example, the generation of a quality indicator vector calculates, for a reference audio signal and for an audio signal to be evaluated, the spectral power density of the audio signal, the coefficients of a prediction filter, using an autoregressive method, a temporal activity of the signal or the minimum value of the spectrum in successive blocks of the signal. To evaluate the deterioration of the audio signal, the method may calculate a distance between the vectors of the reference audio signal and the audio signal to be evaluated associated with each time window.

Type: Grant

Filed: August 26, 2011

Date of Patent: December 10, 2013

Assignee: Telediffusion de France

Inventor: Alexandre Joly
Method and apparatus for estimating speech quality

Patent number: 8145205

Abstract: A method of estimating the quality of speech information associated with a voice call over a communication system comprising a core network and an access network where speech information is carried between the access network and the core network and within the core network in frames. The method comprises determining a rate of frame loss for frames transported between the access network and the core network and/or within the core network, and mapping the rate of frame loss to a quality estimation value using data collected by simulating frame loss on representative speech samples and determining quality estimation values for the damaged speech samples.

Type: Grant

Filed: October 17, 2005

Date of Patent: March 27, 2012

Assignee: Telefonaktiebolaget L M Ericsson (publ)

Inventors: Mika Väisänen, Peter Jungner, Johan Fagerström
System and method for dynamic modification of speech intelligibility scoring

Patent number: 8098833

Abstract: A system and method to detect and measure remediated speech intelligibility by evaluating received test audio transmitted across and received in a space or region of interest. Remediation of the test audio may include altering the rate, pitch, amplitude and frequency bands energy during presentation of the speech signal.

Type: Grant

Filed: January 29, 2007

Date of Patent: January 17, 2012

Assignee: Honeywell International Inc.

Inventors: Philip J. Zumsteg, D. Michael Shields
Method for qualitative evaluation of a digital audio signal

Patent number: 8036765

Abstract: The invention relates to a method of qualitatively evaluating a digital audio signal. It calculates a quality indicator consisting of a vector associated with each time window in real time, in continuous time, and in successive time windows. For example, the generation of a quality indicator vector calculates, for a reference audio signal and for an audio signal to be evaluated, the spectral power density of the audio signal, the coefficients of a prediction filter, using an autoregressive method, a temporal activity of the signal or the minimum value of the spectrum in successive blocks of the signal. To evaluate the deterioration of the audio signal, the method may calculate a distance between the vectors of the reference audio signal and the audio signal to be evaluated associated with each time window.

Type: Grant

Filed: January 23, 2003

Date of Patent: October 11, 2011

Assignee: Telediffusion de France

Inventor: Alexandre Joly
PSYCHOACOUSTIC TIME ALIGNMENT

Publication number: 20100169079

Abstract: A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.

Type: Application

Filed: December 30, 2008

Publication date: July 1, 2010

Applicant: AUDIOCODES LTD.

Inventors: Ilan Shallom, Nitay Shiran
METHOD AND SYSTEM FOR DECODING WCDMA AMR SPEECH DATA USING REDUNDANCY

Publication number: 20100153103

Abstract: WCDMA speech data is received over a plurality of channels each with at least one bit-sequence generated using a channel decoding such as a convolution decoding. At least one junction is selected in the generated at least one bit-sequence using a determined channel metric and/or physical constraint metric. Bits in the generated at least one bit-sequence are concatenated based on redundancy and the selected junctions to form at least one speech stream. A single speech stream is selected based on speech constraints for voice decoding. The at least one bit-sequence is selected, for example, using a maximum likelihood metric, by searching starting from a selected junction corresponding to a highest junction metric value. The selected at least one bit-sequence is verified using a selected redundancy verification parameter. The single speech stream is formed using the selected at least one bit-sequence over different channels for voice decoding.

Type: Application

Filed: November 18, 2009

Publication date: June 17, 2010

Inventor: Arie Heiman
Speech Enhancement in Entertainment Audio

Publication number: 20100121634

Abstract: The invention relates to audio signal processing. More specifically, the invention relates to enhancing entertainment audio, such as television audio, to improve the clarity and intelligibility of speech, such as dialog and narrative audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

Type: Application

Filed: February 20, 2008

Publication date: May 13, 2010

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor: Hannes Muesch
Spectrum Harmonic/Noise Sharpness Control

Publication number: 20100063803

Abstract: A transmitted data that includes audio data and a transmitted spectral sharpness parameter representing a spectral harmonic/noise sharpness of a plurality of subbands are received. A measured spectral sharpness parameter is estimated from received audio data. The transmitted spectral sharpness parameter is compared with the measured spectral sharpness parameter. A main sharpness control parameter is formed for each of the decoded subbands. The main sharpness control parameter for each of the decoded subbands is analyzed. Ones of the decoded subbands are sharpened if the corresponding main sharpness control indicates that a corresponding subband is not sharp enough, wherein sharpened subbands are formed. Likewise, ones of the decoded subbands are flattened if the corresponding main sharpness control indicates that a corresponding subband is not flat enough, wherein flattened subbands are formed.

Type: Application

Filed: September 4, 2009

Publication date: March 11, 2010

Applicant: GH Innovation, Inc.

Inventor: Yang Gao
VOICE CODING DEVICE, VOICE DECODING DEVICE AND THEIR METHODS

Publication number: 20100017197

Abstract: It is an object to disclose a voice coding device, etc. in which the deterioration of a voice quality of a decoded signal can be reduced in the case that low frequency domain components of a spectrum are used for coding high frequency domain components and that no low frequency domain components exist.

Type: Application

Filed: November 1, 2007

Publication date: January 21, 2010

Applicant: PANASONIC CORPORATION

Inventor: Masahiro Oshikiri
SOUND FRAME LENGTH ADAPTATION

Publication number: 20090287479

Abstract: A method of producing time domain sound data (B) from sound parameters (A), the method comprising the steps of: forming first frames, each first frame containing sound parameters representing sound, —forming second frames from the first frames, each second frame containing transform domain sound data derived from the sound parameters, the transform domain sound data of each second frame representing sound having a specific time domain length, and each second frame having a length corresponding with an efficient inverse transform, inversely transforming the second frames into third frames (G1, G2, . . .

Type: Application

Filed: June 27, 2007

Publication date: November 19, 2009

Applicant: NXP B.V.

Inventors: Marek Szczerba, Andreas Gerrits, Marc Klein Middelink
SYSTEM, METHOD, AND PROGRAM PRODUCT FOR PROCESSING VOICE DATA IN A CONVERSATION BETWEEN TWO PERSONS

Publication number: 20090228268

Abstract: A system, method, and program product for processing voice data in a conversation between two persons to determine characteristic conversation patterns. The system includes: a variation calculator for calculating a variation of a speech ratio of a first speaker and a variation calculator for calculating a variation of a speech ratio of a second speaker; a difference calculator for calculating a difference data string; a smoother for generating a smoothed difference data string; and a presenter for presenting the difference between the variation of the speech ratio of the first speaker and the speech ratio of the second speaker. The method includes: calculating a variation of a speech ratio of a first speaker and a second speaker; calculating a difference data string; generating a smoothed difference data string; and grouping them according to their patterns.

Type: Application

Filed: March 6, 2009

Publication date: September 10, 2009

Inventors: Gakuto Kurata, Masafumi Nishimura
Economical Loudness Measurement of Coded Audio

Publication number: 20090067644

Abstract: Measuring the loudness of audio encoded in a bitstream that includes data from which an approximation of the power spectrum of the audio can be derived without fully decoding the audio is performed by deriving the approximation of the power spectrum of the audio from said bitstream without fully decoding the audio, and determining an approximate loudness of the audio in response to the approximation of the power spectrum of the audio. The data may include coarse representations of the audio and associated finer representations of the audio, the approximation of the power spectrum of the audio being derived from the coarse representations of the audio. In the case of subband encoded audio, the coarse representations of the audio may comprise scale factors and the associated finer representations of the audio may comprise sample data associated with each scale factor.

Type: Application

Filed: March 23, 2006

Publication date: March 12, 2009

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Brett Graham Crockett, Michael John Smithers, Alan Jeffrey Seefeldt
Method and Test Signal for Measuring Speech Intelligibility

Publication number: 20080255829

Abstract: A method and apparatus for estimating speech intelligibility in a mobile communications network component handling two-way communication between two ends of a signal path. Test signals adapted for speech intelligibility measurements are inserted into the signal path to simulate two-way communication. Double-talk is detected during the communication, and speech intelligibility measurements are performed only during periods of double-talk. This enables the effect of echo to be taken into account while avoiding undesirable effects from non-linear processing, and comfort noise if present, in the signal path. Voice enhancement devices may then be adjusted in response to the estimated speech intelligibility.

Type: Application

Filed: September 20, 2005

Publication date: October 16, 2008

Inventor: Jun Cheng
Method and Apparatus for Determining Audio Spatial Quality

Publication number: 20080249769

Abstract: Techniques for evaluating the audio quality of an audio test signal are disclosed. These techniques provide a quality analysis that takes into account spatial audio distortions between the audio test signal and a reference audio signal. These techniques involve, for example, determining a plurality of audio spatial cues for an audio test signal, determining a corresponding plurality of audio spatial cues for an audio reference signal, comparing the determined audio spatial cues of the audio test signal to the audio spatial cues of the audio reference signal, and determining the audio quality of the audio test signal.

Type: Application

Filed: April 4, 2007

Publication date: October 9, 2008

Inventor: Frank M. Baumgarte
Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking

Publication number: 20080221875

Abstract: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2.

Type: Application

Filed: May 19, 2008

Publication date: September 11, 2008

Applicant: Her Majesty in Right of Canada as Represented by the Minister of Industry

Inventors: Hossein Najaf-Zadeh, Hassan Lahdili, Louis Thibault, William Treurniet