Perceptual Measures For Quality Assessment (epo) Patents (Class 704/E19.002)
  • Patent number: 11948598
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to determine audio quality. Example apparatus disclosed herein include an equalization (EQ) model query generator to generate a query to a neural network, the query including a representation of a sample of an audio signal. Example apparatus disclosed herein also include an EQ analyzer to access a plurality of equalization settings determined by the neural network based on the query; and compare the equalization settings to an equalization threshold to determine if the audio signal is to be removed from subsequent processing.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: April 2, 2024
    Assignee: Gracenote, Inc.
    Inventor: Markus Kurt Cremer
  • Patent number: 11900961
    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: February 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Oron Nir, Inbal Sagiv, Maayan Yedidia, Fardau Van Neerden, Itai Norman
  • Patent number: 11778076
    Abstract: An apparatus and method for broadcast signal frame using layered division multiplexing are disclosed. An apparatus for generating broadcast signal frame according to an embodiment of the present invention includes a combiner configured to generate a multiplexed signal by combining a core layer signal and an enhanced layer signal at different power levels; a power normalizer configured to reduce the power of the multiplexed signal to a power level corresponding to the core layer signal; a time interleaver configured to generate a time-interleaved signal by performing interleaving that is applied to both the core layer signal and the enhanced layer signal; and a frame builder configured to generate a broadcast signal frame including a preamble for signaling time interleaver information shared by the core layer signal and the enhanced layer signal, using the time-interleaved signal.
    Type: Grant
    Filed: August 8, 2022
    Date of Patent: October 3, 2023
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Jae-Young Lee, Heung-Mook Kim, Sung-Ik Park, Sun-Hyoung Kwon
  • Patent number: 11380300
    Abstract: In particular embodiments, an apparatus comprises a non-transitory computer-readable storage media and a processor coupled to the media executes instructions to: access a plurality of text, generate, using one or more natural language understanding (NLU) models, one or more scores for at least a portion of the plurality of text. The apparatus determines, based on the scores, one or more prosodic values corresponding to the portion of the plurality of text. The apparatus determines, based on the one or more prosodic values, one or more speech synthesis markup language (SSML) tags. The apparatus then generates, based on the prosodic values, SSML-tagged data comprising each determined SSML tag and that tag's location in the plurality of text.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: July 5, 2022
    Assignee: SAMSUNG ELECTRONICS COMPANY, LTD.
    Inventors: Vinod Cherian Joseph, Varun Nambikrishnan
  • Patent number: 8868222
    Abstract: An audio quality estimation apparatus includes an audio packet loss frequency calculation unit (11) which, when at least one audio packet to be assessed exists in singly or continuously generated IP packet losses, calculates an audio packet loss frequency based on information of received IP packets by counting the packet losses as an audio packet loss of one time regardless of the continuous length, an average influence time calculation unit (12) which calculates, based on information of received IP packets, an average influence time serving as an average time during which audio quality is influenced when the audio packet loss frequency is 1, and a subjective quality assessment value estimation unit (22) which estimates a subjective quality assessment value based on the audio packet loss frequency and average influence time.
    Type: Grant
    Filed: May 25, 2009
    Date of Patent: October 21, 2014
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Noritsugu Egi, Takanori Hayashi
  • Patent number: 8606385
    Abstract: The invention relates to a method of qualitatively evaluating a digital audio signal. It calculates a quality indicator consisting of a vector associated with each time window in real time, in continuous time, and in successive time windows. For example, the generation of a quality indicator vector calculates, for a reference audio signal and for an audio signal to be evaluated, the spectral power density of the audio signal, the coefficients of a prediction filter, using an autoregressive method, a temporal activity of the signal or the minimum value of the spectrum in successive blocks of the signal. To evaluate the deterioration of the audio signal, the method may calculate a distance between the vectors of the reference audio signal and the audio signal to be evaluated associated with each time window.
    Type: Grant
    Filed: August 26, 2011
    Date of Patent: December 10, 2013
    Assignee: Telediffusion de France
    Inventor: Alexandre Joly
  • Patent number: 8145205
    Abstract: A method of estimating the quality of speech information associated with a voice call over a communication system comprising a core network and an access network where speech information is carried between the access network and the core network and within the core network in frames. The method comprises determining a rate of frame loss for frames transported between the access network and the core network and/or within the core network, and mapping the rate of frame loss to a quality estimation value using data collected by simulating frame loss on representative speech samples and determining quality estimation values for the damaged speech samples.
    Type: Grant
    Filed: October 17, 2005
    Date of Patent: March 27, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Mika Väisänen, Peter Jungner, Johan Fagerström
  • Patent number: 8098833
    Abstract: A system and method to detect and measure remediated speech intelligibility by evaluating received test audio transmitted across and received in a space or region of interest. Remediation of the test audio may include altering the rate, pitch, amplitude and frequency bands energy during presentation of the speech signal.
    Type: Grant
    Filed: January 29, 2007
    Date of Patent: January 17, 2012
    Assignee: Honeywell International Inc.
    Inventors: Philip J. Zumsteg, D. Michael Shields
  • Patent number: 8036765
    Abstract: The invention relates to a method of qualitatively evaluating a digital audio signal. It calculates a quality indicator consisting of a vector associated with each time window in real time, in continuous time, and in successive time windows. For example, the generation of a quality indicator vector calculates, for a reference audio signal and for an audio signal to be evaluated, the spectral power density of the audio signal, the coefficients of a prediction filter, using an autoregressive method, a temporal activity of the signal or the minimum value of the spectrum in successive blocks of the signal. To evaluate the deterioration of the audio signal, the method may calculate a distance between the vectors of the reference audio signal and the audio signal to be evaluated associated with each time window.
    Type: Grant
    Filed: January 23, 2003
    Date of Patent: October 11, 2011
    Assignee: Telediffusion de France
    Inventor: Alexandre Joly
  • Publication number: 20100169079
    Abstract: A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.
    Type: Application
    Filed: December 30, 2008
    Publication date: July 1, 2010
    Applicant: AUDIOCODES LTD.
    Inventors: Ilan Shallom, Nitay Shiran
  • Publication number: 20100153103
    Abstract: WCDMA speech data is received over a plurality of channels each with at least one bit-sequence generated using a channel decoding such as a convolution decoding. At least one junction is selected in the generated at least one bit-sequence using a determined channel metric and/or physical constraint metric. Bits in the generated at least one bit-sequence are concatenated based on redundancy and the selected junctions to form at least one speech stream. A single speech stream is selected based on speech constraints for voice decoding. The at least one bit-sequence is selected, for example, using a maximum likelihood metric, by searching starting from a selected junction corresponding to a highest junction metric value. The selected at least one bit-sequence is verified using a selected redundancy verification parameter. The single speech stream is formed using the selected at least one bit-sequence over different channels for voice decoding.
    Type: Application
    Filed: November 18, 2009
    Publication date: June 17, 2010
    Inventor: Arie Heiman
  • Publication number: 20100121634
    Abstract: The invention relates to audio signal processing. More specifically, the invention relates to enhancing entertainment audio, such as television audio, to improve the clarity and intelligibility of speech, such as dialog and narrative audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.
    Type: Application
    Filed: February 20, 2008
    Publication date: May 13, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Hannes Muesch
  • Publication number: 20100063803
    Abstract: A transmitted data that includes audio data and a transmitted spectral sharpness parameter representing a spectral harmonic/noise sharpness of a plurality of subbands are received. A measured spectral sharpness parameter is estimated from received audio data. The transmitted spectral sharpness parameter is compared with the measured spectral sharpness parameter. A main sharpness control parameter is formed for each of the decoded subbands. The main sharpness control parameter for each of the decoded subbands is analyzed. Ones of the decoded subbands are sharpened if the corresponding main sharpness control indicates that a corresponding subband is not sharp enough, wherein sharpened subbands are formed. Likewise, ones of the decoded subbands are flattened if the corresponding main sharpness control indicates that a corresponding subband is not flat enough, wherein flattened subbands are formed.
    Type: Application
    Filed: September 4, 2009
    Publication date: March 11, 2010
    Applicant: GH Innovation, Inc.
    Inventor: Yang Gao
  • Publication number: 20100017197
    Abstract: It is an object to disclose a voice coding device, etc. in which the deterioration of a voice quality of a decoded signal can be reduced in the case that low frequency domain components of a spectrum are used for coding high frequency domain components and that no low frequency domain components exist.
    Type: Application
    Filed: November 1, 2007
    Publication date: January 21, 2010
    Applicant: PANASONIC CORPORATION
    Inventor: Masahiro Oshikiri
  • Publication number: 20090287479
    Abstract: A method of producing time domain sound data (B) from sound parameters (A), the method comprising the steps of: forming first frames, each first frame containing sound parameters representing sound, —forming second frames from the first frames, each second frame containing transform domain sound data derived from the sound parameters, the transform domain sound data of each second frame representing sound having a specific time domain length, and each second frame having a length corresponding with an efficient inverse transform, inversely transforming the second frames into third frames (G1, G2, . . .
    Type: Application
    Filed: June 27, 2007
    Publication date: November 19, 2009
    Applicant: NXP B.V.
    Inventors: Marek Szczerba, Andreas Gerrits, Marc Klein Middelink
  • Publication number: 20090228268
    Abstract: A system, method, and program product for processing voice data in a conversation between two persons to determine characteristic conversation patterns. The system includes: a variation calculator for calculating a variation of a speech ratio of a first speaker and a variation calculator for calculating a variation of a speech ratio of a second speaker; a difference calculator for calculating a difference data string; a smoother for generating a smoothed difference data string; and a presenter for presenting the difference between the variation of the speech ratio of the first speaker and the speech ratio of the second speaker. The method includes: calculating a variation of a speech ratio of a first speaker and a second speaker; calculating a difference data string; generating a smoothed difference data string; and grouping them according to their patterns.
    Type: Application
    Filed: March 6, 2009
    Publication date: September 10, 2009
    Inventors: Gakuto Kurata, Masafumi Nishimura
  • Publication number: 20090067644
    Abstract: Measuring the loudness of audio encoded in a bitstream that includes data from which an approximation of the power spectrum of the audio can be derived without fully decoding the audio is performed by deriving the approximation of the power spectrum of the audio from said bitstream without fully decoding the audio, and determining an approximate loudness of the audio in response to the approximation of the power spectrum of the audio. The data may include coarse representations of the audio and associated finer representations of the audio, the approximation of the power spectrum of the audio being derived from the coarse representations of the audio. In the case of subband encoded audio, the coarse representations of the audio may comprise scale factors and the associated finer representations of the audio may comprise sample data associated with each scale factor.
    Type: Application
    Filed: March 23, 2006
    Publication date: March 12, 2009
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Brett Graham Crockett, Michael John Smithers, Alan Jeffrey Seefeldt
  • Publication number: 20080255829
    Abstract: A method and apparatus for estimating speech intelligibility in a mobile communications network component handling two-way communication between two ends of a signal path. Test signals adapted for speech intelligibility measurements are inserted into the signal path to simulate two-way communication. Double-talk is detected during the communication, and speech intelligibility measurements are performed only during periods of double-talk. This enables the effect of echo to be taken into account while avoiding undesirable effects from non-linear processing, and comfort noise if present, in the signal path. Voice enhancement devices may then be adjusted in response to the estimated speech intelligibility.
    Type: Application
    Filed: September 20, 2005
    Publication date: October 16, 2008
    Inventor: Jun Cheng
  • Publication number: 20080249769
    Abstract: Techniques for evaluating the audio quality of an audio test signal are disclosed. These techniques provide a quality analysis that takes into account spatial audio distortions between the audio test signal and a reference audio signal. These techniques involve, for example, determining a plurality of audio spatial cues for an audio test signal, determining a corresponding plurality of audio spatial cues for an audio reference signal, comparing the determined audio spatial cues of the audio test signal to the audio spatial cues of the audio reference signal, and determining the audio quality of the audio test signal.
    Type: Application
    Filed: April 4, 2007
    Publication date: October 9, 2008
    Inventor: Frank M. Baumgarte
  • Publication number: 20080221875
    Abstract: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2.
    Type: Application
    Filed: May 19, 2008
    Publication date: September 11, 2008
    Applicant: Her Majesty in Right of Canada as Represented by the Minister of Industry
    Inventors: Hossein Najaf-Zadeh, Hassan Lahdili, Louis Thibault, William Treurniet