Specialized Information Patents (Class 704/206)
  • Patent number: 10783434
    Abstract: A method of training a non-verbal sound class detection machine learning system, the non-verbal sound class detection machine learning system comprising a machine learning model configured to: receive data for each frame of a sequence of frames of audio data obtained from an audio signal; for each frame of the sequence of frames: process the data for multiple frames; and output data for at least one sound class score representative of a degree of affiliation of the frame with at least one sound class of a plurality of sound classes, wherein the plurality of sound classes comprises: one or more target sound classes; and a non-target sound class representative of an absence of each of the one or more target sound classes; wherein the method comprises: training the machine learning model using a loss function.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: September 22, 2020
    Assignee: AUDIO ANALYTIC LTD
    Inventors: Christopher James Mitchell, Sacha Krstulovic, Cagdas Bilen, Juan Azcarreta Ortiz, Giacomo Ferroni, Arnoldas Jasonas, Francesco Tuveri
  • Patent number: 10770085
    Abstract: An encoding method, a decoding method, an encoding apparatus, a decoding apparatus, a transmitter, a receiver, and a communications system, where the encoding method includes dividing a to-be-encoded time-domain signal into a low band signal and a high band signal, performing encoding on the low band signal to obtain a low frequency encoding parameter, performing encoding on the high band signal to obtain a high frequency encoding parameter, obtaining a synthesized high band signal; performing short-time post-filtering processing on the synthesized high band signal to obtain a short-time filtering signal, and calculating a high frequency gain based on the high band signal and the short-time filtering signal.
    Type: Grant
    Filed: January 3, 2019
    Date of Patent: September 8, 2020
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bin Wang, Zexin Liu, Lei Miao
  • Patent number: 10762889
    Abstract: A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: September 1, 2020
    Assignee: Gracenote Digital Ventures, LLC
    Inventors: Venkatarama Anilkumar Panguluri, Venkata Sunil Kumar Yarram, Lalit Kumar, Gregory P. Defouw
  • Patent number: 10734001
    Abstract: A device includes a receiver and a decoder. The receiver is configured to receive one or more upmix parameters, one or more inter-channel bandwidth extension parameters, one or more inter-channel prediction gain parameters, and an encoded audio signal. The encoded audio signal includes an encoded mid signal. The decoder is configured to generate a synthesized mid signal based on the encoded mid signal. The decoder is also configured to generate a synthesized side signal based on the synthesized mid signal and the one or more inter-channel prediction gain parameters. The decoder is further configured to generate one or more output signals based on the synthesized mid signal, the synthesized side signal, the one or more upmix parameters, and the one or more inter-channel bandwidth extension parameters.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: August 4, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Venkatraman Atti, Venkata Subramanyam Chandra Sekhar Chebiyyam
  • Patent number: 10692397
    Abstract: A smart nasometer according to an embodiment of the present invention includes: a hardware unit worn on a head of a user for measuring nasal and oral sounds and providing feedback for the user; and a computational unit for receiving and processing speech signals of the nasal and oral sounds measured by the hardware unit, wherein the hardware unit includes: a microphone unit for separately measuring the nasal and oral sounds in a non-touched state of the user's philtrum, wherein the computational unit includes: a nasalance adjustment unit for adjusting a nasalance of the nasal and oral sounds measured by the microphone unit.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: June 23, 2020
    Assignees: POSTECH ACADEMY-INDUSTRY FOUNDATION, INDUSTRIAL COOPERATION FOUNDATION OF CHONBUK NATIONAL UNIVERSITY, CHONBUK NATIONAL UNIVERSITY HOSPITAL
    Inventors: Heecheon You, Myoung-Hwan Ko, Jong-Kwan Park, Younggeun Choi, Hyun Gi Kim, Han Soo Lee, Gradiyan Budi Pratama, Min-Jung Yu, Ki Wook Kim, Yun Ju Jo, Jin Kook Lee
  • Patent number: 10657984
    Abstract: A method of regenerating wideband speech from narrowband speech, the method comprising: receiving samples of a narrowband speech signal having a first range of frequencies; identifying, based on a characteristic of the narrowband speech signal, frequencies in the first range of frequencies to translate into a target band of a regenerated speech signal; modulating the identified frequencies in the first range of frequencies of the received samples of the narrowband speech signal with a modulation signal, the modulation signal having a modulating frequency adapted to upshift the identified frequencies in the first range of frequencies into the target band; filtering the modulated samples, using a target band filter, to form the regenerated speech signal in the target band; and combining the narrowband speech signal with the regenerated speech signal to produce a new wideband speech signal.
    Type: Grant
    Filed: March 12, 2018
    Date of Patent: May 19, 2020
    Assignee: SKYPE
    Inventors: Mattias Nilsson, Soren Vang Andersen, Koen Bernard Vos
  • Patent number: 10600405
    Abstract: A speech signal processing method of a user terminal includes: receiving a speech signal, detecting a personalized information section including personal information in the speech signal, performing data processing on the personalized information section of the speech signal by using a personalized model generated based on the personal information, and receiving, from a server, a result of the data processing performed by the server on a general information section of the speech signal that is different than the personalized information section of the speech signal.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: March 24, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Tae-yoon Kim, Sang-ha Kim, Sung-Soo Kim, Jin-sik Lee, Chang-woo Han, Eun-kyoung Kim, Jae-won Lee
  • Patent number: 10586526
    Abstract: This invention discloses a speech analysis/synthesis method and a simplified form of such a method. Based on a harmonic model, the present method decomposes the parameters of the harmonic model into glottal source characteristics and vocal tract characteristics in its analysis stage and recombines the glottal source and vocal tract characteristics into harmonic model parameters in its synthesis stage.
    Type: Grant
    Filed: December 10, 2015
    Date of Patent: March 10, 2020
    Inventor: Kanru Hua
  • Patent number: 10515656
    Abstract: A pitch extraction device includes a processor configured to perform a process including: dividing a first bit stream in encoded data into a plurality of sections each having a prescribed section length, the encoded data being obtained by performing entropy encoding on a residual signal calculated by performing linear prediction analysis on a sound signal; allocating a first value or a second value to each of the plurality of sections in the first bit stream in accordance with a bit value in each of the plurality of sections; generating a second bit stream obtained by re-encoding the first bit stream according to the first value and the second value that have been allocated to each of the plurality of sections in the first bit stream; and calculating a fundamental frequency of the sound signal in accordance with an autocorrelation of the second bit stream.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: December 24, 2019
    Assignee: FUJITSU LIMITED
    Inventors: Akira Kamano, Yohei Kishi, Takeshi Otani
  • Patent number: 10510354
    Abstract: A speech/audio coding apparatus includes a receiver that receives a time-domain speech input signal. The apparatus also includes a processor that transforms a time-domain speech input signal into a frequency-domain spectrum, and divides a frequency region of the spectrum in an extended band into a plurality of bands. The processor sets a limited band for each divided band in the current frame, a width of the limited band in the current frame being narrower than the divided band and the limited band including a first frequency. The processor further encodes the spectrum in the limited band within each divided band in the current frame, wherein the width of the limited band is predetermined and is set to 31.
    Type: Grant
    Filed: January 9, 2019
    Date of Patent: December 17, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Takuya Kawashima, Masahiro Oshikiri
  • Patent number: 10475484
    Abstract: The present disclosure discloses a method including: performing a silence detection on a speech to be decoded; cutting the speech to be decoded off to obtain a target speech if detecting that the speech to be detected is a silent speech; resetting tail features of the target speech with preset tail features of silent frames; and performing a CTC decoding process on the target speech reset. In embodiments, when a large number of blank frames are carried in the speech to be decoded, the speech to be decoded is cut off, and the tail features of the target speech is placed with the tail features of the silent frames such that there may be one CTC peak when the CTC decoding process is performed on the tail features of the target speech. Therefore, a last word of text content may be displayed rapidly on a screen.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: November 12, 2019
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhijian Wang, Sheng Qian
  • Patent number: 10446111
    Abstract: An image data transfer system includes a receiver and a transmitter configured to sequentially receive compressed image data and sequentially transmit transmission data corresponding to the compressed image data to the receiver. The transmitter is configured to, in transmitting a specific transmission data, perform data comparison of bits of a compressed image body data of a specific compressed image data with bits of a previous transmission data transmitted over signal lines allocated to the compressed image body data, incorporate the compressed image body data of the specific compressed image data or the bit-inverted data corresponding thereto into the specific transmission data, in response to the result of the data comparison, and incorporate the compression code of the specific compressed image data into the specific transmission data independently of the result of the data comparison.
    Type: Grant
    Filed: January 23, 2017
    Date of Patent: October 15, 2019
    Assignee: Synaptics Japan GK
    Inventors: Hirobumi Furihata, Masashige Harada, Iori Shiraishi, Takashi Nose
  • Patent number: 10446159
    Abstract: A speech/audio encoding device for selectively allocating bits for higher precision encoding. The speech/audio encoding device receives a time-domain speech/audio input signal, transforms the speech/audio input signal into a frequency domain, and quantizes an energy envelope corresponding to an energy level for a frequency spectrum of the speech/audio input signal. The speech/audio encoding device further groups quantized energy envelopes into a plurality of groups, determines a perceptual significant group including one or more significant bands and a local-peak frequency, and allocates bits to a plurality of subbands corresponding to the grouped quantized energy envelopes, in which each of the subbands is obtained by splitting the frequency spectrum of the speech/audio input signal. The speech/audio encoding device encodes the frequency spectrum using the bits allocated to the subbands.
    Type: Grant
    Filed: November 22, 2016
    Date of Patent: October 15, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Takuya Kawashima, Masahiro Oshikiri
  • Patent number: 10418048
    Abstract: A device for noise estimation comprises a first microphone capturing a nominal speech signal, and a second microphone capturing a nominal noise signal. A generalized sidelobe canceller of the device applies spatial noise reduction, and comprises a blocking matrix filter to adaptively process the nominal speech signal to produce a speech cancellation signal, a node for subtracting the speech cancellation signal from the nominal noise signal to produce a noise reference signal, a noise cancellation filter to adaptively filter the noise reference signal to produce a noise cancellation signal; and a node for subtracting the noise cancellation signal from the nominal speech signal to produce a speech reference signal.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: September 17, 2019
    Assignee: Cirrus Logic, Inc.
    Inventors: Benjamin Hutchins, Brenton Robert Steele
  • Patent number: 10354422
    Abstract: The present invention provides a diagram building system adapted for processing a signal with a time period. The diagram building system comprises a inputting device for receiving the signal; a computing device, dividing the signal into a plurality of window scales according to one of time interval scales; decomposing the window scales via HHT algorithm to generate a plurality of quantized windows according to different components; then, calculating the value of quantized windows with the same single-frequency component through a quantifying function to generate a plurality of specific frequency values; an outputting device, sequentially arranging the specific frequency values according to the time interval scales and the single-frequency components to form a visual diagram.
    Type: Grant
    Filed: April 4, 2016
    Date of Patent: July 16, 2019
    Assignee: NATIONAL CENTRAL UNIVERSITY
    Inventors: Norden E. Huang, Bo-Jau Kuo, Yu-Cheng Lin, Chung-Kang Peng, Men-Tzung Lo
  • Patent number: 10332540
    Abstract: Example embodiments disclosed herein relate to filter coefficient updating in time domain filtering. A method of processing an audio signal is disclosed. The method includes obtaining a predetermined number of target gains for a first portion of the audio signal by analyzing the first portion of the audio signal. Each of the target gains is corresponding to a linear subband of the audio signal. The method also includes determining a filter coefficients for time domain filtering the first portion of the audio signal so as to approximate a frequency response given by the target gains. The filter coefficients are determined by iteratively selecting at least one target gain from the target gains and updating the filter coefficient based on the selected at least one target gain. Corresponding system and computer program product for processing an audio signal are also disclosed.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: June 25, 2019
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Dong Shi, Xuejing Sun
  • Patent number: 10319383
    Abstract: Methods and systems are provided for customizing an action. In some implementations, voice input is received from a user and a context is determined from the voice input. Potential contextual data is identified based on the context and the voice input. A level of confidence is determined for an association of the potential contextual data and the context. An action is performed based on the voice input, the potential contextual data, and the level of confidence. The potential contextual data is used to customize the action.
    Type: Grant
    Filed: August 24, 2018
    Date of Patent: June 11, 2019
    Assignee: Google LLC
    Inventors: Zoltan Stekkelpak, Gyula Simonyi
  • Patent number: 10210877
    Abstract: A speech/audio decoding apparatus is provided that includes a receiver that receives encoded data including a limited-band mode flag, and a memory that stores information on a position of a maximum amplitude spectrum frequency of a previous frame in a divided band. The speech/audio decoding apparatus also includes a processor that identifies whether a decoding band is encoded using a limited-band mode based on the decoded limited-band mode flag. Additionally, the processor decodes the spectrum in a limited band within each of the divided bands in a current frame using the stored information. Furthermore, the limited-band mode is set at an encoder side, when a difference between a first frequency with a first maximum amplitude in a spectrum of the divided band in a preceding frame and a second frequency with a second maximum amplitude in a spectrum of the divided band in the current frame is below a threshold.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: February 19, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Takuya Kawashima, Masahiro Oshikiri
  • Patent number: 10186273
    Abstract: Provided are a method and apparatus for encoding an audio signal and a method and apparatus for decoding an audio signal, in which errors generated during encoding and decoding of the audio signal are reduced to enhance the audio quality of a reconstructed audio signal. The method of encoding the audio signal includes detecting a pitch of the audio signal, determining a filter coefficient based on the detected pitch, performing second filtering on the audio signal, based on the determined filter coefficient; and encoding an audio signal resulting from the second filtering.
    Type: Grant
    Filed: November 25, 2014
    Date of Patent: January 22, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-suk Lee, Hyun-wook Kim
  • Patent number: 10134404
    Abstract: An apparatus for generating a decoded two-channel signal includes: an audio processor for decoding an encoded two-channel signal to obtain a first set of first spectral portions; a parametric decoder for providing parametric data for a second set of second spectral portions and a two-channel identification identifying either a first or a second different two-channel representation for the second spectral portions; and a frequency regenerator for regenerating a second spectral portion depending on a first spectral portion of the first set of first spectral portions, the parametric data for the second portion and the two-channel identification for the second portion.
    Type: Grant
    Filed: January 19, 2016
    Date of Patent: November 20, 2018
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Frederik Nagel, Ralf Geiger, Balaji Nagendran Thoshkahna, Konstantin Schmidt, Stefan Bayer, Christian Neukam, Bernd Edler, Christian Helmrich
  • Patent number: 10117247
    Abstract: A method implemented in a fronthaul communication unit, comprising applying, via a processor of the fronthaul communication unit, a plurality of first frequency-domain windowing (FDW) functions on a plurality of first communication channel signals to produce a plurality of first windowed signals, aggregating, via the processor, the plurality of first windowed signals to produce a first aggregated signal, and transmitting, via a frontend of the fronthaul communication unit, the first aggregated signal to a corresponding fronthaul communication unit over a fronthaul communication link to facilitate fronthaul communication.
    Type: Grant
    Filed: March 1, 2016
    Date of Patent: October 30, 2018
    Assignee: Futurewei Technologies, Inc.
    Inventors: Huaiyu Zeng, Xiang Liu
  • Patent number: 10096324
    Abstract: A frame error concealment (FEC) method is provided. The method includes: selecting an FEC mode based on states of a current frame and a previous frame of the current frame in a time domain signal generated after time-frequency inverse transform processing; and performing corresponding time domain error concealment processing on the current frame based on the selected FEC mode, wherein the current frame is an error frame or the current frame is a normal frame when the previous frame is an error frame.
    Type: Grant
    Filed: January 30, 2017
    Date of Patent: October 9, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho-sang Sung, Nam-suk Lee
  • Patent number: 10089290
    Abstract: A meeting summarization method, system, and computer program product, include recording meeting audio of a meeting, capturing notes including a time stamp from each of a plurality of users associated with the meeting, synchronizing the recorded meeting audio of the meeting and each of the notes of each of the plurality of users based on a correlation between the time stamp, and analyzing the synchronized meeting audio and notes to determine highlights of the meeting based on a co-occurrence of notes between the plurality of users.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: October 2, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Keith William Grueneberg, Jason Crawford, Jonathan Lenchner, Satya V. Nitta, Christian Makaya, Sharad C. Sundararajan
  • Patent number: 10062383
    Abstract: Methods and systems are provided for customizing an action. In some implementations, voice input is received from a user and a context is determined from the voice input. Potential contextual data is identified based on the context and the voice input. A level of confidence is determined for an association of the potential contextual data and the context. An action is performed based on the voice input, the potential contextual data, and the level of confidence. The potential contextual data is used to customize the action.
    Type: Grant
    Filed: November 20, 2017
    Date of Patent: August 28, 2018
    Assignee: Google LLC
    Inventors: Zoltan Stekkelpak, Gyula Simonyi
  • Patent number: 10032460
    Abstract: Embodiments of the present application proposes a frequency envelope vector quantization method and apparatus, where the method includes: dividing N frequency envelopes in one frame into N1 vectors; quantizing a first vector in the N1 vectors by using a first codebook, to obtain a code word corresponding to the quantized first vector, where the first codebook is divided into 2B1 portions; determining, according to the code word corresponding to the quantized first vector; determining a second codebook according to the codebook of the ith portion; and quantizing a second vector in the N1 vectors based on the second codebook. In the embodiments of the present application, vector quantization can be performed on frequency envelope vectors by using a codebook with a smaller quantity of bits. Therefore, complexity of vector quantization can be reduced, and an effect of vector quantization can also be ensured.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: July 24, 2018
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Chen Hu, Lei Miao, Zexin Liu
  • Patent number: 10008211
    Abstract: Present disclosure discloses a method and an apparatus for encoding a stereo phase parameter, which relate to the field of information technologies and can improve an effect of stereo audio phase information. The method includes: first, acquiring a global stereo phase parameter of a current frame; then, determining a value of the global stereo phase parameter of the current frame, and adjusting the value of the global stereo phase parameter of the current frame according to a determining result of the value of the global stereo phase parameter of the current frame; and finally, encoding an adjusted value of the global stereo phase parameter of the current frame. The embodiments of the present disclosure are applicable to recovering stereo phase information.
    Type: Grant
    Filed: May 13, 2016
    Date of Patent: June 26, 2018
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Xingtao Zhang, Lei Miao, Wenhai Wu
  • Patent number: 9973555
    Abstract: The present invention relates to an apparatus and method for transmitting/receiving streaming data using multiple paths, in which the streaming data is smoothly reproduced without being interrupted, and more particularly, to an apparatus and method for transmitting/receiving streaming data using multiple paths, in which exchange of the streaming data is performed in real-time using the multiple paths regardless of obstacles. The method for transmitting streaming data using multiple paths includes managing and maintaining a path list including sequence information about a transmission path capable of transmitting data, framing the streaming data, and transmitting the framed streaming data via the transmission path according to the sequence information.
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: May 15, 2018
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hyoung Jin Kwon, Jin Kyeong Kim, Woo Yong Lee, Kyeongpyo Kim
  • Patent number: 9928843
    Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.
    Type: Grant
    Filed: November 18, 2013
    Date of Patent: March 27, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
  • Patent number: 9871916
    Abstract: A system and methods is provided for providing SIP based voice transcription services. A computer implemented method includes: transcribing a Session Initiation Protocol (SIP) based conversation between one or more users from voice to text transcription; identifying each of the one or more users that are speaking using a device SIP_ID of the one or more users; marking the identity of the one or more users that are speaking in the text transcription; and providing the text transcription of the speaking user to non-speaking users.
    Type: Grant
    Filed: March 5, 2009
    Date of Patent: January 16, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John R. Dingler, Sri Ramanathan, Matthew A. Terry, Matthew B. Trevathan
  • Patent number: 9865277
    Abstract: Methods and apparatus for dynamically suppressing low frequency non-speech audio events, such as road bumps, without suppressing speech formants. In exemplary embodiments of the invention, maximum powers in first and second windows are computed and used to determine whether dampening should be applied, and if so, to what extent.
    Type: Grant
    Filed: July 10, 2013
    Date of Patent: January 9, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Friedrich Faubel, Patrick B. Hannon, Kai Wenzler
  • Patent number: 9865247
    Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.
    Type: Grant
    Filed: February 25, 2015
    Date of Patent: January 9, 2018
    Assignee: Google Inc.
    Inventors: Ioannis Agiomyrgiannakis, Byung Ha Chun
  • Patent number: 9865258
    Abstract: A method for recognizing a voice context for a voice control function in a vehicle. The method encompasses reading in a gaze direction datum regarding a current gaze direction of an occupant of the vehicle; allocating the gaze direction datum to a viewing zone in an interior of the vehicle in order to obtain a viewing zone datum regarding a viewing zone currently being viewed by the occupant; and determining, by utilization of the viewing zone datum, a voice context datum regarding a predetermined voice context allocated to the viewing zone currently being viewed.
    Type: Grant
    Filed: May 17, 2016
    Date of Patent: January 9, 2018
    Assignee: ROBERT BOSCH GMBH
    Inventor: Philippe Dreuw
  • Patent number: 9830929
    Abstract: A matrix is generated that stores sinusoidal components evaluated for a given sample rate corresponding to the matrix. The matrix is then used to convert an audio signal to chroma vectors representing of a set of “chromae” (frequencies of interest). The conversion of an audio signal portion into its chromae enables more meaningful analysis of the audio signal than would be possible using the signal data alone. The chroma vectors of the audio signal can be used to perform analyzes such as comparisons with the chroma vectors obtained from other audio signals in order to identify audio matches.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: November 28, 2017
    Assignee: GOOGLE INC.
    Inventor: Pedro Gonnet Anders
  • Patent number: 9812133
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: August 5, 2016
    Date of Patent: November 7, 2017
    Assignee: Nuance Communications, Inc.
    Inventor: Horst J. Schroeter
  • Patent number: 9779744
    Abstract: A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is shaped. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a bandwidth extension technique in the frequency domain represented by SBR.
    Type: Grant
    Filed: August 18, 2016
    Date of Patent: October 3, 2017
    Assignee: NTT Docomo, Inc.
    Inventors: Kosuke Tsujino, Kei Kikuiri, Nobuhiko Naka
  • Patent number: 9761230
    Abstract: A method for processing a digital signal, implemented during decoding of the signal, in order to replace a succession of samples lost during decoding, the method comprising steps of: generating a structure of a signal for replacing the lost succession, this structure comprising spectral components determined from valid samples received during decoding before the succession of lost samples; generating a residue between a digital signal available to the decoder, comprising received valid samples, and a signal generated from the spectral components; and extracting blocks from the residue, method in which window weighted blocks are injected into the structure using an overlap-add approach, the injected blocks partially overlapping in time.
    Type: Grant
    Filed: April 17, 2014
    Date of Patent: September 12, 2017
    Assignee: Orange
    Inventors: Jerome Daniel, Julien Faure
  • Patent number: 9743183
    Abstract: An apparatus and method are disclosed for filtering an audio signal are disclosed. The apparatus includes an analysis filter bank, a high frequency reconstructor or a parametric stereo processor, and a synthesis filter bank. The analysis filterbank receives real-valued time domain input audio samples and generates complex valued subband samples. The high frequency reconstructor or parametric stereo processor modifies at least some of the complex valued subband samples. The synthesis filter bank receives the modified complex valued subband samples and generates time domain output audio samples. The analysis filter bank comprises analysis filters that are complex exponential modulated versions of a prototype filter.
    Type: Grant
    Filed: May 4, 2017
    Date of Patent: August 22, 2017
    Assignee: Dolby International AB
    Inventor: Per Ekstrand
  • Patent number: 9564140
    Abstract: Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; and encoding the residual discrete spectral representation using a plurality of codewords.
    Type: Grant
    Filed: April 7, 2015
    Date of Patent: February 7, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Alexander Sorin
  • Patent number: 9536517
    Abstract: Systems, methods, and computer-readable storage devices for crowd-sourced data labeling. The system requests a respective response from each of a set of entities. The set of entities includes crowd workers. Next, the system incrementally receives a number of responses from the set of entities until one of an accuracy threshold is reached and m responses are received, wherein the accuracy threshold is based on characteristics of the number of responses. Finally, the system generates an output response based on the number of responses.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: January 3, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Jason Williams, Tirso Alonso, Barbara B. Hollister, Ilya Dan Melamed
  • Patent number: 9502047
    Abstract: From a plurality of received voice signals, a signal interval in which there is a talker collision between at least a first and a second voice signal is detected. A processor receives a positive detection result and processes, in response to this, at least one of the voice signals with the aim of making it perceptually distinguishable. A mixer mixes the voice signals to supply an output signal, wherein the processed signal(s) replaces the corresponding received signals. In example embodiments, signal content is shifted away from the talker collision in frequency or in time. The invention may be useful in a conferencing system.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: November 22, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Gary Spittle, Michael Hollier
  • Patent number: 9412382
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: August 9, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst J. Schroeter
  • Patent number: 9325285
    Abstract: An audio processing device comprises a multitude of electric input signals, each electric input signal being provided in a digitized form, and a control unit receiving said digitized electric input signals and providing a resulting enhanced signal. The control unit is configured to determine the resulting enhanced signal from said digitized electric input signals, or signals derived therefrom, according to a predefined scheme.
    Type: Grant
    Filed: February 6, 2014
    Date of Patent: April 26, 2016
    Assignees: OTICON A/S, SENNHEISER COMMUNICATIONS A/S
    Inventors: Svend Feldt, Thomas Kaulberg, Torben Christiansen, Claus Benjaminsen
  • Patent number: 9319818
    Abstract: Embodiments of the present invention provide a stereo signal down-mixing method, encoding/decoding apparatus and system. The down-mixing method includes: converting a first channel time-domain signal and a second channel time-domain signal into a first channel frequency-domain signal and a second channel frequency-domain signal; obtaining a frequency-domain channel signal level difference and a frequency-domain channel signal phase difference between the two channel frequency-domain signals; for each frequency bin in each frequency band, using a function based on the frequency-domain channel signal level difference and frequency-domain channel signal phase difference to obtain a down-mixed signal phase that is located between phases of the two channel frequency-domain signals, and obtaining a down-mixed signal amplitude through calculation; and obtaining a frequency-domain down-mixed signal according to the phase and amplitude.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: April 19, 2016
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Lei Miao, Wenhai Wu, Yue Lang
  • Patent number: 9264094
    Abstract: A voice coding device capable of preventing overall quality degradation even when the bit rate for coding is lowered. The voice coding device codes a wide band signal in a first layer, and codes an extended band signal whose frequency band is located in higher frequency than the wide band signal in an extended band layer. An adaptive band selection unit (301) selects a frequency band to be excluded from a coding object in the extended band layer or a frequency band whose energy is to be attenuated in the extended band layer. A band-limited signal generation unit (302) excludes, within the frequency band of an input signal, the frequency band selected by the adaptive band selection unit (301) from the coding object, or attenuates the energy of the frequency band selected by the adaptive band selection unit (301).
    Type: Grant
    Filed: May 25, 2012
    Date of Patent: February 16, 2016
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Katsunori Daimou, Masahiro Oshikiri, Hiroyuki Ehara
  • Patent number: 9236059
    Abstract: An apparatus determining a weighting function for line prediction coding coefficients quantization converts a linear prediction coding (LPC) coefficient of an input signal into one of a line spectral frequency (LSF) coefficient and an immitance spectral frequency (ISF) coefficient and determines a weighting function associated with one of an importance of the ISF coefficient and importance of the LSF coefficient using one of the converted ISF coefficient and the converted LSF coefficient.
    Type: Grant
    Filed: May 26, 2011
    Date of Patent: January 12, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho Sang Sung, Eun Mi Oh
  • Patent number: 9237349
    Abstract: A method and apparatus for sharing information in a video decoding system are disclosed. The method derives reconstructed data for a picture from a bitstream, where the picture is partitioned into multiple slices. An information-sharing flag is parsed from the bitstream associated with a current reconstructed slice. If the information-sharing flag indicates information sharing, shared information is determined from a part of the bitstream not corresponding to the current reconstructed slice, and in-loop filtering process is applied to the current reconstructed slice according to the shared information. If the information-sharing flag indicates filter no information sharing, individual information is determined from a part of the bitstream corresponding to the current reconstructed slice, and in-loop filtering process is applied to the current reconstructed slice according to the individual information. A method for a corresponding encoder is also disclosed.
    Type: Grant
    Filed: February 17, 2015
    Date of Patent: January 12, 2016
    Assignee: MEDIATEK INC
    Inventors: Chia-Yang Tsai, Chih-Wei Hsu, Yu-Wen Huang, Ching-Yeh Chen, Chih-Ming Fu, Shaw-Min Lei
  • Patent number: 9048906
    Abstract: Beamforming precoding matrix using non-uniform angles quantization. Adaptively generated feedback information is provided between communication devices that communicate using more than one communication path, link, connection, etc. With respect to feeding back different types of information having different respective characteristics (e.g., different respective probability density functions), different and respective quantization may be employed for the different types of information. For example, uniform, Gaussian, or per bit loop optimized quantization may be individually selected and employed for each of the different types of feedback information used in a wired communication system (e.g.
    Type: Grant
    Filed: January 9, 2012
    Date of Patent: June 2, 2015
    Assignee: Broadcom Corporation
    Inventor: Avi Kliger
  • Patent number: 9043200
    Abstract: The present invention is based on the finding that parameters including: a first set of parameters of a representation of a first portion of an original signal and a second set of parameters of a representation of a second portion of the original signal can be efficiently encoded when the parameters are arranged in a first sequence of tuples and a second sequence of tuples. The first sequence of tuples includes tuples of parameters having two parameters from a single portion of the original signal and the second sequence of tuples includes tuples of parameters having one parameter from the first portion and one parameter from the second portion of the original signal. A bit estimator estimates the number of necessary bits to encode the first and the second sequence of tuples. Only the sequence of tuples, which results in the lower number of bits, is encoded.
    Type: Grant
    Filed: November 17, 2010
    Date of Patent: May 26, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Ralph Sperschneider, Jürgen Herre, Karsten Linzmeier, Johannes Hilpert
  • Patent number: 9031835
    Abstract: In a method of improving perceived loudness and sharpness of a reconstructed speech signal delimited by a predetermined bandwidth, performing the steps of providing (S10) the speech signal, and separating (S20) the provided signal into at least a first and a second signal portion. Subsequently, adapting (S30) the first signal portion to emphasize at least a predetermined frequency or frequency interval within the first bandwidth portion. Finally, reconstructing (S40) the second signal portion based on at least the first signal portion, and combining (S50) the adapted first signal portion and the reconstructed second signal portion to provide a reconstructed speech signal with an overall improved perceived loudness and sharpness.
    Type: Grant
    Filed: June 29, 2010
    Date of Patent: May 12, 2015
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Volodya Grancharov, Sigurdur Sverrisson
  • Patent number: 9026435
    Abstract: The invention provides a method for estimating a fundamental frequency of a speech signal comprising the steps of receiving a signal spectrum of the speech signal, filtering the signal spectrum to obtain a refined signal spectrum, determining a cross-power spectral density using the refined signal spectrum and the signal spectrum, transforming the cross-power spectral density into the time domain to obtain a cross-correlation function, and estimating the fundamental frequency of the speech signal based on the cross-correlation function.
    Type: Grant
    Filed: May 3, 2010
    Date of Patent: May 5, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Mohamed Krini, Gerhard Schmidt