Frequency Patents (Class 704/205)
  • Patent number: 11922956
    Abstract: An apparatus for decoding an encoded audio signal, includes a spectral domain audio decoder for generating a first decoded representation of a first set of first spectral portions, the decoded representation having a first spectral resolution; a parametric decoder for generating a second decoded representation of a second set of second spectral portions having a second spectral resolution being lower than the first spectral resolution; a frequency regenerator for regenerating every constructed second spectral portion having the first spectral resolution using a first spectral portion and spectral envelope information for the second spectral portion; and a spectrum time converter for converting the first decoded representation and the reconstructed second spectral portion into a time representation.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: March 5, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Frederik Nagel, Ralf Geiger, Balaji Nagendran Thoshkahna, Konstantin Schmidt, Stefan Bayer, Christian Neukam, Bernd Edler, Christian Helmrich
  • Patent number: 11922967
    Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: March 5, 2024
    Assignee: Gracenote, Inc.
    Inventors: Amanmeet Garg, Aneesh Vartakavi
  • Patent number: 11817107
    Abstract: Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.
    Type: Grant
    Filed: July 27, 2022
    Date of Patent: November 14, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Soren Skak Jensen, Sriram Srinivasan, Koen Bernard Vos
  • Patent number: 11812230
    Abstract: A measurement method includes generating a plurality of second measurement signals by disposing a plurality of first measurement signals corresponding to each of the plurality of speakers in respective different time zones on a time axis, generating a plurality of third measurement signals by copying a portion of a back end of each of the plurality of second measurement signals and adding the portion to a front end of each of the plurality of second measurement signals, outputting sounds according to each of the plurality of third measurement signals from each of the plurality of speakers, collecting the sounds with a microphone, and calculating a plurality of impulse responses corresponding to the plurality of first measurement signals, based on the collected sound signal collected with the microphone and the plurality of third measurement signals.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: November 7, 2023
    Assignee: Yamaha Corporation
    Inventor: Ryo Matsuda
  • Patent number: 11756556
    Abstract: An audio packet error concealment system includes an encoding unit for encoding an audio signal consisting of a plurality of frames, and an auxiliary information encoding unit for estimating and encoding auxiliary information about a temporal change of power of the audio signal. The auxiliary information is used in packet loss concealment in decoding of the audio signal. The auxiliary information about the temporal change of power may contain a parameter that functionally approximates a plurality of powers of subframes shorter than one frame, or may contain information about a vector obtained by vector quantization of a plurality of powers of subframes shorter than one frame.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: September 12, 2023
    Assignees: NTT DOCOMO, INC., JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT
    Inventors: Kimitaka Tsutsumi, Kei Kikuiri
  • Patent number: 11749262
    Abstract: A keyword detection method includes: obtaining an enhanced speech signal of a to-be-detected speech signal, the enhanced speech signal corresponding to a target speech speed; performing speed adjustment on the enhanced speech signal to obtain a first speed-adjusted speech signal having a first speech speed, the first speech speed being different from the target speech speed; obtaining a first speech feature signal according to the first speed-adjusted speech signal; obtaining a detection result according to a first keyword detection result corresponding to the first speech feature signal, the detection result indicating whether a target keyword exists in the to-be-detected speech signal; and performing an operation corresponding to the target keyword in response to determining that the target keyword exists according to the detection result.
    Type: Grant
    Filed: June 10, 2021
    Date of Patent: September 5, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Yi Gao, Ian Ernan Liu, Min Luo
  • Patent number: 11751001
    Abstract: An audio system for an elevator includes two or more speaker cabinets arranged inside a suspended ceiling fixed to a ceiling board of a car of the elevator, an input device to which sound content radiated to an inside of the car from each of the two or more speaker cabinets are input, and a sound field control device configured to conduct phase control and reverberation time control for the sound content and thereby cause a sound wave based on the sound content to be radiated from the speaker cabinet to the inside of the car. Each of the speaker cabinets includes a casing arranged inside the suspended ceiling, and a speaker unit arranged inside the casing and having a radiation surface that radiates the sound wave.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: September 5, 2023
    Assignee: Mitsubishi Electric Corporation
    Inventors: Susumu Fujiwara, Keigo Taruishi, Masami Aikawa
  • Patent number: 11694707
    Abstract: A target speech signal extraction method for robust speech recognition includes: initializing a steering vector for a target speech source and an adaptive vector, setting a real output channel of the target speech source as an output by the adaptive vector, initializing adaptive vectors for a noise and setting a dummy channel as an output by the adaptive vectors for the noise; setting a cost function for minimizing dependency between a real output for the target speech source and a dummy output for the noise; setting an auxiliary function to the cost function, and updating the adaptive vector for the target speech source and the adaptive vectors for the noise by using the auxiliary function and the steering vector; estimating the target speech signal by using the adaptive vector thereby extracting the target speech signal from the input signals; and updating the steering vector for the target speech source.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: July 4, 2023
    Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION SOGANG UNIVERSITY
    Inventors: Hyung Min Park, Uihyeop Shin
  • Patent number: 11682404
    Abstract: An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first and second encoding branches, the second encoding branch having a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and the second encoding branch having a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder, a second domain decoder, and a third domain decoder as well as two cascaded switches for switching between the decoders.
    Type: Grant
    Filed: September 20, 2022
    Date of Patent: June 20, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernhard Grill, Roch Lefebvre, Bruno Bessette, Jimmy Lapierre, Philippe Gournay, Redwan Salami, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jérémie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach
  • Patent number: 11676611
    Abstract: An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first and second encoding branches, the second encoding branch having a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and the second encoding branch having a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder, a second domain decoder, and a third domain decoder as well as two cascaded switches for switching between the decoders.
    Type: Grant
    Filed: September 20, 2022
    Date of Patent: June 13, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernhard Grill, Roch Lefebvre, Bruno Bessette, Jimmy Lapierre, Philippe Gournay, Redwan Salami, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jérémie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach
  • Patent number: 11646009
    Abstract: A device capable of autonomous motion may move in an environment and may receive audio data from a microphone. A model may be trained to process the audio data to determine mask data, which may be used to mask noise in the audio data. Training data for the model may be normalized before training, and different loss functions may be used for different types of training data.
    Type: Grant
    Filed: June 16, 2020
    Date of Patent: May 9, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Amit Singh Chhetri, Navin Chatlani
  • Patent number: 11600269
    Abstract: A system for detection of at least one designated wake-up word for at least one speech-enabled application. The system comprises at least one microphone; and at least one computer hardware processor configured to perform: receiving an acoustic signal generated by the at least one microphone at least in part as a result of receiving an utterance spoken by a speaker; obtaining information indicative of the speaker's identity; interpreting the acoustic signal at least in part by determining, using the information indicative of the speaker's identity and automated speech recognition, whether the utterance spoken by the speaker includes the at least one designated wake-up word; and interacting with the speaker based, at least in part, on results of the interpreting.
    Type: Grant
    Filed: June 15, 2016
    Date of Patent: March 7, 2023
    Assignee: Cerence Operating Company
    Inventors: Meik Pfeffinger, Timo Matheja, Tobias Herbig, Tim Haulick
  • Patent number: 11562758
    Abstract: An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: January 24, 2023
    Assignee: IMMERSION NETWORKS, INC.
    Inventors: James David Johnston, Stephen Daniel White, King Wei Hor, Barry M. Genova
  • Patent number: 11551704
    Abstract: A method and device for automatically increasing the spectral bandwidth of an audio signal including generating a “mapping” (or “prediction”) matrix based on the analysis of a reference wideband signal and a reference narrowband signal, the mapping matrix being a transformation matrix to predict high frequency energy from a low frequency energy envelope, generating an energy envelope analysis of an input narrowband audio signal, generating a resynthesized noise signal by processing a random noise signal with the mapping matrix and the envelope analysis, high-pass filtering the resynthesized noise signal, and summing the high-pass filtered resynthesized noise signal with the original an input narrowband audio signal. Other embodiments are disclosed.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: January 10, 2023
    Assignee: Staton Techiya, LLC
    Inventors: John Usher, Dan Ellis
  • Patent number: 11538487
    Abstract: The disclosure discloses a voice signal enhancing method and device, which divide a voice signal at the present scene into multiple frame signals based on a preset time interval; feed multiple frame signals into a trained neural network based on a preset step size, perform convolution operations on multiple frame signals through skip-connected convolutional layers to obtain multiple enhanced frame signals; superpose each enhanced frame signal according to the time domain of each enhanced frame signal to obtain an enhanced voice signal. Compared with the prior art, the present disclosure automatically enhances voice signals through the neural network without manual interference, so the effects and the application scenes of voice enhancement is not necessary to be limited by the preset method and method designers, thereby reducing the occurrence frequency of signal distortion and extra noises, which in turn improves the effects of the voice signal enhancement.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: December 27, 2022
    Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.
    Inventors: Wanjian Feng, Lianchang Zhang, Jiantao Liu
  • Patent number: 11521630
    Abstract: A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.
    Type: Grant
    Filed: October 2, 2020
    Date of Patent: December 6, 2022
    Assignee: AUDIOSHAKE, INC.
    Inventor: Luke Miner
  • Patent number: 11423874
    Abstract: A speech synthesis model training device includes one or more hardware processors configured to perform the following. Storing, in a speech corpus storing unit, speech data, and pitch mark information and context information of the speech data. From the speech data, analyzing acoustic feature parameters at each pitch mark timing in pitch mark information. From the acoustic feature parameters analyzed, training a statistical model which has a plurality of states and which includes an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution based on timing parameters.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: August 23, 2022
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 11412296
    Abstract: Disclosed are methods and systems to help disambiguate channel identification in a scenario where a video fingerprint of media content matches multiple reference video fingerprints corresponding respectively with multiple different channels. Given such a multi-match situation, an entity could disambiguate based on an audio component of the media content, such as by further determining that an audio fingerprint of the media content at issue matches an audio fingerprint of just one of the multiple channels, thereby establishing that that is the channel on which the media content being rendered by the media presentation device is arriving.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: August 9, 2022
    Assignee: Roku, Inc.
    Inventors: Chung Won Seo, Youngmoo Kwon, Jaehyung Lee
  • Patent number: 11405875
    Abstract: The present disclosure describes various examples of a method, an apparatus, and a computer readable medium for signaling synchronization block patterns in wireless communications (e.g., 5th Generation New Radio (5G NR)). For example, one of the methods described may include receiving, by a user equipment (UE), a message including information of a configuration. The configuration includes at least a group of repetitions of one or more synchronization signal (SS) blocks in an SS burst set, and the repetitions of the one or more SS blocks are configured into at least two groups. The method may further include determining, by the UE, which group of the at least two groups to search for during a synchronous neighbor cell search based on the information and at least one condition at the UE.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: August 2, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Sony Akkarakaran, Tao Luo
  • Patent number: 11398230
    Abstract: An electronic device includes a display, a microphone, a memory, a communication circuit, and a processor. The processor is configured to display a user interface for adjusting voice recognition sensitivity of each of a plurality of voice recognizing devices configured to start a voice recognition service in response to the same start utterance, through the display, to transmit a value of the changed sensitivity to at least part of the plurality of voice recognizing devices when the voice recognition sensitivity is changed through the user interface, to transmit a signal for waiting to receive a first utterance of a user, to the plurality of voice recognizing devices, to receive utterance information corresponding to the first utterance from the plurality of voice recognizing devices, and to update the user interface based on the utterance information.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: July 26, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sungwoon Jang, Sangki Kang, Namkoo Lee, Euisuk Chung
  • Patent number: 11393443
    Abstract: A data generating apparatus for generating noise environment noisy data is disclosed. The data generating apparatus according to the present application comprises a signal conversion unit configured to convert each of a noisy signal obtained in real environment and an original sound signal for the noisy signal into a noisy signal spectrum and an original sound signal spectrum in a short-time frequency domain; and a noisy signal generation training unit configured to train deep neural network to output the noisy signal spectrum corresponding to each short-time using the original sound signal spectrum as an input.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: July 19, 2022
    Assignee: Agency for Defense Development
    Inventors: Hong Kook Kim, Jung Hyuk Lee, Seung Ho Choi, Deokgyu Yun
  • Patent number: 11373671
    Abstract: The present disclosure relates to a device for processing an audio signal. The device may include a first acoustic-electric transducer and a second acoustic-electric transducer. The first acoustic-electric transducer may have a first frequency response, and may be configured to detect the audio signal and generate a first sub-band signal according to the detected audio signal. The second acoustic-electric transducer may have a second frequency response, the second frequency response being different from the first frequency response. The second acoustic-electric transducer may be configured to detect the audio signal and generate a second sub-band signal according to the detected audio signal.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: June 28, 2022
    Assignee: SHENZHEN SHOKZ CO., LTD.
    Inventors: Xin Qi, Lei Zhang
  • Patent number: 11375224
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for watermarking using starting phase modulation. An example apparatus includes memory, and processor circuitry to execute instructions to at least determine a first analyzed phase value for a watermark component of a watermarked media signal at a first time, determine a sum of differences for analyzed phase values with respect to a first one of a plurality of possible starting phase values, the analyzed phase values associated with the watermarked media signal, the analyzed phase values including the first analyzed phase value, in response to the sum of differences satisfying a threshold, decode a first data value corresponding to the first one of the possible starting phase values, and determine a watermark payload based on the first data value.
    Type: Grant
    Filed: July 6, 2020
    Date of Patent: June 28, 2022
    Assignee: THE NIELSEN COMPANY (US), LLC
    Inventors: Alexander Topchy, Vladimir Kuznetsov, Jeremey M. Davis
  • Patent number: 11362702
    Abstract: An echo and near-end cross-talk (NEXT) cancellation system includes a time-domain processing module and a frequency-domain processing module. The time-domain processing module is configured to receive an unprocessed signal after an analog-to-digital conversion, remove at least one time-domain dominant component of interference from the unprocessed signal, and accordingly cancel a time-domain processed signal. The frequency-domain processing module is connected to the time-domain processing module, and configured to receive the time-domain processed signal, cancel at least one frequency-domain component of the interference from the unprocessed signal, and accordingly generate a processed signal.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: June 14, 2022
    Assignee: AIROHA TECHNOLOGY (SUZHOU) LIMITED
    Inventors: Chia-Lung Wu, Dong-Ming Chuang
  • Patent number: 11355134
    Abstract: A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.
    Type: Grant
    Filed: October 2, 2020
    Date of Patent: June 7, 2022
    Assignee: AUDIOSHAKE, INC.
    Inventor: Luke Miner
  • Patent number: 11341977
    Abstract: To provide a bandwidth extension method which allows reduction of computation amount in bandwidth extension and suppression of deterioration of quality in the bandwidth to be extended. In the bandwidth extension method: a low frequency bandwidth signal is transformed into a QMF domain to generate a first low frequency QMF spectrum; pitch-shifted signals are generated by applying different shifting factors on the low frequency bandwidth signal; a high frequency QMF spectrum is generated by time-stretching the pitch-shifted signals in the QMF domain; the high frequency QMF spectrum is modified; and the modified high frequency QMF spectrum is combined with the first low frequency QMF spectrum.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: May 24, 2022
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Tomokazu Ishikawa, Takeshi Norimatsu, Huan Zhou, Kok Seng Chong, Haishan Zhong
  • Patent number: 11335358
    Abstract: The present disclosure relates to a device for processing an audio signal. The device may include a first acoustic-electric transducer and a second acoustic-electric transducer. The first acoustic-electric transducer may have a first frequency response, and may be configured to detect the audio signal and generate a first sub-band signal according to the detected audio signal. The second acoustic-electric transducer may have a second frequency response, the second frequency response being different from the first frequency response. The second acoustic-electric transducer may be configured to detect the audio signal and generate a second sub-band signal according to the detected audio signal.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: May 17, 2022
    Assignee: SHENZHEN SHOKZ CO., LTD.
    Inventors: Xin Qi, Lei Zhang
  • Patent number: 11312382
    Abstract: A method for ascertaining features in an environment of at least one mobile unit for implementation of a localization and/or mapping by a control unit. In the course of the method, sensor measurement data of the environment are received, the sensor measurement data received are transformed by an alignment algorithm into a cost function and a cost map is generated with the aid of the cost function, a convergence map is generated based on the alignment algorithm. At least one feature is extracted from the cost map and/or the convergence map and stored, the at least one feature being provided in order to optimize a localization and/or mapping. A control unit, a computer program, and a machine-readable storage medium are also described.
    Type: Grant
    Filed: October 20, 2020
    Date of Patent: April 26, 2022
    Assignee: Robert Bosch GmbH
    Inventors: Philipp Rasp, Carsten Hasberg, Muhammad Sheraz Khan
  • Patent number: 11282505
    Abstract: According to one embodiment, a signal generation device includes one or more processors. The processors convert an acoustic signal and output amplitude and phase at a plurality of frequencies. The processors, for each of a plurality of nodes of a hidden layer included in a neural network that treats the amplitude and the phase as input, obtain frequency based on a plurality of weights used in arithmetic operation of the node. The processors generate an acoustic signal based on the plurality of obtained frequencies and based on amplitude and phase corresponding to each of the plurality of nodes.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: March 22, 2022
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Daichi Hayakawa, Takehiko Kagoshima, Hiroshi Fujimura
  • Patent number: 11257506
    Abstract: A decoding device includes: a separating unit separating first encoded data, a spectrum including a low-band spectrum of audio signals having been encoded, and second encoded data, a high-band spectrum of a higher band having been encoded, based on the first encoded data; a first decoding unit decoding the first encoded data and generating a first decoded spectrum; a first amplitude normalizer dividing amplitude of the first decoded spectrum into sub-bands, normalizing the spectrum of each sub-band by the largest amplitude of the first decoded spectrum within each sub-band, and generating a normalized spectrum; an addition unit adding noise spectrum to the normalized spectrum and generating a noise-added normalized spectrum; a second decoding unit decoding the second encoded data using the noise-added normalized spectrum, and generating a second noise-added spectrum; and a converter performing time-frequency conversion regarding a spectrum coupled based on the first decoded spectrum and second noise-added spe
    Type: Grant
    Filed: January 24, 2020
    Date of Patent: February 22, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Takuya Kawashima, Hiroyuki Ehara
  • Patent number: 11227614
    Abstract: A system and method of recording and transmitting compressed audio signals over a network is disclosed. The end node device first converts the audio signal to a spectrogram, which is commonly used by machine learning algorithms to perform speech recognition. The end node device then compresses the spectrogram prior to transmission. In certain embodiments, the compression is performed using Discrete Cosine Transforms (DCT). Furthermore, in some embodiments, the DCT is performed on the difference between two columns of the spectrogram. Further, in some embodiments, a function that replaces values below a predetermined threshold with zeroes in the Encoded Spectrogram is utilized. These functions may be performed in hardware or software.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: January 18, 2022
    Assignee: Silicon Laboratories Inc.
    Inventors: Antonio Torrini, Sebastian Ahmed
  • Patent number: 11217259
    Abstract: The invention provides methods and devices for outputting a stereo audio signal having a left channel and a right channel. The apparatus includes a demultiplexer, decoder, and upmixer. The upmixer is configured operate either in a prediction mode or a non-prediction mode based on a parameter encoded in the audio bitstream.
    Type: Grant
    Filed: July 16, 2020
    Date of Patent: January 4, 2022
    Assignee: Dolby International AB
    Inventors: Heiko Purnhagen, Pontus Carlsson, Lars Villemoes
  • Patent number: 11195007
    Abstract: Systems and methods for identifying patterns of symbols in standardized system diagrams are disclosed. Disclosed implementations obtain or synthetically generate a symbol recognition training data set including multiple training images, generate a symbol recognition model based on the symbol recognition training data set, obtain an image comprising a pattern of symbols, group symbols into process loops based on the logical relationships captured by process loop identification algorithm, apply a character classification model to image contours to identify the characters and group characters into tags via hierarchical clustering, and store the identified tags, symbols and identified process loops in a relational database.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: December 7, 2021
    Assignee: CHEVRON U.S.A. INC.
    Inventors: Paul Duke, Shuxing Cheng
  • Patent number: 11170756
    Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: November 9, 2021
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 11145317
    Abstract: A method for generating a psychoacoustic model from an audio signal transforms a block of samples of an audio signal into a frequency spectrum comprising frequency components. From this frequency spectrum, it derives group masking energies. These group masking energies each correspond to a group of neighboring frequency components in the frequency spectrum. For a group of frequency components, the method allocates the group masking energy to the frequency components in the group in proportion to energy of the frequency components within the group to provide adapted mask energies for the frequency components within the group, the adapted mask energies providing masking thresholds for the psychoacoustic model of the audio signal.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: October 12, 2021
    Assignee: Digimarc Corporation
    Inventors: Aparna R. Gurijala, Shankar Thagadur Shivappa, Ravi K. Sharma, Brett A. Bradley
  • Patent number: 11146307
    Abstract: The invention relates to a method, a circuit, and an apparatus for detecting distortion in spread spectrum signals. An edge in a spread spectrum clock signal is identified based on a reference clock signal. The edge data is then provided to a set of counters which are incremented corresponding to an identified edge. Each bit of a respective output of the counters are provided to a respective OR gate of a set of OR gates. An OR gate from the set of OR gates corresponding to a selected bit then outputs an indication of whether distortion exists in the spread spectrum clock signal.
    Type: Grant
    Filed: April 13, 2020
    Date of Patent: October 12, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John Borkenhagen, Christopher Steffen, Grant P. Kesselring
  • Patent number: 11074922
    Abstract: An audio encoding method includes dividing an energy spectrum of a current audio frame into P FFT energy spectrum coefficients; determining a minimum bandwidth of distribution, on spectrum, of first-preset-proportion energy of the current audio frame according to the energy of the P FFT energy spectrum coefficients of the current audio frame, wherein the minimum bandwidth of distribution, on spectrum, of first preset proportion energy of the current audio frame indicates sparseness of distribution, on the spectrum, of energy of the current audio frame; and determining to use a linear-prediction-based encoding method to encode the current audio frame in response to the minimum bandwidth of distribution is greater than a first preset value.
    Type: Grant
    Filed: June 13, 2019
    Date of Patent: July 27, 2021
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Zhe Wang
  • Patent number: 11068571
    Abstract: An electronic device for identity verification includes a memory and a processor; the system of identity verification is stored in the memory, and executed by the processor to implement: after receiving current voice data of a target user, carrying out framing processing on the current voice data according to preset framing parameters to obtain multiple voice frames; extracting preset types of acoustic features in all the voice frames by using a predetermined filter, and generating multiple observed feature units corresponding to the current voice data according to the extracted acoustic features; pairwise coupling all the observed feature units with pre-stored observed feature units respectively to obtain multiple groups of coupled observed feature units; inputting the multiple groups of coupled observed feature units into a preset type of identity verification model generated by pre-training to carry out the identity verification on the target user.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: July 20, 2021
    Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
    Inventors: Jianzong Wang, Hui Guo, Jing Xiao
  • Patent number: 11064296
    Abstract: Provided are a voice denoising method and apparatus, a server and a storage medium. The voice denoising method comprises: acquiring voice signals synchronously collected by an acoustic microphone and a non-acoustic microphone (S100); carrying out voice activity detection according to the voice signal collected by the non-acoustic microphone to obtain a voice activity detection result (S110); and according to the voice activity detection result, denoising the voice signal collected by the acoustic microphone to obtain a denoised voice signal (S120). The effect of denoising can be enhanced, and the quality of voice signals can be improved.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: July 13, 2021
    Assignee: IFLYTEK CO., LTD.
    Inventors: Haikun Wang, Feng Ma, Zhiguo Wang
  • Patent number: 11024332
    Abstract: The present disclosure proposes a speech processing method and a cloud-based speech processing apparatus. The speech processing method includes: acquiring a piece of speech to be recognized collected by a terminal; performing a speech recognition on the piece of speech to be recognized; detecting whether the piece of speech to be recognized ends during the speech recognition; and feeding back a recognized result of the piece of speech to be recognized to the terminal when it is detected that the piece of speech to be recognized ends.
    Type: Grant
    Filed: October 8, 2018
    Date of Patent: June 1, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Sheng Qian
  • Patent number: 11024321
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: June 1, 2021
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 11004463
    Abstract: A speech processing method for estimating a pitch frequency includes: specifying, for each determination result of a speech-like-frame, a fundamental sound by using a plurality of local maximum values included in a spectrum of a respective frame determined as the speech-like-frame; obtaining a learned value by performing learning processing on a magnitude of the fundamental sound specified from each determination result of the speech-like-frame, the learned value including an average value and a variance of the magnitude of the fundamental sound specified from each determination result of the speech-like-frame; and executing a detection process by using the learned value, the detection process including detecting a pitch frequency of the respective frame determined as the speech-like-frame by using a threshold, the threshold being obtained by subtracting the variance included in the learned value from the average value included in the learned value.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: May 11, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Sayuri Nakayama, Taro Togawa, Takeshi Otani
  • Patent number: 10977005
    Abstract: A service running on a server for developing software collaboratively. The service includes accessing at least one repository of code for software applications. A code tree structure is extracted from the repository which represents a plurality of preexisting pipeline requirements to be used with a tree kernel similarity algorithm. At least one development repository of code is accessed. A code tree structure is extracted from the development repository of code which represents a new pipeline requirement to be used with a tree kernel similarity algorithm. A tree kernel similarity algorithm is used that includes a specified similarity function to create feature map between the new pipeline requirements with the preexisting pipeline requirements. One or more features of the new pipe line requirements are clustered. Different requirements are extracted to different definitions based upon the features that have been clustered. A preexisting pipeline requirement is selected for a highest similarity.
    Type: Grant
    Filed: June 14, 2017
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Xiao Xi Liu, Jing Min Xu, Yuan Wang, Jian Ming Zhang
  • Patent number: 10978076
    Abstract: A speaker retrieval device includes a first converting unit, a receiving unit, and a searching unit. The first converting unit converts, using an inverse transform model of a first conversion model for converting score vectors representing the features of voice quality into acoustic models, pre-registered acoustic models into score vectors; and registers the score vectors in a corresponding manner to a speaker identifier in score management information. The receiving unit receives input of a score vector. The searching unit searches the score management information for the speaker identifiers whose score vectors are similar to the received score vector.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: April 13, 2021
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou Mori, Masaru Suzuki, Yamato Ohtani, Masahiro Morita
  • Patent number: 10957331
    Abstract: Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: March 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Soren Skak Jensen, Sriram Srinivasan, Koen Bernard Vos
  • Patent number: 10937447
    Abstract: A harmony generation device and a program for the same which can generate a natural harmony sound are provided. The harmony generation device (1) generates first and second harmony tones to which a voice input through a microphone (M) is shifted in pitch by first and second shift amounts calculated based on both the voice input through the microphone (M) and a chord determined from performance information of an electric guitar (G) input through an input device (34). That is, since the first and second harmony tones can be tones based on the chord of the electric guitar (G) that changes from moment to moment, the harmony sound obtained by mixing the first and second harmony tones with the voice input through the microphone (M) can be a natural harmony sound that is rich in variation according to the chord of the electric guitar (G).
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: March 2, 2021
    Assignee: Roland Corporation
    Inventors: Hideyuki Mamori, Masayuki Nakayama, Hideaki Shiraishi
  • Patent number: 10930301
    Abstract: A method is provided. Intermediate audio features are generated from an input acoustic sequence. Using a nearest neighbor search, segments of the input acoustic sequence are classified based on the intermediate audio features to generate a final intermediate feature as a classification for the input acoustic sequence. Each segment corresponds to a respective different acoustic window. The generating step includes learning the intermediate audio features from Multi-Frequency Cepstral Component (MFCC) features extracted from the input acoustic sequence. The generating step includes dividing the same scene into the different acoustic windows having varying MFCC features.
    Type: Grant
    Filed: August 19, 2020
    Date of Patent: February 23, 2021
    Inventors: Cristian Lumezanu, Yuncong Chen, Dongjin Song, Takehiko Mizuguchi, Haifeng Chen, Bo Dong
  • Patent number: 10891960
    Abstract: A method of coding for multi-channel audio signals includes estimating comparison values at an encoder indicative of an amount of temporal mismatch between a reference channel and a corresponding target channel. The method includes smoothing the comparison values to generate short-term and first long-term smoothed comparison values. The method includes calculating a cross-correlation value between the comparison values and the short-term smoothed comparison values. The method also includes adjusting the first long-term smoothed comparison values in response to comparing the cross-correlation value with a threshold. The method further includes estimating a tentative shift value and non-causally shifting the target channel by a non-causal shift value to generate an adjusted target channel. The non-causal shift value is based on the tentative shift value. The method further includes generating, based on reference channel and the adjusted target channel, at least one of a mid-band channel or a side-band channel.
    Type: Grant
    Filed: August 28, 2018
    Date of Patent: January 12, 2021
    Assignee: Qualcomm Incorproated
    Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman Atti
  • Patent number: 10885925
    Abstract: A method includes processing a time-domain decoded high-band mid signal to generate a time-domain high-band residual prediction signal. The method also includes generating a high-band left channel and a high-band right channel based on the time-domain decoded high-band mid signal and the time-domain high-band residual prediction signal.
    Type: Grant
    Filed: July 15, 2019
    Date of Patent: January 5, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Venkatraman Atti, Venkata Subrahmanyam Chandra Sekhar Chebiyyam
  • Patent number: 10867620
    Abstract: The present disclosure relates to sibilance detection and mitigation in a voice signal. A method of sibilance detection and mitigation is described. In the method, a predetermined spectrum feature is extracted from a voice signal, the predetermined spectrum feature representing a distribution of signal energy over a voice frequency band. Sibilance is then identified based on the predetermined spectrum feature. Excessive sibilance is further identified from the identified sibilance based on a level of the identified sibilance. Then the voice signal is processed by decreasing a level of the excessive sibilance so as to suppress the excessive sibilance. Corresponding system and computer program products are described as well.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: December 15, 2020
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Kai Li, David Gunawan