Patents Examined by Bryan S Blankenagel
  • Patent number: 11621015
    Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.
    Type: Grant
    Filed: March 11, 2019
    Date of Patent: April 4, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takaaki Fukutomi, Manabu Okamoto, Takashi Nakamura, Kiyoaki Matsui
  • Patent number: 11610591
    Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: March 21, 2023
    Assignee: Capital One Services, LLC
    Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
  • Patent number: 11609115
    Abstract: To provide an anomalous sound detection training technique by which a feature amount extraction function for detecting anomalous sound can be generated irrespective of whether training data for anomalous signals is available or not.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: March 21, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu
  • Patent number: 11605369
    Abstract: Audio translation system includes a feature extractor and a style transfer machine learning model. The feature extractor generates for each of a plurality of source voice files one or more source voice parameters encoded as a collection of source feature vectors, and generates for each of a plurality of target voice files one or more target voice parameters encoded as a collection of target feature vectors. The style transfer machine learning model trained on the collection of source feature vectors for the plurality of source voice files and the collection of target feature vectors for the plurality of target voice files to generate a style transformed feature vector.
    Type: Grant
    Filed: March 10, 2021
    Date of Patent: March 14, 2023
    Assignee: Spotify AB
    Inventor: Marco Marchini
  • Patent number: 11600261
    Abstract: Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: March 7, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Shifeng Pan, Lei He, Yulin Li, Sheng Zhao, Chunling Ma
  • Patent number: 11594234
    Abstract: The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length La, extracting a frame of the input signal, and an analysis transformation unit of order M transforming the samples into M complex coefficients. M is a function of the transposition factor T. The system further comprises a nonlinear processing unit altering the phase of the complex coefficients by using the transposition factor T, a synthesis transformation unit of order M transforming the altered coefficients into M altered samples, and a synthesis window of length Ls, generating a frame of the output signal.
    Type: Grant
    Filed: September 27, 2022
    Date of Patent: February 28, 2023
    Assignee: Dolby International AB
    Inventors: Per Ekstrand, Lars Villemoes
  • Patent number: 11594215
    Abstract: Techniques for providing a contextual voice user interface that enables a user to query a speech processing system with respect to the decisions made to answer the user's command are described. The speech processing system may store speech processing pipeline data used to process a command. At some point after the system outputs content deemed responsive to the command, a user may speak an utterance corresponding to an inquiry with respect to the processing performed to respond to the command. For example, the user may state “why did you tell me that?” In response thereto, the speech processing system may determine the stored speech processing pipeline data used to respond to the command, and may generate output audio data that describes the data and computing decisions involved in determining the content deemed responsive to the command.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: February 28, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael James Moniz, Abishek Ravi, Ryan Scott Aldrich, Michael Bennett Adams
  • Patent number: 11587574
    Abstract: Provided in the present disclosure are a voice processing method, an apparatus, an electronic device, and a storage medium, the method comprising: detecting the working state of a current call system, and when the working state is a two-end speaking state or a remote-end speaking state, performing compression processing on a subsequent remote-end voice signal, acquiring a near-end voice signal by means of a microphone, performing echo processing on the basis of the near-end voice signal and the compression-processed remote-end voice signal to obtain an echo-processed near-end voice signal and a remaining echo signal, performing non-linear suppression processing on the near-end voice signal and the remaining echo signal, and performing gain control on the suppression-processed near-end voice signal.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: February 21, 2023
    Assignee: Beijing Dajia Internet Information Technology Co., Ltd.
    Inventors: Chen Zhang, Pei Dong
  • Patent number: 11587554
    Abstract: The control system includes a calculation unit configured to control a voice interaction system including voice recognition models, in which the calculation unit instructs, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model tentatively selected from among the voice recognition models, determines a voice recognition model estimated to be optimal among the voice recognition models held in the voice interaction system based on results of the voice recognition of a speech made by the target person in a voice recognition server, and instructs, when the voice recognition model estimated to be optimal is different from the one voice recognition model tentatively selected, the voice interaction system to switch the voice recognition model to the one estimated to be optimal and to perform voice recognition and response generation.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: February 21, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Narimasa Watanabe
  • Patent number: 11580999
    Abstract: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: February 14, 2023
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung Kwon Beack, Jongmo Sung, Mi Suk Lee, Tae Jin Lee, Woo-taek Lim, Inseon Jang
  • Patent number: 11581005
    Abstract: A method for improving decomposition of digital signals using training sequences is presented. A method for improving decomposition of digital signals using initialization is also provided. A method for sorting digital signals using frames based upon energy content in the frame is further presented. A method for utilizing user input for combining parts of a decomposed signal is also presented.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: February 14, 2023
    Assignee: Meta Platforms Technologies, LLC
    Inventors: Elias Kokkinis, Alexandros Tsilfidis
  • Patent number: 11568000
    Abstract: A method for dialog state tracking includes decoding, by a fertility decoder, encoded dialog information associated with a dialog to generate fertilities for generating dialog states of the dialog. Each dialog state includes one or more domains. Each domain includes one or more slots. Each slot includes one or more slot tokens. The method further includes generating an input sequence to a state decoder based on the fertilities. A total number of each slot token in the input sequence is based on a corresponding fertility. The method further includes encoding, by a state encoder, the input sequence to the state decoder, and decoding, by the state decoder, the encoded input sequence to generate a complete sequence of the dialog states.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: January 31, 2023
    Assignee: SALESFORCE.COM, INC.
    Inventors: Hung Le, Chu Hong Hoi
  • Patent number: 11562755
    Abstract: The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length La, extracting a frame of the input signal, and an analysis transformation unit of order M transforming the samples into M complex coefficients. M is a function of the transposition factor T. The system further comprises a nonlinear processing unit altering the phase of the complex coefficients by using the transposition factor T, a synthesis transformation unit of order M transforming the altered coefficients into M altered samples, and a synthesis window of length Ls, generating a frame of the output signal.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: January 24, 2023
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Per Ekstrand, Lars Villemoes
  • Patent number: 11562761
    Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: January 24, 2023
    Assignee: Zoom Video Communications, Inc.
    Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
  • Patent number: 11545143
    Abstract: Within each harmonic spectrum of a sequence of spectra derived from analysis of a waveform representing human speech are identified two or more fundamental or harmonic components that have frequencies that are separated by integer multiples of a fundamental acoustic frequency. The highest harmonic frequency that is also greater than 410 Hz is a primary cap frequency, which is used to select a primary phonetic note that corresponds to a subset of phonetic chords from a set of phonetic chords for which acoustic spectral is available. The spectral data can also include frequencies for primary band, secondary band (or secondary note), basal band, or reduced basal band acoustic components, which can be used to select a phonetic chord from the subset of phonetic chords corresponding to the selected primary note.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: January 3, 2023
    Inventor: Boris Fridman-Mintz
  • Patent number: 11532314
    Abstract: A computer-implemented method can include receiving a first signal corresponding to a first flow of acoustic energy, applying a transform to the received first signal using at least a first amplitude-independent window size at a first frequency and a second amplitude-independent window size at a second frequency, the second amplitude-independent window size improving a temporal response at the second frequency, wherein the second frequency is subject to amplitude reduction due to a resonance phenomenon associated with the first frequency, and storing a first encoded signal, the first encoded signal based on applying the transform to the received first signal.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: December 20, 2022
    Assignee: GOOGLE LLC
    Inventors: Jyrki Antero Alakuijala, Martin Bruse
  • Patent number: 11508353
    Abstract: A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: November 22, 2022
    Assignee: Gracenote Digital Ventures, LLC
    Inventors: Venkatarama Anilkumar Panguluri, Venkata Sunil Kumar Yarram, Lalit Kumar, Gregory P. Defouw
  • Patent number: 11507807
    Abstract: An audio signal processing device includes a neural network circuit that includes an input layer including input units, an intermediate layer, and an output layer including output units, an input section that executes simultaneous inputting of, at each of unit time intervals, each of pieces of unit data of consecutive sampling units in an input signal data string generated through sampling based on an audio signal string into each of the input units on a one-to-one basis, one of the pieces of unit data input into one of the input units at one of the unit time intervals being input into another of the input units at another of the unit time intervals in the simultaneous inputting at each of the unit time intervals, and an output section that outputs, in accordance with the simultaneous inputting over a plurality of the unit time intervals that are consecutive, a computation result at each of the unit time intervals, the computation result being based on pieces of data output from the output units at each of th
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: November 22, 2022
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventor: Ryoji Suzuki
  • Patent number: 11501788
    Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: November 15, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
  • Patent number: 11501786
    Abstract: Methods and apparatus are disclosed for supplementing partially readable and/or inaccurate codes. An example apparatus includes a watermark analyzer to select a first watermark and a second watermark decoded from media; a comparator to compare a first decoded timestamp of the first watermark to a second decoded timestamp of the second watermark; and a timestamp adjuster to adjust the second decoded timestamp based on the first decoded timestamp of the second watermark when at least a threshold number of symbols of the second decoded timestamp match corresponding symbols of the first decoded timestamp.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: November 15, 2022
    Assignee: The Nielsen Company (US), LLC
    Inventors: David Gish, Jeremey M. Davis, Wendell D. Lynch, Christen V. Nielsen