Patents Examined by Bryan S Blankenagel

Learning speech data generating apparatus, learning speech data generating method, and program

Patent number: 11621015

Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.

Type: Grant

Filed: March 11, 2019

Date of Patent: April 4, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki Fukutomi, Manabu Okamoto, Takashi Nakamura, Kiyoaki Matsui
Machine learning for improving quality of voice biometrics

Patent number: 11610591

Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

Type: Grant

Filed: May 19, 2021

Date of Patent: March 21, 2023

Assignee: Capital One Services, LLC

Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
Anomalous sound detection apparatus, degree-of-anomaly calculation apparatus, anomalous sound generation apparatus, anomalous sound detection training apparatus, anomalous signal detection apparatus, anomalous signal detection training apparatus, and methods and programs therefor

Patent number: 11609115

Abstract: To provide an anomalous sound detection training technique by which a feature amount extraction function for detecting anomalous sound can be generated irrespective of whether training data for anomalous signals is available or not.

Type: Grant

Filed: September 14, 2017

Date of Patent: March 21, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu
Audio translator

Patent number: 11605369

Abstract: Audio translation system includes a feature extractor and a style transfer machine learning model. The feature extractor generates for each of a plurality of source voice files one or more source voice parameters encoded as a collection of source feature vectors, and generates for each of a plurality of target voice files one or more target voice parameters encoded as a collection of target feature vectors. The style transfer machine learning model trained on the collection of source feature vectors for the plurality of source voice files and the collection of target feature vectors for the plurality of target voice files to generate a style transformed feature vector.

Type: Grant

Filed: March 10, 2021

Date of Patent: March 14, 2023

Assignee: Spotify AB

Inventor: Marco Marchini
System and method for cross-speaker style transfer in text-to-speech and training data generation

Patent number: 11600261

Abstract: Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.

Type: Grant

Filed: May 27, 2022

Date of Patent: March 7, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Shifeng Pan, Lei He, Yulin Li, Sheng Zhao, Chunling Ma
Harmonic transposition in an audio coding method and system

Patent number: 11594234

Abstract: The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length La, extracting a frame of the input signal, and an analysis transformation unit of order M transforming the samples into M complex coefficients. M is a function of the transposition factor T. The system further comprises a nonlinear processing unit altering the phase of the complex coefficients by using the transposition factor T, a synthesis transformation unit of order M transforming the altered coefficients into M altered samples, and a synthesis window of length Ls, generating a frame of the output signal.

Type: Grant

Filed: September 27, 2022

Date of Patent: February 28, 2023

Assignee: Dolby International AB

Inventors: Per Ekstrand, Lars Villemoes
Contextual voice user interface

Patent number: 11594215

Abstract: Techniques for providing a contextual voice user interface that enables a user to query a speech processing system with respect to the decisions made to answer the user's command are described. The speech processing system may store speech processing pipeline data used to process a command. At some point after the system outputs content deemed responsive to the command, a user may speak an utterance corresponding to an inquiry with respect to the processing performed to respond to the command. For example, the user may state “why did you tell me that?” In response thereto, the speech processing system may determine the stored speech processing pipeline data used to respond to the command, and may generate output audio data that describes the data and computing decisions involved in determining the content deemed responsive to the command.

Type: Grant

Filed: October 11, 2019

Date of Patent: February 28, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Michael James Moniz, Abishek Ravi, Ryan Scott Aldrich, Michael Bennett Adams
Voice processing method, apparatus, electronic device, and storage medium

Patent number: 11587574

Abstract: Provided in the present disclosure are a voice processing method, an apparatus, an electronic device, and a storage medium, the method comprising: detecting the working state of a current call system, and when the working state is a two-end speaking state or a remote-end speaking state, performing compression processing on a subsequent remote-end voice signal, acquiring a near-end voice signal by means of a microphone, performing echo processing on the basis of the near-end voice signal and the compression-processed remote-end voice signal to obtain an echo-processed near-end voice signal and a remaining echo signal, performing non-linear suppression processing on the near-end voice signal and the remaining echo signal, and performing gain control on the suppression-processed near-end voice signal.

Type: Grant

Filed: August 17, 2021

Date of Patent: February 21, 2023

Assignee: Beijing Dajia Internet Information Technology Co., Ltd.

Inventors: Chen Zhang, Pei Dong
Control apparatus, voice interaction apparatus, voice recognition server, and program

Patent number: 11587554

Abstract: The control system includes a calculation unit configured to control a voice interaction system including voice recognition models, in which the calculation unit instructs, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model tentatively selected from among the voice recognition models, determines a voice recognition model estimated to be optimal among the voice recognition models held in the voice interaction system based on results of the voice recognition of a speech made by the target person in a voice recognition server, and instructs, when the voice recognition model estimated to be optimal is different from the one voice recognition model tentatively selected, the voice interaction system to switch the voice recognition model to the one estimated to be optimal and to perform voice recognition and response generation.

Type: Grant

Filed: December 16, 2019

Date of Patent: February 21, 2023

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Narimasa Watanabe
Method and apparatus for encoding and decoding audio signal to reduce quantization noise

Patent number: 11580999

Abstract: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.

Type: Grant

Filed: May 26, 2021

Date of Patent: February 14, 2023

Assignee: Electronics and Telecommunications Research Institute

Inventors: Seung Kwon Beack, Jongmo Sung, Mi Suk Lee, Tae Jin Lee, Woo-taek Lim, Inseon Jang
Methods and systems for improved signal decomposition

Patent number: 11581005

Abstract: A method for improving decomposition of digital signals using training sequences is presented. A method for improving decomposition of digital signals using initialization is also provided. A method for sorting digital signals using frames based upon energy content in the frame is further presented. A method for utilizing user input for combining parts of a decomposed signal is also presented.

Type: Grant

Filed: January 28, 2022

Date of Patent: February 14, 2023

Assignee: Meta Platforms Technologies, LLC

Inventors: Elias Kokkinis, Alexandros Tsilfidis
System and method for automatic task-oriented dialog system

Patent number: 11568000

Abstract: A method for dialog state tracking includes decoding, by a fertility decoder, encoded dialog information associated with a dialog to generate fertilities for generating dialog states of the dialog. Each dialog state includes one or more domains. Each domain includes one or more slots. Each slot includes one or more slot tokens. The method further includes generating an input sequence to a state decoder based on the fertilities. A total number of each slot token in the input sequence is based on a corresponding fertility. The method further includes encoding, by a state encoder, the input sequence to the state decoder, and decoding, by the state decoder, the encoded input sequence to generate a complete sequence of the dialog states.

Type: Grant

Filed: January 7, 2020

Date of Patent: January 31, 2023

Assignee: SALESFORCE.COM, INC.

Inventors: Hung Le, Chu Hong Hoi
Harmonic transposition in an audio coding method and system

Patent number: 11562755

Abstract: The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length La, extracting a frame of the input signal, and an analysis transformation unit of order M transforming the samples into M complex coefficients. M is a function of the transposition factor T. The system further comprises a nonlinear processing unit altering the phase of the complex coefficients by using the transposition factor T, a synthesis transformation unit of order M transforming the altered coefficients into M altered samples, and a synthesis window of length Ls, generating a frame of the output signal.

Type: Grant

Filed: August 23, 2021

Date of Patent: January 24, 2023

Assignee: DOLBY INTERNATIONAL AB

Inventors: Per Ekstrand, Lars Villemoes
Methods and apparatus for enhancing musical sound during a networked conference

Patent number: 11562761

Abstract: Dynamic adjustment of audio characteristics for enhancing musical sound during a networked conference is disclosed. In an embodiment, a method is provided for sound enhancement performed by a device coupled to a network. The method includes receiving an audio signal to be transmitted over the network, detecting when musical content is present in the audio signal, processing the audio signal to enhance voice characteristics to generate an enhanced audio signal when the musical content is not detected, processing the audio signal to enhance music characteristic to generate the enhanced audio signal when the musical content is detected, and transmitting the enhanced audio signal over the network.

Type: Grant

Filed: July 31, 2020

Date of Patent: January 24, 2023

Assignee: Zoom Video Communications, Inc.

Inventors: Qiyong Liu, Jiachuan Deng, Yuhui Chen, Oded Gal
Recognition or synthesis of human-uttered harmonic sounds

Patent number: 11545143

Abstract: Within each harmonic spectrum of a sequence of spectra derived from analysis of a waveform representing human speech are identified two or more fundamental or harmonic components that have frequencies that are separated by integer multiples of a fundamental acoustic frequency. The highest harmonic frequency that is also greater than 410 Hz is a primary cap frequency, which is used to select a primary phonetic note that corresponds to a subset of phonetic chords from a set of phonetic chords for which acoustic spectral is available. The spectral data can also include frequencies for primary band, secondary band (or secondary note), basal band, or reduced basal band acoustic components, which can be used to select a phonetic chord from the subset of phonetic chords corresponding to the selected primary note.

Type: Grant

Filed: May 18, 2021

Date of Patent: January 3, 2023

Inventor: Boris Fridman-Mintz
Amplitude-independent window sizes in audio encoding

Patent number: 11532314

Abstract: A computer-implemented method can include receiving a first signal corresponding to a first flow of acoustic energy, applying a transform to the received first signal using at least a first amplitude-independent window size at a first frequency and a second amplitude-independent window size at a second frequency, the second amplitude-independent window size improving a temporal response at the second frequency, wherein the second frequency is subject to amplitude reduction due to a resonance phenomenon associated with the first frequency, and storing a first encoded signal, the first encoded signal based on applying the transform to the received first signal.

Type: Grant

Filed: December 16, 2019

Date of Patent: December 20, 2022

Assignee: GOOGLE LLC

Inventors: Jyrki Antero Alakuijala, Martin Bruse
Real time popularity based audible content acquisition

Patent number: 11508353

Abstract: A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.

Type: Grant

Filed: June 28, 2021

Date of Patent: November 22, 2022

Assignee: Gracenote Digital Ventures, LLC

Inventors: Venkatarama Anilkumar Panguluri, Venkata Sunil Kumar Yarram, Lalit Kumar, Gregory P. Defouw
Audio signal processing device, audio signal processing method, and control program

Patent number: 11507807

Abstract: An audio signal processing device includes a neural network circuit that includes an input layer including input units, an intermediate layer, and an output layer including output units, an input section that executes simultaneous inputting of, at each of unit time intervals, each of pieces of unit data of consecutive sampling units in an input signal data string generated through sampling based on an audio signal string into each of the input units on a one-to-one basis, one of the pieces of unit data input into one of the input units at one of the unit time intervals being input into another of the input units at another of the unit time intervals in the simultaneous inputting at each of the unit time intervals, and an output section that outputs, in accordance with the simultaneous inputting over a plurality of the unit time intervals that are consecutive, a computation result at each of the unit time intervals, the computation result being based on pieces of data output from the output units at each of th

Type: Grant

Filed: September 22, 2017

Date of Patent: November 22, 2022

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventor: Ryoji Suzuki
Periodic-combined-envelope-sequence generation device, periodic-combined-envelope-sequence generation method, periodic-combined-envelope-sequence generation program and recording medium

Patent number: 11501788

Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.

Type: Grant

Filed: June 18, 2021

Date of Patent: November 15, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
Methods and apparatus for supplementing partially readable and/or inaccurate codes in media

Patent number: 11501786

Abstract: Methods and apparatus are disclosed for supplementing partially readable and/or inaccurate codes. An example apparatus includes a watermark analyzer to select a first watermark and a second watermark decoded from media; a comparator to compare a first decoded timestamp of the first watermark to a second decoded timestamp of the second watermark; and a timestamp adjuster to adjust the second decoded timestamp based on the first decoded timestamp of the second watermark when at least a threshold number of symbols of the second decoded timestamp match corresponding symbols of the first decoded timestamp.

Type: Grant

Filed: April 30, 2020

Date of Patent: November 15, 2022

Assignee: The Nielsen Company (US), LLC

Inventors: David Gish, Jeremey M. Davis, Wendell D. Lynch, Christen V. Nielsen

prev 1 2 3 4 5 6 7 … next