General Speech Analysis Without Concrete Application (epo) Patents (Class 704/E11.002)

Multiple stage indexing of audio content

Patent number: 11954149

Abstract: Techniques of content unification are disclosed. In some example embodiments, a computer-implemented method comprises: determining clusters based a comparison of a plurality of audio content using a first matching criteria, each cluster of the plurality of clusters comprising at least two audio content from the plurality of audio content; for each cluster of the plurality of clusters, determining a representative audio content for the cluster from the at least two audio content of the cluster; loading the corresponding representative audio content of each cluster into an index; matching the query audio content to one of the representative audio contents using a first matching criteria; determining the corresponding cluster of the matched representative audio content; and identifying a match between the query audio content and at least one of the audio content of the cluster of the matched representative audio content based on a comparison using a second matching criteria.

Type: Grant

Filed: October 27, 2022

Date of Patent: April 9, 2024

Assignee: Gracenote, Inc.

Inventors: Peter C. DiMaria, Markus K. Cremer, Barnabas Mink, Tanji Koshio, Kei Tsuji
Information processing method, system, apparatus, electronic device and storage medium

Patent number: 11900945

Abstract: An information processing method, a system, an apparatus, an electronic device and a storage medium, where the method is applied to a client, and includes: receiving a transcript and a sentence identifier of the transcript sent by a service server; reading a local sentence identifier, and when the received sentence identifier is the same as the local sentence identifier, updating a displayed caption content corresponding to the local sentence identifier with the transcript. When the received sentence identifier of the client is the same as the local sentence identifier, the displayed caption content is replaced with the received transcript.

Type: Grant

Filed: March 21, 2022

Date of Patent: February 13, 2024

Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.

Inventors: Li Zhao, Xiao Han, Kojung Chen, Jian Tong
Sample generation method and apparatus

Patent number: 11810546

Abstract: Provided are a sample generation method and apparatus. The sample generation method comprises: acquiring a plurality of text-audio pairs, wherein each text-audio pair contains a text segment and an audio segment; calculating an audio feature of an audio segment of each of the plurality of text-audio pairs, and selecting, by means of screening and according to the audio feature, a target text-audio pair and a splicing text-audio pair corresponding to the target text-audio pair from among the plurality of text-audio pairs; splicing the target text-audio pair and the splicing text-audio pair into a text-audio pair to be tested, and testing the text-audio pair to be tested; and when the text-audio pair to be tested meets a preset test condition, writing the text-audio pair to be tested into a training database.

Type: Grant

Filed: November 12, 2021

Date of Patent: November 7, 2023

Assignee: Beijing Yuanli Weilai Science and Technology Co., Ltd.

Inventors: Dongxiao Wang, Mingqi Yang, Nan Ma, Long Xia, Changzhen Guo
Model agnostic time series analysis via matrix estimation

Patent number: 11775608

Abstract: A system and method model a time series from missing data by imputing missing values, denoising measured but noisy values, and forecasting future values of a single time series. A time series of potentially noisy, partially-measured values of a physical process is represented as a non-overlapping matrix. For several classes of common model functions, it can be proved that the resulting matrix has a low rank or approximately low rank, allowing a matrix estimation technique, for example singular value thresholding, to be efficiently applied. Applying such a technique produces a mean matrix that estimates latent values, of the physical process at times or intervals corresponding to measurements, with less error than previously known methods. These latent values have been denoised (if noisy) and imputed (if missing). Linear regression of the estimated latent values permits forecasting with an error that decreases as more measurements are made.

Type: Grant

Filed: July 7, 2022

Date of Patent: October 3, 2023

Assignee: Massachusetts Institute of Technology

Inventors: Devavrat D. Shah, Anish Agarwal, Muhammad Amjad, Dennis Shen
End-to-end streaming keyword spotting

Patent number: 11682385

Abstract: A method for training hotword detection includes receiving a training input audio sequence including a sequence of input frames that define a hotword that initiates a wake-up process on a device. The method also includes feeding the training input audio sequence into an encoder and a decoder of a memorized neural network. Each of the encoder and the decoder of the memorized neural network include sequentially-stacked single value decomposition filter (SVDF) layers. The method further includes generating a logit at each of the encoder and the decoder based on the training input audio sequence. For each of the encoder and the decoder, the method includes smoothing each respective logit generated from the training input audio sequence, determining a max pooling loss from a probability distribution based on each respective logit, and optimizing the encoder and the decoder based on all max pooling losses associated with the training input audio sequence.

Type: Grant

Filed: June 15, 2021

Date of Patent: June 20, 2023

Assignee: Google LLC

Inventors: Raziel Alvarez Guevara, Hyun Jin Park, Patrick Violette
Information processing method, system, apparatus, electronic device and storage medium

Patent number: 11568872

Abstract: An information processing method, a system, an apparatus, an electronic device and a storage medium, where the method is applied to a client, and includes: receiving a transcript and a sentence identifier of the transcript sent by a service server; reading a local sentence identifier, and when the received sentence identifier is the same as the local sentence identifier, updating a displayed caption content corresponding to the local sentence identifier with the transcript. When the received sentence identifier of the client is the same as the local sentence identifier, the displayed caption content is replaced with the received transcript.

Type: Grant

Filed: March 21, 2022

Date of Patent: January 31, 2023

Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.

Inventors: Li Zhao, Xiao Han, Kojung Chen, Jian Tong
Multiple stage indexing of audio content

Patent number: 11487814

Abstract: Techniques of content unification are disclosed. In some example embodiments, a computer-implemented method comprises: determining clusters based a comparison of a plurality of audio content using a first matching criteria, each cluster of the plurality of clusters comprising at least two audio content from the plurality of audio content; for each cluster of the plurality of clusters, determining a representative audio content for the cluster from the at least two audio content of the cluster; loading the corresponding representative audio content of each cluster into an index; matching the query audio content to one of the representative audio contents using a first matching criteria; determining the corresponding cluster of the matched representative audio content; and identifying a match between the query audio content and at least one of the audio content of the cluster of the matched representative audio content based on a comparison using a second matching criteria.

Type: Grant

Filed: January 4, 2021

Date of Patent: November 1, 2022

Assignee: Gracenote, Inc.

Inventors: Peter C. DiMaria, Markus K. Cremer, Barnabas Mink, Tanji Koshio, Kei Tsuji
Sensor data fusion for prognostics and health monitoring

Patent number: 11340602

Abstract: A method includes converting time-series data from a plurality of prognostic and health monitoring (PHM) sensors into frequency domain data. One or more portions of the frequency domain data are labeled as indicative of one or more target modes to form labeled target data. A model including a deep neural network is applied to the labeled target data. A result of applying the model is classified as one or more discretized PHM training indicators associated with the one or more target modes. The one or more discretized PHM training indicators are output.

Type: Grant

Filed: December 18, 2015

Date of Patent: May 24, 2022

Assignee: RAYTHEON TECHNOLOGIES CORPORATION

Inventors: Michael J. Giering, Madhusudana Shashanka, Soumik Sarkar, Vivek Venugopalan
Methods and apparatus to perform windowed sliding transforms

Patent number: 10726852

Abstract: Methods and apparatus to perform windowed sliding transforms are disclosed. An example apparatus includes a transformer to transform a first block of time-domain samples of an input signal into a first frequency-domain representation based on a second frequency-domain representation of a second block of time-domain samples of the input signal, and a windower to apply a third frequency-domain representation of a time-domain window function to the first frequency-domain representation.

Type: Grant

Filed: February 19, 2018

Date of Patent: July 28, 2020

Assignee: The Nielsen Company (US), LLC

Inventor: Zafar Rafii
Methods and apparatus to perform windowed sliding transforms

Patent number: 10692507

Abstract: Methods and apparatus to perform windowed sliding transforms are disclosed. An example apparatus includes a transformer to transform a first block of time-domain samples of an input signal into a first frequency-domain representation based on a second frequency-domain representation of a second block of time-domain samples of the input signal, and a windower to apply a third frequency-domain representation of a time-domain window function to the first frequency-domain representation.

Type: Grant

Filed: February 19, 2018

Date of Patent: June 23, 2020

Assignee: The Nielsen Company (US), LLC

Inventor: Zafar Rafii
Composite inspection

Patent number: 10571390

Abstract: A method of detecting local material changes in a composite structure is presented. A pulsed laser beam is directed towards the composite structure comprised of a number of composite materials. Wide-band ultrasonic signals are formed in the composite structure when radiation of the pulsed laser beam is absorbed by the composite structure. The wide-band ultrasonic signals are detected to form data. The data is processed to identify a local frequency value for the composite structure. The local frequency value is used to determine if local material changes are present in the number of composite materials.

Type: Grant

Filed: March 15, 2016

Date of Patent: February 25, 2020

Assignee: The Boeing Company

Inventors: William P. Motzer, Gary Ernest Georgeson, Jill Paisley Bingham, Steven Kenneth Brady, Alan F. Stewart, James C. Kennedy, Ivan Pelivanov, Matthew O'Donnell, Jeffrey Reyner Kollgaard
QUALITY OF USER GENERATED AUDIO CONTENT IN VOICE APPLICATIONS

Publication number: 20130030813

Abstract: Methods and arrangements for improving quality of content in voice applications. A specification is provided for acceptable content for a voice application, and user generated audio content for the voice application is inputted. At least one test is applied to the user generated audio content, and it is thereupon determined as to whether the user generated audio content meets the provided specification.

Type: Application

Filed: August 29, 2012

Publication date: January 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nitendra Rajput, Kundan Shrivastava
NOISE POWER ESTIMATION SYSTEM, NOISE POWER ESTIMATING METHOD, SPEECH RECOGNITION SYSTEM AND SPEECH RECOGNIZING METHOD

Publication number: 20120095753

Abstract: A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.

Type: Application

Filed: September 14, 2011

Publication date: April 19, 2012

Applicant: HONDA MOTOR CO., LTD.

Inventors: Hirofumi NAKAJIMA, Kazuhiro NAKADAI, Yuji HASEGAWA
METHOD, SYSTEM, AND MEDIA GATEWAY FOR REPORTING MEDIA INSTANCE INFORMATION

Publication number: 20110264446

Abstract: A method, a system, and a media gateway (MG) for reporting media instance information are disclosed. The method for reporting media instance information includes: detecting, by an MG, received media data according to a set media instance detection (MID) event; and reporting, by the MG, the MID event when the media instance information is detected. With the present invention, the MG reports the detected media instance information related to the media data to a media gateway controller (MGC) through a set MID event, so that the MG can detect media instance information related to the media data, and report the detected media instance information related to the media data to the MGC. In this way, the MGC can execute corresponding control operations according to the media instance information related to the media data, extending the applicable scope of media services.

Type: Application

Filed: July 7, 2011

Publication date: October 27, 2011

Inventor: Weiwei YANG
Method and apparatus for recovering line spectrum pair parameter and speech decoding apparatus using same

Publication number: 20100191523

Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.

Type: Application

Filed: March 25, 2010

Publication date: July 29, 2010

Applicant: SAMSUNG ELECTRONIC CO., LTD.

Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
SOUND SIGNAL GENERATING METHOD, SOUND SIGNAL GENERATING DEVICE, AND RECORDING MEDIUM

Publication number: 20100145690

Abstract: A sound signal generating method includes: generating, using a computer, a plurality of unit waveform signals by dividing the original sound signal having a periodic length of repeating similar waveforms by the length of the waveform; generating, using a computer, a repetitive waveform signal for each of the generated unit waveform signals by repeating the waveform of the unit waveform signal a given number of times; and generating, using a computer, an outputsound signal by shifting each of the repetitive waveform signals in each length with a sequence in which the unit waveform signals form the original sound signal and then superimposing on one another.

Type: Application

Filed: February 10, 2010

Publication date: June 10, 2010

Applicant: FUJITSU LIMITED

Inventor: Kazuhiro Watanabe
System for Voice-Based Interaction on Web Pages

Publication number: 20100094635

Abstract: SYSTEM FOR VOICE-BASE INTERACTION ON WEB PAGES, of type that permits the incorporation of voice-handling functions on a Web page, in which from a Terminal (1) a Web page (3) of a Web site that is structured under the DOM (Domain Object Model), or any of its extensions, and a networked Voice Service Server (5), by means of a downloadable module (6) for further incorporation in a Web browser, the system including the operating procedures for enabling said module to act as a transparent gateway in a dialogue between said Voice Service Server (5) and said Web page (3), said Web browser permitting to handle said Voice Services of said Server (5) through script functions incorporated in said Web page (3).

Type: Application

Filed: November 30, 2007

Publication date: April 15, 2010

Inventor: Juan Jose Bermudez Perez
METHOD AND SYSTEM FOR THE INTEGRAL AND DIAGNOSTIC ASSESSMENT OF LISTENING SPEECH QUALITY

Publication number: 20090099843

Abstract: A method for determining a speech quality measure of an output speech signal with respect to an input speech signal, wherein the input signal passes through a signal path of a data transmission system resulting in the output signal, includes the steps of pre-processing the output signal; determining at least one of an interruption rate of the pre-processed output signal and a measure for an intensity of musical tones present in the pre-processed output signal; and determining the speech quality measure from at least one of the interruption rate and the measure for the intensity of the musical tones.

Type: Application

Filed: September 11, 2008

Publication date: April 16, 2009

Applicants: DEUTSCHE TELEKOM AG, FRANCE TELECOM, TECHNISCHE UNIVERSITAET BERLIN

Inventors: Vincent Barriac, Nicolas Cote, Valerie Gautier-Turbin, Sebastian Moeller, Alexander Raake, Marcel Waeltermann, Ulrich Heute, Kirstin Scholz
Positive affirmation bear

Publication number: 20080228491

Abstract: The technical disclosure of my invention is the pre-recorded sayings inside a sound module housed in a collectable plush bear, and also the fact that both pre-recorded and recordable sound modules will be used with specific wording that will be used in the pre-recorded left hand of the plush bear. The pre-recorded sayings are as follows: I can do all things through Christ who strengthens me. When you believe you can achieve. Everything I touch turns to gold. What you dream about you bring about.

Type: Application

Filed: March 12, 2007

Publication date: September 18, 2008

Inventor: Nancy C. Hale
TECHNIQUE FOR ACCURATELY DETECTING SYSTEM FAILURE

Publication number: 20080215325

Abstract: An apparatus, method and program for dividing a conversational dialog into utterance. The apparatus includes a computer processor; a word database for storing spellings and pronunciations of words; a grammar database for storing syntactic rules on words; a pause detecting section which detects a pause location in a channel making a main speech among conversational dialogs inputted in at least two channels; an acknowledgement detecting section which detects an acknowledgement location in a channel not making the main speech; a boundary-candidate extracting section which extracts boundary candidates in the main speech, by extracting pauses existing within a predetermined range before and after a base point that is the acknowledgement location; and a recognizing unit which outputs a word string of the main speech segmented by one of the extracted boundary candidates after dividing the segmented speech into optimal utterance in reference to the word database and grammar database.

Type: Application

Filed: December 27, 2007

Publication date: September 4, 2008

Inventors: Hiroshi Horii, Hideki Tai, Gaku Yamamoto
METHOD AND DEVICE FOR CLASSIFYING SPOKEN LANGUAGE IN SPEECH DIALOG SYSTEMS

Publication number: 20080162146

Abstract: A method and device are provided for classifying at least two languages in an automatic dialogue system, which processes digitized speech input. At least one speech recognition method and at least one language identification method are used on the digitized speech input in order, by logical evaluation of the results of the method, to identify the language of the speech input.

Type: Application

Filed: December 3, 2007

Publication date: July 3, 2008

Applicant: Deutsche Telekom AG

Inventors: Martin Eckert, Roman Englert, Wiebke Johannsen, Fred Runge, Markus Van Ballegooy
Methods and apparatuses for dynamically adjusting an audio signal based on a parameter

Publication number: 20080120115

Abstract: In one embodiment, the methods and apparatuses detect an original audio signal;detect a sound model wherein the sound model includes a sound parameter; transform the original audio signal based on the parameter whereby forming a transformed audio signal; and compare the transformed audio signal with the original audio signal.

Type: Application

Filed: November 16, 2006

Publication date: May 22, 2008

Inventor: Xiao Dong Mao