Neural Network Patents (Class 704/202)

Multimodal and real-time method for filtering sensitive media

Patent number: 10194203

Abstract: A multimodal and real-time method for filtering sensitive content, receiving as input a digital video stream, the method including segmenting digital video into video fragments along the video timeline; extracting features containing significant information from the digital video input on sensitive media; reducing the semantic difference between each of the low-level video features, and the high-level sensitive concept; classifying the video fragments, generating a high-level label (positive or negative), with a confidence score for each fragment representation; performing high-level fusion to properly match the possible high-level labels and confidence scores for each fragment; and predicting the sensitive time by combining the labels of the fragments along the video timeline, indicating the moments when the content becomes sensitive.

Type: Grant

Filed: June 30, 2016

Date of Patent: January 29, 2019

Assignees: SAMSUNG ELETRÔNICA DA AMACÔNIA LTDA., UNIVERSIDADE ESTADUAL DE CAMPINAS

Inventors: Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Siome Goldenstein, Eduardo Valle, Anderson Rocha
Systems and methods for aggregating information-asset classifications

Patent number: 10095768

Abstract: The disclosed computer-implemented method for aggregating information-asset classifications may include (1) identifying a data collection that includes two or more information assets, (2) identifying a classification for each of the information assets, (3) deriving, based at least in part on the classifications of the information assets, an aggregate classification for the data collection, and (4) associating the aggregate classification with the data collection to enable a data management system to enforce a data management policy based on the aggregate classification. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: November 14, 2014

Date of Patent: October 9, 2018

Assignee: Veritas Technologies LLC

Inventor: Robert Koeten
Method and apparatus for generating a speech signal

Patent number: 10032461

Abstract: An apparatus includes microphone receivers configured to receive microphone signals from a plurality of microphones. A comparator configured to determine a speech similarity indication indicative of a similarity between the microphone signal and non-reverberant speech for each microphone signal. The determination is in response to a comparison of a property derived from the microphone signal to a reference property for non-reverberant speech. In some embodiments, the comparator is configured to determine the similarity indication by comparing to reference properties for speech samples of a set of non-reverberant speech samples. A generator is configured to generate a speech signal by combining the microphone signals in response to the similarity indications. The apparatus may be distributed over a plurality of devices each containing a microphone, and the approach may determine the most suited microphone for generating the speech signal.

Type: Grant

Filed: February 18, 2014

Date of Patent: July 24, 2018

Assignee: KONINKLIJKE PHILIPS N.V.

Inventor: Sriram Srinivasan
Frequency warping in a speech recognition system

Patent number: 10026396

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a sequence representing an utterance, the sequence comprising a plurality of audio frames; determining one or more warping factors for each audio frame in the sequence using a warping neural network; applying, for each audio frame, the one or more warping factors for the audio frame to the audio frame to generate a respective modified audio frame, wherein the applying comprises using at least one of the warping factors to scale a respective frequency of the audio frame to a new respective frequency in the respective modified audio frame; and decoding the modified audio frames using a decoding neural network, wherein the decoding neural network is configured to output a word sequence that is a transcription of the utterance.

Type: Grant

Filed: July 27, 2016

Date of Patent: July 17, 2018

Assignee: Google LLC

Inventor: Andrew W. Senior
Meta-data inputs to front end processing for automatic speech recognition

Patent number: 9953638

Abstract: A computer-implemented method is described for front end speech processing for automatic speech recognition. A sequence of speech features which characterize an unknown speech input provided on an audio input channel and associated meta-data which characterize the audio input channel are received. The speech features are transformed with a computer process that uses a trained mapping function controlled by the meta-data, and automatic speech recognition is performed of the transformed speech features.

Type: Grant

Filed: June 28, 2012

Date of Patent: April 24, 2018

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Karl Jonas Lööf, Yue Pan, Joel Pinto, Christian Gollan
Device specific multi-channel data compression

Patent number: 9875747

Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.

Type: Grant

Filed: July 15, 2016

Date of Patent: January 23, 2018

Assignee: GOOGLE LLC

Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
Providing a distraction free reading mode with an electronic personal display

Patent number: 9785313

Abstract: A method and system for providing a distraction free reading mode with an electronic personal display is disclosed. One example accesses non-adjustable settings for a reader mode. In addition, user adjustable settings for the reader mode on the electronic personal display are also accessed. The user adjustable settings and the non-adjustable settings are then implemented when the reader mode is initiated.

Type: Grant

Filed: June 28, 2013

Date of Patent: October 10, 2017

Assignee: RAKUTEN KOBO, INC.

Inventors: James Wu, Peter James Farmer, Michael Serbinis, Pamela Lynn Hilborn
Non-negative matrix factorization regularized by recurrent neural networks for audio processing

Patent number: 9721202

Abstract: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.

Type: Grant

Filed: February 21, 2014

Date of Patent: August 1, 2017

Assignee: Adobe Systems Incorporated

Inventors: Nicolas Maurice Boulanger-Lewandowski, Gautham J. Mysore, Matthew Douglas Hoffman
System, method, and recording medium for determining and discerning items with multiple meanings

Patent number: 9672207

Abstract: A method, system, and non-transitory compute readable medium determining and discerning items with multiple meanings in a sequence of items including producing a distributed representation for each item of the sequence of items including a word vector and a context vector, partitioning the sequence of items into classes, for an item using a representative word vector of each class, calculating a cosine distance between the word vector of said item and the class representative vector, and producing a new sequence of items by modifying the distributed representation in the producing by replacing each occurrence of an item depending on the cosine distance calculated by the calculating.

Type: Grant

Filed: October 19, 2015

Date of Patent: June 6, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Oded Shmueli
Method of and apparatus for evaluating intelligibility of a degraded speech signal, through selecting a difference function for compensating for a disturbance type, and providing an output signal indicative of a derived quality parameter

Patent number: 9659579

Abstract: The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference signal. The method comprises sampling said reference and degraded signal into frames, and forming frame pairs. For each pair one or more difference functions representing a difference between the degraded and reference signal are provided. A difference function is selected and compensated for different disturbance types, such as to provide a disturbance density function adapted to human auditory perception. An overall quality parameter is determined indicative of the intelligibility of the degraded signal. The method comprises determining a switching parameter indicative of audio power level of said degraded signal, for performing said selecting.

Type: Grant

Filed: November 15, 2012

Date of Patent: May 23, 2017

Assignee: Nederlandse Organisatie voor toegepast-natuurwetenschappelijk onderzoek TNO

Inventor: John Gerard Beerends
Signal processing systems

Patent number: 9342781

Abstract: We describe a signal processor, the signal processor comprising: a probability vector generation system, wherein said probability vector generation system has an input to receive a category vector for a category of output example and an output to provide a probability vector for said category of output example, wherein said output example comprises a set of data points, and wherein said probability vector defines a probability of each of said set of data points for said category of output example; a memory storing a plurality of said category vectors, one for each of a plurality of said categories of output example; and a stochastic selector to select a said stored category of output example for presentation of the corresponding category vector to said probability vector generation system; wherein said signal processor is configured to output data for an output example corresponding to said selected stored category.

Type: Grant

Filed: June 24, 2013

Date of Patent: May 17, 2016

Assignee: Google Inc.

Inventors: Julien Robert Michel Cornebise, Danilo Jimenez Rezende, Daniël Pieter Wierstra
Posterior-based feature with partial distance elimination for speech recognition

Patent number: 9336775

Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.

Type: Grant

Filed: March 5, 2013

Date of Patent: May 10, 2016

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
CONTEXT-DEPENDENT STATE TYING USING A NEURAL NETWORK

Publication number: 20150127327

Abstract: The technology described herein can be embodied in a method that includes receiving an audio signal encoding a portion of an utterance, and providing, to a first neural network, data corresponding to the audio signal. The method also includes generating, by a processor, data representing a transcription for the utterance based on an output of the first neural network. The first neural network is trained using features of multiple context-dependent states, the context-dependent states being derived from a plurality of context-independent states provided by a second neural network.

Type: Application

Filed: May 20, 2014

Publication date: May 7, 2015

Applicant: Google Inc.

Inventors: Michiel A.U. Bacchiani, David Rybach
Audio signal processing method and device

Patent number: 9020812

Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.

Type: Grant

Filed: November 24, 2010

Date of Patent: April 28, 2015

Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University

Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
CONTEXT-BASED SPEECH RECOGNITION

Publication number: 20150039299

Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

Type: Application

Filed: September 18, 2013

Publication date: February 5, 2015

Applicant: Google Inc.

Inventors: Eugene Weinstein, Pedro J. Moreno Mengibar, Johan Schalkwyk
Voice conversion method and system

Patent number: 8930183

Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.

Type: Grant

Filed: August 25, 2011

Date of Patent: January 6, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Byung Ha Chun, Mark John Francis Gales
Audio signal bandwidth extension in CELP-based speech coder

Patent number: 8924200

Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.

Type: Grant

Filed: September 28, 2011

Date of Patent: December 30, 2014

Assignee: Motorola Mobility LLC

Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
Voice correction device, voice correction method, and recording medium storing voice correction program

Patent number: 8924199

Abstract: A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller.

Type: Grant

Filed: December 20, 2011

Date of Patent: December 30, 2014

Assignee: Fujitsu Limited

Inventors: Chisato Ishikawa, Takeshi Otani, Taro Togawa, Masanao Suzuki, Masakiyo Tanaka
Multiband compressor

Patent number: 8924220

Abstract: In a multiband compressor 100, a level calculation unit 121 calculates a signal level inputted for each of bands, a gain calculation unit 122 calculates a gain value from the calculated signal level, and a gain limitation unit 130 limits a gain value by comparison with a gain value of the other band in a compressor for each band. With this configuration, provided is a multiband compressor capable of achieving a balance between the quality of sound and the effect of enhancing the sound level at a high level.

Type: Grant

Filed: September 7, 2010

Date of Patent: December 30, 2014

Assignee: Lenovo Innovations Limited (Hong Kong)

Inventor: Satoshi Hosokawa
METHODS AND APPARATUS FOR SIGNAL QUALITY ANALYSIS

Publication number: 20140358526

Abstract: A non-intrusive objective speech quality assessment is performed on a degraded speech signal. The methods are well suited for systems where random and bursty packet losses may occur and/or packet stream regeneration may also occur prior to speech signal quality assessment. In one embodiment received packetized speech is analyzed to determine to an overall final signal quality score. A limited set of trained neural networks, e.g., 5, corresponding to different signal features, each determine a signal feature quality score. A trained joint quality score determination module determines a joint quality score based on the signal feature quality scores. Packet loss is estimated based on received packet header information and/or detected gap durations. The determined joint quality score is adjusted, based on estimated packet loss information obtained from examining the speech signal, network level statistics and/or codec parameters to generate the final quality score.

Type: Application

Filed: May 31, 2013

Publication date: December 4, 2014

Inventors: Ahmed Abdelal, Vladimir Botchev, Wassim Matragi
Method, apparatus and computer program product for providing improved data compression

Patent number: 8886612

Abstract: An apparatus for providing improved data compression may include an encoder comprising a quantizer for encoding input data and a side model. The quantizer may be trained with respect to high priority data among the input data and may be configured to partially encode the input data by encoding the high priority data. The side model may be trained jointly with the training of the quantizer and is configured to model low priority data among the input data.

Type: Grant

Filed: October 4, 2007

Date of Patent: November 11, 2014

Assignee: Core Wireless Licensing S.a.r.l.

Inventors: Jani K. Nurminen, Sakari Himanen
INTEGRATION OF SEMANTIC CONTEXT INFORMATION

Publication number: 20140278379

Abstract: In one implementation, a computer-implemented method includes receiving, at a computer system, a request to predict a next word in a dialog being uttered by a speaker; accessing, by the computer system, a neural network comprising i) an input layer, ii) one or more hidden layers, and iii) an output layer; identifying the local context for the dialog of the speaker; selecting, by the computer system and using a semantic model, at least one vector that represents the semantic context for the dialog; applying input to the input layer of the neural network, the input comprising i) the local context of the dialog and ii) the values for the at least one vector; generating probability values for at least a portion of the candidate words; and providing, by the computer system and based on the probability values, information that identifies one or more of the candidate words.

Type: Application

Filed: April 16, 2013

Publication date: September 18, 2014

Applicant: Google Inc.

Inventors: Noah B. Coccaro, Patrick An Phu Nguyen
Method and apparatus of transforming speech feature vectors using an auto-associative neural network

Patent number: 8838446

Abstract: Provided is a method and apparatus for transforming a speech feature vector. The method includes extracting a feature vector required for speech recognition from a speech signal and transforming the extracted feature vector using an auto-associative neural network (AANN).

Type: Grant

Filed: August 31, 2007

Date of Patent: September 16, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
Multi-stage speech recognition apparatus and method

Patent number: 8762142

Abstract: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.

Type: Grant

Filed: August 15, 2007

Date of Patent: June 24, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
Decoding method and decoding apparatus therefor

Patent number: 8762158

Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.

Type: Grant

Filed: August 5, 2011

Date of Patent: June 24, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
DEEP NEURAL NETWORKS TRAINING FOR SPEECH AND PATTERN RECOGNITION

Publication number: 20140142929

Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.

Type: Application

Filed: November 20, 2012

Publication date: May 22, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
Spatial selectivity noise reduction tradeoff for multi-microphone systems

Patent number: 8606571

Abstract: The present technology provides noise reduction of an acoustic signal using a configurable classification threshold which provides a sophisticated level of control to balance the tradeoff between positional robustness and noise reduction robustness. The configurable classification threshold corresponds to a configurable spatial region, such that signals arising from sources within the configurable spatial region are preserved, and signals arising from sources outside it are rejected. In embodiments, the configurable classification threshold can be automatically and dynamically adjusted in real-time based on evaluated environmental conditions surrounding an audio device implementing the noise reduction techniques described herein.

Type: Grant

Filed: July 15, 2010

Date of Patent: December 10, 2013

Assignee: Audience, Inc.

Inventors: Mark Every, Carlo Murgia
Robust information fusion methods for decision making for multisource data

Patent number: 8589334

Abstract: Methods and systems are provided for developing decision information relating to a single system based on data received from a plurality of sensors. The method includes receiving first data from a first sensor that defines first information of a first type that is related to a system, receiving second data from a second sensor that defines second information of a second type that is related to said system, wherein the first type is different from the second type, generating a first decision model, a second decision model, and a third decision model, determining whether data is available from only the first sensor, only the second sensor, or both the first and second sensors, and selecting based on the determination of availability an additional model to apply the available data, wherein the additional model is selected from a plurality of additional decision models including the third decision model.

Type: Grant

Filed: January 18, 2011

Date of Patent: November 19, 2013

Assignee: Telcordia Technologies, Inc.

Inventor: Akshay Vashist
Apparatus and method for encoding a multi-channel audio signal

Patent number: 8577045

Abstract: An encoding apparatus comprises a frame processor (105) which receives a multi channel audio signal comprising at least a first audio signal from a first microphone (101) and a second audio signal from a second microphone (103). An ITD processor 107 then determines an inter time difference between the first audio signal and the second audio signal and a set of delays (109, 111) generates a compensated multi channel audio signal from the multi channel audio signal by delaying at least one of the first and second audio signals in response to the inter time difference signal. A combiner (113) then generates a mono signal by combining channels of the compensated multi channel audio signal and a mono signal encoder (115) encodes the mono signal. The inter time difference may specifically be determined by an algorithm based on determining cross correlations between the first and second audio signals.

Type: Grant

Filed: September 9, 2008

Date of Patent: November 5, 2013

Assignee: Motorola Mobility LLC

Inventor: Jonathan A. Gibbs
Neuromimetic homomorphic pattern recognition method and apparatus therefor

Patent number: 8577821

Abstract: A method and apparatus for encoding the amplitude of frequency signals, into a temporal electrical spike pattern and a method for decoding the generated pattern by recognizing and classifying it with a correlating (associative pattern recognition) neuron like device. The first part of the system takes signals and converts their amplitude into a set of timed pulses. The timing of that pulse is a function of the amplitude of the incoming signal. Once a series of timed pulses have been created, they can be sent to a decoder. The decoder circuit uses bandpass filters and delays/gains a low pass summation. The recognition portion compares the timed input spikes with a known sample. Only when the input signals match the sample signals will the amplitudes be sufficient to meet the trigger threshold. When the output spike is generated, a pattern match has been found.

Type: Grant

Filed: April 16, 2010

Date of Patent: November 5, 2013

Inventor: Thomas D. Humphrey
METHODS FOR ALIGNING EXPRESSIVE SPEECH UTTERANCES WITH TEXT AND SYSTEMS THEREFOR

Publication number: 20130262096

Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.

Type: Application

Filed: September 21, 2012

Publication date: October 3, 2013

Applicant: LESSAC TECHNOLOGIES, INC.

Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
Ratio of speech to non-speech audio such as for elderly or hearing-impaired listeners

Patent number: 8494840

Abstract: The invention relates to audio signal processing and speech enhancement. In accordance with one aspect, the invention combines a high-quality audio program that is a mix of speech and non-speech audio with a lower-quality copy of the speech components contained in the audio program for the purpose of generating a high-quality audio program with an increased ratio of speech to non-speech audio such as may benefit the elderly, hearing impaired or other listeners. Aspects of the invention are particularly useful for television and home theater sound, although they may be applicable to other audio and sound applications. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

Type: Grant

Filed: February 12, 2008

Date of Patent: July 23, 2013

Assignee: Dolby Laboratories Licensing Corporation

Inventor: Hannes Muesch
Method and apparatus for defining an artificial brain via a plurality of concept nodes defined by frame semantics

Patent number: 8463720

Abstract: A method for defining a network of nodes is provided, each representing a unique concept, and making connections between individual concepts through unique relationships to other concepts. Each of the nodes is operable to store a unique identifier in the network and information regarding the concept in addition to the unique relationships.

Type: Grant

Filed: March 26, 2010

Date of Patent: June 11, 2013

Assignee: Neuric Technologies, LLC

Inventors: Jennifer Seale, Hannah Lindsley, Timothy Allen Margheim
Audio decoding device, audio decoding method, program, and integrated circuit

Patent number: 8428953

Abstract: An audio decoding device of the present invention includes: a decoding unit decoding a stream to a spectrum coefficient, and outputting stream information when a frame included in the stream cannot be decoded; an orthogonal transformation unit transforming the spectrum coefficient to a time signal; a correction unit generating a correction time signal based on an output waveform within a reference section that is in a section that overlaps between an error frame section to which the stream information is outputted and an adjacent frame section and that is a section in the middle of the adjacent frame section, when the decoding unit outputs the stream information: and an output unit generating the output waveform by synthesizing the correction time signal and the time signal.

Type: Grant

Filed: May 20, 2008

Date of Patent: April 23, 2013

Assignee: Panasonic Corporation

Inventors: Kojiro Ono, Takeshi Norimatsu, Yoshiaki Takagi, Takashi Katayama
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8428946

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: July 6, 2012

Date of Patent: April 23, 2013

Assignee: Google Inc.

Inventor: Marco Paniconi
Concealment of transmission error in a digital audio signal in a hierarchical decoding structure

Patent number: 8391373

Abstract: A method is provided for concealing a transmission error in a digital signal chopped into a plurality of successive frames associated with different time intervals in which, on reception, the signal may comprise erased frames and valid frames, the valid frames comprising information relating to the concealment of frame loss. The method is implemented during a hierarchical decoding using a core decoding and a transform-based decoding using windows introducing a time delay of less than a frame with respect to the core decoding. The method includes concealing a first set of missing samples for the erased frame, implemented in a first time interval; a step of concealing a second set of missing samples utilizing information of said valid frame and implemented in a second time interval; and a step of transition between the first and the second set of missing samples to obtain at least part of the missing frame.

Type: Grant

Filed: March 20, 2009

Date of Patent: March 5, 2013

Assignee: France Telecom

Inventors: David Virette, Pierrick Philippe, Balazs Kovesi
NEURAL TRANSLATOR

Publication number: 20120290294

Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.

Type: Application

Filed: July 27, 2012

Publication date: November 15, 2012

Inventors: Michael Callahan, Thomas Coleman
Audio encoding apparatus and spectrum modifying method

Patent number: 8296134

Abstract: A spectrum modifying method and the like wherein the efficiencies of the signal estimation and prediction can be improved and the spectrum can be more efficiently encoded. According to this method, the pitch period is calculated from an original signal, which serves as a reference signal, and then a basic pitch frequency (f0) is calculated. Thereafter, the spectrum of a target signal, which is a target of spectrum modification, is divided into a plurality of partitions. It is specified here that the width of each partition be the basic pitch frequency. Then, the spectra of bands are interleaved such that a plurality of peaks having similar amplitudes are unified into a group. The basic pitch frequency is used as an interleave pitch.

Type: Grant

Filed: May 11, 2006

Date of Patent: October 23, 2012

Assignee: Panasonic Corporation

Inventors: Chun Woei Teo, Sua Hong Neo, Koji Yoshida, Michiyo Goto
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239196

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: July 28, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239194

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: September 26, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
Method and apparatus for recovering line spectrum pair parameter and speech decoding apparatus using same

Patent number: 8214203

Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.

Type: Grant

Filed: March 25, 2010

Date of Patent: July 3, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
Conservative training method for adapting a neural network of an automatic speech recognition device

Patent number: 8126710

Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.

Type: Grant

Filed: June 1, 2005

Date of Patent: February 28, 2012

Assignee: Loquendo S.p.A.

Inventors: Roberto Gemello, Franco Mana
Neural network-controlled automatic tracking and recognizing system and method

Patent number: 8036425

Abstract: A neural network-controlled automatic tracking and recognizing system includes a fixed field of view collection module, a full functions variable field of view collection module, a video image recognition algorithm module, a neural network control module, a suspect object track-tracking module, a database comparison and alarm judgment module, a monitored characteristic recording and rule setting module, a light monitoring and control module, a backlight module, an alarm output/display/storage module, and security monitoring sensors. The invention relates also to the operation method of the system.

Type: Grant

Filed: June 26, 2008

Date of Patent: October 11, 2011

Inventor: Billy Hou
Method and device for recognising a phonetic sound sequence or character sequence

Patent number: 7966177

Abstract: The invention relates to a method for recognizing a phonetic sound sequence or a character sequence, e.g.

Type: Grant

Filed: August 13, 2001

Date of Patent: June 21, 2011

Inventor: Hans Geiger
Activity-centric domain scoping

Patent number: 7836002

Abstract: A system that can automatically narrow the search space or recognition scope within an activity-centric environment based upon a current activity or set of activities is provided. In addition, the activity and context data can also be used to rank the results of the recognition or search activity. In accordance with the domain scoping, natural language processing (NLP) as well as other types of conversion and recognition systems can dynamically adjust to the scope of the activity or group of activities thereby increasing the recognition systems accuracy and usefulness. In operation, a user context, activity context, environment context and/or device profile can be employed to effectuate the scoping. As well, the system can combine context with extrinsic data, including but not limited to, calendar, profile, historical activity data, etc. in order to define the parameters for an appropriate scoping.

Type: Grant

Filed: June 27, 2006

Date of Patent: November 16, 2010

Assignee: Microsoft Corporation

Inventors: Steven W. Macbeth, Roland L. Fernandez, Brian R. Meyers, Desney S. Tan, George G. Robertson, Nuria M. Oliver, Oscar E. Murillo
Method for accelerating the execution of speech recognition neural networks and the related speech recognition device

Patent number: 7827031

Abstract: A neural network in a speech-recognition system has computing units organized in levels including at least one hidden level and one output level. The computing units of the hidden level are connected to the computing units of the output level via weighted connections, and the computing units of the output level correspond to acoustic-phonetic units of the general vocabulary. This network executes the following steps: determining a subset of acoustic-phonetic units necessary for recognizing all the words contained in the general vocabulary subset; eliminating from the neural network all the weighted connections afferent to computing units of the output level that correspond to acoustic-phonetic units not contained in the previously determined subset of acoustic-phonetic units, thus obtaining a compacted neural network optimized for recognition of the words contained in the general vocabulary subset; and executing, at each moment in time, only the compacted neural network.

Type: Grant

Filed: February 12, 2003

Date of Patent: November 2, 2010

Assignee: Loquendo S.p.A.

Inventors: Dario Albesano, Roberto Gemello
Method of optimising the execution of a neural network in a speech recognition system through conditionally skipping a variable number of frames

Patent number: 7769580

Abstract: A method of optimizing the execution of a neural network in a speech recognition system provides for conditionally skipping a variable number of frames, depending on a distance computed between output probabilities, or likelihoods, of a neural network. The distance is initially evaluated between two frames at times 1 and 1+k, where k is a predetermined maximum distance between frames, and if such distance is sufficiently small, the frames between times 1 and 1+k are calculated by interpolation, avoiding further executions of the neural network. If, on the contrary, such distance is not small enough, it means that the outputs of the network are changing quickly, and it is not possible to skip too many frames. In that case, the method attempts to skip remaining frames, calculating and evaluating a new distance.

Type: Grant

Filed: December 23, 2002

Date of Patent: August 3, 2010

Assignee: Loquendo S.p.A.

Inventors: Roberto Gemello, Dario Albesano
Method and apparatus for recovering line spectrum pair parameter and speech decoding apparatus using same

Patent number: 7765100

Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.

Type: Grant

Filed: February 6, 2006

Date of Patent: July 27, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
Pattern matching method and apparatus and speech information retrieval system

Patent number: 7739111

Abstract: A pattern matching method for matching between a first symbol sequence and a second symbol sequence which is shorter than the first symbol sequence is provided. The method includes the steps of performing DP matching between the first and second symbol sequences to create a matrix of the DP matching transition, detecting the maximum length of lengths of consecutive correct answers based on the matrix of the DP matching transition, and calculating similarity based on the maximum length.

Type: Grant

Filed: August 9, 2006

Date of Patent: June 15, 2010

Assignee: Canon Kabushiki Kaisha

Inventor: Kazue Kaneko
Packet loss concealment for overlapped transform codecs

Patent number: 7627467

Abstract: Real-time packet-based audio communications over packet-based networks frequently results in the loss of one or more packets during any given communication session. The real-time nature of such communications precludes retransmission of lost packets due to the unacceptable delays that would result. Consequently, packet loss concealment methods are employed to “hide” lost packets from the listener. Unfortunately, conventional loss concealment methods, such as packet repetition or stretch/overlap methods, do not fully exploit information available from partially received samples. Therefore, when a single frame of N coefficients is lost, 2N samples are only partially reconstructed, thereby degrading the reconstructed signal.

Type: Grant

Filed: June 30, 2005

Date of Patent: December 1, 2009

Assignee: Microsoft Corporation

Inventors: Dinei A. Florencio, Philip A. Chou

prev 1 2 3 next