Neural Network Patents (Class 704/202)
-
Patent number: 10194203Abstract: A multimodal and real-time method for filtering sensitive content, receiving as input a digital video stream, the method including segmenting digital video into video fragments along the video timeline; extracting features containing significant information from the digital video input on sensitive media; reducing the semantic difference between each of the low-level video features, and the high-level sensitive concept; classifying the video fragments, generating a high-level label (positive or negative), with a confidence score for each fragment representation; performing high-level fusion to properly match the possible high-level labels and confidence scores for each fragment; and predicting the sensitive time by combining the labels of the fragments along the video timeline, indicating the moments when the content becomes sensitive.Type: GrantFiled: June 30, 2016Date of Patent: January 29, 2019Assignees: SAMSUNG ELETRÔNICA DA AMACÔNIA LTDA., UNIVERSIDADE ESTADUAL DE CAMPINASInventors: Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Siome Goldenstein, Eduardo Valle, Anderson Rocha
-
Patent number: 10095768Abstract: The disclosed computer-implemented method for aggregating information-asset classifications may include (1) identifying a data collection that includes two or more information assets, (2) identifying a classification for each of the information assets, (3) deriving, based at least in part on the classifications of the information assets, an aggregate classification for the data collection, and (4) associating the aggregate classification with the data collection to enable a data management system to enforce a data management policy based on the aggregate classification. Various other methods, systems, and computer-readable media are also disclosed.Type: GrantFiled: November 14, 2014Date of Patent: October 9, 2018Assignee: Veritas Technologies LLCInventor: Robert Koeten
-
Patent number: 10032461Abstract: An apparatus includes microphone receivers configured to receive microphone signals from a plurality of microphones. A comparator configured to determine a speech similarity indication indicative of a similarity between the microphone signal and non-reverberant speech for each microphone signal. The determination is in response to a comparison of a property derived from the microphone signal to a reference property for non-reverberant speech. In some embodiments, the comparator is configured to determine the similarity indication by comparing to reference properties for speech samples of a set of non-reverberant speech samples. A generator is configured to generate a speech signal by combining the microphone signals in response to the similarity indications. The apparatus may be distributed over a plurality of devices each containing a microphone, and the approach may determine the most suited microphone for generating the speech signal.Type: GrantFiled: February 18, 2014Date of Patent: July 24, 2018Assignee: KONINKLIJKE PHILIPS N.V.Inventor: Sriram Srinivasan
-
Patent number: 10026396Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a sequence representing an utterance, the sequence comprising a plurality of audio frames; determining one or more warping factors for each audio frame in the sequence using a warping neural network; applying, for each audio frame, the one or more warping factors for the audio frame to the audio frame to generate a respective modified audio frame, wherein the applying comprises using at least one of the warping factors to scale a respective frequency of the audio frame to a new respective frequency in the respective modified audio frame; and decoding the modified audio frames using a decoding neural network, wherein the decoding neural network is configured to output a word sequence that is a transcription of the utterance.Type: GrantFiled: July 27, 2016Date of Patent: July 17, 2018Assignee: Google LLCInventor: Andrew W. Senior
-
Patent number: 9953638Abstract: A computer-implemented method is described for front end speech processing for automatic speech recognition. A sequence of speech features which characterize an unknown speech input provided on an audio input channel and associated meta-data which characterize the audio input channel are received. The speech features are transformed with a computer process that uses a trained mapping function controlled by the meta-data, and automatic speech recognition is performed of the transformed speech features.Type: GrantFiled: June 28, 2012Date of Patent: April 24, 2018Assignee: Nuance Communications, Inc.Inventors: Daniel Willett, Karl Jonas Lööf, Yue Pan, Joel Pinto, Christian Gollan
-
Patent number: 9875747Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.Type: GrantFiled: July 15, 2016Date of Patent: January 23, 2018Assignee: GOOGLE LLCInventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
-
Patent number: 9785313Abstract: A method and system for providing a distraction free reading mode with an electronic personal display is disclosed. One example accesses non-adjustable settings for a reader mode. In addition, user adjustable settings for the reader mode on the electronic personal display are also accessed. The user adjustable settings and the non-adjustable settings are then implemented when the reader mode is initiated.Type: GrantFiled: June 28, 2013Date of Patent: October 10, 2017Assignee: RAKUTEN KOBO, INC.Inventors: James Wu, Peter James Farmer, Michael Serbinis, Pamela Lynn Hilborn
-
Patent number: 9721202Abstract: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.Type: GrantFiled: February 21, 2014Date of Patent: August 1, 2017Assignee: Adobe Systems IncorporatedInventors: Nicolas Maurice Boulanger-Lewandowski, Gautham J. Mysore, Matthew Douglas Hoffman
-
Patent number: 9672207Abstract: A method, system, and non-transitory compute readable medium determining and discerning items with multiple meanings in a sequence of items including producing a distributed representation for each item of the sequence of items including a word vector and a context vector, partitioning the sequence of items into classes, for an item using a representative word vector of each class, calculating a cosine distance between the word vector of said item and the class representative vector, and producing a new sequence of items by modifying the distributed representation in the producing by replacing each occurrence of an item depending on the cosine distance calculated by the calculating.Type: GrantFiled: October 19, 2015Date of Patent: June 6, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Oded Shmueli
-
Patent number: 9659579Abstract: The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference signal. The method comprises sampling said reference and degraded signal into frames, and forming frame pairs. For each pair one or more difference functions representing a difference between the degraded and reference signal are provided. A difference function is selected and compensated for different disturbance types, such as to provide a disturbance density function adapted to human auditory perception. An overall quality parameter is determined indicative of the intelligibility of the degraded signal. The method comprises determining a switching parameter indicative of audio power level of said degraded signal, for performing said selecting.Type: GrantFiled: November 15, 2012Date of Patent: May 23, 2017Assignee: Nederlandse Organisatie voor toegepast-natuurwetenschappelijk onderzoek TNOInventor: John Gerard Beerends
-
Patent number: 9342781Abstract: We describe a signal processor, the signal processor comprising: a probability vector generation system, wherein said probability vector generation system has an input to receive a category vector for a category of output example and an output to provide a probability vector for said category of output example, wherein said output example comprises a set of data points, and wherein said probability vector defines a probability of each of said set of data points for said category of output example; a memory storing a plurality of said category vectors, one for each of a plurality of said categories of output example; and a stochastic selector to select a said stored category of output example for presentation of the corresponding category vector to said probability vector generation system; wherein said signal processor is configured to output data for an output example corresponding to said selected stored category.Type: GrantFiled: June 24, 2013Date of Patent: May 17, 2016Assignee: Google Inc.Inventors: Julien Robert Michel Cornebise, Danilo Jimenez Rezende, Daniël Pieter Wierstra
-
Patent number: 9336775Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.Type: GrantFiled: March 5, 2013Date of Patent: May 10, 2016Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
-
Publication number: 20150127327Abstract: The technology described herein can be embodied in a method that includes receiving an audio signal encoding a portion of an utterance, and providing, to a first neural network, data corresponding to the audio signal. The method also includes generating, by a processor, data representing a transcription for the utterance based on an output of the first neural network. The first neural network is trained using features of multiple context-dependent states, the context-dependent states being derived from a plurality of context-independent states provided by a second neural network.Type: ApplicationFiled: May 20, 2014Publication date: May 7, 2015Applicant: Google Inc.Inventors: Michiel A.U. Bacchiani, David Rybach
-
Patent number: 9020812Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.Type: GrantFiled: November 24, 2010Date of Patent: April 28, 2015Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei UniversityInventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
-
Publication number: 20150039299Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.Type: ApplicationFiled: September 18, 2013Publication date: February 5, 2015Applicant: Google Inc.Inventors: Eugene Weinstein, Pedro J. Moreno Mengibar, Johan Schalkwyk
-
Patent number: 8930183Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.Type: GrantFiled: August 25, 2011Date of Patent: January 6, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Byung Ha Chun, Mark John Francis Gales
-
Patent number: 8924200Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: December 30, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8924199Abstract: A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller.Type: GrantFiled: December 20, 2011Date of Patent: December 30, 2014Assignee: Fujitsu LimitedInventors: Chisato Ishikawa, Takeshi Otani, Taro Togawa, Masanao Suzuki, Masakiyo Tanaka
-
Patent number: 8924220Abstract: In a multiband compressor 100, a level calculation unit 121 calculates a signal level inputted for each of bands, a gain calculation unit 122 calculates a gain value from the calculated signal level, and a gain limitation unit 130 limits a gain value by comparison with a gain value of the other band in a compressor for each band. With this configuration, provided is a multiband compressor capable of achieving a balance between the quality of sound and the effect of enhancing the sound level at a high level.Type: GrantFiled: September 7, 2010Date of Patent: December 30, 2014Assignee: Lenovo Innovations Limited (Hong Kong)Inventor: Satoshi Hosokawa
-
Publication number: 20140358526Abstract: A non-intrusive objective speech quality assessment is performed on a degraded speech signal. The methods are well suited for systems where random and bursty packet losses may occur and/or packet stream regeneration may also occur prior to speech signal quality assessment. In one embodiment received packetized speech is analyzed to determine to an overall final signal quality score. A limited set of trained neural networks, e.g., 5, corresponding to different signal features, each determine a signal feature quality score. A trained joint quality score determination module determines a joint quality score based on the signal feature quality scores. Packet loss is estimated based on received packet header information and/or detected gap durations. The determined joint quality score is adjusted, based on estimated packet loss information obtained from examining the speech signal, network level statistics and/or codec parameters to generate the final quality score.Type: ApplicationFiled: May 31, 2013Publication date: December 4, 2014Inventors: Ahmed Abdelal, Vladimir Botchev, Wassim Matragi
-
Patent number: 8886612Abstract: An apparatus for providing improved data compression may include an encoder comprising a quantizer for encoding input data and a side model. The quantizer may be trained with respect to high priority data among the input data and may be configured to partially encode the input data by encoding the high priority data. The side model may be trained jointly with the training of the quantizer and is configured to model low priority data among the input data.Type: GrantFiled: October 4, 2007Date of Patent: November 11, 2014Assignee: Core Wireless Licensing S.a.r.l.Inventors: Jani K. Nurminen, Sakari Himanen
-
Publication number: 20140278379Abstract: In one implementation, a computer-implemented method includes receiving, at a computer system, a request to predict a next word in a dialog being uttered by a speaker; accessing, by the computer system, a neural network comprising i) an input layer, ii) one or more hidden layers, and iii) an output layer; identifying the local context for the dialog of the speaker; selecting, by the computer system and using a semantic model, at least one vector that represents the semantic context for the dialog; applying input to the input layer of the neural network, the input comprising i) the local context of the dialog and ii) the values for the at least one vector; generating probability values for at least a portion of the candidate words; and providing, by the computer system and based on the probability values, information that identifies one or more of the candidate words.Type: ApplicationFiled: April 16, 2013Publication date: September 18, 2014Applicant: Google Inc.Inventors: Noah B. Coccaro, Patrick An Phu Nguyen
-
Method and apparatus of transforming speech feature vectors using an auto-associative neural network
Patent number: 8838446Abstract: Provided is a method and apparatus for transforming a speech feature vector. The method includes extracting a feature vector required for speech recognition from a speech signal and transforming the extracted feature vector using an auto-associative neural network (AANN).Type: GrantFiled: August 31, 2007Date of Patent: September 16, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim -
Patent number: 8762142Abstract: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.Type: GrantFiled: August 15, 2007Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
-
Patent number: 8762158Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.Type: GrantFiled: August 5, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
-
Publication number: 20140142929Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.Type: ApplicationFiled: November 20, 2012Publication date: May 22, 2014Applicant: MICROSOFT CORPORATIONInventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
-
Patent number: 8606571Abstract: The present technology provides noise reduction of an acoustic signal using a configurable classification threshold which provides a sophisticated level of control to balance the tradeoff between positional robustness and noise reduction robustness. The configurable classification threshold corresponds to a configurable spatial region, such that signals arising from sources within the configurable spatial region are preserved, and signals arising from sources outside it are rejected. In embodiments, the configurable classification threshold can be automatically and dynamically adjusted in real-time based on evaluated environmental conditions surrounding an audio device implementing the noise reduction techniques described herein.Type: GrantFiled: July 15, 2010Date of Patent: December 10, 2013Assignee: Audience, Inc.Inventors: Mark Every, Carlo Murgia
-
Patent number: 8589334Abstract: Methods and systems are provided for developing decision information relating to a single system based on data received from a plurality of sensors. The method includes receiving first data from a first sensor that defines first information of a first type that is related to a system, receiving second data from a second sensor that defines second information of a second type that is related to said system, wherein the first type is different from the second type, generating a first decision model, a second decision model, and a third decision model, determining whether data is available from only the first sensor, only the second sensor, or both the first and second sensors, and selecting based on the determination of availability an additional model to apply the available data, wherein the additional model is selected from a plurality of additional decision models including the third decision model.Type: GrantFiled: January 18, 2011Date of Patent: November 19, 2013Assignee: Telcordia Technologies, Inc.Inventor: Akshay Vashist
-
Patent number: 8577045Abstract: An encoding apparatus comprises a frame processor (105) which receives a multi channel audio signal comprising at least a first audio signal from a first microphone (101) and a second audio signal from a second microphone (103). An ITD processor 107 then determines an inter time difference between the first audio signal and the second audio signal and a set of delays (109, 111) generates a compensated multi channel audio signal from the multi channel audio signal by delaying at least one of the first and second audio signals in response to the inter time difference signal. A combiner (113) then generates a mono signal by combining channels of the compensated multi channel audio signal and a mono signal encoder (115) encodes the mono signal. The inter time difference may specifically be determined by an algorithm based on determining cross correlations between the first and second audio signals.Type: GrantFiled: September 9, 2008Date of Patent: November 5, 2013Assignee: Motorola Mobility LLCInventor: Jonathan A. Gibbs
-
Patent number: 8577821Abstract: A method and apparatus for encoding the amplitude of frequency signals, into a temporal electrical spike pattern and a method for decoding the generated pattern by recognizing and classifying it with a correlating (associative pattern recognition) neuron like device. The first part of the system takes signals and converts their amplitude into a set of timed pulses. The timing of that pulse is a function of the amplitude of the incoming signal. Once a series of timed pulses have been created, they can be sent to a decoder. The decoder circuit uses bandpass filters and delays/gains a low pass summation. The recognition portion compares the timed input spikes with a known sample. Only when the input signals match the sample signals will the amplitudes be sufficient to meet the trigger threshold. When the output spike is generated, a pattern match has been found.Type: GrantFiled: April 16, 2010Date of Patent: November 5, 2013Inventor: Thomas D. Humphrey
-
Publication number: 20130262096Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.Type: ApplicationFiled: September 21, 2012Publication date: October 3, 2013Applicant: LESSAC TECHNOLOGIES, INC.Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
-
Patent number: 8494840Abstract: The invention relates to audio signal processing and speech enhancement. In accordance with one aspect, the invention combines a high-quality audio program that is a mix of speech and non-speech audio with a lower-quality copy of the speech components contained in the audio program for the purpose of generating a high-quality audio program with an increased ratio of speech to non-speech audio such as may benefit the elderly, hearing impaired or other listeners. Aspects of the invention are particularly useful for television and home theater sound, although they may be applicable to other audio and sound applications. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.Type: GrantFiled: February 12, 2008Date of Patent: July 23, 2013Assignee: Dolby Laboratories Licensing CorporationInventor: Hannes Muesch
-
Patent number: 8463720Abstract: A method for defining a network of nodes is provided, each representing a unique concept, and making connections between individual concepts through unique relationships to other concepts. Each of the nodes is operable to store a unique identifier in the network and information regarding the concept in addition to the unique relationships.Type: GrantFiled: March 26, 2010Date of Patent: June 11, 2013Assignee: Neuric Technologies, LLCInventors: Jennifer Seale, Hannah Lindsley, Timothy Allen Margheim
-
Patent number: 8428953Abstract: An audio decoding device of the present invention includes: a decoding unit decoding a stream to a spectrum coefficient, and outputting stream information when a frame included in the stream cannot be decoded; an orthogonal transformation unit transforming the spectrum coefficient to a time signal; a correction unit generating a correction time signal based on an output waveform within a reference section that is in a section that overlaps between an error frame section to which the stream information is outputted and an adjacent frame section and that is a section in the middle of the adjacent frame section, when the decoding unit outputs the stream information: and an output unit generating the output waveform by synthesizing the correction time signal and the time signal.Type: GrantFiled: May 20, 2008Date of Patent: April 23, 2013Assignee: Panasonic CorporationInventors: Kojiro Ono, Takeshi Norimatsu, Yoshiaki Takagi, Takashi Katayama
-
Patent number: 8428946Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.Type: GrantFiled: July 6, 2012Date of Patent: April 23, 2013Assignee: Google Inc.Inventor: Marco Paniconi
-
Patent number: 8391373Abstract: A method is provided for concealing a transmission error in a digital signal chopped into a plurality of successive frames associated with different time intervals in which, on reception, the signal may comprise erased frames and valid frames, the valid frames comprising information relating to the concealment of frame loss. The method is implemented during a hierarchical decoding using a core decoding and a transform-based decoding using windows introducing a time delay of less than a frame with respect to the core decoding. The method includes concealing a first set of missing samples for the erased frame, implemented in a first time interval; a step of concealing a second set of missing samples utilizing information of said valid frame and implemented in a second time interval; and a step of transition between the first and the second set of missing samples to obtain at least part of the missing frame.Type: GrantFiled: March 20, 2009Date of Patent: March 5, 2013Assignee: France TelecomInventors: David Virette, Pierrick Philippe, Balazs Kovesi
-
Publication number: 20120290294Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.Type: ApplicationFiled: July 27, 2012Publication date: November 15, 2012Inventors: Michael Callahan, Thomas Coleman
-
Patent number: 8296134Abstract: A spectrum modifying method and the like wherein the efficiencies of the signal estimation and prediction can be improved and the spectrum can be more efficiently encoded. According to this method, the pitch period is calculated from an original signal, which serves as a reference signal, and then a basic pitch frequency (f0) is calculated. Thereafter, the spectrum of a target signal, which is a target of spectrum modification, is divided into a plurality of partitions. It is specified here that the width of each partition be the basic pitch frequency. Then, the spectra of bands are interleaved such that a plurality of peaks having similar amplitudes are unified into a group. The basic pitch frequency is used as an interleave pitch.Type: GrantFiled: May 11, 2006Date of Patent: October 23, 2012Assignee: Panasonic CorporationInventors: Chun Woei Teo, Sua Hong Neo, Koji Yoshida, Michiyo Goto
-
Patent number: 8239196Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.Type: GrantFiled: July 28, 2011Date of Patent: August 7, 2012Assignee: Google Inc.Inventor: Marco Paniconi
-
Patent number: 8239194Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.Type: GrantFiled: September 26, 2011Date of Patent: August 7, 2012Assignee: Google Inc.Inventor: Marco Paniconi
-
Patent number: 8214203Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.Type: GrantFiled: March 25, 2010Date of Patent: July 3, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
-
Conservative training method for adapting a neural network of an automatic speech recognition device
Patent number: 8126710Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.Type: GrantFiled: June 1, 2005Date of Patent: February 28, 2012Assignee: Loquendo S.p.A.Inventors: Roberto Gemello, Franco Mana -
Patent number: 8036425Abstract: A neural network-controlled automatic tracking and recognizing system includes a fixed field of view collection module, a full functions variable field of view collection module, a video image recognition algorithm module, a neural network control module, a suspect object track-tracking module, a database comparison and alarm judgment module, a monitored characteristic recording and rule setting module, a light monitoring and control module, a backlight module, an alarm output/display/storage module, and security monitoring sensors. The invention relates also to the operation method of the system.Type: GrantFiled: June 26, 2008Date of Patent: October 11, 2011Inventor: Billy Hou
-
Patent number: 7966177Abstract: The invention relates to a method for recognizing a phonetic sound sequence or a character sequence, e.g.Type: GrantFiled: August 13, 2001Date of Patent: June 21, 2011Inventor: Hans Geiger
-
Patent number: 7836002Abstract: A system that can automatically narrow the search space or recognition scope within an activity-centric environment based upon a current activity or set of activities is provided. In addition, the activity and context data can also be used to rank the results of the recognition or search activity. In accordance with the domain scoping, natural language processing (NLP) as well as other types of conversion and recognition systems can dynamically adjust to the scope of the activity or group of activities thereby increasing the recognition systems accuracy and usefulness. In operation, a user context, activity context, environment context and/or device profile can be employed to effectuate the scoping. As well, the system can combine context with extrinsic data, including but not limited to, calendar, profile, historical activity data, etc. in order to define the parameters for an appropriate scoping.Type: GrantFiled: June 27, 2006Date of Patent: November 16, 2010Assignee: Microsoft CorporationInventors: Steven W. Macbeth, Roland L. Fernandez, Brian R. Meyers, Desney S. Tan, George G. Robertson, Nuria M. Oliver, Oscar E. Murillo
-
Patent number: 7827031Abstract: A neural network in a speech-recognition system has computing units organized in levels including at least one hidden level and one output level. The computing units of the hidden level are connected to the computing units of the output level via weighted connections, and the computing units of the output level correspond to acoustic-phonetic units of the general vocabulary. This network executes the following steps: determining a subset of acoustic-phonetic units necessary for recognizing all the words contained in the general vocabulary subset; eliminating from the neural network all the weighted connections afferent to computing units of the output level that correspond to acoustic-phonetic units not contained in the previously determined subset of acoustic-phonetic units, thus obtaining a compacted neural network optimized for recognition of the words contained in the general vocabulary subset; and executing, at each moment in time, only the compacted neural network.Type: GrantFiled: February 12, 2003Date of Patent: November 2, 2010Assignee: Loquendo S.p.A.Inventors: Dario Albesano, Roberto Gemello
-
Patent number: 7769580Abstract: A method of optimizing the execution of a neural network in a speech recognition system provides for conditionally skipping a variable number of frames, depending on a distance computed between output probabilities, or likelihoods, of a neural network. The distance is initially evaluated between two frames at times 1 and 1+k, where k is a predetermined maximum distance between frames, and if such distance is sufficiently small, the frames between times 1 and 1+k are calculated by interpolation, avoiding further executions of the neural network. If, on the contrary, such distance is not small enough, it means that the outputs of the network are changing quickly, and it is not possible to skip too many frames. In that case, the method attempts to skip remaining frames, calculating and evaluating a new distance.Type: GrantFiled: December 23, 2002Date of Patent: August 3, 2010Assignee: Loquendo S.p.A.Inventors: Roberto Gemello, Dario Albesano
-
Patent number: 7765100Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.Type: GrantFiled: February 6, 2006Date of Patent: July 27, 2010Assignee: Samsung Electronics Co., Ltd.Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
-
Patent number: 7739111Abstract: A pattern matching method for matching between a first symbol sequence and a second symbol sequence which is shorter than the first symbol sequence is provided. The method includes the steps of performing DP matching between the first and second symbol sequences to create a matrix of the DP matching transition, detecting the maximum length of lengths of consecutive correct answers based on the matrix of the DP matching transition, and calculating similarity based on the maximum length.Type: GrantFiled: August 9, 2006Date of Patent: June 15, 2010Assignee: Canon Kabushiki KaishaInventor: Kazue Kaneko
-
Patent number: 7627467Abstract: Real-time packet-based audio communications over packet-based networks frequently results in the loss of one or more packets during any given communication session. The real-time nature of such communications precludes retransmission of lost packets due to the unacceptable delays that would result. Consequently, packet loss concealment methods are employed to “hide” lost packets from the listener. Unfortunately, conventional loss concealment methods, such as packet repetition or stretch/overlap methods, do not fully exploit information available from partially received samples. Therefore, when a single frame of N coefficients is lost, 2N samples are only partially reconstructed, thereby degrading the reconstructed signal.Type: GrantFiled: June 30, 2005Date of Patent: December 1, 2009Assignee: Microsoft CorporationInventors: Dinei A. Florencio, Philip A. Chou