Neural Network Patents (Class 704/202)
  • Patent number: 10194203
    Abstract: A multimodal and real-time method for filtering sensitive content, receiving as input a digital video stream, the method including segmenting digital video into video fragments along the video timeline; extracting features containing significant information from the digital video input on sensitive media; reducing the semantic difference between each of the low-level video features, and the high-level sensitive concept; classifying the video fragments, generating a high-level label (positive or negative), with a confidence score for each fragment representation; performing high-level fusion to properly match the possible high-level labels and confidence scores for each fragment; and predicting the sensitive time by combining the labels of the fragments along the video timeline, indicating the moments when the content becomes sensitive.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: January 29, 2019
    Assignees: SAMSUNG ELETRÔNICA DA AMACÔNIA LTDA., UNIVERSIDADE ESTADUAL DE CAMPINAS
    Inventors: Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Siome Goldenstein, Eduardo Valle, Anderson Rocha
  • Patent number: 10095768
    Abstract: The disclosed computer-implemented method for aggregating information-asset classifications may include (1) identifying a data collection that includes two or more information assets, (2) identifying a classification for each of the information assets, (3) deriving, based at least in part on the classifications of the information assets, an aggregate classification for the data collection, and (4) associating the aggregate classification with the data collection to enable a data management system to enforce a data management policy based on the aggregate classification. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: November 14, 2014
    Date of Patent: October 9, 2018
    Assignee: Veritas Technologies LLC
    Inventor: Robert Koeten
  • Patent number: 10032461
    Abstract: An apparatus includes microphone receivers configured to receive microphone signals from a plurality of microphones. A comparator configured to determine a speech similarity indication indicative of a similarity between the microphone signal and non-reverberant speech for each microphone signal. The determination is in response to a comparison of a property derived from the microphone signal to a reference property for non-reverberant speech. In some embodiments, the comparator is configured to determine the similarity indication by comparing to reference properties for speech samples of a set of non-reverberant speech samples. A generator is configured to generate a speech signal by combining the microphone signals in response to the similarity indications. The apparatus may be distributed over a plurality of devices each containing a microphone, and the approach may determine the most suited microphone for generating the speech signal.
    Type: Grant
    Filed: February 18, 2014
    Date of Patent: July 24, 2018
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventor: Sriram Srinivasan
  • Patent number: 10026396
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a sequence representing an utterance, the sequence comprising a plurality of audio frames; determining one or more warping factors for each audio frame in the sequence using a warping neural network; applying, for each audio frame, the one or more warping factors for the audio frame to the audio frame to generate a respective modified audio frame, wherein the applying comprises using at least one of the warping factors to scale a respective frequency of the audio frame to a new respective frequency in the respective modified audio frame; and decoding the modified audio frames using a decoding neural network, wherein the decoding neural network is configured to output a word sequence that is a transcription of the utterance.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: July 17, 2018
    Assignee: Google LLC
    Inventor: Andrew W. Senior
  • Patent number: 9953638
    Abstract: A computer-implemented method is described for front end speech processing for automatic speech recognition. A sequence of speech features which characterize an unknown speech input provided on an audio input channel and associated meta-data which characterize the audio input channel are received. The speech features are transformed with a computer process that uses a trained mapping function controlled by the meta-data, and automatic speech recognition is performed of the transformed speech features.
    Type: Grant
    Filed: June 28, 2012
    Date of Patent: April 24, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Willett, Karl Jonas Lööf, Yue Pan, Joel Pinto, Christian Gollan
  • Patent number: 9875747
    Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: January 23, 2018
    Assignee: GOOGLE LLC
    Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
  • Patent number: 9785313
    Abstract: A method and system for providing a distraction free reading mode with an electronic personal display is disclosed. One example accesses non-adjustable settings for a reader mode. In addition, user adjustable settings for the reader mode on the electronic personal display are also accessed. The user adjustable settings and the non-adjustable settings are then implemented when the reader mode is initiated.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: October 10, 2017
    Assignee: RAKUTEN KOBO, INC.
    Inventors: James Wu, Peter James Farmer, Michael Serbinis, Pamela Lynn Hilborn
  • Patent number: 9721202
    Abstract: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.
    Type: Grant
    Filed: February 21, 2014
    Date of Patent: August 1, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Nicolas Maurice Boulanger-Lewandowski, Gautham J. Mysore, Matthew Douglas Hoffman
  • Patent number: 9672207
    Abstract: A method, system, and non-transitory compute readable medium determining and discerning items with multiple meanings in a sequence of items including producing a distributed representation for each item of the sequence of items including a word vector and a context vector, partitioning the sequence of items into classes, for an item using a representative word vector of each class, calculating a cosine distance between the word vector of said item and the class representative vector, and producing a new sequence of items by modifying the distributed representation in the producing by replacing each occurrence of an item depending on the cosine distance calculated by the calculating.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: June 6, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Oded Shmueli
  • Patent number: 9659579
    Abstract: The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference signal. The method comprises sampling said reference and degraded signal into frames, and forming frame pairs. For each pair one or more difference functions representing a difference between the degraded and reference signal are provided. A difference function is selected and compensated for different disturbance types, such as to provide a disturbance density function adapted to human auditory perception. An overall quality parameter is determined indicative of the intelligibility of the degraded signal. The method comprises determining a switching parameter indicative of audio power level of said degraded signal, for performing said selecting.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: May 23, 2017
    Assignee: Nederlandse Organisatie voor toegepast-natuurwetenschappelijk onderzoek TNO
    Inventor: John Gerard Beerends
  • Patent number: 9342781
    Abstract: We describe a signal processor, the signal processor comprising: a probability vector generation system, wherein said probability vector generation system has an input to receive a category vector for a category of output example and an output to provide a probability vector for said category of output example, wherein said output example comprises a set of data points, and wherein said probability vector defines a probability of each of said set of data points for said category of output example; a memory storing a plurality of said category vectors, one for each of a plurality of said categories of output example; and a stochastic selector to select a said stored category of output example for presentation of the corresponding category vector to said probability vector generation system; wherein said signal processor is configured to output data for an output example corresponding to said selected stored category.
    Type: Grant
    Filed: June 24, 2013
    Date of Patent: May 17, 2016
    Assignee: Google Inc.
    Inventors: Julien Robert Michel Cornebise, Danilo Jimenez Rezende, Daniël Pieter Wierstra
  • Patent number: 9336775
    Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.
    Type: Grant
    Filed: March 5, 2013
    Date of Patent: May 10, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Jinyu Li, Zhijie Yan, Qiang Huo, Yifan Gong
  • Publication number: 20150127327
    Abstract: The technology described herein can be embodied in a method that includes receiving an audio signal encoding a portion of an utterance, and providing, to a first neural network, data corresponding to the audio signal. The method also includes generating, by a processor, data representing a transcription for the utterance based on an output of the first neural network. The first neural network is trained using features of multiple context-dependent states, the context-dependent states being derived from a plurality of context-independent states provided by a second neural network.
    Type: Application
    Filed: May 20, 2014
    Publication date: May 7, 2015
    Applicant: Google Inc.
    Inventors: Michiel A.U. Bacchiani, David Rybach
  • Patent number: 9020812
    Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.
    Type: Grant
    Filed: November 24, 2010
    Date of Patent: April 28, 2015
    Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
  • Publication number: 20150039299
    Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.
    Type: Application
    Filed: September 18, 2013
    Publication date: February 5, 2015
    Applicant: Google Inc.
    Inventors: Eugene Weinstein, Pedro J. Moreno Mengibar, Johan Schalkwyk
  • Patent number: 8930183
    Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.
    Type: Grant
    Filed: August 25, 2011
    Date of Patent: January 6, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Byung Ha Chun, Mark John Francis Gales
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8924199
    Abstract: A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller.
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: December 30, 2014
    Assignee: Fujitsu Limited
    Inventors: Chisato Ishikawa, Takeshi Otani, Taro Togawa, Masanao Suzuki, Masakiyo Tanaka
  • Patent number: 8924220
    Abstract: In a multiband compressor 100, a level calculation unit 121 calculates a signal level inputted for each of bands, a gain calculation unit 122 calculates a gain value from the calculated signal level, and a gain limitation unit 130 limits a gain value by comparison with a gain value of the other band in a compressor for each band. With this configuration, provided is a multiband compressor capable of achieving a balance between the quality of sound and the effect of enhancing the sound level at a high level.
    Type: Grant
    Filed: September 7, 2010
    Date of Patent: December 30, 2014
    Assignee: Lenovo Innovations Limited (Hong Kong)
    Inventor: Satoshi Hosokawa
  • Publication number: 20140358526
    Abstract: A non-intrusive objective speech quality assessment is performed on a degraded speech signal. The methods are well suited for systems where random and bursty packet losses may occur and/or packet stream regeneration may also occur prior to speech signal quality assessment. In one embodiment received packetized speech is analyzed to determine to an overall final signal quality score. A limited set of trained neural networks, e.g., 5, corresponding to different signal features, each determine a signal feature quality score. A trained joint quality score determination module determines a joint quality score based on the signal feature quality scores. Packet loss is estimated based on received packet header information and/or detected gap durations. The determined joint quality score is adjusted, based on estimated packet loss information obtained from examining the speech signal, network level statistics and/or codec parameters to generate the final quality score.
    Type: Application
    Filed: May 31, 2013
    Publication date: December 4, 2014
    Inventors: Ahmed Abdelal, Vladimir Botchev, Wassim Matragi
  • Patent number: 8886612
    Abstract: An apparatus for providing improved data compression may include an encoder comprising a quantizer for encoding input data and a side model. The quantizer may be trained with respect to high priority data among the input data and may be configured to partially encode the input data by encoding the high priority data. The side model may be trained jointly with the training of the quantizer and is configured to model low priority data among the input data.
    Type: Grant
    Filed: October 4, 2007
    Date of Patent: November 11, 2014
    Assignee: Core Wireless Licensing S.a.r.l.
    Inventors: Jani K. Nurminen, Sakari Himanen
  • Publication number: 20140278379
    Abstract: In one implementation, a computer-implemented method includes receiving, at a computer system, a request to predict a next word in a dialog being uttered by a speaker; accessing, by the computer system, a neural network comprising i) an input layer, ii) one or more hidden layers, and iii) an output layer; identifying the local context for the dialog of the speaker; selecting, by the computer system and using a semantic model, at least one vector that represents the semantic context for the dialog; applying input to the input layer of the neural network, the input comprising i) the local context of the dialog and ii) the values for the at least one vector; generating probability values for at least a portion of the candidate words; and providing, by the computer system and based on the probability values, information that identifies one or more of the candidate words.
    Type: Application
    Filed: April 16, 2013
    Publication date: September 18, 2014
    Applicant: Google Inc.
    Inventors: Noah B. Coccaro, Patrick An Phu Nguyen
  • Patent number: 8838446
    Abstract: Provided is a method and apparatus for transforming a speech feature vector. The method includes extracting a feature vector required for speech recognition from a speech signal and transforming the extracted feature vector using an auto-associative neural network (AANN).
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: September 16, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
  • Patent number: 8762142
    Abstract: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.
    Type: Grant
    Filed: August 15, 2007
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Publication number: 20140142929
    Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.
    Type: Application
    Filed: November 20, 2012
    Publication date: May 22, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
  • Patent number: 8606571
    Abstract: The present technology provides noise reduction of an acoustic signal using a configurable classification threshold which provides a sophisticated level of control to balance the tradeoff between positional robustness and noise reduction robustness. The configurable classification threshold corresponds to a configurable spatial region, such that signals arising from sources within the configurable spatial region are preserved, and signals arising from sources outside it are rejected. In embodiments, the configurable classification threshold can be automatically and dynamically adjusted in real-time based on evaluated environmental conditions surrounding an audio device implementing the noise reduction techniques described herein.
    Type: Grant
    Filed: July 15, 2010
    Date of Patent: December 10, 2013
    Assignee: Audience, Inc.
    Inventors: Mark Every, Carlo Murgia
  • Patent number: 8589334
    Abstract: Methods and systems are provided for developing decision information relating to a single system based on data received from a plurality of sensors. The method includes receiving first data from a first sensor that defines first information of a first type that is related to a system, receiving second data from a second sensor that defines second information of a second type that is related to said system, wherein the first type is different from the second type, generating a first decision model, a second decision model, and a third decision model, determining whether data is available from only the first sensor, only the second sensor, or both the first and second sensors, and selecting based on the determination of availability an additional model to apply the available data, wherein the additional model is selected from a plurality of additional decision models including the third decision model.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: November 19, 2013
    Assignee: Telcordia Technologies, Inc.
    Inventor: Akshay Vashist
  • Patent number: 8577045
    Abstract: An encoding apparatus comprises a frame processor (105) which receives a multi channel audio signal comprising at least a first audio signal from a first microphone (101) and a second audio signal from a second microphone (103). An ITD processor 107 then determines an inter time difference between the first audio signal and the second audio signal and a set of delays (109, 111) generates a compensated multi channel audio signal from the multi channel audio signal by delaying at least one of the first and second audio signals in response to the inter time difference signal. A combiner (113) then generates a mono signal by combining channels of the compensated multi channel audio signal and a mono signal encoder (115) encodes the mono signal. The inter time difference may specifically be determined by an algorithm based on determining cross correlations between the first and second audio signals.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: November 5, 2013
    Assignee: Motorola Mobility LLC
    Inventor: Jonathan A. Gibbs
  • Patent number: 8577821
    Abstract: A method and apparatus for encoding the amplitude of frequency signals, into a temporal electrical spike pattern and a method for decoding the generated pattern by recognizing and classifying it with a correlating (associative pattern recognition) neuron like device. The first part of the system takes signals and converts their amplitude into a set of timed pulses. The timing of that pulse is a function of the amplitude of the incoming signal. Once a series of timed pulses have been created, they can be sent to a decoder. The decoder circuit uses bandpass filters and delays/gains a low pass summation. The recognition portion compares the timed input spikes with a known sample. Only when the input signals match the sample signals will the amplitudes be sufficient to meet the trigger threshold. When the output spike is generated, a pattern match has been found.
    Type: Grant
    Filed: April 16, 2010
    Date of Patent: November 5, 2013
    Inventor: Thomas D. Humphrey
  • Publication number: 20130262096
    Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.
    Type: Application
    Filed: September 21, 2012
    Publication date: October 3, 2013
    Applicant: LESSAC TECHNOLOGIES, INC.
    Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
  • Patent number: 8494840
    Abstract: The invention relates to audio signal processing and speech enhancement. In accordance with one aspect, the invention combines a high-quality audio program that is a mix of speech and non-speech audio with a lower-quality copy of the speech components contained in the audio program for the purpose of generating a high-quality audio program with an increased ratio of speech to non-speech audio such as may benefit the elderly, hearing impaired or other listeners. Aspects of the invention are particularly useful for television and home theater sound, although they may be applicable to other audio and sound applications. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: July 23, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Hannes Muesch
  • Patent number: 8463720
    Abstract: A method for defining a network of nodes is provided, each representing a unique concept, and making connections between individual concepts through unique relationships to other concepts. Each of the nodes is operable to store a unique identifier in the network and information regarding the concept in addition to the unique relationships.
    Type: Grant
    Filed: March 26, 2010
    Date of Patent: June 11, 2013
    Assignee: Neuric Technologies, LLC
    Inventors: Jennifer Seale, Hannah Lindsley, Timothy Allen Margheim
  • Patent number: 8428953
    Abstract: An audio decoding device of the present invention includes: a decoding unit decoding a stream to a spectrum coefficient, and outputting stream information when a frame included in the stream cannot be decoded; an orthogonal transformation unit transforming the spectrum coefficient to a time signal; a correction unit generating a correction time signal based on an output waveform within a reference section that is in a section that overlaps between an error frame section to which the stream information is outputted and an adjacent frame section and that is a section in the middle of the adjacent frame section, when the decoding unit outputs the stream information: and an output unit generating the output waveform by synthesizing the correction time signal and the time signal.
    Type: Grant
    Filed: May 20, 2008
    Date of Patent: April 23, 2013
    Assignee: Panasonic Corporation
    Inventors: Kojiro Ono, Takeshi Norimatsu, Yoshiaki Takagi, Takashi Katayama
  • Patent number: 8428946
    Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.
    Type: Grant
    Filed: July 6, 2012
    Date of Patent: April 23, 2013
    Assignee: Google Inc.
    Inventor: Marco Paniconi
  • Patent number: 8391373
    Abstract: A method is provided for concealing a transmission error in a digital signal chopped into a plurality of successive frames associated with different time intervals in which, on reception, the signal may comprise erased frames and valid frames, the valid frames comprising information relating to the concealment of frame loss. The method is implemented during a hierarchical decoding using a core decoding and a transform-based decoding using windows introducing a time delay of less than a frame with respect to the core decoding. The method includes concealing a first set of missing samples for the erased frame, implemented in a first time interval; a step of concealing a second set of missing samples utilizing information of said valid frame and implemented in a second time interval; and a step of transition between the first and the second set of missing samples to obtain at least part of the missing frame.
    Type: Grant
    Filed: March 20, 2009
    Date of Patent: March 5, 2013
    Assignee: France Telecom
    Inventors: David Virette, Pierrick Philippe, Balazs Kovesi
  • Publication number: 20120290294
    Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.
    Type: Application
    Filed: July 27, 2012
    Publication date: November 15, 2012
    Inventors: Michael Callahan, Thomas Coleman
  • Patent number: 8296134
    Abstract: A spectrum modifying method and the like wherein the efficiencies of the signal estimation and prediction can be improved and the spectrum can be more efficiently encoded. According to this method, the pitch period is calculated from an original signal, which serves as a reference signal, and then a basic pitch frequency (f0) is calculated. Thereafter, the spectrum of a target signal, which is a target of spectrum modification, is divided into a plurality of partitions. It is specified here that the width of each partition be the basic pitch frequency. Then, the spectra of bands are interleaved such that a plurality of peaks having similar amplitudes are unified into a group. The basic pitch frequency is used as an interleave pitch.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: October 23, 2012
    Assignee: Panasonic Corporation
    Inventors: Chun Woei Teo, Sua Hong Neo, Koji Yoshida, Michiyo Goto
  • Patent number: 8239196
    Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.
    Type: Grant
    Filed: July 28, 2011
    Date of Patent: August 7, 2012
    Assignee: Google Inc.
    Inventor: Marco Paniconi
  • Patent number: 8239194
    Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: August 7, 2012
    Assignee: Google Inc.
    Inventor: Marco Paniconi
  • Patent number: 8214203
    Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.
    Type: Grant
    Filed: March 25, 2010
    Date of Patent: July 3, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
  • Patent number: 8126710
    Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.
    Type: Grant
    Filed: June 1, 2005
    Date of Patent: February 28, 2012
    Assignee: Loquendo S.p.A.
    Inventors: Roberto Gemello, Franco Mana
  • Patent number: 8036425
    Abstract: A neural network-controlled automatic tracking and recognizing system includes a fixed field of view collection module, a full functions variable field of view collection module, a video image recognition algorithm module, a neural network control module, a suspect object track-tracking module, a database comparison and alarm judgment module, a monitored characteristic recording and rule setting module, a light monitoring and control module, a backlight module, an alarm output/display/storage module, and security monitoring sensors. The invention relates also to the operation method of the system.
    Type: Grant
    Filed: June 26, 2008
    Date of Patent: October 11, 2011
    Inventor: Billy Hou
  • Patent number: 7966177
    Abstract: The invention relates to a method for recognizing a phonetic sound sequence or a character sequence, e.g.
    Type: Grant
    Filed: August 13, 2001
    Date of Patent: June 21, 2011
    Inventor: Hans Geiger
  • Patent number: 7836002
    Abstract: A system that can automatically narrow the search space or recognition scope within an activity-centric environment based upon a current activity or set of activities is provided. In addition, the activity and context data can also be used to rank the results of the recognition or search activity. In accordance with the domain scoping, natural language processing (NLP) as well as other types of conversion and recognition systems can dynamically adjust to the scope of the activity or group of activities thereby increasing the recognition systems accuracy and usefulness. In operation, a user context, activity context, environment context and/or device profile can be employed to effectuate the scoping. As well, the system can combine context with extrinsic data, including but not limited to, calendar, profile, historical activity data, etc. in order to define the parameters for an appropriate scoping.
    Type: Grant
    Filed: June 27, 2006
    Date of Patent: November 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Steven W. Macbeth, Roland L. Fernandez, Brian R. Meyers, Desney S. Tan, George G. Robertson, Nuria M. Oliver, Oscar E. Murillo
  • Patent number: 7827031
    Abstract: A neural network in a speech-recognition system has computing units organized in levels including at least one hidden level and one output level. The computing units of the hidden level are connected to the computing units of the output level via weighted connections, and the computing units of the output level correspond to acoustic-phonetic units of the general vocabulary. This network executes the following steps: determining a subset of acoustic-phonetic units necessary for recognizing all the words contained in the general vocabulary subset; eliminating from the neural network all the weighted connections afferent to computing units of the output level that correspond to acoustic-phonetic units not contained in the previously determined subset of acoustic-phonetic units, thus obtaining a compacted neural network optimized for recognition of the words contained in the general vocabulary subset; and executing, at each moment in time, only the compacted neural network.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: November 2, 2010
    Assignee: Loquendo S.p.A.
    Inventors: Dario Albesano, Roberto Gemello
  • Patent number: 7769580
    Abstract: A method of optimizing the execution of a neural network in a speech recognition system provides for conditionally skipping a variable number of frames, depending on a distance computed between output probabilities, or likelihoods, of a neural network. The distance is initially evaluated between two frames at times 1 and 1+k, where k is a predetermined maximum distance between frames, and if such distance is sufficiently small, the frames between times 1 and 1+k are calculated by interpolation, avoiding further executions of the neural network. If, on the contrary, such distance is not small enough, it means that the outputs of the network are changing quickly, and it is not possible to skip too many frames. In that case, the method attempts to skip remaining frames, calculating and evaluating a new distance.
    Type: Grant
    Filed: December 23, 2002
    Date of Patent: August 3, 2010
    Assignee: Loquendo S.p.A.
    Inventors: Roberto Gemello, Dario Albesano
  • Patent number: 7765100
    Abstract: A method and an apparatus for recovering a line spectrum pair (LSP) parameter of a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus adopting the same are provided. The method of recovering an LSP parameter in speech decoding includes: if it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous good frame (PGF) of the erased frame or LSP parameters of the PGF and a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF or spectrum envelopes of the PGF and NGF; recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF or the spectrum envelopes of the PGF and NGF; and converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: July 27, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Seungho Choi, Kihyun Choo
  • Patent number: 7739111
    Abstract: A pattern matching method for matching between a first symbol sequence and a second symbol sequence which is shorter than the first symbol sequence is provided. The method includes the steps of performing DP matching between the first and second symbol sequences to create a matrix of the DP matching transition, detecting the maximum length of lengths of consecutive correct answers based on the matrix of the DP matching transition, and calculating similarity based on the maximum length.
    Type: Grant
    Filed: August 9, 2006
    Date of Patent: June 15, 2010
    Assignee: Canon Kabushiki Kaisha
    Inventor: Kazue Kaneko
  • Patent number: 7627467
    Abstract: Real-time packet-based audio communications over packet-based networks frequently results in the loss of one or more packets during any given communication session. The real-time nature of such communications precludes retransmission of lost packets due to the unacceptable delays that would result. Consequently, packet loss concealment methods are employed to “hide” lost packets from the listener. Unfortunately, conventional loss concealment methods, such as packet repetition or stretch/overlap methods, do not fully exploit information available from partially received samples. Therefore, when a single frame of N coefficients is lost, 2N samples are only partially reconstructed, thereby degrading the reconstructed signal.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: December 1, 2009
    Assignee: Microsoft Corporation
    Inventors: Dinei A. Florencio, Philip A. Chou