Neural Network Patents (Class 704/232)

Voiced programming system and method

Patent number: 8315864

Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.

Type: Grant

Filed: April 24, 2012

Date of Patent: November 20, 2012

Inventor: Lunis Orcutt
Semi-automatic speech transcription

Patent number: 8249870

Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.

Type: Grant

Filed: November 12, 2008

Date of Patent: August 21, 2012

Assignee: Massachusetts Institute of Technology

Inventors: Brandon Cain Roy, Deb Kumar Roy
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239196

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: July 28, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239194

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: September 26, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
Voiced programming system and method

Patent number: 8209170

Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.

Type: Grant

Filed: June 2, 2011

Date of Patent: June 26, 2012

Assignee: Lunis ORCUTT

Inventor: Lunis Orcutt
Sub-audible speech recognition based upon electromyographic signals

Patent number: 8200486

Abstract: Method and system for processing and identifying a sub-audible signal formed by a source of sub-audible sounds. Sequences of samples of sub-audible sound patterns (“SASPs”) for known words/phrases in a selected database are received for overlapping time intervals, and Signal Processing Transforms (“SPTs”) are formed for each sample, as part of a matrix of entry values. The matrix is decomposed into contiguous, non-overlapping two-dimensional cells of entries, and neural net analysis is applied to estimate reference sets of weight coefficients that provide sums with optimal matches to reference sets of values. The reference sets of weight coefficients are used to determine a correspondence between a new (unknown) word/phrase and a word/phrase in the database.

Type: Grant

Filed: June 5, 2003

Date of Patent: June 12, 2012

Assignee: The United States of America as represented by the Administrator of the National Aeronautics & Space Administration (NASA)

Inventors: Charles C. Jorgensen, Diana D. Lee, Shane T. Agabon
Fast semantic extraction using a neural network architecture

Patent number: 8180633

Abstract: A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.

Type: Grant

Filed: February 29, 2008

Date of Patent: May 15, 2012

Assignee: NEC Laboratories America, Inc.

Inventors: Ronan Collobert, Jason Weston
Recognizing speech, and processing data

Patent number: 8150687

Abstract: An example embodiment of the invention includes a speech recognition processing unit for specifying speech segments for speech data, recognizing a speech in each of the speech segments, and associating a character string of obtained recognition data with the speech data for each speech segment, based on information on a time of the speech, and an output control unit for displaying/outputting the text prepared by sorting the recognition data in each speech segment. Sometimes, the system further includes a text editing unit for editing the prepared text, and a speech correspondence estimation unit for associating a character string in the edited text with the speech data by using a technique of dynamic programming.

Type: Grant

Filed: November 30, 2004

Date of Patent: April 3, 2012

Assignee: Nuance Communications, Inc.

Inventors: Shinsuke Mori, Nobuyasu Itoh, Masafumi Nishimura
Robot behavior control system and method, and robot apparatus

Patent number: 8145492

Abstract: A behavior control system of a robot for learning a phoneme sequence includes a sound inputting device inputting a phoneme sequence, a sound signal learning unit operable to convert the phoneme sequence into a sound synthesis parameter and to learn or evaluate a relationship between a sound synthesis parameter of a phoneme sequence that is generated by the robot and a sound synthesis parameter used for sound imitation, and a sound synthesizer operable to generate a phoneme sequence based on the sound synthesis parameter obtained by the sound signal learning unit.

Type: Grant

Filed: April 6, 2005

Date of Patent: March 27, 2012

Assignee: Sony Corporation

Inventor: Masahiro Fujita
Conservative training method for adapting a neural network of an automatic speech recognition device

Patent number: 8126710

Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.

Type: Grant

Filed: June 1, 2005

Date of Patent: February 28, 2012

Assignee: Loquendo S.p.A.

Inventors: Roberto Gemello, Franco Mana
Using Utterance Classification in Telephony and Speech Recognition Applications

Publication number: 20110307252

Abstract: Described is the use of utterance classification based methods and other machine learning techniques to provide a telephony application or other voice menu application (e.g., an automotive application) that need not use Context-Free-Grammars to determine a user's spoken intent. A classifier receives text from an information retrieval-based speech recognizer and outputs a semantic label corresponding to the likely intent of a user's speech. The semantic label is then output, such as for use by a voice menu program in branching between menus. Also described is training, including training the language model from acoustic data without transcriptions, and training the classifier from speech-recognized acoustic data having associated semantic labels.

Type: Application

Filed: June 15, 2010

Publication date: December 15, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Yun-Cheng Ju, James Garnet Droppo, III
Application of speech and speaker recognition tools to fault detection in electrical circuits

Patent number: 8041571

Abstract: A method and apparatus detect and localize electric faults in electrical power grids and circuit. High impedance faults are detected by analyzing data from remote sensor units deployed over the network using the algorithms of speech and speaker analysis software. This is accomplished by converting the voltage and/or current waveform readouts from the sensors into a digital form which is then transmitted to a computer located either near the sensors or at an operations center. The digitized data is converted by a dedicated software or software/hardware interface to a format accepted by a reliable and stable software solution, such as speech or speaker recognition software. The speech or speaker recognition software must be “trained” to recognize various signal patterns that either indicate or not the occurrence of a fault. The readout of the speech or speaker recognition software, if indicating a fault, is transmitted to a central processor and displayed to provide information on the most likely type of fault.

Type: Grant

Filed: January 5, 2007

Date of Patent: October 18, 2011

Assignee: International Business Machines Corporation

Inventors: Sarah C. McAllister, Tomasz J. Nowicki, Jason W. Pelecanos, Grzegorz M. Swirszcz
Adaptive postfiltering methods and systems for decoding speech

Patent number: 8032363

Abstract: A method of processing a decoded speech (DS) signal including successive DS frames, each DS frame including DS samples. The method comprises: adaptively filtering the DS signal to produce a filtered signal; gain-scaling the filtered signal with an adaptive gain updated once a DS frame, thereby producing a gain-scaled signal; and performing a smoothing operation to smooth possible waveform discontinuities in the gain-scaled signal.

Type: Grant

Filed: August 9, 2002

Date of Patent: October 4, 2011

Assignee: Broadcom Corporation

Inventors: Juin-Hwey Chen, Jes Thyssen, Chris C Lee
Dictation selection

Patent number: 8032372

Abstract: A computer program product for computing a correction rate predictor for medical record dictations, the computer program product residing on a computer-readable medium includes computer-readable instructions for causing a computer to obtain a draft medical transcription of at least a portion of a dictation, the dictation being from medical personnel and concerning a patient, determine features of the dictation to produce a feature set comprising a combination of features of the dictation, the features being relevant to a quantity of transcription errors in the transcription, analyze the feature set to compute a predicted correction rate associated with the dictation and use the predicted correction rate to determine whether to provide at least a portion of the transcription to a transcriptionist.

Type: Grant

Filed: September 13, 2005

Date of Patent: October 4, 2011

Assignee: eScription, Inc.

Inventors: Roger Scott Zimmerman, George Zavaliagkos
Method and device for recognising a phonetic sound sequence or character sequence

Patent number: 7966177

Abstract: The invention relates to a method for recognizing a phonetic sound sequence or a character sequence, e.g.

Type: Grant

Filed: August 13, 2001

Date of Patent: June 21, 2011

Inventor: Hans Geiger
CONFIDENCE CALIBRATION IN AUTOMATIC SPEECH RECOGNITION SYSTEMS

Publication number: 20110144986

Abstract: Described is a calibration model for use in a speech recognition system. The calibration model adjusts the confidence scores output by a speech recognition engine to thereby provide an improved calibrated confidence score for use by an application. The calibration model is one that has been trained for a specific usage scenario, e.g., for that application, based upon a calibration training set obtained from a previous similar/corresponding usage scenario or scenarios. Different calibration models may be used with different usage scenarios, e.g., during different conditions. The calibration model may comprise a maximum entropy classifier with distribution constraints, trained with continuous raw confidence scores and multi-valued word tokens, and/or other distributions and extracted features.

Type: Application

Filed: December 10, 2009

Publication date: June 16, 2011

Applicant: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Jinyu Li
Efficient storage for finite state machines

Patent number: 7949679

Abstract: A method of operating a storage of a finite state machine includes organizing information concerning an operation of the machine in a payload-transition matrix, in which a given number of columns of the matrix reflect features of a state of the machine and other columns describe valid transitions between the states of the machine depending on input characters, and compressing the payload-transition matrix in a row-displaced format.

Type: Grant

Filed: March 5, 2008

Date of Patent: May 24, 2011

Assignee: International Business Machines Corporation

Inventor: Branimir Z. Lambov
Neural Segmentation of an Input Signal and Applications Thereof

Publication number: 20110119057

Abstract: Disclosed are systems, methods, and computer-program products for segmenting content of an input signal and applications thereof. In an embodiment, the system includes simulated neurons, a phase modulator, and an entity-identifier module. Each simulated neuron is connected to one or more other simulated neurons and is associated with an activity and a phase. The activity and the phase of each simulated neuron is set based on the activity and the phase of the one or more other simulated neurons connected to each simulated neuron. The phase modulator includes individual modulators, each configured to modulate the activity and the phase of each of the plurality of simulated neurons based on a modulation function. The entity-identifier module is configured to identify one or more distinct entities (e.g., objects, sound sources, etc.) included in the input signal based on the one or more distinct collections of simulated neurons that have substantially distinct phases.

Type: Application

Filed: November 18, 2009

Publication date: May 19, 2011

Applicant: The Intellisis Corporation

Inventors: Douglas A. Moore, Kristi H. Tsukida, Paulo B. Ang
Apparatus and method to reduce recognition errors through context relations among dialogue turns

Patent number: 7890329

Abstract: Disclosed is directed an apparatus and method to reduce recognition errors through context relations among multiple dialogue turns. The apparatus includes a rule set storage unit having a rule set containing one or more rules, an evolutionary rule generation module connected to the rule storage unit, and a rule trigger unit connected to the rule storage unit. The rule set uses dialogue turn as a unit for the information described by each rule. The method analyzes a dialogue history through an evolutionary massive parallelism approach to get a rule set describing the context relation among dialogue turns. Based on the rule set and recognition result of an ASR system, it reevaluates the recognition result, and measures the confidence measure of the reevaluated recognition result. After each successful dialogue turn, the rule set is dynamically adapted.

Type: Grant

Filed: August 1, 2007

Date of Patent: February 15, 2011

Assignee: Industrial Technology Research Institute

Inventors: Hsu-Chih Wu, Ching-Hsien Lee
Method for accelerating the execution of speech recognition neural networks and the related speech recognition device

Patent number: 7827031

Abstract: A neural network in a speech-recognition system has computing units organized in levels including at least one hidden level and one output level. The computing units of the hidden level are connected to the computing units of the output level via weighted connections, and the computing units of the output level correspond to acoustic-phonetic units of the general vocabulary. This network executes the following steps: determining a subset of acoustic-phonetic units necessary for recognizing all the words contained in the general vocabulary subset; eliminating from the neural network all the weighted connections afferent to computing units of the output level that correspond to acoustic-phonetic units not contained in the previously determined subset of acoustic-phonetic units, thus obtaining a compacted neural network optimized for recognition of the words contained in the general vocabulary subset; and executing, at each moment in time, only the compacted neural network.

Type: Grant

Filed: February 12, 2003

Date of Patent: November 2, 2010

Assignee: Loquendo S.p.A.

Inventors: Dario Albesano, Roberto Gemello
Parameterized statistical interaction policies

Patent number: 7818271

Abstract: A method and apparatus are disclosed for selecting interaction policies. Values may be provided for a group of parameters for user models. Interaction policies within a specific tolerance of an optimal interaction policy for the user models may be learned. Up to a predetermined number of the learned interaction policies, within a specific tolerance of an optimal policy for the user models, may be selected for a wireless communication device. The wireless communication device, including the selected interaction policies, may determine whether any of a group of parameters, representing a user preference or contextual information with respect to use of the wireless communication device, is updated. When any of the group of parameters has been updated, the wireless communication device may select one of the selected interaction policies, such that the selected one of the selected interaction policies may determine a better interaction behavior for the wireless communication device.

Type: Grant

Filed: June 13, 2007

Date of Patent: October 19, 2010

Assignee: Motorola Mobility, Inc.

Inventor: Michael E. Groble
Method for Automated Training of a Plurality of Artificial Neural Networks

Publication number: 20100217589

Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.

Type: Application

Filed: February 17, 2010

Publication date: August 26, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
Method of optimising the execution of a neural network in a speech recognition system through conditionally skipping a variable number of frames

Patent number: 7769580

Abstract: A method of optimizing the execution of a neural network in a speech recognition system provides for conditionally skipping a variable number of frames, depending on a distance computed between output probabilities, or likelihoods, of a neural network. The distance is initially evaluated between two frames at times 1 and 1+k, where k is a predetermined maximum distance between frames, and if such distance is sufficiently small, the frames between times 1 and 1+k are calculated by interpolation, avoiding further executions of the neural network. If, on the contrary, such distance is not small enough, it means that the outputs of the network are changing quickly, and it is not possible to skip too many frames. In that case, the method attempts to skip remaining frames, calculating and evaluating a new distance.

Type: Grant

Filed: December 23, 2002

Date of Patent: August 3, 2010

Assignee: Loquendo S.p.A.

Inventors: Roberto Gemello, Dario Albesano
Using finite-state networks to store weights in a finite-state network

Patent number: 7743011

Abstract: In a weighted finite state network process, a finite state network object is stored. The finite state network object includes arcs, and each arc has an associated weight stored as a weight-defining finite state network object. The finite state network object is applied to an input. The applying includes combining weights of one or more arcs matching the input using finite state network-combining operations.

Type: Grant

Filed: December 21, 2006

Date of Patent: June 22, 2010

Assignee: Xerox Corporation

Inventor: Kenneth R. Beesley
Speaker identification in the presence of packet losses

Patent number: 7720012

Abstract: A system, method, and apparatus for identifying a speaker of an utterance, particularly when the utterance has portions of it missing due to packet losses. Different packet loss models are applied to each speaker's training data in order to improve accuracy, especially for small packet sizes.

Type: Grant

Filed: July 11, 2005

Date of Patent: May 18, 2010

Assignee: Arrowhead Center, Inc.

Inventors: Deva K. Borah, Phillip De Leon
Grouping lines in freeform handwritten text

Patent number: 7680332

Abstract: Techniques for efficiently and accurately organizing freeform handwriting into lines. A global cost function is employed to find the simplest partitioning of electronic ink strokes into line groups that also maximize the “goodness” of the resulting lines and the consistency of their configuration. The “goodness” of a line may be based upon its linear regression error and the horizontal and vertical compactness of the strokes making up the line. The line consistency configuration for a grouping of strokes is measured by the angle difference between neighboring groups. The global cost function also takes into account the complexity of the stroke partitioning, measured by the number of lines into which the strokes are grouped. An initial grouping of strokes is made, and the cost for this initial grouping is determined. Alternate groupings of the initial stroke grouping are then generated.

Type: Grant

Filed: May 30, 2005

Date of Patent: March 16, 2010

Assignee: Microsoft Corporation

Inventors: Ming Ye, Herry Sutanto, Sashi Raghupathy, Chengyang Li, Michael Shilman
SPEECH INTERFACES

Publication number: 20100057452

Abstract: The described implementations relate to speech interfaces and in some instances to speech pattern recognition techniques that enable speech interfaces. One system includes a feature pipeline configured to produce speech feature vectors from input speech. This system also includes a classifier pipeline configured to classify individual speech feature vectors utilizing multi-level classification.

Type: Application

Filed: August 28, 2008

Publication date: March 4, 2010

Applicant: Microsoft Corporation

Inventors: Kunal Mukerjee, Brendan Meeder
Voice activity detection system and method

Publication number: 20100057453

Abstract: Discrimination between at least two classes of events in an input signal is carried out in the following way. A set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames. Said at least two different feature vectors are classified using respective sets of preclassifiers trained for said at least two classes of events. Values for at least one weighting factor are determined based on outputs of said preclassifiers for each of said frames. A combined feature vector is calculated for each of said frames by applying said at least one weighting factor to said at least two different feature vectors. Said combined feature vector is classified using a set of classifiers trained for said at least two classes of events.

Type: Application

Filed: November 16, 2006

Publication date: March 4, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Zica Valsan
SYSTEMS AND METHODS OF IMPROVING AUTOMATED SPEECH RECOGNITION ACCURACY USING STATISTICAL ANALYSIS OF SEARCH TERMS

Publication number: 20090292538

Abstract: Systems and methods of improving speech recognition accuracy using statistical analysis of word or phrase-based search terms are disclosed. An illustrative system for statistically analyzing search terms includes an interface adapted to receive a text-based search term, a textual-linguistic analysis module that detects textual features within the search term and generates a first score, a phonetic conversion module that converts the search term into a phoneme string, a phonetic-linguistic analysis module that detects phonemic features within the phoneme string and generates a second score, and a score normalization module that normalizes the first and second scores and outputs a search term score to a user or process.

Type: Application

Filed: May 20, 2008

Publication date: November 26, 2009

Applicant: Calabrio, Inc.

Inventor: David M. Barnish
Correlation of plant states for feedback control of combustion

Patent number: 7624082

Abstract: Material (G) is converted by a combustion process in a plant (1) while air (L) is supplied. The state of the system in the plant (1) is described by state variables (x, y) and is regulated at least by one control loop (3, 5, 7, 9). Groups of states (Z) are defined for at least one pair of correlated state variables (x, y), with the groups being comparable as regards changes (dx/dt, dy/dt) of the correlated state variables (x, y). Each group of comparable states (Z) is characterized, as regards their transition functions, by parameters (Kp, Tn, Tv) of a standard controller. In the event of changes in the state of the system in the plant (1), the closest groups of comparable states (Z) are selected, and their transition functions, characterized by the parameters (Kp, Tn, Tv), are used for the purposes of regulation the system.

Type: Grant

Filed: September 27, 2007

Date of Patent: November 24, 2009

Assignee: Powitec Intelligent Technologies GmbH

Inventors: Franz Wintrich, Volker Stephan
Isolating speech signals utilizing neural networks

Patent number: 7620546

Abstract: A speech signal isolation system configured to isolate and reconstruct a speech signal transmitted in an environment in which frequency components of the speech signal are masked by background noise. The speech signal isolation system obtains a noisy speech signal from an audio source. The noisy speech signal may then be fed through a neural network that has been trained to isolate and reconstruct a clean speech signal from against background noise. Once the noisy speech signal has been fed through the neural network, the speech signal isolation system generates an estimated speech signal with substantially reduced noise.

Type: Grant

Filed: March 21, 2005

Date of Patent: November 17, 2009

Assignee: QNX Software Systems (Wavemakers), Inc.

Inventors: Phillip Hetherington, Pierre Zakarauskas, Shahla Parveen
Method and system for utterance verification

Patent number: 7617101

Abstract: A method and system for utterance verification is disclosed. It first extracts a sequence of feature vectors from speech signal. At least one candidate string is obtained after speech recognition. Then, speech signal is segmented into speech segments according to the verification-unit-specified structure of candidate string for making each speech segment corresponding to a verification unit. After calculating the verification feature vectors of speech segments, these verification feature vectors are sequentially used to generate verification scores of speech segments in verification process. This invention uses neural networks for calculating verification scores, where each neural network is a Multi-Layer Perceptron (MLP) developed for each verification unit. Verification score is obtained through using feed-forward process of MLP.

Type: Grant

Filed: July 29, 2003

Date of Patent: November 10, 2009

Assignee: Industrial Technology Research Institute

Inventors: Sen-Chia Chang, Shih-Chieh Chien
System And Method For Facilitating Cognitive Processing Of Simultaneous Remote Voice Conversations

Publication number: 20090259464

Abstract: A system and method for facilitating cognitive processing of simultaneous remote voice conversations is provided. A plurality of remote voice conversations participated in by distributed participants are provided over a shared communication channel. A main conversation between at least two of the distributed participants and one or more subconversations between at least two other of the distributed participants are identified from within the remote voice conversations. Segments of interest to one of the distributed participants are defined including a conversation excerpt having a lower attention activation threshold for the one distributed participant. Each of the subconversations is parsed into conversation excerpts. The conversation excerpts are compared to the segments of interest. One or more gaps between conversation flow in the main conversation are predicted.

Type: Application

Filed: April 11, 2008

Publication date: October 15, 2009

Applicant: PALO ALTO RESEARCH CENTER INCORPORATED

Inventors: Nicolas B. Ducheneaut, Trevor F. Smith
System and method of word graph matrix decomposition

Patent number: 7603272

Abstract: Disclosed is a system and method of decomposing a lattice transition matrix into a block diagonal matrix. The method is applicable to automatic speech recognition but can be used in other contexts as well, such as parsing, named entity extraction and any other methods. The method normalizes the topology of any input graph according to a canonical form.

Type: Grant

Filed: June 19, 2007

Date of Patent: October 13, 2009

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Giuseppe Riccardi
Vocabulary-independent search of spontaneous speech

Patent number: 7584098

Abstract: A method of identifying a location of a query string in an audio signal is provided. Under the method, a segment of the audio signal is selected. A score for a query string in the segment of the audio signal is determined by determining the product of probabilities of overlapping sequences of tokens. The score is then used to decide if the segment of the audio signal is likely to contain the query string.

Type: Grant

Filed: November 29, 2004

Date of Patent: September 1, 2009

Assignee: Microsoft Corporation

Inventors: Roger Peng Yu, Frank Torsten Seide
Method of adapting a neural network of an automatic speech recognition device

Publication number: 20090216528

Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.

Type: Application

Filed: June 1, 2005

Publication date: August 27, 2009

Inventors: Roberto Gemello, Franco Mana
MULTI-STATE BARGE-IN MODELS FOR SPOKEN DIALOG SYSTEMS

Publication number: 20090112599

Abstract: Disclosed are systems, methods and computer readable media for applying a multi-state barge-in acoustic model in a spoken dialogue system comprising the steps of (1) presenting a prompt to a user from the spoken dialog system. (2) receiving an audio speech input from the user during the presentation of the prompt, (3) accumulating the audio speech input from the user, (4) applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, (5) applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, (6) determining whether the audio speech input is a barge-in-speech input from the user, and (7) if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.

Type: Application

Filed: October 31, 2007

Publication date: April 30, 2009

Applicant: AT&T Labs

Inventor: Andrej Ljolje
System and method for learning a network of categories using prediction

Publication number: 20090106022

Abstract: An improved system and method is provided for efficiently learning a network of categories using prediction. A learning engine may receive a stream of characters and incrementally segment the stream of characters beginning with individual characters into larger and larger categories. To do so, a prediction engine may be provided for predicting a target category from the stream of characters using one or more context categories. Upon predicting the target category, the edges of the network of categories may be updated. A category composer may also be provided for composing a new category from existing categories in the network of categories, and a new category composed may then be added to the network of categories. Advantageously, iterative episodes of prediction and learning of categories for large scale applications may result in hundreds of thousands of categories connected by millions of prediction edges.

Type: Application

Filed: October 18, 2007

Publication date: April 23, 2009

Applicant: Yahoo! Inc.

Inventor: Omid Madani
Voice registration method and system, and voice recognition method and system based on voice registration method and system

Patent number: 7502736

Abstract: Disclosed is a voice registration method for voice recognition, comprising the steps of analyzing a spectrum of a sound signal inputted from the outside; extracting predetermined language units for a speaker recognition from a voice signal in the sound signal; measuring the loudness of each language unit; collecting voice data on registered (background) speakers including loudness data of the plurality of background speakers as a reference onto voice database; determining whether the loudness of each language unit is within a predetermined loudness range based on the voice data base; learning each language unit by using a multi-layer perceptron in the case that at least a predetermined number of language units are within the predetermined loudness range; and storing data on the learned language unit as data for recognizing the speaker. With this configuration, loudness of a speaker is considered at learning for registering his/her voice and at verifying a speaker.

Type: Grant

Filed: December 6, 2001

Date of Patent: March 10, 2009

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-jin Hong, Sung-zoo Lee, Tae-soo Kim, Tae-sung Lee, Ho-jin Choi, Byoung-won Hwang
Information processing apparatus, information processing method, and program

Patent number: 7499892

Abstract: An information processing apparatus includes a first learning unit adapted to learn a first SOM (self-organization map), based on a first parameter extracted from an observed value, a winner node determination unit adapted to determine a winner node on the first SOM, a searching unit adapted to search for a generation node on a second SOM having highest connection strength with the winner node, a parameter generation unit adapted to generate a second parameter from the generation node, a modification unit adapted to modify the second parameter generated from the generation node, a first connection weight modification unit adapted to modify the connection weight when end condition is satisfied, a second connection weight modification unit adapted to modify the connection weight depending on evaluation made by a user, and a second learning unit adapted to learn the second SOM based on the second parameter obtained when the end condition is satisfied.

Type: Grant

Filed: April 4, 2006

Date of Patent: March 3, 2009

Assignee: Sony Corporation

Inventors: Kazumi Aoyama, Katsuki Minamino, Hideki Shimomura
Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

Patent number: 7454341

Abstract: According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.

Type: Grant

Filed: September 30, 2000

Date of Patent: November 18, 2008

Assignee: Intel Corporation

Inventors: Jielin Pan, Baosheng Yuan
Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network

Patent number: 7444282

Abstract: A method of automatic labeling using an optimum-partitioned classified neural network includes searching for neural networks having minimum errors with respect to a number of L phoneme combinations from a number of K neural network combinations generated at an initial stage or updated, updating weights during learning of the K neural networks by K phoneme combination groups searched with the same neural networks, and composing an optimum-partitioned classified neural network combination using the K neural networks of which a total error sum has converged; and tuning a phoneme boundary of a first label file by using the phoneme combination group classification result and the optimum-partitioned classified neural network combination, and generating a final label file reflecting the tuning result.

Type: Grant

Filed: March 1, 2004

Date of Patent: October 28, 2008

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ki-hyun Choo, Jeong-su Kim, Jae-won Lee, Ki-seung Lee
FAST SEMANTIC EXTRACTION USING A NEURAL NETWORK ARCHITECTURE

Publication number: 20080221878

Abstract: A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.

Type: Application

Filed: February 29, 2008

Publication date: September 11, 2008

Applicant: NEC LABORATORIES AMERICA, INC.

Inventors: Ronan Collobert, Jason Weston
Method and device for determining prosodic markers by neural autoassociators

Patent number: 7409340

Abstract: A neural network is used to obtain more robust performance in determining prosodic markers on the basis of linguistic categories.

Type: Grant

Filed: January 27, 2003

Date of Patent: August 5, 2008

Assignee: Siemens Aktiengesellschaft

Inventors: Martin Holzapfel, Achim Mueller
Method of and apparatus for transforming speech feature vector

Publication number: 20080147391

Abstract: Provided is a method and apparatus for transforming a speech feature vector. The method includes extracting a feature vector required for speech recognition from a speech signal and transforming the extracted feature vector using an auto-associative neural network (AANN).

Type: Application

Filed: August 31, 2007

Publication date: June 19, 2008

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: So-young Jeong, Kwang-cheol Oh, Jae-hoon Jeong, Jeong-su Kim
Method and device for modulation recognition of digitally modulated signals with multi-level magnitudes

Patent number: 7379507

Abstract: A modulation recognition method and device for digitally modulated signals with multi-level magnitudes are provided. The modulation recognition method includes selecting plural quantization sizes used to construct plural statistic histograms related to the magnitude of a sequence of data, setting up an off-line processing to extract plural useful feature patterns for each modulation type of interest, receiving a sequence of samples of a modulated object signal and constructing plural statistic histograms related to the magnitude of these samples, and adopting a hierarchical classification method for modulation recognition. It can be applied to the adaptive-modulation communication system, software defined radio, digital broadcasting systems and military communication systems. It can also be integrated with modulation recognition techniques for other types of modulated signals to function in a universal demodulator.

Type: Grant

Filed: October 1, 2004

Date of Patent: May 27, 2008

Assignee: Industrial Technology Research Institute

Inventors: Ching-Yung Chen, Chih-Chun Feng
Fractal harmonic overtone mapping of speech and musical sounds

Patent number: 7376553

Abstract: An apparatus for signal processing based on an algorithm for representing harmonics in a fractal lattice. The apparatus includes a plurality of tuned segments, each tuned segment including a transceiver having an intrinsic resonant frequency the amplitude of the resonant frequency capable of being modified by either receiving an external input signal, or by internally generating a response to an applied feedback signal. A plurality of signal processing elements are arranged in an array pattern, the signal processing elements including at least one function selected from the group including buffers for storing information, a feedback device for generating a feedback signal, a controller for controlling an output signal, a connection circuit for connecting the plurality of tuned segments to signal processing elements, and a feedback connection circuit for conveying signals from the plurality of signal processing elements in the array to the tuned segments.

Type: Grant

Filed: July 8, 2004

Date of Patent: May 20, 2008

Inventor: Robert Patel Quinn
High-order entropy error functions for neural classifiers

Patent number: 7346497

Abstract: An automatic speech recognition system comprising a speech decoder to resolve phone and word level information, a vector generator to generate information vectors on which a confidence measure is based by a neural network classifier (ANN). An error signal is designed which is not subject to false saturation or over specialization. The error signal is integrated into an error function which is back propagated through the ANN.

Type: Grant

Filed: May 8, 2001

Date of Patent: March 18, 2008

Assignee: Intel Corporation

Inventors: Xiaobo Pi, Ying Jia
Speech recognition method and system

Patent number: 7319960

Abstract: A speech recognition system uses a phoneme counter to determine the length of a word to be recognized. The result is used to split a lexicon into one or more sub-lexicons containing only words which have the same or similar length to that of the word to be recognized, so restricting the search space significantly. In another aspect, a phoneme counter is used to estimate the number of phonemes in a word so that a transition bias can be calculated. This bias is applied to the transition probabilities between phoneme models in an HNN based recognizer to improve recognition performance for relatively short or long words.

Type: Grant

Filed: December 19, 2001

Date of Patent: January 15, 2008

Assignee: Nokia Corporation

Inventors: Soren Riis, Konstantinos Koumpis
Extracting classifying data in music from an audio bitstream

Patent number: 7295977

Abstract: The method of the present invention utilizes machine-learning techniques, particularly Support Vector Machines in combination with a neural network, to process a unique machine-learning enabled representation of the audio bitstream. Using this method, a classifying machine is able to autonomously detect characteristics of a piece of music, such as the artist or genre, and classify it accordingly. The method includes transforming digital time-domain representation of music into a frequency-domain representation, then dividing that frequency data into time slices, and compressing it into frequency bands to form multiple learning representations of each song. The learning representations that result are processed by a group of Support Vector Machines, then by a neural network, both previously trained to distinguish among a given set of characteristics, to determine the classification.

Type: Grant

Filed: August 27, 2001

Date of Patent: November 13, 2007

Assignee: NEC Laboratories America, Inc.

Inventors: Brian Whitman, Gary W. Flake, Stephen R. Lawrence

prev … 4 5 6 7 8 9 10 next