Creating Patterns For Matching Patents (Class 704/243)

Update patterns (Class 704/244)

Clustering (Class 704/245)

System and method for unsupervised and active learning for automatic speech recognition

Patent number: 8914283

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Grant

Filed: August 5, 2013

Date of Patent: December 16, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment

Patent number: 8914290

Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.

Type: Grant

Filed: May 18, 2012

Date of Patent: December 16, 2014

Assignee: Vocollect, Inc.

Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
Automatic context sensitive language correction and enhancement using an internet corpus

Patent number: 8914278

Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.

Type: Grant

Filed: July 31, 2008

Date of Patent: December 16, 2014

Assignee: Ginger Software, Inc.

Inventors: Yael Karov Zangvil, Avner Zangvil
Speech recognition training

Patent number: 8909534

Abstract: A method may include selecting, by a computing device, sets of two or more text candidates from a plurality of text candidates corresponding to vocal input. The method may further include for each set, providing, by the computing device, representations of each of the respective two or more text candidates in the set to users, wherein the representations are provided as audio. The method may further include receiving a selection from each of the users of one of a text candidate from the set, wherein the selection is based on satisfying a criterion. The method may further include determining that a text candidate included in the plurality of text candidates has a highest probability out of the plurality of text candidates of being a correct textual transcription of the vocal input based at least in part on selections from the users.

Type: Grant

Filed: March 9, 2012

Date of Patent: December 9, 2014

Assignee: Google Inc.

Inventor: Taliver Heath
Method and system for prompt construction for selection from a list of acoustically confusable items in spoken dialog systems

Patent number: 8909528

Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.

Type: Grant

Filed: May 9, 2007

Date of Patent: December 9, 2014

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
Method and system for automatically detecting morphemes in a task classification system using lattices

Patent number: 8909529

Abstract: The invention concerns a method and corresponding system for building a phonotactic mode for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Grant

Filed: November 15, 2013

Date of Patent: December 9, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Giuseppe Riccardi
METHOD AND APPARATUS FOR BUILDING A LANGUAGE MODEL

Publication number: 20140358539

Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.

Type: Application

Filed: February 14, 2014

Publication date: December 4, 2014

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventors: Feng Rao, Li Lu, Bo Chen, Xiang Zhang, Shuai Yue, Lu Li
METHODS AND SYSTEMS FOR SHAPING DIALOG OF SPEECH SYSTEMS

Publication number: 20140358538

Abstract: Methods and systems are provided for shaping speech dialog of a speech system. In one embodiment, a method includes: receiving data related to a first utterance from a user of the speech system; processing the data based on at least one attribute processing technique that determines at least one attribute of the first utterance; determining a shaping pattern based on the at least one attribute; and generating a speech prompt based on the shaping pattern.

Type: Application

Filed: May 28, 2013

Publication date: December 4, 2014

Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Ron M. Hecht, Eli Tzirkel-Hancock, Omer Tsimhoni, Ute Winter
LANGUAGE MODEL TRAINED USING PREDICTED QUERIES FROM STATISTICAL MACHINE TRANSLATION

Publication number: 20140350931

Abstract: A Statistical Machine Translation (SMT) model is trained using pairs of sentences that include content obtained from one or more content sources (e.g. feed(s)) with corresponding queries that have been used to access the content. A query click graph may be used to assist in determining candidate pairs for the SMT training data. All/portion of the candidate pairs may be used to train the SMT model. After training the SMT model using the SMT training data, the SMT model is applied to content to determine predicted queries that may be used to search for the content. The predicted queries are used to train a language model, such as a query language model. The query language model may be interpolated other language models, such as a background language model, as well as a feed language model trained using the content used in determining the predicted queries.

Type: Application

Filed: May 24, 2013

Publication date: November 27, 2014

Applicant: Microsoft Corporation

Inventors: Michael Levit, Dilek Hakkani-Tur, Gokhan Tur
Method and apparatus of providing semi-automated classifier adaptation for natural language processing

Patent number: 8892437

Abstract: Example embodiments of the present invention may include a method that provides transcribing spoken utterances occurring during a call and assigning each of the spoken utterances with a corresponding set of first classifications. The method may also include determining a confidence rating associated with each of the spoken utterances and the assigned set of first classifications, and performing at least one of reclassifying the spoken utterances with new classifications based on at least one additional classification operation, and adding the assigned first classifications and the corresponding plurality of spoken utterances to a training data set.

Type: Grant

Filed: November 13, 2013

Date of Patent: November 18, 2014

Assignee: West Corporation

Inventor: Silke Witt-ehsani
Front-end processor for speech recognition, and speech recognizing apparatus and method using the same

Patent number: 8892436

Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.

Type: Grant

Filed: October 19, 2011

Date of Patent: November 18, 2014

Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation

Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
METHOD, APPARATUS, AND PROGRAM FOR GENERATING TRAINING SPEECH DATA FOR TARGET DOMAIN

Publication number: 20140337026

Abstract: A method and system for generating training data for a target domain using speech data of a source domain. The training data generation method including: reading out a Gaussian mixture model (GMM) of a target domain trained with a clean speech data set of the target domain; mapping, by referring to the GMM of the target domain, a set of source domain speech data received as an input to the set of target domain speech data on a basis of a channel characteristic of the target domain speech data; and adding a noise of the target domain to the mapped set of source domain speech data to output a set of pseudo target domain speech data.

Type: Application

Filed: April 14, 2014

Publication date: November 13, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Osamu Ichikawa, Steven J. Rennie
Utilizing multiple processing units for rapid training of hidden markov models

Patent number: 8886535

Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.

Type: Grant

Filed: January 23, 2014

Date of Patent: November 11, 2014

Assignee: Accumente, LLC

Inventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification

Patent number: 8886533

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.

Type: Grant

Filed: October 25, 2011

Date of Patent: November 11, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Sumit Chopra, Dimitrios Dimitriadis, Patrick Haffner
Speech recognition apparatus, speech recognition method, and speech recognition robot

Patent number: 8886534

Abstract: A speech recognition apparatus includes a speech input unit that receives input speech, a phoneme recognition unit that recognizes phonemes of the input speech and generates a first phoneme sequence representing corrected speech, a matching unit that matches the first phoneme sequence with a second phoneme sequence representing original speech, and a phoneme correcting unit that corrects phonemes of the second phoneme sequence based on the matching result.

Type: Grant

Filed: January 27, 2011

Date of Patent: November 11, 2014

Assignee: Honda Motor Co., Ltd.

Inventors: Mikio Nakano, Naoto Iwahashi, Kotaro Funakoshi, Taisuke Sumii
Frequency ratio fingerprint characterization for audio matching

Patent number: 8886543

Abstract: System and methods for characterizing interest points within a fingerprint are disclosed herein. The systems include generating a set of interest points and an anchor point related to an audio sample. A quantized absolute frequency of an anchor point can be calculated and used to calculate a set of quantized ratios. A fingerprint can then be generated based upon the set of quantized ratios and used in comparison to reference fingerprints to identify the audio sample. The disclosed systems and methods provide for an audio matching system robust to pitch-shift distortion by using quantized ratios within fingerprints rather than solely using absolute frequencies of interest points. Thus, the disclosed system and methods result in more accurate audio identification.

Type: Grant

Filed: November 15, 2011

Date of Patent: November 11, 2014

Assignee: Google Inc.

Inventors: Matthew Sharifi, George Tzanetakis, Annie Chen, Dominik Roblek
Automatically adapting user guidance in automated speech recognition

Patent number: 8880402

Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech to obtain at least one parameter value, and determining an experience level of the user using the parameter value(s). The method can also include prompting the user based upon the determined experience level of the user to assist the user in delivering speech commands.

Type: Grant

Filed: October 28, 2006

Date of Patent: November 4, 2014

Assignee: General Motors LLC

Inventors: Ryan J. Wasson, John P. Weiss, Jason W. Clark
Systems, devices and methods for list display and management

Patent number: 8880397

Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.

Type: Grant

Filed: October 21, 2011

Date of Patent: November 4, 2014

Assignee: Wal-Mart Stores, Inc.

Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
SYSTEM AND DIALOG MANAGER DEVELOPED USING MODULAR SPOKEN-DIALOG COMPONENTS

Publication number: 20140324427

Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.

Type: Application

Filed: April 25, 2014

Publication date: October 30, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
System and method for controlling access to resources with a spoken CAPTCHA test

Patent number: 8868423

Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.

Type: Grant

Filed: July 11, 2013

Date of Patent: October 21, 2014

Assignee: John Nicholas and Kristin Gross Trust

Inventor: John Nicholas Gross
Non-dialogue-based and dialogue-based learning apparatus by substituting for uttered words undefined in a dictionary with word-graphs comprising of words defined in the dictionary

Patent number: 8868410

Abstract: The invention provides a dialogue-based learning apparatus through dialogue with users comprising: a speech input unit (10) for inputting speeches; a speech recognition unit (20) for recognizing the input speech; and a behavior and dialogue controller (30) for controlling behaviors and dialogues according to speech recognition results, wherein the behavior and dialogue controller (30) has a topic recognition expert (34) to memorise contents of utterances and to retrieve the topic that best matches the speech recognition results, and a mode switching expert (35) to control mode switching in accordance with a user utterance, wherein the mode switching expert switches modes in accordance with a user utterance, wherein the topic recognition expert registers a plurality words in the utterance as topics in first mode, performs searches from among the registered topics, and selects the maximum likelihood topic in second mode.

Type: Grant

Filed: August 29, 2008

Date of Patent: October 21, 2014

Assignees: National Institute of Information and Communications Technology, Honda Motor Co., Ltd.

Inventors: Naoto Iwahashi, Noriyuki Kimura, Mikio Nakano, Kotaro Funakoshi
Leveraging back-off grammars for authoring context-free grammars

Patent number: 8862468

Abstract: A system and method of refining context-free grammars (CFGs). The method includes deriving back-off grammar (BOG) rules from an initially developed CFG and utilizing the initial CFG and the derived BOG rules to recognize user utterances. Based on a response of the initial CFG and the derived BOG rules to the user utterances, at least a portion of the derived BOG rules are utilized to modify the initial CFG and thereby produce a refined CFG. The above method can carried out iterativey, with each new iteration utilizing a refined CFG from preceding iterations.

Type: Grant

Filed: December 22, 2011

Date of Patent: October 14, 2014

Assignee: Microsoft Corporation

Inventors: Timothy Paek, Max Chickering, Eric Badger
Location based responses to telephone requests

Patent number: 8856005

Abstract: A method for receiving processed information at a remote device is described. The method includes transmitting from the remote device a verbal request to a first information provider and receiving a digital message from the first information provider in response to the transmitted verbal request. The digital message includes a symbolic representation indicator associated with a symbolic representation of the verbal request and data used to control an application. The method also includes transmitting, using the application, the symbolic representation indicator to a second information provider for generating results to be displayed on the remote device.

Type: Grant

Filed: January 8, 2014

Date of Patent: October 7, 2014

Assignee: Google Inc.

Inventors: Gudmundur Hafsteinsson, Michael J. LeBeau, Natalia Marmasse, Sumit Agarwal, Dipchad Nishar
Distance metrics for universal pattern processing tasks

Patent number: 8856002

Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.

Type: Grant

Filed: April 11, 2008

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
Method and system for segmenting phonemes from voice signals

Patent number: 8849662

Abstract: A method and a system for segmenting phonemes from voice signals. A method for accurately segmenting phonemes, in which a histogram showing a peak distribution corresponding to an order is formed by using a high order concept, and a boundary indicating a starting point and an ending point of each phoneme is determined by calculating a peak statistic based on the histogram. The phoneme segmentation method can remarkably reduce an amount of calculation, and has an advantage of being applied to sound signal systems which perform sound coding, sound recognition, sound synthesizing, sound reinforcement, etc.

Type: Grant

Filed: December 28, 2006

Date of Patent: September 30, 2014

Assignee: Samsung Electronics Co., Ltd

Inventor: Hyun-Soo Kim
Training of voice-controlled television navigation

Patent number: 8849660

Abstract: Systems and methods for training voice activation control of electronic equipment are disclosed. One example method includes receiving a selection corresponding to at least one command used to control the electronic equipment. The method further includes instructing a user to speak, and responsive to the instruction, receiving a digitized speech stream. The method further includes segmenting the speech stream into speech segments, storing at least one of the speech segments as an entry in a dictionary, and associating the dictionary entry with the selected command.

Type: Grant

Filed: December 14, 2007

Date of Patent: September 30, 2014

Inventors: Arturo A. Rodriguez, David A. Sedacca, Albert Garcia
Speech recognition adaptation systems based on adaptation data

Patent number: 8843371

Abstract: The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text.

Type: Grant

Filed: August 1, 2012

Date of Patent: September 23, 2014

Assignee: Elwha LLC

Inventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud
Joint discriminative training of multiple speech recognizers

Patent number: 8843370

Abstract: Adjusting model parameters is described for a speech recognition system that combines recognition outputs from multiple speech recognition processes. Discriminative adjustments are made to model parameters of at least one acoustic model based on a joint discriminative criterion over multiple complementary acoustic models to lower recognition word error rate in the system.

Type: Grant

Filed: November 26, 2007

Date of Patent: September 23, 2014

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Chuang He
TRAINING AN AT LEAST PARTIAL VOICE COMMAND SYSTEM

Publication number: 20140278413

Abstract: An electronic device with one or more processors and memory includes a procedure for training a digital assistant. In some embodiments, the device detects an impasse in a dialogue between the digital assistant and a user including a speech input. During a learning session, the device utilizes a subsequent clarification input from the user to adjust intent inference or task execution associated with the speech input to produce a satisfactory response. In some embodiments, the device identifies a pattern of success or failure associated with an aspect previously used to complete a task and generates a hypothesis regarding a parameter used in speech recognition, intent inference or task execution as a cause for the pattern. Then, the device tests the hypothesis by altering the parameter for a subsequent completion of the task and adopts or rejects the hypothesis based on feedback information collected from the subsequent completion.

Type: Application

Filed: March 14, 2014

Publication date: September 18, 2014

Applicant: Apple Inc.

Inventors: Donald W. PITSCHEL, Adam J. CHEYER, Christopher D. BRIGHAM, Thomas R. GRUBER
Method and Apparatus for Determining a Motion Environment Profile to Adapt Voice Recognition Processing

Publication number: 20140278395

Abstract: A method and apparatus for determining a motion environment profile to adapt voice recognition processing includes a device receiving an acoustic signal including a speech signal, which is provided to a voice recognition module. The method also includes determining a motion profile for the device, determining a temperature profile for the device, and determining a noise profile for the acoustic signal. The method further includes determining, from the motion, temperature, and noise profiles, a motion environment profile for the device and adapting voice recognition processing for the speech signal based on the motion environment profile.

Type: Application

Filed: July 31, 2013

Publication date: September 18, 2014

Applicant: Motorola Mobility LLC

Inventors: Robert A. Zurek, Kevin J. Bastyr, Giles T. Davis, Plamen A. Ivanov, Adrian M. Schuster
PATTERN CLASSIFIER DEVICE, PATTERN CLASSIFYING METHOD, COMPUTER PROGRAM PRODUCT, LEARNING DEVICE, AND LEARNING METHOD

Publication number: 20140257810

Abstract: According to an embodiment, a pattern classifier device includes a decision unit, an execution unit, a calculator, and a determination unit. The decision unit is configured to decide a subclass to which the input pattern is to belong, based on attribute information of the input pattern. The execution unit is configured to determine whether the input pattern belongs to a class that is divided into subclasses, using a weak classifier allocated to the decided subclass, and output a result of the determination and a reliability of the weak classifier. The calculator is configured to calculate an integrated value obtained by integrating an evaluation value based on the determination result and the reliability. The determination unit is configured to repeat the determination processing when a termination condition of the determination processing is not satisfied, and terminate the determination processing and output the integrated value when the termination condition, has been satisfied.

Type: Application

Filed: January 24, 2014

Publication date: September 11, 2014

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Hiroshi Fujimura, Takashi Masuko
Intelligent interest point pruning for audio matching

Patent number: 8831763

Abstract: System and methods for intelligently pruning interest points are disclosed herein. The systems include generating a plurality of distorted audio samples and associated distorted interest points based upon a clean audio sample. Interest points that are common to sets of distorted interest points are retained with interest points not robust to distortion discarded. The disclosed systems and methods therefore can provide for a scalable audio matching solution by eliminating interest points in reference sample fingerprints. The set of pruned interest points are robust to distortion and the benefits of both scalability and accuracy can be had.

Type: Grant

Filed: October 18, 2011

Date of Patent: September 9, 2014

Assignee: Google Inc.

Inventors: Matthew Sharifi, Gheorghe Postelnicu, George Tzanetakis, Dominik Roblek
Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses

Patent number: 8831940

Abstract: A dictation system that allows using trainable code phrases is provided. The dictation system operates by receiving audio and recognizing the audio as text. The text/audio may contain code phrases that are identified by a comparator that matches the text/audio and replaces the code phrase with a standard clause that is associated with the code phrase. The database or memory containing the code phrases is loaded with matched standard clauses that may be identified to provide a hierarchal system such that certain code phrases may have multiple meanings depending on the user.

Type: Grant

Filed: March 21, 2011

Date of Patent: September 9, 2014

Assignee: NVOQ Incorporated

Inventors: Charles Corfield, Brian Marquette, David Mondragon, Rebecca Heins
Speech recognition models based on location indicia

Patent number: 8831957

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speech recognition using models that are based on where, within a building, a speaker makes an utterance are disclosed. The methods, systems, and apparatus include actions of receiving data corresponding to an utterance, and obtaining location indicia for an area within a building where the utterance was spoken. Further actions include selecting one or more models for speech recognition based on the location indicia, wherein each of the selected one or more models is associated with a weight based on the location indicia. Additionally, the actions include generating a composite model using the selected one or more models and the respective weights of the selected one or more models. And the actions also include generating a transcription of the utterance using the composite model.

Type: Grant

Filed: October 15, 2012

Date of Patent: September 9, 2014

Assignee: Google Inc.

Inventors: Gabriel Taubman, Brian Strope
Subword-based multi-level pronunciation adaptation for recognizing accented speech

Patent number: 8825481

Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.

Type: Grant

Filed: January 20, 2012

Date of Patent: September 2, 2014

Assignee: Microsoft Corporation

Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
Real time generation of audio content summaries

Patent number: 8825478

Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.

Type: Grant

Filed: January 10, 2011

Date of Patent: September 2, 2014

Assignee: Nuance Communications, Inc.

Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
FACILITATING DEVELOPMENT OF A SPOKEN NATURAL LANGUAGE INTERFACE

Publication number: 20140244254

Abstract: A development system is described for facilitating the development of a spoken natural language (SNL) interface. The development system receives seed templates from a developer, each of which provides a command phrasing that can be used to invoke a function, when spoken by an end user. The development system then uses one or more development resources, such as a crowdsourcing system and a paraphrasing system, to provide additional templates. This yields an extended set of templates. A generation system then generates one or more models based on the extended set of templates. A user device may install the model(s) for use in interpreting commands spoken by an end user. When the user device recognizes a command, it may automatically invoke a function associated with that command. Overall, the development system provides an easy-to-use tool for producing an SNL interface.

Type: Application

Filed: February 25, 2013

Publication date: August 28, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Yun-Cheng Ju, Matthai Philipose, Seungyeop Han
Unsupervised and active learning in automatic speech recognition for call classification

Patent number: 8818808

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Grant

Filed: February 23, 2005

Date of Patent: August 26, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
Methods and apparatus for generating, updating and distributing speech recognition models

Patent number: 8818809

Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.

Type: Grant

Filed: June 20, 2013

Date of Patent: August 26, 2014

Assignee: Google Inc.

Inventors: Craig L. Reding, Suzi Levas
Method, apparatus, and system for providing real-time PSAP call analysis

Patent number: 8817952

Abstract: Methods, apparatus, and systems are provided such that a Public Safety Answering Point (PSAP) may utilize a new model to handle Open Line emergency calls, including audio optimization, automation, analysis, and presentation. Embodiments of the present disclosure assist with the difficult task of identifying background noise while trying to listen and talk to a caller, and give the best possible audio from the caller to the emergency call-taker or dispatcher. More particularly, an audio stream is split into at least two instances, with a first instance being optimized for speech intelligibility and provided to a call-taker or dispatcher and a second instance being provided for background sound analysis. Accordingly, the new PSAP Open Line model may allow for significantly more efficient emergency assessment, location, and management of resources.

Type: Grant

Filed: March 15, 2013

Date of Patent: August 26, 2014

Assignee: Avaya Inc.

Inventors: Jon Bentley, Mark Fletcher, Joseph L. Hall, Avram Levi, Paul Roller Michaelis, Heinz Teutsch
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8812315

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: October 1, 2013

Date of Patent: August 19, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
DATA PATTERN ANALYSIS (as amended)

Publication number: 20140229178

Abstract: A method for real-time data-pattern analysis. The method includes receiving and queuing at least one data-pattern analysis request by a data-pattern analysis unit controller. At least one data stream portion is also received and stored by the data-pattern analysis unit controller, each data stream portion corresponding to a received data-pattern analysis request. Next, a received data-pattern analysis request is selected by the data-pattern analysis unit controller along with a corresponding data stream portion. A data-pattern analysis is performed based on the selected data-pattern analysis request and the corresponding data stream portion, wherein the data-pattern analysis is performed by one of a plurality of data-pattern analysis units.

Type: Application

Filed: April 22, 2014

Publication date: August 14, 2014

Applicant: Spansion LLC

Inventors: Richard FASTOW, Qamrul Hasan
Distributed speaker adaptation

Patent number: 8805684

Abstract: Automatic speech recognition (ASR) may be performed on received utterances. The ASR may be performed by an ASR module of a computing device (e.g., a client device). The ASR may include: generating feature vectors based on the utterances, updating the feature vectors based on feature-space speaker adaptation parameters, transcribing the utterances to text strings, and updating the feature-space speaker adaptation parameters based on the feature vectors. The transcriptions may be based, at least in part, on an acoustic model and the updated feature vectors. Updated speaker adaptation parameters may be received from another computing device and incorporated into the ASR module.

Type: Grant

Filed: October 17, 2012

Date of Patent: August 12, 2014

Assignee: Google Inc.

Inventors: Petar Aleksic, Xin Lei
System and method for detecting synthetic speaker verification

Patent number: 8805685

Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.

Type: Grant

Filed: August 5, 2013

Date of Patent: August 12, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst J. Schroeter
Noise based interest point density pruning

Patent number: 8805560

Abstract: Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.

Type: Grant

Filed: October 18, 2011

Date of Patent: August 12, 2014

Assignee: Google Inc.

Inventors: George Tzanetakis, Dominik Roblek, Matthew Sharifi
System and Method of Providing an Automated Data-Collection in Spoken Dialog Systems

Publication number: 20140222426

Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

Type: Application

Filed: April 7, 2014

Publication date: August 7, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
SPEECH RECOGNITION LEARNING METHOD USING 3D GEOMETRIC INFORMATION AND SPEECH RECOGNITION METHOD USING 3D GEOMETRIC INFORMATION

Publication number: 20140222425

Abstract: Provided are a speech recognition learning method using 3D geometric information and a speech recognition method by using 3D geometric information. The method performs learning by using 3D geometric information for learning or information derived from the 3D geometric information to generate a recognizer, and the speech recognition method performs speech recognition by applying 3D geometric information on a physical object correlated to or dependent on voice or information derived from the 3D geometric information to the recognizer.

Type: Application

Filed: February 7, 2014

Publication date: August 7, 2014

Applicant: SOGANG UNIVERSITY RESEARCH FOUNDATION

Inventors: Hyung-Min PARK, Changsoo JE, Bi Ho KIM, Min Wook KIM
Methods and systems for natural language understanding using human knowledge and collected data

Patent number: 8798990

Abstract: Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data.

Type: Grant

Filed: April 30, 2013

Date of Patent: August 5, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Mazin Gilbert, Narendra K. Gupta
Resource conservative transformation based unsupervised speaker adaptation

Patent number: 8798994

Abstract: The present invention discloses a solution for conserving computing resources when implementing transformation based adaptation techniques. The disclosed solution limits the amount of speech data used by real-time adaptation algorithms to compute a transformation, which results in substantial computational savings. Appreciably, application of a transform is a relatively low memory and computationally cheap process compared to memory and resource requirements for computing the transform to be applied.

Type: Grant

Filed: February 6, 2008

Date of Patent: August 5, 2014

Assignee: International Business Machines Corporation

Inventors: John W. Eckhart, Michael Florio, Radek Hampl, Pavel Krbec, Jonathan Palgon
METHOD AND SYSTEM FOR DETECTING BOUNDARY OF COARTICULATED UNITS FROM ISOLATED SPEECH

Publication number: 20140214422

Abstract: The application provides a method and system for determinism in non-linear systems for speech processing, particularly automatic speech segmentation for building speech recognition systems. More particularly, the application enables a method and system for detecting boundary of coarticulated units from isolated speech using recurrence plot.

Type: Application

Filed: July 18, 2012

Publication date: July 31, 2014

Applicant: Tata Consultancy Services Limited

Inventors: Mohd Bilal Arif Syed, Arijit Sinharay, Tanushyam Chattopadhyay

prev … 4 5 6 7 8 9 10 11 12 … next