Update Patterns Patents (Class 704/244)

MATCHING AND LEAD PREQUALIFICATION BASED ON VOICE ANALYSIS

Publication number: 20150127343

Abstract: A computing device may perform a feature identification of a received voice segment to recognize physical characteristics of the voice segment. The device may also determine paralinguistic voice characteristics of the voice segment according to the physical characteristics of the voice segment. The device may also indicate a match status of the voice segment according to a comparison of the physical characteristics and the paralinguistic voice characteristics of the voice segment to desired characteristics of matching voice segments.

Type: Application

Filed: November 4, 2014

Publication date: May 7, 2015

Inventors: Miki MULLOR, Luis J. SALAZAR G., Ying LI, Jose Daniel Contreras LANETTI
System for generating captions for live video broadcasts

Patent number: 9026446

Abstract: An adaptive workflow system can be used to implement captioning projects, such as projects for creating captions or subtitles for live and non-live broadcasts. Workers can repeat words spoken during a broadcast program or other program into a voice recognition system, which outputs text that may be used as captions or subtitles. The process of workers repeating these words to create such text can be referred to as respeaking. Respeaking can be used as an effective alternative to more expensive and hard-to-find stenographers for generating captions and subtitles.

Type: Grant

Filed: June 10, 2011

Date of Patent: May 5, 2015

Inventor: Morgan Fiumi
Spoken control for user construction of complex behaviors

Patent number: 9026441

Abstract: A device interface system is presented. Contemplated device interfaces allow for construction of complex device behaviors by aggregating device functions. The behaviors are triggered based on conditions derived from environmental data about the device.

Type: Grant

Filed: February 28, 2013

Date of Patent: May 5, 2015

Assignee: Nant Holdings IP, LLC

Inventors: Farzad Ehsani, Silke Maren Witt-Ehsani, Demitrios Master
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9026442

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: August 14, 2014

Date of Patent: May 5, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
VOICE CONTROL SYSTEM FOR AN IMPLANT

Publication number: 20150120298

Abstract: A system for the control of an implant (32) in a body (11), comprising first (10, 20) and second parts (12) which communicate with each other. The first part (10, 20) is adapted for implantation and for control of and communication with the medical implant (32), and the second part (12) is adapted to be worn on the outside of the body (11) in contact with the body and to receive control commands from a user and to transmit them to the first part (10, 20). The body (11) is used as a conductor for communication between the first (10, 20) and the second (12) parts. The second part (12) is adapted to receive and recognize voice control commands from a user and to transform them into signals which are transmitted to the first part (10, 20) via the body (11).

Type: Application

Filed: August 10, 2014

Publication date: April 30, 2015

Inventor: Peter Forsell
Format based speech reconstruction from noisy signals

Patent number: 9020818

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 28, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Method for Updating Voiceprint Feature Model and Terminal

Publication number: 20150112680

Abstract: A method for updating a voiceprint feature model and a terminal are provided that are applicable to the field of voice recognition technologies. The method includes: obtaining an original audio stream including at least one speaker; obtaining a respective audio stream of each speaker of the at least one speaker in the original audio stream according to a preset speaker segmentation and clustering algorithm; separately matching the respective audio stream of each speaker of the at least one speaker with an original voiceprint feature model, to obtain a successfully matched audio stream; and using the successfully matched audio stream as an additional audio stream training sample for generating the original voiceprint feature model, and updating the original voiceprint feature model.

Type: Application

Filed: December 30, 2014

Publication date: April 23, 2015

Inventor: Ting Lu
Method for refining a search

Patent number: 9015045

Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.

Type: Grant

Filed: March 11, 2013

Date of Patent: April 21, 2015

Assignee: Nuance Communications, Inc.

Inventor: Jean-Francois Lavallee
Formant based speech reconstruction from noisy signals

Patent number: 9015044

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 21, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Configuring Dynamic Custom Vocabulary for Personalized Speech Recognition

Publication number: 20150106096

Abstract: The disclosure includes a system and method for configuring custom vocabularies for personalized speech recognition. The system includes a processor and a memory storing instructions that when executed cause the system to: detect a provisioning trigger event; determine a state of a journey associated with a user based on the provisioning trigger event; determine one or more interest places based on the state of the journey; populate a place vocabulary associated with the user using the one or more interest places; filter the place vocabulary based on one or more place filtering parameters; and register the filtered place vocabulary for the user.

Type: Application

Filed: October 15, 2013

Publication date: April 16, 2015

Applicant: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Divya Sai Toopran, Vinuth Rai, Rahul Parundekar
Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data

Patent number: 9009041

Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.

Type: Grant

Filed: July 26, 2011

Date of Patent: April 14, 2015

Assignee: Nuance Communications, Inc.

Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
State detection device and state detecting method

Patent number: 8996373

Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.

Type: Grant

Filed: October 5, 2011

Date of Patent: March 31, 2015

Assignee: Fujitsu Limited

Inventors: Shoji Hayakawa, Naoshi Matsuo
Method and system for automatic domain adaptation in speech recognition applications

Patent number: 8996371

Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.

Type: Grant

Filed: March 29, 2012

Date of Patent: March 31, 2015

Assignee: Nice-Systems Ltd.

Inventors: Eyal Hurvitz, Ezra Daya, Oren Pereg, Moshe Wasserblat
Using adaptation data with cloud-based speech recognition

Patent number: 8996372

Abstract: Speech recognition may be improved using data derived from an utterance. In some embodiments, audio data is received by a user device. Adaptation data may be retrieved from a data store accessible by the user device. The audio data and the adaptation data may be transmitted to a server device. The server device may use the audio data to calculate second adaptation data. The second adaptation data may be transmitted to the user device. Synchronously or asynchronously, the server device may perform speech recognition using the audio data and the second adaptation data and transmit speech recognition results back to the user device.

Type: Grant

Filed: October 30, 2012

Date of Patent: March 31, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Hugh Secker-Walker, Bjorn Hoffmeister, Ryan Thomas, Stan Salvador, Karthik Ramakrishnan
Methods and systems for synchronizing media

Patent number: 8996380

Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.

Type: Grant

Filed: May 4, 2011

Date of Patent: March 31, 2015

Assignee: Shazam Entertainment Ltd.

Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
NAMED-ENTITY BASED SPEECH RECOGNITION

Publication number: 20150088511

Abstract: In embodiments, apparatuses, methods and storage media are described that are associated with recognition of speech based on sequences of named entities. Language models may be trained as being associated with sequences of named entities. A language model may be selected for speech recognition after identification of one or more sequences of named entities by an initial language model. After identification of the one or more sequences of named entities, weights may be assigned to the one or more sequences of named entities. These weights may be utilized to select a language module and/or update the initial language model to one that is associated with the identified one or more sequences of named entities. In various embodiments, the language model may be repeatedly updated until the recognized speech converges sufficiently to satisfy a predetermined threshold. Other embodiments may be described and claimed.

Type: Application

Filed: September 24, 2013

Publication date: March 26, 2015

Applicant: Verizon Patent and Licensing Inc.

Inventors: Sujeeth S. Bharadwaj, Suri B. Medapati
System and method for handling repeat queries due to wrong ASR output by modifying an acoustic, a language and a semantic model

Patent number: 8990085

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.

Type: Grant

Filed: September 30, 2009

Date of Patent: March 24, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro
Techniques to normalize names efficiently for name-based speech recognition grammars

Patent number: 8990080

Abstract: Techniques to normalize names for name-based speech recognition grammars are described. Some embodiments are particularly directed to techniques to normalize names for name-based speech recognition grammars more efficiently by caching, and on a per-culture basis. A technique may comprise receiving a name for normalization, during name processing for a name-based speech grammar generating process. A normalization cache may be examined to determine if the name is already in the cache in a normalized form. When the name is not already in the cache, the name may be normalized and added to the cache. When the name is in the cache, the normalization result may be retrieved and passed to the next processing step. Other embodiments are described and claimed.

Type: Grant

Filed: January 27, 2012

Date of Patent: March 24, 2015

Assignee: Microsoft Corporation

Inventors: Mini Varkey, Bernardo Sana, Victor Boctor, Diego Carlomagno
METHOD AND APPARATUS FOR CONTROLLING ACCESS TO APPLICATIONS

Publication number: 20150081295

Abstract: According to an aspect of the present disclosure, a method for controlling access to a plurality of applications in an electronic device is disclosed. The method includes receiving a voice command from a speaker for accessing a target application among the plurality of applications, and verifying whether the voice command is indicative of a user authorized to access the applications based on a speaker model of the authorized user. In this method, each application is associated with a security level having a threshold value. The method further includes updating the speaker model with the voice command if the voice command is verified to be indicative of the user, and adjusting at least one of the threshold values based on the updated speaker model.

Type: Application

Filed: September 16, 2013

Publication date: March 19, 2015

Applicant: QUALCOMM Incorporated

Inventors: Sungrack Yun, Taesu Kim, Jun-Cheol Cho, Min-Kyu Park, Kyu Woong Hwang
Captioning using socially derived acoustic profiles

Patent number: 8983836

Abstract: Mechanisms for performing dynamic automatic speech recognition on a portion of multimedia content are provided. Multimedia content is segmented into homogeneous segments of content with regard to speakers and background sounds. For the at least one segment, a speaker providing speech in an audio track of the at least one segment is identified using information retrieved from a social network service source. A speech profile for the speaker is generated using information retrieved from the social network service source, an acoustic profile for the segment is generated based on the generated speech profile, and an automatic speech recognition engine is dynamically configured for operation on the at least one segment based on the acoustic profile. Automatic speech recognition operations are performed on the audio track of the at least one segment to generate a textual representation of speech content in the audio track corresponding to the speaker.

Type: Grant

Filed: September 26, 2012

Date of Patent: March 17, 2015

Assignee: International Business Machines Corporation

Inventors: Elizabeth V. Woodward, Shunguo Yan
System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling

Publication number: 20150073797

Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations based on symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.

Type: Application

Filed: November 12, 2014

Publication date: March 12, 2015

Inventors: Alistair D. CONKIE, Mazin GILBERT, Andrej LJOLJE
APPARATUS AND METHOD OF GENERATING LANGUAGE MODEL FOR SPEECH RECOGNITION

Publication number: 20150073796

Abstract: Disclosed herein are an apparatus and a method of generating a language model for speech recognition. The present invention is to provide an apparatus of generating a language model capable of improving speech recognition performance by predicting a position at which break is present and reflecting the predicted break information.

Type: Application

Filed: April 2, 2014

Publication date: March 12, 2015

Applicant: Electronics and Telecommunications Research Institute

Inventors: Jeong-Se Kim, Sang-Hun Kim
Computer-implemented system and method for voice transcription error reduction

Patent number: 8972261

Abstract: A computer-implemented system and method for voice transcription error reduction is provided. Speech utterances are obtained from a voice stream and each speech utterance is associated with a transcribed value and a confidence score. Those utterances with transcription values associated with lower confidence scores are identified as questionable utterances. One of the questionable utterances is selected from the voice stream. A predetermined number of questionable utterances from other voice streams and having transcribed values similar to the transcribed value of the selected questionable utterance are identified as a pool of related utterances. A further transcribed value is received for each of a plurality of the questionable utterances in the pool of related utterances. A transcribed message is generated for the voice stream using those transcribed values with higher confidence scores and the further transcribed value for the selected questionable utterance.

Type: Grant

Filed: February 3, 2014

Date of Patent: March 3, 2015

Assignee: Intellisist, Inc.

Inventor: David Milstein
Speech recognition using multiple language models

Patent number: 8972260

Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.

Type: Grant

Filed: April 19, 2012

Date of Patent: March 3, 2015

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
Method for setting voice tag

Patent number: 8964948

Abstract: A method for setting a voice tag is provided, which comprises the following steps. First, counting a number of phone calls performed between a user and a contact person. If the number of phone calls exceeds a predetermined times or a voice dialing performed by the user is failed before calling to the contact person within a predetermined duration, the user is inquired whether or not to set a voice tag corresponding to the contact person after the phone call is complete. If the user decides to set the voice tag, a voice training procedure is executed for setting the voice tag corresponding to the contact person.

Type: Grant

Filed: May 29, 2012

Date of Patent: February 24, 2015

Assignee: HTC Corporation

Inventor: Fu-Chiang Chou
Discriminative language modeling for automatic speech recognition with a weak acoustic model and distributed training

Patent number: 8965763

Abstract: Training data from a plurality of utterance-to-text-string mappings of an automatic speech recognition (ASR) system may be selected. Parameters of the ASR system that characterize the utterances and their respective mappings may be determined through application of a first acoustic model and a language model. A second acoustic model and the language model may be applied to the selected training data utterances to determine a second set of utterance-to-text-string mappings. The first set of utterance-to-text-string mappings may be compared to the second set of utterance-to-text-string mappings, and the parameters of the ASR system may be updated based on the comparison.

Type: Grant

Filed: May 1, 2012

Date of Patent: February 24, 2015

Assignee: Google Inc.

Inventors: Ciprian Ioan Chelba, Brian Strope, Preethi Jyothi, Leif Johnson
Speech recognition in automated information services systems

Patent number: 8954325

Abstract: The present invention allows feedback from operator workstations to be used to update databases used for providing automated information services. When an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system. Thus, when the operator inputs the words corresponding to the speech provided at the information services automation system, the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.

Type: Grant

Filed: March 22, 2004

Date of Patent: February 10, 2015

Assignee: Rockstar Consortium US LP

Inventors: Bruce Bokish, Michael Craig Presnell
Object datastore in an augmented reality environment

Patent number: 8953889

Abstract: An augmented reality environment allows interaction between virtual and real objects and enhances an unstructured real-world environment. An object datastore comprising attributes of an object within the environment may be built and/or maintained from sources including manufacturers, retailers, shippers, and users. This object datastore may be local, cloud based, or a combination thereof. Applications may interrogate the object datastore to provide user functionality.

Type: Grant

Filed: September 14, 2011

Date of Patent: February 10, 2015

Assignee: Rawles LLC

Inventors: William Spencer Worley, III, Edward Dietz Crump
Method and Apparatus for Mitigating False Accepts of Trigger Phrases

Publication number: 20150039310

Abstract: An electronic device includes a microphone that receives an audio signal, and a processor that is electrically coupled to the microphone. The processor detects a trigger phrase in the received audio signal and measure characteristics of the detected trigger phrase. Based on the measured characteristics of the detected trigger phrase, the processor determines whether the detected trigger phrase is valid.

Type: Application

Filed: October 10, 2013

Publication date: February 5, 2015

Applicant: Motorola Mobility LLC

Inventors: Joel A. Clark, Tenkasi V. Rambadran, Mark A. Jasiuk
Method and Apparatus for Evaluating Trigger Phrase Enrollment

Publication number: 20150039311

Abstract: An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.

Type: Application

Filed: October 10, 2013

Publication date: February 5, 2015

Applicant: Motorola Mobility LLC

Inventors: Joel A. Clark, Tenkasi V. Ramabadran, Mark A. Jasiuk
Internal and external speech recognition use with a mobile communication facility

Patent number: 8949130

Abstract: In embodiments of the present invention improved capabilities are described for a user interacting with a mobile communication facility, where speech presented by the user is recorded using a mobile communication facility resident capture facility. The recorded speech may be recognized using an external speech recognition facility to produce an external output and a resident speech recognition facility to produce an internal output, where at least one of the external output and the internal output may be selected based on a criteria.

Type: Grant

Filed: October 21, 2009

Date of Patent: February 3, 2015

Assignee: Vlingo Corporation

Inventor: Michael S. Phillips
Creating statistical language models for spoken CAPTCHAs

Patent number: 8949126

Abstract: Methods for creating statistical language models (SLMs) for spoken Completely Automated Turing Tests for Telling Computers and Humans Apart (CAPTCHAs) are disclosed. In these methods, candidate challenge items including one or more words are automatically selected from a document corpus. Selected ones of the challenge items are articulated by a machine text-to-speech (TTS) system as candidate articulations. Those articulations are ranked based on a human listener score indicating whether a candidate articulation originated from a machine. The SLM is then trained to recognize machine TTS articulations according to those rankings, by using a subset of the plurality of candidate challenge items identified as machine articulations as a seed set.

Type: Grant

Filed: April 21, 2014

Date of Patent: February 3, 2015

Assignee: The John Nicholas and Kristin Gross Trust

Inventor: John Nicholas Gross
Method and Device for Voice Recognition Training

Publication number: 20150032451

Abstract: A method on a mobile device for voice recognition training is described. A voice training mode is entered. A voice training sample for a user of the mobile device is recorded. The voice training mode is interrupted to enter a noise indicator mode based on a sample background noise level for the voice training sample and a sample background noise type for the voice training sample. The voice training mode is returned to from the noise indicator mode when the user provides a continuation input that indicates a current background noise level meets an indicator threshold value.

Type: Application

Filed: December 27, 2013

Publication date: January 29, 2015

Applicant: Motorola Mobility LLC

Inventors: Michael E. Gunn, Boris Bekkerman, Mark A. Jasiuk, Pratik M. Kamdar, Jeffrey A. Sierawski
SYSTEM AND METHOD OF EXTRACTING CLAUSES FOR SPOKEN LANGUAGE UNDERSTANDING

Publication number: 20150025886

Abstract: A clausifier for extracting clauses for spoken language understanding is disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.

Type: Application

Filed: August 26, 2014

Publication date: January 22, 2015

Inventors: Srinivas BANGALORE, Narendra K. GUPTA, Mazin G. RAHIM
Configuring a speech engine for a multimodal application based on location

Patent number: 8938392

Abstract: Methods, apparatus, and products are disclosed for configuring a speech engine for a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application. The multimodal application is operatively coupled to a speech engine.

Type: Grant

Filed: February 27, 2007

Date of Patent: January 20, 2015

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Igor R. Jablokov
SYSTEMS AND METHODS FOR SPOKEN DIALOG SERVICE ARBITRATION

Publication number: 20150019219

Abstract: Systems and methods for arbitrating spoken dialog services include determining a capability catalog associated with a plurality of devices accessible within an environment. The capability catalog includes a list of the plurality of devices mapped to a list of spoken dialog services provided by each of the plurality of devices. The system arbitrates between the plurality of devices and the spoken dialog services in the capability catalog to determine a selected device and a selected dialog service.

Type: Application

Filed: December 2, 2013

Publication date: January 15, 2015

Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: ELI TZIRKEL-HANCOCK, GREG T. LINDEMANN, ROBERT D. SIMS, OMER TSIMHONI
VOICE AUTHENTICATION AND SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20150019220

Abstract: A method for configuring a speech recognition system comprises obtaining a speech sample utilised by a voice authentication system in a voice authentication process. The speech sample is processed to generate acoustic models for units of speech associated with the speech sample. The acoustic models are stored for subsequent use by the speech recognition system as part of a speech recognition process.

Type: Application

Filed: January 23, 2013

Publication date: January 15, 2015

Inventors: Habib Emile Talhami, Amit Sadanand Malegaonkar, Renuka Amit Malegaonkar, Clive David Summerfield
Exemplar-based latent perceptual modeling for automatic speech recognition

Patent number: 8935167

Abstract: Methods, systems, and computer-readable media related to selecting observation-specific training data (also referred to as “observation-specific exemplars”) from a general training corpus, and then creating, from the observation-specific training data, a focused, observation-specific acoustic model for recognizing the observation in an output domain are disclosed. In one aspect, a global speech recognition model is established based on an initial set of training data; a plurality of input speech segments to be recognized in an output domain are received; and for each of the plurality of input speech segments: a respective set of focused training data relevant to the input speech segment is identified in the global speech recognition model; a respective focused speech recognition model is generated based on the respective set of focused training data; and the respective focused speech recognition model is provided to a recognition device for recognizing the input speech segment in the output domain.

Type: Grant

Filed: September 25, 2012

Date of Patent: January 13, 2015

Assignee: Apple Inc.

Inventor: Jerome Bellegarda
Extended recognition dictionary learning device and speech recognition system

Patent number: 8918318

Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.

Type: Grant

Filed: January 15, 2008

Date of Patent: December 23, 2014

Assignee: NEC Corporation

Inventor: Yoshifumi Onishi
Speech recognition with hierarchical networks

Patent number: 8914286

Abstract: Provided are systems and methods for using hierarchical networks for recognition, such as speech recognition. Conventional automatic recognition systems may not be both efficient and flexible. Recognition systems are disclosed that may achieve efficiency and flexibility by employing hierarchical networks, prefix consolidation of networks, and future consolidation of networks. The disclosed networks may be associated with a network model and the associated network model may be modified during recognition to achieve greater flexibility.

Type: Grant

Filed: March 29, 2012

Date of Patent: December 16, 2014

Assignee: Canyon IP Holdings, LLC

Inventors: Hugh Secker-Walker, Kenneth J. Basye, Mahesh Krishnamoorthy
Internal and external speech recognition use with a mobile communication facility

Patent number: 8914292

Abstract: In embodiments of the present invention improved capabilities are described for a user interacting with a mobile communication facility, where speech presented by the user is recorded using a mobile communication facility resident capture facility. The recorded speech may be recognized using an external speech recognition facility to produce an external output and a resident speech recognition facility to produce an internal output, where at least one of the external output and the internal output may be selected based on a criteria.

Type: Grant

Filed: October 21, 2009

Date of Patent: December 16, 2014

Assignee: Vlingo Corporation

Inventor: Michael S. Phillips
LANGUAGE MODEL ADAPTATION USING RESULT SELECTION

Publication number: 20140365218

Abstract: A received utterance is recognized using different language models. For example, recognition of the utterance is independently performed using a baseline language model (BLM) and using an adapted language model (ALM). A determination is made as to what results from the different language model are more likely to be accurate. Different features may be used to assist in making the determination (e.g. language model scores, recognition confidences, acoustic model scores, quality measurements, . . . ) may be used. A classifier may be trained and then used in determining whether to select the results using the BLM or to select the results using the ALM. A language model may be automatically trained or re-trained that adjusts a weight of the training data used in training the model in response to differences between the two results obtained from applying the different language models.

Type: Application

Filed: June 7, 2013

Publication date: December 11, 2014

Inventors: Shuangyu Chang, Michael Levit
Method and system for automatically detecting morphemes in a task classification system using lattices

Patent number: 8909529

Abstract: The invention concerns a method and corresponding system for building a phonotactic mode for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Grant

Filed: November 15, 2013

Date of Patent: December 9, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Giuseppe Riccardi
System and Method for Adapting Automatic Speech Recognition Pronunciation by Acoustic Model Restructuring

Publication number: 20140358540

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Application

Filed: August 14, 2014

Publication date: December 4, 2014

Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. Syrdal
Multitask Learning for Spoken Language Understanding

Publication number: 20140343942

Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Application

Filed: May 27, 2014

Publication date: November 20, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Gokhan TUR
Front-end processor for speech recognition, and speech recognizing apparatus and method using the same

Patent number: 8892436

Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.

Type: Grant

Filed: October 19, 2011

Date of Patent: November 18, 2014

Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation

Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
Utilizing multiple processing units for rapid training of hidden markov models

Patent number: 8886535

Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.

Type: Grant

Filed: January 23, 2014

Date of Patent: November 11, 2014

Assignee: Accumente, LLC

Inventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
Speech recognition apparatus, speech recognition method, and speech recognition robot

Patent number: 8886534

Abstract: A speech recognition apparatus includes a speech input unit that receives input speech, a phoneme recognition unit that recognizes phonemes of the input speech and generates a first phoneme sequence representing corrected speech, a matching unit that matches the first phoneme sequence with a second phoneme sequence representing original speech, and a phoneme correcting unit that corrects phonemes of the second phoneme sequence based on the matching result.

Type: Grant

Filed: January 27, 2011

Date of Patent: November 11, 2014

Assignee: Honda Motor Co., Ltd.

Inventors: Mikio Nakano, Naoto Iwahashi, Kotaro Funakoshi, Taisuke Sumii
Using speech recognition results based on an unstructured language model in a mobile communication facility application

Patent number: 8886540

Abstract: A method and system for entering information into a software application resident on a mobile communication facility is provided. The method and system may include recording speech presented by a user using a mobile communication facility resident capture facility, transmitting the recording through a wireless communication facility to a speech recognition facility, transmitting information relating to the software application to the speech recognition facility, generating results utilizing the speech recognition facility using an unstructured language model based at least in part on the information relating to the software application and the recording, transmitting the results to the mobile communications facility, loading the results into the software application and simultaneously displaying the results as a set of words and as a set of application results based on those words.

Type: Grant

Filed: August 1, 2008

Date of Patent: November 11, 2014

Assignee: Vlingo Corporation

Inventors: Joseph P. Cerra, John N. Nguyen, Michael S. Phillips, Han Shu, Alexandra Beth Mischke
Apparatus and Method for Model Adaptation for Spoken Language Understanding

Publication number: 20140330565

Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.

Type: Application

Filed: May 20, 2014

Publication date: November 6, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur

prev 1 2 3 4 5 6 7 … next