Update Patterns Patents (Class 704/244)
  • Publication number: 20150127343
    Abstract: A computing device may perform a feature identification of a received voice segment to recognize physical characteristics of the voice segment. The device may also determine paralinguistic voice characteristics of the voice segment according to the physical characteristics of the voice segment. The device may also indicate a match status of the voice segment according to a comparison of the physical characteristics and the paralinguistic voice characteristics of the voice segment to desired characteristics of matching voice segments.
    Type: Application
    Filed: November 4, 2014
    Publication date: May 7, 2015
    Inventors: Miki MULLOR, Luis J. SALAZAR G., Ying LI, Jose Daniel Contreras LANETTI
  • Patent number: 9026446
    Abstract: An adaptive workflow system can be used to implement captioning projects, such as projects for creating captions or subtitles for live and non-live broadcasts. Workers can repeat words spoken during a broadcast program or other program into a voice recognition system, which outputs text that may be used as captions or subtitles. The process of workers repeating these words to create such text can be referred to as respeaking. Respeaking can be used as an effective alternative to more expensive and hard-to-find stenographers for generating captions and subtitles.
    Type: Grant
    Filed: June 10, 2011
    Date of Patent: May 5, 2015
    Inventor: Morgan Fiumi
  • Patent number: 9026441
    Abstract: A device interface system is presented. Contemplated device interfaces allow for construction of complex device behaviors by aggregating device functions. The behaviors are triggered based on conditions derived from environmental data about the device.
    Type: Grant
    Filed: February 28, 2013
    Date of Patent: May 5, 2015
    Assignee: Nant Holdings IP, LLC
    Inventors: Farzad Ehsani, Silke Maren Witt-Ehsani, Demitrios Master
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Publication number: 20150120298
    Abstract: A system for the control of an implant (32) in a body (11), comprising first (10, 20) and second parts (12) which communicate with each other. The first part (10, 20) is adapted for implantation and for control of and communication with the medical implant (32), and the second part (12) is adapted to be worn on the outside of the body (11) in contact with the body and to receive control commands from a user and to transmit them to the first part (10, 20). The body (11) is used as a conductor for communication between the first (10, 20) and the second (12) parts. The second part (12) is adapted to receive and recognize voice control commands from a user and to transform them into signals which are transmitted to the first part (10, 20) via the body (11).
    Type: Application
    Filed: August 10, 2014
    Publication date: April 30, 2015
    Inventor: Peter Forsell
  • Patent number: 9020818
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Publication number: 20150112680
    Abstract: A method for updating a voiceprint feature model and a terminal are provided that are applicable to the field of voice recognition technologies. The method includes: obtaining an original audio stream including at least one speaker; obtaining a respective audio stream of each speaker of the at least one speaker in the original audio stream according to a preset speaker segmentation and clustering algorithm; separately matching the respective audio stream of each speaker of the at least one speaker with an original voiceprint feature model, to obtain a successfully matched audio stream; and using the successfully matched audio stream as an additional audio stream training sample for generating the original voiceprint feature model, and updating the original voiceprint feature model.
    Type: Application
    Filed: December 30, 2014
    Publication date: April 23, 2015
    Inventor: Ting Lu
  • Patent number: 9015045
    Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.
    Type: Grant
    Filed: March 11, 2013
    Date of Patent: April 21, 2015
    Assignee: Nuance Communications, Inc.
    Inventor: Jean-Francois Lavallee
  • Patent number: 9015044
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 21, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Publication number: 20150106096
    Abstract: The disclosure includes a system and method for configuring custom vocabularies for personalized speech recognition. The system includes a processor and a memory storing instructions that when executed cause the system to: detect a provisioning trigger event; determine a state of a journey associated with a user based on the provisioning trigger event; determine one or more interest places based on the state of the journey; populate a place vocabulary associated with the user using the one or more interest places; filter the place vocabulary based on one or more place filtering parameters; and register the filtered place vocabulary for the user.
    Type: Application
    Filed: October 15, 2013
    Publication date: April 16, 2015
    Applicant: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Divya Sai Toopran, Vinuth Rai, Rahul Parundekar
  • Patent number: 9009041
    Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: April 14, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8996371
    Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: March 31, 2015
    Assignee: Nice-Systems Ltd.
    Inventors: Eyal Hurvitz, Ezra Daya, Oren Pereg, Moshe Wasserblat
  • Patent number: 8996372
    Abstract: Speech recognition may be improved using data derived from an utterance. In some embodiments, audio data is received by a user device. Adaptation data may be retrieved from a data store accessible by the user device. The audio data and the adaptation data may be transmitted to a server device. The server device may use the audio data to calculate second adaptation data. The second adaptation data may be transmitted to the user device. Synchronously or asynchronously, the server device may perform speech recognition using the audio data and the second adaptation data and transmit speech recognition results back to the user device.
    Type: Grant
    Filed: October 30, 2012
    Date of Patent: March 31, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Hugh Secker-Walker, Bjorn Hoffmeister, Ryan Thomas, Stan Salvador, Karthik Ramakrishnan
  • Patent number: 8996380
    Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.
    Type: Grant
    Filed: May 4, 2011
    Date of Patent: March 31, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
  • Publication number: 20150088511
    Abstract: In embodiments, apparatuses, methods and storage media are described that are associated with recognition of speech based on sequences of named entities. Language models may be trained as being associated with sequences of named entities. A language model may be selected for speech recognition after identification of one or more sequences of named entities by an initial language model. After identification of the one or more sequences of named entities, weights may be assigned to the one or more sequences of named entities. These weights may be utilized to select a language module and/or update the initial language model to one that is associated with the identified one or more sequences of named entities. In various embodiments, the language model may be repeatedly updated until the recognized speech converges sufficiently to satisfy a predetermined threshold. Other embodiments may be described and claimed.
    Type: Application
    Filed: September 24, 2013
    Publication date: March 26, 2015
    Applicant: Verizon Patent and Licensing Inc.
    Inventors: Sujeeth S. Bharadwaj, Suri B. Medapati
  • Patent number: 8990085
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: March 24, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro
  • Patent number: 8990080
    Abstract: Techniques to normalize names for name-based speech recognition grammars are described. Some embodiments are particularly directed to techniques to normalize names for name-based speech recognition grammars more efficiently by caching, and on a per-culture basis. A technique may comprise receiving a name for normalization, during name processing for a name-based speech grammar generating process. A normalization cache may be examined to determine if the name is already in the cache in a normalized form. When the name is not already in the cache, the name may be normalized and added to the cache. When the name is in the cache, the normalization result may be retrieved and passed to the next processing step. Other embodiments are described and claimed.
    Type: Grant
    Filed: January 27, 2012
    Date of Patent: March 24, 2015
    Assignee: Microsoft Corporation
    Inventors: Mini Varkey, Bernardo Sana, Victor Boctor, Diego Carlomagno
  • Publication number: 20150081295
    Abstract: According to an aspect of the present disclosure, a method for controlling access to a plurality of applications in an electronic device is disclosed. The method includes receiving a voice command from a speaker for accessing a target application among the plurality of applications, and verifying whether the voice command is indicative of a user authorized to access the applications based on a speaker model of the authorized user. In this method, each application is associated with a security level having a threshold value. The method further includes updating the speaker model with the voice command if the voice command is verified to be indicative of the user, and adjusting at least one of the threshold values based on the updated speaker model.
    Type: Application
    Filed: September 16, 2013
    Publication date: March 19, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Sungrack Yun, Taesu Kim, Jun-Cheol Cho, Min-Kyu Park, Kyu Woong Hwang
  • Patent number: 8983836
    Abstract: Mechanisms for performing dynamic automatic speech recognition on a portion of multimedia content are provided. Multimedia content is segmented into homogeneous segments of content with regard to speakers and background sounds. For the at least one segment, a speaker providing speech in an audio track of the at least one segment is identified using information retrieved from a social network service source. A speech profile for the speaker is generated using information retrieved from the social network service source, an acoustic profile for the segment is generated based on the generated speech profile, and an automatic speech recognition engine is dynamically configured for operation on the at least one segment based on the acoustic profile. Automatic speech recognition operations are performed on the audio track of the at least one segment to generate a textual representation of speech content in the audio track corresponding to the speaker.
    Type: Grant
    Filed: September 26, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Elizabeth V. Woodward, Shunguo Yan
  • Publication number: 20150073797
    Abstract: The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations based on symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.
    Type: Application
    Filed: November 12, 2014
    Publication date: March 12, 2015
    Inventors: Alistair D. CONKIE, Mazin GILBERT, Andrej LJOLJE
  • Publication number: 20150073796
    Abstract: Disclosed herein are an apparatus and a method of generating a language model for speech recognition. The present invention is to provide an apparatus of generating a language model capable of improving speech recognition performance by predicting a position at which break is present and reflecting the predicted break information.
    Type: Application
    Filed: April 2, 2014
    Publication date: March 12, 2015
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Jeong-Se Kim, Sang-Hun Kim
  • Patent number: 8972261
    Abstract: A computer-implemented system and method for voice transcription error reduction is provided. Speech utterances are obtained from a voice stream and each speech utterance is associated with a transcribed value and a confidence score. Those utterances with transcription values associated with lower confidence scores are identified as questionable utterances. One of the questionable utterances is selected from the voice stream. A predetermined number of questionable utterances from other voice streams and having transcribed values similar to the transcribed value of the selected questionable utterance are identified as a pool of related utterances. A further transcribed value is received for each of a plurality of the questionable utterances in the pool of related utterances. A transcribed message is generated for the voice stream using those transcribed values with higher confidence scores and the further transcribed value for the selected questionable utterance.
    Type: Grant
    Filed: February 3, 2014
    Date of Patent: March 3, 2015
    Assignee: Intellisist, Inc.
    Inventor: David Milstein
  • Patent number: 8972260
    Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: March 3, 2015
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
  • Patent number: 8964948
    Abstract: A method for setting a voice tag is provided, which comprises the following steps. First, counting a number of phone calls performed between a user and a contact person. If the number of phone calls exceeds a predetermined times or a voice dialing performed by the user is failed before calling to the contact person within a predetermined duration, the user is inquired whether or not to set a voice tag corresponding to the contact person after the phone call is complete. If the user decides to set the voice tag, a voice training procedure is executed for setting the voice tag corresponding to the contact person.
    Type: Grant
    Filed: May 29, 2012
    Date of Patent: February 24, 2015
    Assignee: HTC Corporation
    Inventor: Fu-Chiang Chou
  • Patent number: 8965763
    Abstract: Training data from a plurality of utterance-to-text-string mappings of an automatic speech recognition (ASR) system may be selected. Parameters of the ASR system that characterize the utterances and their respective mappings may be determined through application of a first acoustic model and a language model. A second acoustic model and the language model may be applied to the selected training data utterances to determine a second set of utterance-to-text-string mappings. The first set of utterance-to-text-string mappings may be compared to the second set of utterance-to-text-string mappings, and the parameters of the ASR system may be updated based on the comparison.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: February 24, 2015
    Assignee: Google Inc.
    Inventors: Ciprian Ioan Chelba, Brian Strope, Preethi Jyothi, Leif Johnson
  • Patent number: 8954325
    Abstract: The present invention allows feedback from operator workstations to be used to update databases used for providing automated information services. When an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system. Thus, when the operator inputs the words corresponding to the speech provided at the information services automation system, the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.
    Type: Grant
    Filed: March 22, 2004
    Date of Patent: February 10, 2015
    Assignee: Rockstar Consortium US LP
    Inventors: Bruce Bokish, Michael Craig Presnell
  • Patent number: 8953889
    Abstract: An augmented reality environment allows interaction between virtual and real objects and enhances an unstructured real-world environment. An object datastore comprising attributes of an object within the environment may be built and/or maintained from sources including manufacturers, retailers, shippers, and users. This object datastore may be local, cloud based, or a combination thereof. Applications may interrogate the object datastore to provide user functionality.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: February 10, 2015
    Assignee: Rawles LLC
    Inventors: William Spencer Worley, III, Edward Dietz Crump
  • Publication number: 20150039310
    Abstract: An electronic device includes a microphone that receives an audio signal, and a processor that is electrically coupled to the microphone. The processor detects a trigger phrase in the received audio signal and measure characteristics of the detected trigger phrase. Based on the measured characteristics of the detected trigger phrase, the processor determines whether the detected trigger phrase is valid.
    Type: Application
    Filed: October 10, 2013
    Publication date: February 5, 2015
    Applicant: Motorola Mobility LLC
    Inventors: Joel A. Clark, Tenkasi V. Rambadran, Mark A. Jasiuk
  • Publication number: 20150039311
    Abstract: An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.
    Type: Application
    Filed: October 10, 2013
    Publication date: February 5, 2015
    Applicant: Motorola Mobility LLC
    Inventors: Joel A. Clark, Tenkasi V. Ramabadran, Mark A. Jasiuk
  • Patent number: 8949130
    Abstract: In embodiments of the present invention improved capabilities are described for a user interacting with a mobile communication facility, where speech presented by the user is recorded using a mobile communication facility resident capture facility. The recorded speech may be recognized using an external speech recognition facility to produce an external output and a resident speech recognition facility to produce an internal output, where at least one of the external output and the internal output may be selected based on a criteria.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: February 3, 2015
    Assignee: Vlingo Corporation
    Inventor: Michael S. Phillips
  • Patent number: 8949126
    Abstract: Methods for creating statistical language models (SLMs) for spoken Completely Automated Turing Tests for Telling Computers and Humans Apart (CAPTCHAs) are disclosed. In these methods, candidate challenge items including one or more words are automatically selected from a document corpus. Selected ones of the challenge items are articulated by a machine text-to-speech (TTS) system as candidate articulations. Those articulations are ranked based on a human listener score indicating whether a candidate articulation originated from a machine. The SLM is then trained to recognize machine TTS articulations according to those rankings, by using a subset of the plurality of candidate challenge items identified as machine articulations as a seed set.
    Type: Grant
    Filed: April 21, 2014
    Date of Patent: February 3, 2015
    Assignee: The John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Publication number: 20150032451
    Abstract: A method on a mobile device for voice recognition training is described. A voice training mode is entered. A voice training sample for a user of the mobile device is recorded. The voice training mode is interrupted to enter a noise indicator mode based on a sample background noise level for the voice training sample and a sample background noise type for the voice training sample. The voice training mode is returned to from the noise indicator mode when the user provides a continuation input that indicates a current background noise level meets an indicator threshold value.
    Type: Application
    Filed: December 27, 2013
    Publication date: January 29, 2015
    Applicant: Motorola Mobility LLC
    Inventors: Michael E. Gunn, Boris Bekkerman, Mark A. Jasiuk, Pratik M. Kamdar, Jeffrey A. Sierawski
  • Publication number: 20150025886
    Abstract: A clausifier for extracting clauses for spoken language understanding is disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.
    Type: Application
    Filed: August 26, 2014
    Publication date: January 22, 2015
    Inventors: Srinivas BANGALORE, Narendra K. GUPTA, Mazin G. RAHIM
  • Patent number: 8938392
    Abstract: Methods, apparatus, and products are disclosed for configuring a speech engine for a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application. The multimodal application is operatively coupled to a speech engine.
    Type: Grant
    Filed: February 27, 2007
    Date of Patent: January 20, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Charles W. Cross, Jr., Igor R. Jablokov
  • Publication number: 20150019219
    Abstract: Systems and methods for arbitrating spoken dialog services include determining a capability catalog associated with a plurality of devices accessible within an environment. The capability catalog includes a list of the plurality of devices mapped to a list of spoken dialog services provided by each of the plurality of devices. The system arbitrates between the plurality of devices and the spoken dialog services in the capability catalog to determine a selected device and a selected dialog service.
    Type: Application
    Filed: December 2, 2013
    Publication date: January 15, 2015
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: ELI TZIRKEL-HANCOCK, GREG T. LINDEMANN, ROBERT D. SIMS, OMER TSIMHONI
  • Publication number: 20150019220
    Abstract: A method for configuring a speech recognition system comprises obtaining a speech sample utilised by a voice authentication system in a voice authentication process. The speech sample is processed to generate acoustic models for units of speech associated with the speech sample. The acoustic models are stored for subsequent use by the speech recognition system as part of a speech recognition process.
    Type: Application
    Filed: January 23, 2013
    Publication date: January 15, 2015
    Inventors: Habib Emile Talhami, Amit Sadanand Malegaonkar, Renuka Amit Malegaonkar, Clive David Summerfield
  • Patent number: 8935167
    Abstract: Methods, systems, and computer-readable media related to selecting observation-specific training data (also referred to as “observation-specific exemplars”) from a general training corpus, and then creating, from the observation-specific training data, a focused, observation-specific acoustic model for recognizing the observation in an output domain are disclosed. In one aspect, a global speech recognition model is established based on an initial set of training data; a plurality of input speech segments to be recognized in an output domain are received; and for each of the plurality of input speech segments: a respective set of focused training data relevant to the input speech segment is identified in the global speech recognition model; a respective focused speech recognition model is generated based on the respective set of focused training data; and the respective focused speech recognition model is provided to a recognition device for recognizing the input speech segment in the output domain.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: January 13, 2015
    Assignee: Apple Inc.
    Inventor: Jerome Bellegarda
  • Patent number: 8918318
    Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.
    Type: Grant
    Filed: January 15, 2008
    Date of Patent: December 23, 2014
    Assignee: NEC Corporation
    Inventor: Yoshifumi Onishi
  • Patent number: 8914286
    Abstract: Provided are systems and methods for using hierarchical networks for recognition, such as speech recognition. Conventional automatic recognition systems may not be both efficient and flexible. Recognition systems are disclosed that may achieve efficiency and flexibility by employing hierarchical networks, prefix consolidation of networks, and future consolidation of networks. The disclosed networks may be associated with a network model and the associated network model may be modified during recognition to achieve greater flexibility.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: December 16, 2014
    Assignee: Canyon IP Holdings, LLC
    Inventors: Hugh Secker-Walker, Kenneth J. Basye, Mahesh Krishnamoorthy
  • Patent number: 8914292
    Abstract: In embodiments of the present invention improved capabilities are described for a user interacting with a mobile communication facility, where speech presented by the user is recorded using a mobile communication facility resident capture facility. The recorded speech may be recognized using an external speech recognition facility to produce an external output and a resident speech recognition facility to produce an internal output, where at least one of the external output and the internal output may be selected based on a criteria.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: December 16, 2014
    Assignee: Vlingo Corporation
    Inventor: Michael S. Phillips
  • Publication number: 20140365218
    Abstract: A received utterance is recognized using different language models. For example, recognition of the utterance is independently performed using a baseline language model (BLM) and using an adapted language model (ALM). A determination is made as to what results from the different language model are more likely to be accurate. Different features may be used to assist in making the determination (e.g. language model scores, recognition confidences, acoustic model scores, quality measurements, . . . ) may be used. A classifier may be trained and then used in determining whether to select the results using the BLM or to select the results using the ALM. A language model may be automatically trained or re-trained that adjusts a weight of the training data used in training the model in response to differences between the two results obtained from applying the different language models.
    Type: Application
    Filed: June 7, 2013
    Publication date: December 11, 2014
    Inventors: Shuangyu Chang, Michael Levit
  • Patent number: 8909529
    Abstract: The invention concerns a method and corresponding system for building a phonotactic mode for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Giuseppe Riccardi
  • Publication number: 20140358540
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Application
    Filed: August 14, 2014
    Publication date: December 4, 2014
    Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. Syrdal
  • Publication number: 20140343942
    Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.
    Type: Application
    Filed: May 27, 2014
    Publication date: November 20, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan TUR
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Patent number: 8886535
    Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.
    Type: Grant
    Filed: January 23, 2014
    Date of Patent: November 11, 2014
    Assignee: Accumente, LLC
    Inventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
  • Patent number: 8886534
    Abstract: A speech recognition apparatus includes a speech input unit that receives input speech, a phoneme recognition unit that recognizes phonemes of the input speech and generates a first phoneme sequence representing corrected speech, a matching unit that matches the first phoneme sequence with a second phoneme sequence representing original speech, and a phoneme correcting unit that corrects phonemes of the second phoneme sequence based on the matching result.
    Type: Grant
    Filed: January 27, 2011
    Date of Patent: November 11, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventors: Mikio Nakano, Naoto Iwahashi, Kotaro Funakoshi, Taisuke Sumii
  • Patent number: 8886540
    Abstract: A method and system for entering information into a software application resident on a mobile communication facility is provided. The method and system may include recording speech presented by a user using a mobile communication facility resident capture facility, transmitting the recording through a wireless communication facility to a speech recognition facility, transmitting information relating to the software application to the speech recognition facility, generating results utilizing the speech recognition facility using an unstructured language model based at least in part on the information relating to the software application and the recording, transmitting the results to the mobile communications facility, loading the results into the software application and simultaneously displaying the results as a set of words and as a set of application results based on those words.
    Type: Grant
    Filed: August 1, 2008
    Date of Patent: November 11, 2014
    Assignee: Vlingo Corporation
    Inventors: Joseph P. Cerra, John N. Nguyen, Michael S. Phillips, Han Shu, Alexandra Beth Mischke
  • Publication number: 20140330565
    Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.
    Type: Application
    Filed: May 20, 2014
    Publication date: November 6, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur