Creating Patterns For Matching Patents (Class 704/243)
  • Patent number: 8914283
    Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: December 16, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 8914290
    Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: December 16, 2014
    Assignee: Vocollect, Inc.
    Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
  • Patent number: 8914278
    Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: December 16, 2014
    Assignee: Ginger Software, Inc.
    Inventors: Yael Karov Zangvil, Avner Zangvil
  • Patent number: 8909534
    Abstract: A method may include selecting, by a computing device, sets of two or more text candidates from a plurality of text candidates corresponding to vocal input. The method may further include for each set, providing, by the computing device, representations of each of the respective two or more text candidates in the set to users, wherein the representations are provided as audio. The method may further include receiving a selection from each of the users of one of a text candidate from the set, wherein the selection is based on satisfying a criterion. The method may further include determining that a text candidate included in the plurality of text candidates has a highest probability out of the plurality of text candidates of being a correct textual transcription of the vocal input based at least in part on selections from the users.
    Type: Grant
    Filed: March 9, 2012
    Date of Patent: December 9, 2014
    Assignee: Google Inc.
    Inventor: Taliver Heath
  • Patent number: 8909528
    Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.
    Type: Grant
    Filed: May 9, 2007
    Date of Patent: December 9, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
  • Patent number: 8909529
    Abstract: The invention concerns a method and corresponding system for building a phonotactic mode for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Giuseppe Riccardi
  • Publication number: 20140358539
    Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.
    Type: Application
    Filed: February 14, 2014
    Publication date: December 4, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Feng Rao, Li Lu, Bo Chen, Xiang Zhang, Shuai Yue, Lu Li
  • Publication number: 20140358538
    Abstract: Methods and systems are provided for shaping speech dialog of a speech system. In one embodiment, a method includes: receiving data related to a first utterance from a user of the speech system; processing the data based on at least one attribute processing technique that determines at least one attribute of the first utterance; determining a shaping pattern based on the at least one attribute; and generating a speech prompt based on the shaping pattern.
    Type: Application
    Filed: May 28, 2013
    Publication date: December 4, 2014
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M. Hecht, Eli Tzirkel-Hancock, Omer Tsimhoni, Ute Winter
  • Publication number: 20140350931
    Abstract: A Statistical Machine Translation (SMT) model is trained using pairs of sentences that include content obtained from one or more content sources (e.g. feed(s)) with corresponding queries that have been used to access the content. A query click graph may be used to assist in determining candidate pairs for the SMT training data. All/portion of the candidate pairs may be used to train the SMT model. After training the SMT model using the SMT training data, the SMT model is applied to content to determine predicted queries that may be used to search for the content. The predicted queries are used to train a language model, such as a query language model. The query language model may be interpolated other language models, such as a background language model, as well as a feed language model trained using the content used in determining the predicted queries.
    Type: Application
    Filed: May 24, 2013
    Publication date: November 27, 2014
    Applicant: Microsoft Corporation
    Inventors: Michael Levit, Dilek Hakkani-Tur, Gokhan Tur
  • Patent number: 8892437
    Abstract: Example embodiments of the present invention may include a method that provides transcribing spoken utterances occurring during a call and assigning each of the spoken utterances with a corresponding set of first classifications. The method may also include determining a confidence rating associated with each of the spoken utterances and the assigned set of first classifications, and performing at least one of reclassifying the spoken utterances with new classifications based on at least one additional classification operation, and adding the assigned first classifications and the corresponding plurality of spoken utterances to a training data set.
    Type: Grant
    Filed: November 13, 2013
    Date of Patent: November 18, 2014
    Assignee: West Corporation
    Inventor: Silke Witt-ehsani
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Publication number: 20140337026
    Abstract: A method and system for generating training data for a target domain using speech data of a source domain. The training data generation method including: reading out a Gaussian mixture model (GMM) of a target domain trained with a clean speech data set of the target domain; mapping, by referring to the GMM of the target domain, a set of source domain speech data received as an input to the set of target domain speech data on a basis of a channel characteristic of the target domain speech data; and adding a noise of the target domain to the mapped set of source domain speech data to output a set of pseudo target domain speech data.
    Type: Application
    Filed: April 14, 2014
    Publication date: November 13, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Osamu Ichikawa, Steven J. Rennie
  • Patent number: 8886535
    Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.
    Type: Grant
    Filed: January 23, 2014
    Date of Patent: November 11, 2014
    Assignee: Accumente, LLC
    Inventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
  • Patent number: 8886533
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.
    Type: Grant
    Filed: October 25, 2011
    Date of Patent: November 11, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Sumit Chopra, Dimitrios Dimitriadis, Patrick Haffner
  • Patent number: 8886534
    Abstract: A speech recognition apparatus includes a speech input unit that receives input speech, a phoneme recognition unit that recognizes phonemes of the input speech and generates a first phoneme sequence representing corrected speech, a matching unit that matches the first phoneme sequence with a second phoneme sequence representing original speech, and a phoneme correcting unit that corrects phonemes of the second phoneme sequence based on the matching result.
    Type: Grant
    Filed: January 27, 2011
    Date of Patent: November 11, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventors: Mikio Nakano, Naoto Iwahashi, Kotaro Funakoshi, Taisuke Sumii
  • Patent number: 8886543
    Abstract: System and methods for characterizing interest points within a fingerprint are disclosed herein. The systems include generating a set of interest points and an anchor point related to an audio sample. A quantized absolute frequency of an anchor point can be calculated and used to calculate a set of quantized ratios. A fingerprint can then be generated based upon the set of quantized ratios and used in comparison to reference fingerprints to identify the audio sample. The disclosed systems and methods provide for an audio matching system robust to pitch-shift distortion by using quantized ratios within fingerprints rather than solely using absolute frequencies of interest points. Thus, the disclosed system and methods result in more accurate audio identification.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: November 11, 2014
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, George Tzanetakis, Annie Chen, Dominik Roblek
  • Patent number: 8880402
    Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech to obtain at least one parameter value, and determining an experience level of the user using the parameter value(s). The method can also include prompting the user based upon the determined experience level of the user to assist the user in delivering speech commands.
    Type: Grant
    Filed: October 28, 2006
    Date of Patent: November 4, 2014
    Assignee: General Motors LLC
    Inventors: Ryan J. Wasson, John P. Weiss, Jason W. Clark
  • Patent number: 8880397
    Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.
    Type: Grant
    Filed: October 21, 2011
    Date of Patent: November 4, 2014
    Assignee: Wal-Mart Stores, Inc.
    Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
  • Publication number: 20140324427
    Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.
    Type: Application
    Filed: April 25, 2014
    Publication date: October 30, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
  • Patent number: 8868423
    Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.
    Type: Grant
    Filed: July 11, 2013
    Date of Patent: October 21, 2014
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8868410
    Abstract: The invention provides a dialogue-based learning apparatus through dialogue with users comprising: a speech input unit (10) for inputting speeches; a speech recognition unit (20) for recognizing the input speech; and a behavior and dialogue controller (30) for controlling behaviors and dialogues according to speech recognition results, wherein the behavior and dialogue controller (30) has a topic recognition expert (34) to memorise contents of utterances and to retrieve the topic that best matches the speech recognition results, and a mode switching expert (35) to control mode switching in accordance with a user utterance, wherein the mode switching expert switches modes in accordance with a user utterance, wherein the topic recognition expert registers a plurality words in the utterance as topics in first mode, performs searches from among the registered topics, and selects the maximum likelihood topic in second mode.
    Type: Grant
    Filed: August 29, 2008
    Date of Patent: October 21, 2014
    Assignees: National Institute of Information and Communications Technology, Honda Motor Co., Ltd.
    Inventors: Naoto Iwahashi, Noriyuki Kimura, Mikio Nakano, Kotaro Funakoshi
  • Patent number: 8862468
    Abstract: A system and method of refining context-free grammars (CFGs). The method includes deriving back-off grammar (BOG) rules from an initially developed CFG and utilizing the initial CFG and the derived BOG rules to recognize user utterances. Based on a response of the initial CFG and the derived BOG rules to the user utterances, at least a portion of the derived BOG rules are utilized to modify the initial CFG and thereby produce a refined CFG. The above method can carried out iterativey, with each new iteration utilizing a refined CFG from preceding iterations.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: October 14, 2014
    Assignee: Microsoft Corporation
    Inventors: Timothy Paek, Max Chickering, Eric Badger
  • Patent number: 8856005
    Abstract: A method for receiving processed information at a remote device is described. The method includes transmitting from the remote device a verbal request to a first information provider and receiving a digital message from the first information provider in response to the transmitted verbal request. The digital message includes a symbolic representation indicator associated with a symbolic representation of the verbal request and data used to control an application. The method also includes transmitting, using the application, the symbolic representation indicator to a second information provider for generating results to be displayed on the remote device.
    Type: Grant
    Filed: January 8, 2014
    Date of Patent: October 7, 2014
    Assignee: Google Inc.
    Inventors: Gudmundur Hafsteinsson, Michael J. LeBeau, Natalia Marmasse, Sumit Agarwal, Dipchad Nishar
  • Patent number: 8856002
    Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
  • Patent number: 8849662
    Abstract: A method and a system for segmenting phonemes from voice signals. A method for accurately segmenting phonemes, in which a histogram showing a peak distribution corresponding to an order is formed by using a high order concept, and a boundary indicating a starting point and an ending point of each phoneme is determined by calculating a peak statistic based on the histogram. The phoneme segmentation method can remarkably reduce an amount of calculation, and has an advantage of being applied to sound signal systems which perform sound coding, sound recognition, sound synthesizing, sound reinforcement, etc.
    Type: Grant
    Filed: December 28, 2006
    Date of Patent: September 30, 2014
    Assignee: Samsung Electronics Co., Ltd
    Inventor: Hyun-Soo Kim
  • Patent number: 8849660
    Abstract: Systems and methods for training voice activation control of electronic equipment are disclosed. One example method includes receiving a selection corresponding to at least one command used to control the electronic equipment. The method further includes instructing a user to speak, and responsive to the instruction, receiving a digitized speech stream. The method further includes segmenting the speech stream into speech segments, storing at least one of the speech segments as an entry in a dictionary, and associating the dictionary entry with the selected command.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: September 30, 2014
    Inventors: Arturo A. Rodriguez, David A. Sedacca, Albert Garcia
  • Patent number: 8843371
    Abstract: The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text.
    Type: Grant
    Filed: August 1, 2012
    Date of Patent: September 23, 2014
    Assignee: Elwha LLC
    Inventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud
  • Patent number: 8843370
    Abstract: Adjusting model parameters is described for a speech recognition system that combines recognition outputs from multiple speech recognition processes. Discriminative adjustments are made to model parameters of at least one acoustic model based on a joint discriminative criterion over multiple complementary acoustic models to lower recognition word error rate in the system.
    Type: Grant
    Filed: November 26, 2007
    Date of Patent: September 23, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Willett, Chuang He
  • Publication number: 20140278413
    Abstract: An electronic device with one or more processors and memory includes a procedure for training a digital assistant. In some embodiments, the device detects an impasse in a dialogue between the digital assistant and a user including a speech input. During a learning session, the device utilizes a subsequent clarification input from the user to adjust intent inference or task execution associated with the speech input to produce a satisfactory response. In some embodiments, the device identifies a pattern of success or failure associated with an aspect previously used to complete a task and generates a hypothesis regarding a parameter used in speech recognition, intent inference or task execution as a cause for the pattern. Then, the device tests the hypothesis by altering the parameter for a subsequent completion of the task and adopts or rejects the hypothesis based on feedback information collected from the subsequent completion.
    Type: Application
    Filed: March 14, 2014
    Publication date: September 18, 2014
    Applicant: Apple Inc.
    Inventors: Donald W. PITSCHEL, Adam J. CHEYER, Christopher D. BRIGHAM, Thomas R. GRUBER
  • Publication number: 20140278395
    Abstract: A method and apparatus for determining a motion environment profile to adapt voice recognition processing includes a device receiving an acoustic signal including a speech signal, which is provided to a voice recognition module. The method also includes determining a motion profile for the device, determining a temperature profile for the device, and determining a noise profile for the acoustic signal. The method further includes determining, from the motion, temperature, and noise profiles, a motion environment profile for the device and adapting voice recognition processing for the speech signal based on the motion environment profile.
    Type: Application
    Filed: July 31, 2013
    Publication date: September 18, 2014
    Applicant: Motorola Mobility LLC
    Inventors: Robert A. Zurek, Kevin J. Bastyr, Giles T. Davis, Plamen A. Ivanov, Adrian M. Schuster
  • Publication number: 20140257810
    Abstract: According to an embodiment, a pattern classifier device includes a decision unit, an execution unit, a calculator, and a determination unit. The decision unit is configured to decide a subclass to which the input pattern is to belong, based on attribute information of the input pattern. The execution unit is configured to determine whether the input pattern belongs to a class that is divided into subclasses, using a weak classifier allocated to the decided subclass, and output a result of the determination and a reliability of the weak classifier. The calculator is configured to calculate an integrated value obtained by integrating an evaluation value based on the determination result and the reliability. The determination unit is configured to repeat the determination processing when a termination condition of the determination processing is not satisfied, and terminate the determination processing and output the integrated value when the termination condition, has been satisfied.
    Type: Application
    Filed: January 24, 2014
    Publication date: September 11, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Hiroshi Fujimura, Takashi Masuko
  • Patent number: 8831763
    Abstract: System and methods for intelligently pruning interest points are disclosed herein. The systems include generating a plurality of distorted audio samples and associated distorted interest points based upon a clean audio sample. Interest points that are common to sets of distorted interest points are retained with interest points not robust to distortion discarded. The disclosed systems and methods therefore can provide for a scalable audio matching solution by eliminating interest points in reference sample fingerprints. The set of pruned interest points are robust to distortion and the benefits of both scalability and accuracy can be had.
    Type: Grant
    Filed: October 18, 2011
    Date of Patent: September 9, 2014
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Gheorghe Postelnicu, George Tzanetakis, Dominik Roblek
  • Patent number: 8831940
    Abstract: A dictation system that allows using trainable code phrases is provided. The dictation system operates by receiving audio and recognizing the audio as text. The text/audio may contain code phrases that are identified by a comparator that matches the text/audio and replaces the code phrase with a standard clause that is associated with the code phrase. The database or memory containing the code phrases is loaded with matched standard clauses that may be identified to provide a hierarchal system such that certain code phrases may have multiple meanings depending on the user.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: September 9, 2014
    Assignee: NVOQ Incorporated
    Inventors: Charles Corfield, Brian Marquette, David Mondragon, Rebecca Heins
  • Patent number: 8831957
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speech recognition using models that are based on where, within a building, a speaker makes an utterance are disclosed. The methods, systems, and apparatus include actions of receiving data corresponding to an utterance, and obtaining location indicia for an area within a building where the utterance was spoken. Further actions include selecting one or more models for speech recognition based on the location indicia, wherein each of the selected one or more models is associated with a weight based on the location indicia. Additionally, the actions include generating a composite model using the selected one or more models and the respective weights of the selected one or more models. And the actions also include generating a transcription of the utterance using the composite model.
    Type: Grant
    Filed: October 15, 2012
    Date of Patent: September 9, 2014
    Assignee: Google Inc.
    Inventors: Gabriel Taubman, Brian Strope
  • Patent number: 8825481
    Abstract: Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.
    Type: Grant
    Filed: January 20, 2012
    Date of Patent: September 2, 2014
    Assignee: Microsoft Corporation
    Inventors: Albert Joseph Kishan Thambiratnam, Timo Pascal Mertens, Frank Torsten Bernd Seide
  • Patent number: 8825478
    Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
  • Publication number: 20140244254
    Abstract: A development system is described for facilitating the development of a spoken natural language (SNL) interface. The development system receives seed templates from a developer, each of which provides a command phrasing that can be used to invoke a function, when spoken by an end user. The development system then uses one or more development resources, such as a crowdsourcing system and a paraphrasing system, to provide additional templates. This yields an extended set of templates. A generation system then generates one or more models based on the extended set of templates. A user device may install the model(s) for use in interpreting commands spoken by an end user. When the user device recognizes a command, it may automatically invoke a function associated with that command. Overall, the development system provides an easy-to-use tool for producing an SNL interface.
    Type: Application
    Filed: February 25, 2013
    Publication date: August 28, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Yun-Cheng Ju, Matthai Philipose, Seungyeop Han
  • Patent number: 8818808
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Grant
    Filed: February 23, 2005
    Date of Patent: August 26, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Patent number: 8818809
    Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.
    Type: Grant
    Filed: June 20, 2013
    Date of Patent: August 26, 2014
    Assignee: Google Inc.
    Inventors: Craig L. Reding, Suzi Levas
  • Patent number: 8817952
    Abstract: Methods, apparatus, and systems are provided such that a Public Safety Answering Point (PSAP) may utilize a new model to handle Open Line emergency calls, including audio optimization, automation, analysis, and presentation. Embodiments of the present disclosure assist with the difficult task of identifying background noise while trying to listen and talk to a caller, and give the best possible audio from the caller to the emergency call-taker or dispatcher. More particularly, an audio stream is split into at least two instances, with a first instance being optimized for speech intelligibility and provided to a call-taker or dispatcher and a second instance being provided for background sound analysis. Accordingly, the new PSAP Open Line model may allow for significantly more efficient emergency assessment, location, and management of resources.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 26, 2014
    Assignee: Avaya Inc.
    Inventors: Jon Bentley, Mark Fletcher, Joseph L. Hall, Avram Levi, Paul Roller Michaelis, Heinz Teutsch
  • Patent number: 8812315
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: August 19, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Publication number: 20140229178
    Abstract: A method for real-time data-pattern analysis. The method includes receiving and queuing at least one data-pattern analysis request by a data-pattern analysis unit controller. At least one data stream portion is also received and stored by the data-pattern analysis unit controller, each data stream portion corresponding to a received data-pattern analysis request. Next, a received data-pattern analysis request is selected by the data-pattern analysis unit controller along with a corresponding data stream portion. A data-pattern analysis is performed based on the selected data-pattern analysis request and the corresponding data stream portion, wherein the data-pattern analysis is performed by one of a plurality of data-pattern analysis units.
    Type: Application
    Filed: April 22, 2014
    Publication date: August 14, 2014
    Applicant: Spansion LLC
    Inventors: Richard FASTOW, Qamrul Hasan
  • Patent number: 8805684
    Abstract: Automatic speech recognition (ASR) may be performed on received utterances. The ASR may be performed by an ASR module of a computing device (e.g., a client device). The ASR may include: generating feature vectors based on the utterances, updating the feature vectors based on feature-space speaker adaptation parameters, transcribing the utterances to text strings, and updating the feature-space speaker adaptation parameters based on the feature vectors. The transcriptions may be based, at least in part, on an acoustic model and the updated feature vectors. Updated speaker adaptation parameters may be received from another computing device and incorporated into the ASR module.
    Type: Grant
    Filed: October 17, 2012
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: Petar Aleksic, Xin Lei
  • Patent number: 8805685
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: August 12, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst J. Schroeter
  • Patent number: 8805560
    Abstract: Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.
    Type: Grant
    Filed: October 18, 2011
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: George Tzanetakis, Dominik Roblek, Matthew Sharifi
  • Publication number: 20140222426
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Application
    Filed: April 7, 2014
    Publication date: August 7, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Publication number: 20140222425
    Abstract: Provided are a speech recognition learning method using 3D geometric information and a speech recognition method by using 3D geometric information. The method performs learning by using 3D geometric information for learning or information derived from the 3D geometric information to generate a recognizer, and the speech recognition method performs speech recognition by applying 3D geometric information on a physical object correlated to or dependent on voice or information derived from the 3D geometric information to the recognizer.
    Type: Application
    Filed: February 7, 2014
    Publication date: August 7, 2014
    Applicant: SOGANG UNIVERSITY RESEARCH FOUNDATION
    Inventors: Hyung-Min PARK, Changsoo JE, Bi Ho KIM, Min Wook KIM
  • Patent number: 8798990
    Abstract: Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data.
    Type: Grant
    Filed: April 30, 2013
    Date of Patent: August 5, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas Bangalore, Mazin Gilbert, Narendra K. Gupta
  • Patent number: 8798994
    Abstract: The present invention discloses a solution for conserving computing resources when implementing transformation based adaptation techniques. The disclosed solution limits the amount of speech data used by real-time adaptation algorithms to compute a transformation, which results in substantial computational savings. Appreciably, application of a transform is a relatively low memory and computationally cheap process compared to memory and resource requirements for computing the transform to be applied.
    Type: Grant
    Filed: February 6, 2008
    Date of Patent: August 5, 2014
    Assignee: International Business Machines Corporation
    Inventors: John W. Eckhart, Michael Florio, Radek Hampl, Pavel Krbec, Jonathan Palgon
  • Publication number: 20140214422
    Abstract: The application provides a method and system for determinism in non-linear systems for speech processing, particularly automatic speech segmentation for building speech recognition systems. More particularly, the application enables a method and system for detecting boundary of coarticulated units from isolated speech using recurrence plot.
    Type: Application
    Filed: July 18, 2012
    Publication date: July 31, 2014
    Applicant: Tata Consultancy Services Limited
    Inventors: Mohd Bilal Arif Syed, Arijit Sinharay, Tanushyam Chattopadhyay