Probability Patents (Class 704/240)
  • Patent number: 9607624
    Abstract: A system for encoding and applying Dynamic Range Control/Compression (DRC) gain values to a piece of sound program content is described. In particular, a set of DRC gain values representing a DRC gain curve for the piece of content may be divided into frames corresponding to frames of the piece of content. A set of fields may be included with an audio signal representing the piece of content. The additional fields may represent the DRC gain values using linear or spline interpolation. The additional fields may include 1) an initial gain value for each DRC frame, 2) a set of slope values at particular points in the DRC curve, 3) a set of time delta values for each consecutive pair of slope values, and/or 4) one or more gain delta values representing changes of DRC gain values in the DRC gain curve between points of the slope values.
    Type: Grant
    Filed: March 26, 2014
    Date of Patent: March 28, 2017
    Assignee: Apple Inc.
    Inventor: Frank M. Baumgarte
  • Patent number: 9589563
    Abstract: A method for speech recognition of partial proper names is described which includes natural language processing (NLP), partial name candidate generation, speech recognition and post processing. Natural language processing techniques including shallow and deep parsing are applied to long proper names to identify syntactic units (for example, noun phrases). The syntactic units form a basis for generating a candidate list of partial names for each original full name. A partial name is part of the original name, with some words omitted, or word order changed, or even word substitution. After candidate partial names are generated, their phonetic transcriptions are incorporated into a model for a speech recognizer to recognize the partial names in a speech recognition system.
    Type: Grant
    Filed: June 2, 2015
    Date of Patent: March 7, 2017
    Assignee: Robert Bosch GmbH
    Inventors: Lin Zhao, Zhe Feng, Kui Xu, Fuliang Weng
  • Patent number: 9576239
    Abstract: A computer-implemented system and method for identifying tasks using temporal footprints is provided. A database of temporal footprints is maintained. Each temporal footprint is representative of a different task and includes one or more significant patterns of two or more sequential events. Events performed by one or more users are tracked. At least one pattern including sequential occurrences of two or more of the tracked events is identified. The identified pattern is compared to each of the significant patterns of the temporal footprints. A footprint score for the identified pattern is determined with respect to each temporal footprint. The task associated with the temporal footprint having the highest footprint score is assigned to the identified pattern.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: February 21, 2017
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Oliver Brdiczka, James (Bo) M.A. Begole
  • Patent number: 9576572
    Abstract: Methods and nodes for enabling and producing input generated by speech of a user, to an application. When the application has been activated (2:1), an application node (200) detects (2:2) a current context of the user and selects (2:3), from a set of predefined contexts (204a), a predefined context that matches the detected current context. The application node (200) then provides (2:4) keywords associated with the selected predefined context to a speech recognition node (202). When receiving (2:5) speech from the user, the speech recognition node (202) is able to recognize (2:6) any of the keyword in the speech. The recognized keyword is then used (2:7) as input to the application.
    Type: Grant
    Filed: June 18, 2012
    Date of Patent: February 21, 2017
    Assignee: Telefonaktiebolaget LM Ericsson (Publ)
    Inventors: Jari Arkko, Jouni Mäenpää, Tomas Mecklin
  • Patent number: 9552352
    Abstract: Technologies pertaining to retrieval of contextually relevant attribute values for an automatically identified named entity in a document are described herein. Named entity recognition technologies are employed to identify named entities in the text of a document. Context corresponding to an identified named entity is analyzed to probabilistically assign a class to the named entity. Attributes that are most relevant to the class are determined, and attribute values for such attributes are retrieved. The attribute values are presented in correlation with the named entity in the document responsive to user-selection of the named entity in the document.
    Type: Grant
    Filed: November 10, 2011
    Date of Patent: January 24, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Evelyne Viegas, Eric Anthony Rozell
  • Patent number: 9507852
    Abstract: A computer-implemented method can include receiving a speech input representing a question, converting the speech input to a string of characters, and obtaining tokens each representing a potential word. The method can include determining one or more part-of-speech (POS) tags for each token and determining sequences of the POS tags for the tokens, each sequence of the POS tags including one POS tag per token. The method can include determining one or more parses for each sequence of the POS tags for the tokens and determining a most-likely parse and its corresponding sequence of the POS tags for the tokens to obtain a selected parse and a selected sequence of the POS tags for the tokens. The method can also include determining a most-likely answer to the question using the selected parse and the selected sequence of the POS tags for the tokens and outputting the most-likely answer.
    Type: Grant
    Filed: December 10, 2013
    Date of Patent: November 29, 2016
    Assignee: Google Inc.
    Inventors: Slav Petrov, Alexander Rush
  • Patent number: 9501466
    Abstract: A system for identifying address components includes a training address interface, a training address probability processor, a parsing address interface, and a processor. The training address interface is to receive training addresses. The training addresses are a set of components with corresponding identifiers. The training address probability processor is to determine probabilities of each component of the training addresses being associated with each identifier. The parsing address interface to receive an address for parsing. The processor is to determine a matching model of a set of models based at least in part on a matching probability for each model for a tokenized address, which is based on the address for parsing, and associate each component of the tokenized address with an identifier based at least in part on the matching model.
    Type: Grant
    Filed: June 3, 2015
    Date of Patent: November 22, 2016
    Assignee: Workday, Inc.
    Inventors: Parag Avinash Namjoshi, Shuangshuang Jiang, Mohammad Sabah
  • Patent number: 9448991
    Abstract: Context-based corrections of voice recognition results are provided by displaying text-based result from a speech-to-text conversion operation on a display screen of an electronic client device. One or more element categories associated with corresponding portions of the text-based result are identified. Graphical icons corresponding to the element categories are also displayed on the display in areas where the corresponding portions of the text-based result are also displayed. A user selection of one of the graphical icons is then detected, and an edit operation is enabled for the portion of the text-based result associated with the selected graphical icon. An updated version of the text-based results is then displayed on the display.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: September 20, 2016
    Assignee: Bayerische Motoren Werke Aktiengesellschaft
    Inventor: Philipp Suessenguth
  • Patent number: 9412365
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to enhanced maximum entropy models. In some implementations, data indicating a candidate transcription for an utterance and a particular context for the utterance are received. A maximum entropy language model is obtained. Feature values are determined for n-gram features and backoff features of the maximum entropy language model. The feature values are input to the maximum entropy language model, and an output is received from the maximum entropy language model. A transcription for the utterance is selected from among a plurality of candidate transcriptions based on the output from the maximum entropy language model. The selected transcription is provided to a client device.
    Type: Grant
    Filed: March 24, 2015
    Date of Patent: August 9, 2016
    Assignee: Google Inc.
    Inventors: Fadi Biadsy, Brian E. Roark
  • Patent number: 9368109
    Abstract: Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.
    Type: Grant
    Filed: May 31, 2013
    Date of Patent: June 14, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Daniele Ernesto Colibro, Claudio Vair, Kevin R. Farrell
  • Patent number: 9355636
    Abstract: Features are provided for selectively scoring portions of user utterances based at least on articulatory features of the portions. One or more articulatory features of a portion of a user utterance can be determined. Acoustic models or subsets of individual acoustic model components (e.g., Gaussians or Gaussian mixture models) can be selected based on the articulatory features of the portion. The portion can then be scored using a selected acoustic model or subset of acoustic model components. The process may be repeated for the multiple portions of the utterance, and speech recognition results can be generated from the scored portions.
    Type: Grant
    Filed: September 16, 2013
    Date of Patent: May 31, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Jeffrey Cornelius O'Neill, Jeffrey Paul Lilly, Thomas Schaaf
  • Patent number: 9348915
    Abstract: Content items and other entities may be ranked or organized according to a relevance to a user. Relevance may take into consideration recency, proximity, popularity, air time (e.g., of television shows) and the like. In one example, the popularity and age of a movie may be used to determine a relevance ranking Popularity (i.e., entity rank) may be determined based on a variety of factors. In the movie example, popularity may be based on gross earnings, awards, nominations, votes and the like. According to one or more embodiments, entities may initially be categorized into relevance groupings based on popularity and/or other factors. Once categorized, the entities may be sorted within each grouping and later combined into a single ranked list.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: May 24, 2016
    Assignee: Comcast Interactive Media, LLC
    Inventors: Ken Iwasa, Seth Michael Murray, Goldee Udani
  • Patent number: 9349372
    Abstract: A speaker identification system (100) includes a microphone (2) which acquires speech information of a speaker; a sex/age range information acquisition unit (7) which acquires age range information relating to a range of the age of the speaker, based on the speech information; a specific age information acquisition unit (8) which acquires specific age information relating to the specific age of the speaker, based on the speech information; a date and time information acquisition unit (9) which acquires date and time information representing the date and time when the speech information has been acquired; and a speaker database (4) which accumulates the specific age information, and the date and time information in association with each other.
    Type: Grant
    Filed: July 8, 2014
    Date of Patent: May 24, 2016
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Kazue Fusakawa, Tomomi Matsuoka, Masako Ikeda
  • Patent number: 9348912
    Abstract: Embodiments are configured to provide information based on a user query. In an embodiment, a system includes a search component having a ranking component that can be used to rank search results as part of a query response. In one embodiment, the ranking component includes a ranking algorithm that can use the length of documents returned in response to a search query to rank search results.
    Type: Grant
    Filed: September 10, 2008
    Date of Patent: May 24, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Vladimir Tankovich, Dmitriy Meyerzon, Michael James Taylor
  • Patent number: 9336771
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using non-parametric models in speech recognition. In some implementations, speech data is accessed. The speech data represents utterances of a particular phonetic unit occurring in a particular phonetic context, and the speech data includes values for multiple dimensions. Boundaries are determined for a set of quantiles for each of the multiple dimensions. Models for the distribution of values within the quantiles are generated. A multidimensional probability function is generated. Data indicating the boundaries of the quantiles, the models for the distribution of values in the quantiles, and the multidimensional probability function are stored.
    Type: Grant
    Filed: May 16, 2013
    Date of Patent: May 10, 2016
    Assignee: Google Inc.
    Inventor: Ciprian I. Chelba
  • Patent number: 9324323
    Abstract: Speech recognition techniques may include: receiving audio; identifying one or more topics associated with audio; identifying language models in a topic space that correspond to the one or more topics, where the language models are identified based on proximity of a representation of the audio to representations of other audio in the topic space; using the language models to generate recognition candidates for the audio, where the recognition candidates have scores associated therewith that are indicative of a likelihood of a recognition candidate matching the audio; and selecting a recognition candidate for the audio based on the scores.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: April 26, 2016
    Assignee: Google Inc.
    Inventors: Daniel M. Bikel, Kapil R. Thadini, Fernando Pereira, Maria Shugrina, Fadi Biadsy
  • Patent number: 9298836
    Abstract: A source system searches a provider system for one or more listings. The source system receives a plurality of potential matching listings. The source system designates a representative listing of the entity located on a provider system from among the plurality of potential matching listings. The source system designates one or more remaining potential matching listings of the plurality of potential matching listings as one or more duplicate listings. The source system transmits, to the provider system, a request to synchronize the representative listing as the only representative listing of the entity on the provider system, the request comprising a first provider-supplied external identifier of the representative listing.
    Type: Grant
    Filed: July 7, 2015
    Date of Patent: March 29, 2016
    Assignee: YEXT, INC.
    Inventors: Howard C. Lerman, Thomas C. Dixon, Kevin Caffrey, David C. Lin
  • Patent number: 9262538
    Abstract: A system for the support and management of search for documents is presents. The system includes knowledge-database, query interface and communication to a database of documents to be searched. Information generated during a search session is collected by the system and is added to the knowledge-database. The information is ranked automatically according to the usage of that information by the user. During successive search session, or during search made by other users, the system uses the knowledge-database to support the users with keywords, queries and reference to documents.
    Type: Grant
    Filed: June 19, 2015
    Date of Patent: February 16, 2016
    Inventor: Haim Zvi Melman
  • Patent number: 9224394
    Abstract: A system and method for implementing a server-based speech recognition system for multi-modal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver's visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound.
    Type: Grant
    Filed: March 23, 2010
    Date of Patent: December 29, 2015
    Assignee: Sirius XM Connected Vehicle Services Inc
    Inventors: Thomas Barton Schalk, Leonel Saenz, Barry Burch
  • Patent number: 9224404
    Abstract: A communication system includes a front-end audio gateway or bridge and a hands-free device. An automatic speech recognition platform accessible to the hands-free device provides or makes available one or more preprocessing schemes and/or acoustic models to the front-end audio gateway or bridge. The preprocessing schemes or acoustic models can be identified by or provided before a connection is established between the front-end audio gateway and the automatic speech recognition platform, when a connection occurs between the front-end audio gateway and the automatic speech recognition platform, and/or during a speech recognition session.
    Type: Grant
    Filed: January 28, 2013
    Date of Patent: December 29, 2015
    Assignee: 2236008 Ontario Inc.
    Inventor: Anthony Andrew Poliak
  • Patent number: 9218809
    Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker-specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.
    Type: Grant
    Filed: January 9, 2014
    Date of Patent: December 22, 2015
    Assignee: Apple Inc.
    Inventors: Jerome R. Bellegarda, Kim E. A. Silverman
  • Patent number: 9191515
    Abstract: A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: November 17, 2015
    Assignee: Nuance Communications, Inc.
    Inventor: Daniel Michael Doulton
  • Patent number: 9142211
    Abstract: A speech recognition apparatus 20 includes: an identification language model creation unit 21 that selects, from learning texts 27 for various fields for generating language models 26 for the fields, a phrase that includes a word whose appearance frequency satisfies a set condition on a field-by-field basis, and generates an identification language model 25 for identifying the field of speech using the selected phrases; a speech recognition unit 22 that executes speech recognition on the speech using the identification language model 25, and outputs text data and word confidences as a recognition result; and a field determination unit 23 that specifies a field that includes the most words whose confidences are greater than or equal to a set value based on the text data, the word confidences, and the words in the learning texts for the fields, and determines that the specified field is the field of the speech.
    Type: Grant
    Filed: February 13, 2013
    Date of Patent: September 22, 2015
    Assignee: NEC CORPORATION
    Inventor: Atsunori Sakai
  • Patent number: 9117449
    Abstract: Techniques disclosed herein include systems and methods that enable a voice trigger that wakes-up an electronic device or causes the device to make additional voice commands active, without manual initiation of voice command functionality. In addition, such a voice trigger is dynamically programmable or customizable. A speaker can program or designate a particular phrase as the voice trigger. In general, techniques herein execute a voice-activated wake-up system that operates on a digital signal processor (DSP) or other low-power, secondary processing unit of an electronic device instead of running on a central processing unit (CPU). A speech recognition manager runs two speech recognition systems on an electronic device. The CPU dynamically creates a compact speech system for the DSP. Such a compact system can be continuously run during a standby mode, without quickly exhausting a battery supply.
    Type: Grant
    Filed: April 26, 2012
    Date of Patent: August 25, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Michael Jack Newman, Robert Roth, William D. Alexander, Paul van Mulbregt
  • Patent number: 9092480
    Abstract: A method and apparatus for performing extended search are provided. The method includes receiving user-inputted keywords; extending the user-inputted keywords according to geographical information to acquire extended keywords; performing a search by using the extended keywords; and returning search results to the user. With the present technical solutions, privilege control can be effectively performed in a cloud storage system. With the present embodiments, more information may be provided to a user for reference.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: July 28, 2015
    Assignee: International Business Machines Corporation
    Inventors: Keke Cai, Hong Lei Guo, Zhong Su, Hui Jia Zhu
  • Patent number: 9087515
    Abstract: A speech recognition apparatus is disclosed. The apparatus converts a speech signal into a digitalized speech data, and performs speech recognition based on the speech data. The apparatus makes a comparison between the speech data inputted the last time and the speech data inputted the time before the last time in response to a user's indication that the speech recognition results in erroneous recognition multiple times in a row. When the speech data inputted the last time is determined to substantially match the speech data inputted the time before the last time, the apparatus outputs a guidance prompting the user to utter an input target by calling it by another name.
    Type: Grant
    Filed: October 13, 2011
    Date of Patent: July 21, 2015
    Assignee: DENSO CORPORATION
    Inventor: Takahiro Tsuda
  • Patent number: 9031840
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving (i) audio data that encodes a spoken natural language query, and (ii) environmental audio data, obtaining a transcription of the spoken natural language query, determining a particular content type associated with one or more keywords in the transcription, providing at least a portion of the environmental audio data to a content recognition engine, and identifying a content item that has been output by the content recognition engine, and that matches the particular content type.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: May 12, 2015
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Gheorghe Postelnicu
  • Patent number: 9031841
    Abstract: A apparatus includes: a storage unit to store a model representing a relationship between a relative time and an occurrence probabilities; a first detection unit to detect first speech period of a first speaker; a second period detection unit to detect second speech period of a second speaker; a unit to calculate a feature value of the first speech period; a detection unit to detect a word using the calculated feature value; an adjustment unit to make an adjustment such that in detecting a word for a reply by the detection unit, the adjustment unit retrieves an occurrence probability corresponding to a relative position of the reply in the second speech period, and adjusts a word score or a detection threshold value for the reply; and a second detection unit to re-detect, using the adjusted word score or the adjusted detection threshold value, the detected word by the detection unit.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: May 12, 2015
    Assignee: Fujitsu Limited
    Inventor: Nobuyuki Washio
  • Patent number: 9026446
    Abstract: An adaptive workflow system can be used to implement captioning projects, such as projects for creating captions or subtitles for live and non-live broadcasts. Workers can repeat words spoken during a broadcast program or other program into a voice recognition system, which outputs text that may be used as captions or subtitles. The process of workers repeating these words to create such text can be referred to as respeaking. Respeaking can be used as an effective alternative to more expensive and hard-to-find stenographers for generating captions and subtitles.
    Type: Grant
    Filed: June 10, 2011
    Date of Patent: May 5, 2015
    Inventor: Morgan Fiumi
  • Patent number: 9020818
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Patent number: 9020816
    Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.
    Type: Grant
    Filed: August 13, 2009
    Date of Patent: April 28, 2015
    Assignee: 21CT, Inc.
    Inventor: Matthew McClain
  • Patent number: 9020820
    Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: April 28, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 9015044
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 21, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Publication number: 20150100316
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.
    Type: Application
    Filed: December 10, 2014
    Publication date: April 9, 2015
    Inventors: Jason D. Williams, Ethan SELFRIDGE
  • Patent number: 9002704
    Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.
    Type: Grant
    Filed: February 3, 2012
    Date of Patent: April 7, 2015
    Assignee: Fujitsu Limited
    Inventor: Akira Kamano
  • Patent number: 8990083
    Abstract: A method is provided in one example and includes receiving data propagating in a network environment, and identifying selected words within the data based on a whitelist. The whitelist includes a plurality of designated words to be tagged. The method further includes assigning a weight to the selected words based on at least one characteristic associated with the data, and associating the selected words to an individual. A resultant composite is generated for the selected words that are tagged. In more specific embodiments, the resultant composite is partitioned amongst a plurality of individuals associated with the data propagating in the network environment. A social graph can be generated that identifies a relationship between a selected individual and the plurality of individuals based on a plurality of words exchanged between the selected individual and the plurality of individuals.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: March 24, 2015
    Assignee: Cisco Technology, Inc.
    Inventors: Satish K. Gannu, Ashutosh A. Malegaonkar, Virgil N. Mihailovici
  • Patent number: 8983841
    Abstract: A network communication node includes an audio outputter that outputs an audible representation of data to be provided to a requester. The network communication node also includes a processor that determines a categorization of the data to be provided to the requester and that varies a pause between segments of the audible representation of the data in accordance with the categorization of the data to be provided to the requester.
    Type: Grant
    Filed: July 15, 2008
    Date of Patent: March 17, 2015
    Assignee: AT&T Intellectual Property, I, L.P.
    Inventors: Gregory Pulz, Steven Lewis, Charles Rajnai
  • Publication number: 20150073792
    Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.
    Type: Application
    Filed: November 13, 2014
    Publication date: March 12, 2015
    Inventor: Giuseppe RICCARDI
  • Publication number: 20150073793
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for a speech recognition application for directory assistance that is based on a user's spoken search query. The spoken search query is received by a portable device and portable device then determines its present location. Upon determining the location of the portable device, that information is incorporated into a local language model that is used to process the search query. Finally, the portable device outputs the results of the search query based on the local language model.
    Type: Application
    Filed: November 14, 2014
    Publication date: March 12, 2015
    Inventors: Enrico BOCCHIERI, Diamantino Antonio Caseiro
  • Publication number: 20150066507
    Abstract: A sound recognition apparatus can include a sound feature value calculating unit configured to calculate a sound feature value based on a sound signal, and a label converting unit configured to convert the sound feature value into a corresponding label with reference to label data in which sound feature values and labels indicating sound units are correlated. A sound identifying unit is configured to calculate a probability of each sound unit group sequence that a label sequence is segmented for each sound unit group with reference to segmentation data. The segmentated data indicates a probability that a sound unit sequence will be segmented into at least one sound unit group. The sound identity unit can also identify a sound event corresponding to the sound unit group sequence selected based on the calculated probability.
    Type: Application
    Filed: August 26, 2014
    Publication date: March 5, 2015
    Inventors: Keisuke NAKAMURA, Kazuhiro NAKADAI
  • Patent number: 8972260
    Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: March 3, 2015
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
  • Patent number: 8948466
    Abstract: In real biometric systems, false match rates and false non-match rates of 0% do not exist. There is always some probability that a purported match is false, and that a genuine match is not identified. The performance of biometric systems is often expressed in part in terms of their false match rate and false non-match rate, with the equal error rate being when the two are equal. There is a tradeoff between the FMR and FNMR in biometric systems which can be adjusted by changing a matching threshold. This matching threshold can be automatically, dynamically and/or user adjusted so that a biometric system of interest can achieve a desired FMR and FNMR.
    Type: Grant
    Filed: October 11, 2013
    Date of Patent: February 3, 2015
    Assignee: Aware, Inc.
    Inventor: David Benini
  • Patent number: 8935151
    Abstract: A source language sentence is tagged with non-lexical tags, such as part-of-speech tags and is parsed using a lexicalized parser trained in the source language. A target language sentence that is a translation of the source language sentence is tagged with non-lexical labels (e.g., part-of speech tags) and is parsed using a delexicalized parser that has been trained in the source language to produce k-best parses. The best parse is selected based on the parse's alignment with lexicalized parse of the source language sentence. The selected best parse can be used to update the parameter vector of a lexicalized parser for the target language.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: January 13, 2015
    Assignee: Google Inc.
    Inventors: Slav Petrov, Ryan McDonald, Keith Hall
  • Patent number: 8930189
    Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: January 6, 2015
    Assignee: Microsoft Corporation
    Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
  • Patent number: 8914277
    Abstract: According to example configurations, a speech-processing system parses an uttered sentence into segments. The speech-processing system translates each of the segments in the uttered sentence into candidate textual expressions (i.e., phrases of one or more words) in a first language. The uttered sentence can include multiple phrases or candidate textual expressions. Additionally, the speech-processing system translates each of the candidate textual expressions into candidate textual phrases in a second language. Based at least in part on a product of confidence values associated with the candidate textual expressions in the first language and confidence values associated with the candidate textual phrases in the second language, the speech-processing system produces a confidence metric for each of the candidate textual phrases in the second language.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: December 16, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Ding Liu
  • Patent number: 8909518
    Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: December 9, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8898058
    Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: November 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
  • Patent number: 8898061
    Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: November 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Patent number: 8892996
    Abstract: User input is received, specifying a continuous traced path across a keyboard presented on a touch sensitive display. An input sequence is resolved, including traced keys and auxiliary keys proximate to the traced keys by prescribed criteria. For each of one or more candidate entries of a prescribed vocabulary, a set-edit-distance metric is computed between said input sequence and the candidate entry. Various rules specify when penalties are imposed, or not, in computing the set-edit-distance metric. Candidate entries are ranked and displayed according to the computed metric.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: November 18, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Erland Unruh