Probability Patents (Class 704/240)

Metadata driven dynamic range control

Patent number: 9607624

Abstract: A system for encoding and applying Dynamic Range Control/Compression (DRC) gain values to a piece of sound program content is described. In particular, a set of DRC gain values representing a DRC gain curve for the piece of content may be divided into frames corresponding to frames of the piece of content. A set of fields may be included with an audio signal representing the piece of content. The additional fields may represent the DRC gain values using linear or spline interpolation. The additional fields may include 1) an initial gain value for each DRC frame, 2) a set of slope values at particular points in the DRC curve, 3) a set of time delta values for each consecutive pair of slope values, and/or 4) one or more gain delta values representing changes of DRC gain values in the DRC gain curve between points of the slope values.

Type: Grant

Filed: March 26, 2014

Date of Patent: March 28, 2017

Assignee: Apple Inc.

Inventor: Frank M. Baumgarte
Speech recognition of partial proper names by natural language processing

Patent number: 9589563

Abstract: A method for speech recognition of partial proper names is described which includes natural language processing (NLP), partial name candidate generation, speech recognition and post processing. Natural language processing techniques including shallow and deep parsing are applied to long proper names to identify syntactic units (for example, noun phrases). The syntactic units form a basis for generating a candidate list of partial names for each original full name. A partial name is part of the original name, with some words omitted, or word order changed, or even word substitution. After candidate partial names are generated, their phonetic transcriptions are incorporated into a model for a speech recognizer to recognize the partial names in a speech recognition system.

Type: Grant

Filed: June 2, 2015

Date of Patent: March 7, 2017

Assignee: Robert Bosch GmbH

Inventors: Lin Zhao, Zhe Feng, Kui Xu, Fuliang Weng
Computer-implemented system and method for identifying tasks using temporal footprints

Patent number: 9576239

Abstract: A computer-implemented system and method for identifying tasks using temporal footprints is provided. A database of temporal footprints is maintained. Each temporal footprint is representative of a different task and includes one or more significant patterns of two or more sequential events. Events performed by one or more users are tracked. At least one pattern including sequential occurrences of two or more of the tracked events is identified. The identified pattern is compared to each of the significant patterns of the temporal footprints. A footprint score for the identified pattern is determined with respect to each temporal footprint. The task associated with the temporal footprint having the highest footprint score is assigned to the identified pattern.

Type: Grant

Filed: March 4, 2013

Date of Patent: February 21, 2017

Assignee: Palo Alto Research Center Incorporated

Inventors: Oliver Brdiczka, James (Bo) M.A. Begole
Methods and nodes for enabling and producing input to an application

Patent number: 9576572

Abstract: Methods and nodes for enabling and producing input generated by speech of a user, to an application. When the application has been activated (2:1), an application node (200) detects (2:2) a current context of the user and selects (2:3), from a set of predefined contexts (204a), a predefined context that matches the detected current context. The application node (200) then provides (2:4) keywords associated with the selected predefined context to a speech recognition node (202). When receiving (2:5) speech from the user, the speech recognition node (202) is able to recognize (2:6) any of the keyword in the speech. The recognized keyword is then used (2:7) as input to the application.

Type: Grant

Filed: June 18, 2012

Date of Patent: February 21, 2017

Assignee: Telefonaktiebolaget LM Ericsson (Publ)

Inventors: Jari Arkko, Jouni Mäenpää, Tomas Mecklin
Enrichment of named entities in documents via contextual attribute ranking

Patent number: 9552352

Abstract: Technologies pertaining to retrieval of contextually relevant attribute values for an automatically identified named entity in a document are described herein. Named entity recognition technologies are employed to identify named entities in the text of a document. Context corresponding to an identified named entity is analyzed to probabilistically assign a class to the named entity. Attributes that are most relevant to the class are determined, and attribute values for such attributes are retrieved. The attribute values are presented in correlation with the named entity in the document responsive to user-selection of the named entity in the document.

Type: Grant

Filed: November 10, 2011

Date of Patent: January 24, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Evelyne Viegas, Eric Anthony Rozell
Techniques for discriminative dependency parsing

Patent number: 9507852

Abstract: A computer-implemented method can include receiving a speech input representing a question, converting the speech input to a string of characters, and obtaining tokens each representing a potential word. The method can include determining one or more part-of-speech (POS) tags for each token and determining sequences of the POS tags for the tokens, each sequence of the POS tags including one POS tag per token. The method can include determining one or more parses for each sequence of the POS tags for the tokens and determining a most-likely parse and its corresponding sequence of the POS tags for the tokens to obtain a selected parse and a selected sequence of the POS tags for the tokens. The method can also include determining a most-likely answer to the question using the selected parse and the selected sequence of the POS tags for the tokens and outputting the most-likely answer.

Type: Grant

Filed: December 10, 2013

Date of Patent: November 29, 2016

Assignee: Google Inc.

Inventors: Slav Petrov, Alexander Rush
Address parsing system

Patent number: 9501466

Abstract: A system for identifying address components includes a training address interface, a training address probability processor, a parsing address interface, and a processor. The training address interface is to receive training addresses. The training addresses are a set of components with corresponding identifiers. The training address probability processor is to determine probabilities of each component of the training addresses being associated with each identifier. The parsing address interface to receive an address for parsing. The processor is to determine a matching model of a set of models based at least in part on a matching probability for each model for a tokenized address, which is based on the address for parsing, and associate each component of the tokenized address with an identifier based at least in part on the matching model.

Type: Grant

Filed: June 3, 2015

Date of Patent: November 22, 2016

Assignee: Workday, Inc.

Inventors: Parag Avinash Namjoshi, Shuangshuang Jiang, Mohammad Sabah
Method for providing context-based correction of voice recognition results

Patent number: 9448991

Abstract: Context-based corrections of voice recognition results are provided by displaying text-based result from a speech-to-text conversion operation on a display screen of an electronic client device. One or more element categories associated with corresponding portions of the text-based result are identified. Graphical icons corresponding to the element categories are also displayed on the display in areas where the corresponding portions of the text-based result are also displayed. A user selection of one of the graphical icons is then detected, and an edit operation is enabled for the portion of the text-based result associated with the selected graphical icon. An updated version of the text-based results is then displayed on the display.

Type: Grant

Filed: March 18, 2014

Date of Patent: September 20, 2016

Assignee: Bayerische Motoren Werke Aktiengesellschaft

Inventor: Philipp Suessenguth
Enhanced maximum entropy models

Patent number: 9412365

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to enhanced maximum entropy models. In some implementations, data indicating a candidate transcription for an utterance and a particular context for the utterance are received. A maximum entropy language model is obtained. Feature values are determined for n-gram features and backoff features of the maximum entropy language model. The feature values are input to the maximum entropy language model, and an output is received from the maximum entropy language model. A transcription for the utterance is selected from among a plurality of candidate transcriptions based on the output from the maximum entropy language model. The selected transcription is provided to a client device.

Type: Grant

Filed: March 24, 2015

Date of Patent: August 9, 2016

Assignee: Google Inc.

Inventors: Fadi Biadsy, Brian E. Roark
Method and apparatus for automatic speaker-based speech clustering

Patent number: 9368109

Abstract: Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.

Type: Grant

Filed: May 31, 2013

Date of Patent: June 14, 2016

Assignee: Nuance Communications, Inc.

Inventors: Daniele Ernesto Colibro, Claudio Vair, Kevin R. Farrell
Selective speech recognition scoring using articulatory features

Patent number: 9355636

Abstract: Features are provided for selectively scoring portions of user utterances based at least on articulatory features of the portions. One or more articulatory features of a portion of a user utterance can be determined. Acoustic models or subsets of individual acoustic model components (e.g., Gaussians or Gaussian mixture models) can be selected based on the articulatory features of the portion. The portion can then be scored using a selected acoustic model or subset of acoustic model components. The process may be repeated for the multiple portions of the utterance, and speech recognition results can be generated from the scored portions.

Type: Grant

Filed: September 16, 2013

Date of Patent: May 31, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Jeffrey Cornelius O'Neill, Jeffrey Paul Lilly, Thomas Schaaf
Ranking search results

Patent number: 9348915

Abstract: Content items and other entities may be ranked or organized according to a relevance to a user. Relevance may take into consideration recency, proximity, popularity, air time (e.g., of television shows) and the like. In one example, the popularity and age of a movie may be used to determine a relevance ranking Popularity (i.e., entity rank) may be determined based on a variety of factors. In the movie example, popularity may be based on gross earnings, awards, nominations, votes and the like. According to one or more embodiments, entities may initially be categorized into relevance groupings based on popularity and/or other factors. Once categorized, the entities may be sorted within each grouping and later combined into a single ranked list.

Type: Grant

Filed: May 4, 2012

Date of Patent: May 24, 2016

Assignee: Comcast Interactive Media, LLC

Inventors: Ken Iwasa, Seth Michael Murray, Goldee Udani
Speaker identification method, and speaker identification system

Patent number: 9349372

Abstract: A speaker identification system (100) includes a microphone (2) which acquires speech information of a speaker; a sex/age range information acquisition unit (7) which acquires age range information relating to a range of the age of the speaker, based on the speech information; a specific age information acquisition unit (8) which acquires specific age information relating to the specific age of the speaker, based on the speech information; a date and time information acquisition unit (9) which acquires date and time information representing the date and time when the speech information has been acquired; and a speaker database (4) which accumulates the specific age information, and the date and time information in association with each other.

Type: Grant

Filed: July 8, 2014

Date of Patent: May 24, 2016

Assignee: Panasonic Intellectual Property Corporation of America

Inventors: Kazue Fusakawa, Tomomi Matsuoka, Masako Ikeda
Document length as a static relevance feature for ranking search results

Patent number: 9348912

Abstract: Embodiments are configured to provide information based on a user query. In an embodiment, a system includes a search component having a ranking component that can be used to rank search results as part of a query response. In one embodiment, the ranking component includes a ranking algorithm that can use the length of documents returned in response to a search query to rank search results.

Type: Grant

Filed: September 10, 2008

Date of Patent: May 24, 2016

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Vladimir Tankovich, Dmitriy Meyerzon, Michael James Taylor
Speech recognition using non-parametric models

Patent number: 9336771

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using non-parametric models in speech recognition. In some implementations, speech data is accessed. The speech data represents utterances of a particular phonetic unit occurring in a particular phonetic context, and the speech data includes values for multiple dimensions. Boundaries are determined for a set of quantiles for each of the multiple dimensions. Models for the distribution of values within the quantiles are generated. A multidimensional probability function is generated. Data indicating the boundaries of the quantiles, the models for the distribution of values in the quantiles, and the multidimensional probability function are stored.

Type: Grant

Filed: May 16, 2013

Date of Patent: May 10, 2016

Assignee: Google Inc.

Inventor: Ciprian I. Chelba
Speech recognition using topic-specific language models

Patent number: 9324323

Abstract: Speech recognition techniques may include: receiving audio; identifying one or more topics associated with audio; identifying language models in a topic space that correspond to the one or more topics, where the language models are identified based on proximity of a representation of the audio to representations of other audio in the topic space; using the language models to generate recognition candidates for the audio, where the recognition candidates have scores associated therewith that are indicative of a likelihood of a recognition candidate matching the audio; and selecting a recognition candidate for the audio based on the scores.

Type: Grant

Filed: December 14, 2012

Date of Patent: April 26, 2016

Assignee: Google Inc.

Inventors: Daniel M. Bikel, Kapil R. Thadini, Fernando Pereira, Maria Shugrina, Fadi Biadsy
Suppressing duplicate listings on multiple search engine web sites from a single source system given a synchronized listing is unknown

Patent number: 9298836

Abstract: A source system searches a provider system for one or more listings. The source system receives a plurality of potential matching listings. The source system designates a representative listing of the entity located on a provider system from among the plurality of potential matching listings. The source system designates one or more remaining potential matching listings of the plurality of potential matching listings as one or more duplicate listings. The source system transmits, to the provider system, a request to synchronize the representative listing as the only representative listing of the entity on the provider system, the request comprising a first provider-supplied external identifier of the representative listing.

Type: Grant

Filed: July 7, 2015

Date of Patent: March 29, 2016

Assignee: YEXT, INC.

Inventors: Howard C. Lerman, Thomas C. Dixon, Kevin Caffrey, David C. Lin
Apparatus and method for search and retrieval of documents

Patent number: 9262538

Abstract: A system for the support and management of search for documents is presents. The system includes knowledge-database, query interface and communication to a database of documents to be searched. Information generated during a search session is collected by the system and is added to the knowledge-database. The information is ranked automatically according to the usage of that information by the user. During successive search session, or during search made by other users, the system uses the knowledge-database to support the users with keywords, queries and reference to documents.

Type: Grant

Filed: June 19, 2015

Date of Patent: February 16, 2016

Inventor: Haim Zvi Melman
Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same

Patent number: 9224394

Abstract: A system and method for implementing a server-based speech recognition system for multi-modal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver's visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound.

Type: Grant

Filed: March 23, 2010

Date of Patent: December 29, 2015

Assignee: Sirius XM Connected Vehicle Services Inc

Inventors: Thomas Barton Schalk, Leonel Saenz, Barry Burch
Dynamic audio processing parameters with automatic speech recognition

Patent number: 9224404

Abstract: A communication system includes a front-end audio gateway or bridge and a hands-free device. An automatic speech recognition platform accessible to the hands-free device provides or makes available one or more preprocessing schemes and/or acoustic models to the front-end audio gateway or bridge. The preprocessing schemes or acoustic models can be identified by or provided before a connection is established between the front-end audio gateway and the automatic speech recognition platform, when a connection occurs between the front-end audio gateway and the automatic speech recognition platform, and/or during a speech recognition session.

Type: Grant

Filed: January 28, 2013

Date of Patent: December 29, 2015

Assignee: 2236008 Ontario Inc.

Inventor: Anthony Andrew Poliak
Fast, language-independent method for user authentication by voice

Patent number: 9218809

Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker-specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.

Type: Grant

Filed: January 9, 2014

Date of Patent: December 22, 2015

Assignee: Apple Inc.

Inventors: Jerome R. Bellegarda, Kim E. A. Silverman
Mass-scale, user-independent, device-independent voice messaging system

Patent number: 9191515

Abstract: A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.

Type: Grant

Filed: October 31, 2007

Date of Patent: November 17, 2015

Assignee: Nuance Communications, Inc.

Inventor: Daniel Michael Doulton
Speech recognition apparatus, speech recognition method, and computer-readable recording medium

Patent number: 9142211

Abstract: A speech recognition apparatus 20 includes: an identification language model creation unit 21 that selects, from learning texts 27 for various fields for generating language models 26 for the fields, a phrase that includes a word whose appearance frequency satisfies a set condition on a field-by-field basis, and generates an identification language model 25 for identifying the field of speech using the selected phrases; a speech recognition unit 22 that executes speech recognition on the speech using the identification language model 25, and outputs text data and word confidences as a recognition result; and a field determination unit 23 that specifies a field that includes the most words whose confidences are greater than or equal to a set value based on the text data, the word confidences, and the words in the learning texts for the fields, and determines that the specified field is the field of the speech.

Type: Grant

Filed: February 13, 2013

Date of Patent: September 22, 2015

Assignee: NEC CORPORATION

Inventor: Atsunori Sakai
Embedded system for construction of small footprint speech recognition with user-definable constraints

Patent number: 9117449

Abstract: Techniques disclosed herein include systems and methods that enable a voice trigger that wakes-up an electronic device or causes the device to make additional voice commands active, without manual initiation of voice command functionality. In addition, such a voice trigger is dynamically programmable or customizable. A speaker can program or designate a particular phrase as the voice trigger. In general, techniques herein execute a voice-activated wake-up system that operates on a digital signal processor (DSP) or other low-power, secondary processing unit of an electronic device instead of running on a central processing unit (CPU). A speech recognition manager runs two speech recognition systems on an electronic device. The CPU dynamically creates a compact speech system for the DSP. Such a compact system can be continuously run during a standby mode, without quickly exhausting a battery supply.

Type: Grant

Filed: April 26, 2012

Date of Patent: August 25, 2015

Assignee: Nuance Communications, Inc.

Inventors: Michael Jack Newman, Robert Roth, William D. Alexander, Paul van Mulbregt
Method and apparatus for performing extended search

Patent number: 9092480

Abstract: A method and apparatus for performing extended search are provided. The method includes receiving user-inputted keywords; extending the user-inputted keywords according to geographical information to acquire extended keywords; performing a search by using the extended keywords; and returning search results to the user. With the present technical solutions, privilege control can be effectively performed in a cloud storage system. With the present embodiments, more information may be provided to a user for reference.

Type: Grant

Filed: January 31, 2013

Date of Patent: July 28, 2015

Assignee: International Business Machines Corporation

Inventors: Keke Cai, Hong Lei Guo, Zhong Su, Hui Jia Zhu
Determining navigation destination target in a situation of repeated speech recognition errors

Patent number: 9087515

Abstract: A speech recognition apparatus is disclosed. The apparatus converts a speech signal into a digitalized speech data, and performs speech recognition based on the speech data. The apparatus makes a comparison between the speech data inputted the last time and the speech data inputted the time before the last time in response to a user's indication that the speech recognition results in erroneous recognition multiple times in a row. When the speech data inputted the last time is determined to substantially match the speech data inputted the time before the last time, the apparatus outputs a guidance prompting the user to utter an input target by calling it by another name.

Type: Grant

Filed: October 13, 2011

Date of Patent: July 21, 2015

Assignee: DENSO CORPORATION

Inventor: Takahiro Tsuda
Identifying media content

Patent number: 9031840

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving (i) audio data that encodes a spoken natural language query, and (ii) environmental audio data, obtaining a transcription of the spoken natural language query, determining a particular content type associated with one or more keywords in the transcription, providing at least a portion of the environmental audio data to a content recognition engine, and identifying a content item that has been output by the content recognition engine, and that matches the particular content type.

Type: Grant

Filed: December 27, 2013

Date of Patent: May 12, 2015

Assignee: Google Inc.

Inventors: Matthew Sharifi, Gheorghe Postelnicu
Speech recognition apparatus, speech recognition method, and speech recognition program

Patent number: 9031841

Abstract: A apparatus includes: a storage unit to store a model representing a relationship between a relative time and an occurrence probabilities; a first detection unit to detect first speech period of a first speaker; a second period detection unit to detect second speech period of a second speaker; a unit to calculate a feature value of the first speech period; a detection unit to detect a word using the calculated feature value; an adjustment unit to make an adjustment such that in detecting a word for a reply by the detection unit, the adjustment unit retrieves an occurrence probability corresponding to a relative position of the reply in the second speech period, and adjusts a word score or a detection threshold value for the reply; and a second detection unit to re-detect, using the adjusted word score or the adjusted detection threshold value, the detected word by the detection unit.

Type: Grant

Filed: December 12, 2012

Date of Patent: May 12, 2015

Assignee: Fujitsu Limited

Inventor: Nobuyuki Washio
System for generating captions for live video broadcasts

Patent number: 9026446

Abstract: An adaptive workflow system can be used to implement captioning projects, such as projects for creating captions or subtitles for live and non-live broadcasts. Workers can repeat words spoken during a broadcast program or other program into a voice recognition system, which outputs text that may be used as captions or subtitles. The process of workers repeating these words to create such text can be referred to as respeaking. Respeaking can be used as an effective alternative to more expensive and hard-to-find stenographers for generating captions and subtitles.

Type: Grant

Filed: June 10, 2011

Date of Patent: May 5, 2015

Inventor: Morgan Fiumi
Format based speech reconstruction from noisy signals

Patent number: 9020818

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 28, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Hidden markov model for speech processing with training method

Patent number: 9020816

Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.

Type: Grant

Filed: August 13, 2009

Date of Patent: April 28, 2015

Assignee: 21CT, Inc.

Inventor: Matthew McClain
State detecting apparatus, communication apparatus, and storage medium storing state detecting program

Patent number: 9020820

Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.

Type: Grant

Filed: April 13, 2012

Date of Patent: April 28, 2015

Assignee: Fujitsu Limited

Inventors: Shoji Hayakawa, Naoshi Matsuo
Formant based speech reconstruction from noisy signals

Patent number: 9015044

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 21, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
SYSTEM AND METHOD FOR ADVANCED TURN-TAKING FOR INTERACTIVE SPOKEN DIALOG SYSTEMS

Publication number: 20150100316

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.

Type: Application

Filed: December 10, 2014

Publication date: April 9, 2015

Inventors: Jason D. Williams, Ethan SELFRIDGE
Speaker state detecting apparatus and speaker state detecting method

Patent number: 9002704

Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.

Type: Grant

Filed: February 3, 2012

Date of Patent: April 7, 2015

Assignee: Fujitsu Limited

Inventor: Akira Kamano
System and method for generating personal vocabulary from network data

Patent number: 8990083

Abstract: A method is provided in one example and includes receiving data propagating in a network environment, and identifying selected words within the data based on a whitelist. The whitelist includes a plurality of designated words to be tagged. The method further includes assigning a weight to the selected words based on at least one characteristic associated with the data, and associating the selected words to an individual. A resultant composite is generated for the selected words that are tagged. In more specific embodiments, the resultant composite is partitioned amongst a plurality of individuals associated with the data propagating in the network environment. A social graph can be generated that identifies a relationship between a selected individual and the plurality of individuals based on a plurality of words exchanged between the selected individual and the plurality of individuals.

Type: Grant

Filed: September 30, 2009

Date of Patent: March 24, 2015

Assignee: Cisco Technology, Inc.

Inventors: Satish K. Gannu, Ashutosh A. Malegaonkar, Virgil N. Mihailovici
Method for enhancing the playback of information in interactive voice response systems

Patent number: 8983841

Abstract: A network communication node includes an audio outputter that outputs an audible representation of data to be provided to a requester. The network communication node also includes a processor that determines a categorization of the data to be provided to the requester and that varies a pause between segments of the audible representation of the data in accordance with the categorization of the data to be provided to the requester.

Type: Grant

Filed: July 15, 2008

Date of Patent: March 17, 2015

Assignee: AT&T Intellectual Property, I, L.P.

Inventors: Gregory Pulz, Steven Lewis, Charles Rajnai
METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES

Publication number: 20150073792

Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Application

Filed: November 13, 2014

Publication date: March 12, 2015

Inventor: Giuseppe RICCARDI
SYSTEM AND METHOD FOR COMBINING GEOGRAPHIC METADATA IN AUTOMATIC SPEECH RECOGNITION LANGUAGE AND ACOUSTIC MODELS

Publication number: 20150073793

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for a speech recognition application for directory assistance that is based on a user's spoken search query. The spoken search query is received by a portable device and portable device then determines its present location. Upon determining the location of the portable device, that information is incorporated into a local language model that is used to process the search query. Finally, the portable device outputs the results of the search query based on the local language model.

Type: Application

Filed: November 14, 2014

Publication date: March 12, 2015

Inventors: Enrico BOCCHIERI, Diamantino Antonio Caseiro
SOUND RECOGNITION APPARATUS, SOUND RECOGNITION METHOD, AND SOUND RECOGNITION PROGRAM

Publication number: 20150066507

Abstract: A sound recognition apparatus can include a sound feature value calculating unit configured to calculate a sound feature value based on a sound signal, and a label converting unit configured to convert the sound feature value into a corresponding label with reference to label data in which sound feature values and labels indicating sound units are correlated. A sound identifying unit is configured to calculate a probability of each sound unit group sequence that a label sequence is segmented for each sound unit group with reference to segmentation data. The segmentated data indicates a probability that a sound unit sequence will be segmented into at least one sound unit group. The sound identity unit can also identify a sound event corresponding to the sound unit group sequence selected based on the calculated probability.

Type: Application

Filed: August 26, 2014

Publication date: March 5, 2015

Inventors: Keisuke NAKAMURA, Kazuhiro NAKADAI
Speech recognition using multiple language models

Patent number: 8972260

Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.

Type: Grant

Filed: April 19, 2012

Date of Patent: March 3, 2015

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
Biometric identification and verification

Patent number: 8948466

Abstract: In real biometric systems, false match rates and false non-match rates of 0% do not exist. There is always some probability that a purported match is false, and that a genuine match is not identified. The performance of biometric systems is often expressed in part in terms of their false match rate and false non-match rate, with the equal error rate being when the two are equal. There is a tradeoff between the FMR and FNMR in biometric systems which can be adjusted by changing a matching threshold. This matching threshold can be automatically, dynamically and/or user adjusted so that a biometric system of interest can achieve a desired FMR and FNMR.

Type: Grant

Filed: October 11, 2013

Date of Patent: February 3, 2015

Assignee: Aware, Inc.

Inventor: David Benini
Multi-source transfer of delexicalized dependency parsers

Patent number: 8935151

Abstract: A source language sentence is tagged with non-lexical tags, such as part-of-speech tags and is parsed using a lexicalized parser trained in the source language. A target language sentence that is a translation of the source language sentence is tagged with non-lexical labels (e.g., part-of speech tags) and is parsed using a delexicalized parser that has been trained in the source language to produce k-best parses. The best parse is selected based on the parse's alignment with lexicalized parse of the source language sentence. The selected best parse can be used to update the parameter vector of a lexicalized parser for the target language.

Type: Grant

Filed: December 7, 2011

Date of Patent: January 13, 2015

Assignee: Google Inc.

Inventors: Slav Petrov, Ryan McDonald, Keith Hall
Distributed user input to text generated by a speech to text transcription service

Patent number: 8930189

Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.

Type: Grant

Filed: October 28, 2011

Date of Patent: January 6, 2015

Assignee: Microsoft Corporation

Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
Speech and language translation of an utterance

Patent number: 8914277

Abstract: According to example configurations, a speech-processing system parses an uttered sentence into segments. The speech-processing system translates each of the segments in the uttered sentence into candidate textual expressions (i.e., phrases of one or more words) in a first language. The uttered sentence can include multiple phrases or candidate textual expressions. Additionally, the speech-processing system translates each of the candidate textual expressions into candidate textual phrases in a second language. Based at least in part on a product of confidence values associated with the candidate textual expressions in the first language and confidence values associated with the candidate textual phrases in the second language, the speech-processing system produces a confidence metric for each of the candidate textual phrases in the second language.

Type: Grant

Filed: September 20, 2011

Date of Patent: December 16, 2014

Assignee: Nuance Communications, Inc.

Inventor: Ding Liu
Frequency axis warping factor estimation apparatus, system, method and program

Patent number: 8909518

Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.

Type: Grant

Filed: September 22, 2008

Date of Patent: December 9, 2014

Assignee: NEC Corporation

Inventor: Tadashi Emori
Systems, methods, and apparatus for voice activity detection

Patent number: 8898058

Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.

Type: Grant

Filed: October 24, 2011

Date of Patent: November 25, 2014

Assignee: QUALCOMM Incorporated

Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
Distributed user input to text generated by a speech to text transcription service

Patent number: 8898061

Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.

Type: Grant

Filed: October 28, 2011

Date of Patent: November 25, 2014

Assignee: Microsoft Corporation

Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
Front-end processor for speech recognition, and speech recognizing apparatus and method using the same

Patent number: 8892436

Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.

Type: Grant

Filed: October 19, 2011

Date of Patent: November 18, 2014

Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation

Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
Spell-check for a keyboard system with automatic correction

Patent number: 8892996

Abstract: User input is received, specifying a continuous traced path across a keyboard presented on a touch sensitive display. An input sequence is resolved, including traced keys and auxiliary keys proximate to the traced keys by prescribed criteria. For each of one or more candidate entries of a prescribed vocabulary, a set-edit-distance metric is computed between said input sequence and the candidate entry. Various rules specify when penalties are imposed, or not, in computing the set-edit-distance metric. Candidate entries are ranked and displayed according to the computed metric.

Type: Grant

Filed: June 29, 2012

Date of Patent: November 18, 2014

Assignee: Nuance Communications, Inc.

Inventor: Erland Unruh

prev 1 2 3 4 5 6 7 8 … next