Subportions Patents (Class 704/254)
  • Patent number: 9653080
    Abstract: A method and apparatus for voice control of a mobile device are provided. The method establishes a connection between the mobile device and a voice-control module. Responsive to establishing the connection, the mobile device enters into an intermediate mode; and the voice-control module monitors for verbal input comprising a verbal command from among a set of predetermined verbal commands. The voice-control module sends instructions to the mobile device related to the verbal command received; and the mobile device acts on the received instructions. An apparatus/voice control module (VCM) for voice control of a mobile device, wherein the VCM includes a connection module configured for establishing a connection between the VCM and the mobile device; a monitoring module configured for monitoring for a verbal command from among a set of predetermined verbal commands; and a communications module configured for sending instructions to the mobile device related to the verbal command received.
    Type: Grant
    Filed: March 7, 2016
    Date of Patent: May 16, 2017
    Assignee: BlackBerry Limited
    Inventors: Ahmed Abdelsamie, Nicholas Shane Choo, Guowei Zhang, Omar George Joseph Barake, Steven Anthony Lill
  • Patent number: 9644985
    Abstract: Because a navigation device that provides guidance on a moving route and displays map data on a display screen on the basis of the map data and the position of a moving object automatically acquires evaluation information for a point or a region from a recognition result and position information, and registers the evaluation information, the navigation device can leave an impression (evaluation) of the point or the region automatically and safely without requiring the user's special operation. Further, by adding the evaluation information to map data and displaying this map data at the time of use, the navigation device enables the user to revisit a point through which the user desires to pass once again, bypass a point through which the user desires to never pass again, and so on without requiring the user's work of consciously memorizing the impression of each point.
    Type: Grant
    Filed: October 10, 2012
    Date of Patent: May 9, 2017
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Takumi Takei, Yuki Furumoto
  • Patent number: 9646608
    Abstract: A voice recognition apparatus includes a voice recognizer configured to recognize user utterance, a storage unit configured to store a plurality of tokens, a token network generator configured to generate a plurality of recognition tokens from the recognized user utterance, search for a similar token similar to each of the recognition tokens and a peripheral token having a history used with the recognition token among the plurality of tokens stored in the storage unit, and generate a token network using the recognition token, the similar token, and the peripheral token, and a processor configured to control the token network generator to generate the token network in response to the user utterance being recognized through the voice recognizer, calculate a transition probability between the tokens constituting the token network, and generate text data for corrected user utterance using the calculated transition probability.
    Type: Grant
    Filed: November 13, 2015
    Date of Patent: May 9, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kyung-duk Kim, Kyoung-min Kim
  • Patent number: 9619449
    Abstract: Techniques for encoding node locations are provided. In some examples, a node in a hierarchy such as a document object model (DOM) may be expressed as a location path from the root node of the hierarchy to the node. Each node on the location path may be encoded using a mapping between the type and/or index of the node and a set of characters such as ASCII characters. The encoding of each node on the path be concatenated to form the encoded location path. The encoding seeks to minimize the length of the encoded location path. For example, in an embodiment, a single ASCII character may be used to encode both the type and the index of a node. In an embodiment, breakpoint locations may be encoded using methods provided to reduce storage and computing resource required.
    Type: Grant
    Filed: May 14, 2015
    Date of Patent: April 11, 2017
    Assignee: Oracle International Corporation
    Inventors: Prabhu Thukkaram, Michal Chmielewski, Horst Heistermann
  • Patent number: 9600473
    Abstract: Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments may enable multi-lingual communications through different modes of communications including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments may implement communications systems and methods that translate text between two or more languages (e.g., spoken), while handling/accommodating for one or more of the following in the text: specialized/domain-related jargon, abbreviations, acronyms, proper nouns, common nouns, diminutives, colloquial words or phrases, and profane words or phrases.
    Type: Grant
    Filed: February 8, 2013
    Date of Patent: March 21, 2017
    Assignee: Machine Zone, Inc.
    Inventors: Gabriel Leydon, Francois Orsini, Nikhil Bojja, Shailen Karur
  • Patent number: 9570069
    Abstract: Systems, methods, and computer program products to detect a keyword in speech, by generating, from a sequence of spectral feature vectors generated from the speech, a plurality of blocked feature vector sequences, and analyzing, by a neural network, each of the plurality of blocked feature vector sequences to detect the presence of the keyword in the speech.
    Type: Grant
    Filed: September 9, 2014
    Date of Patent: February 14, 2017
    Assignee: Disney Enterprises, Inc.
    Inventors: Jill F. Lehman, Pallavi N. Baljekar, Rita Singh
  • Patent number: 9558738
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating an acoustic model for use in speech recognition. A system configured to practice the method first receives training data and identifies non-contextual lexical-level features in the training data. Then the system infers sentence-level features from the training data and generates a set of decision trees by node-splitting based on the non-contextual lexical-level features and the sentence-level features. The system decorrelates training vectors, based on the training data, for each decision tree in the set of decision trees to approximate full-covariance Gaussian models, and then can train an acoustic model for use in speech recognition based on the training data, the set of decision trees, and the training vectors.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: January 31, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Enrico Bocchieri, Diamantino Antonio Caseiro, Dimitrios Dimitriadis
  • Patent number: 9538295
    Abstract: A hearing aid is disclosed. The hearing aid comprises a microphone adapted to receive sound signals, an amplifier configured to amplify signals received by the microphone and output means (e.g. a receiver). The hearing aid is configured to detect if speech is received by the microphone and the hearing aid is configured to provide amplification of the detected sound signals according to a non-speech mode when no speech is detected. The hearing aid is configured to provide amplification of the detected sound signals according to a speech mode when speech is detected. The amplification carried out according to the non-speech mode is different from the amplification carried out according to the speech mode. The invention also discloses a method for amplifying sound signals received by a microphone in a hearing aid.
    Type: Grant
    Filed: August 26, 2014
    Date of Patent: January 3, 2017
    Assignee: OTICON A/S
    Inventor: Karsten Bo Rasmussen
  • Patent number: 9519637
    Abstract: A text processing apparatus includes an environmental information acquisition unit configured to acquire environmental information, a text acquisition unit configured to acquire text, a word extraction unit configured to extract a word from the text, and a joint indication unit configured to convert the word extracted from the text into a converted word using a dictionary that is accessed according to the environmental information acquired by the environmental information acquisition unit and indicate the converted word along with the word extracted from the text.
    Type: Grant
    Filed: August 11, 2014
    Date of Patent: December 13, 2016
    Assignee: Ricoh Company, Ltd.
    Inventors: Atsushi Koinuma, Toshiyuki Tezuka
  • Patent number: 9472183
    Abstract: A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.
    Type: Grant
    Filed: March 11, 2014
    Date of Patent: October 18, 2016
    Assignee: Visteon Global Technologies, Inc.
    Inventors: Mark Zeinstra, Richard J. Chutorash, Jeffrey Golden, Jon M. Skekloff
  • Patent number: 9449617
    Abstract: Method and apparatus for segmenting speech by detecting the pauses between the words and/or phrases, and to determine whether a particular time interval contains speech or non-speech, such as a pause.
    Type: Grant
    Filed: March 21, 2016
    Date of Patent: September 20, 2016
    Assignee: SPEECH MORPHING SYSTEMS, INC.
    Inventors: Fathy Yassa, Ben Reaves
  • Patent number: 9424837
    Abstract: A method for configuring a speech recognition system comprises obtaining a speech sample utilized by a voice authentication system in a voice authentication process. The speech sample is processed to generate acoustic models for units of speech associated with the speech sample. The acoustic models are stored for subsequent use by the speech recognition system as part of a speech recognition process.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: August 23, 2016
    Inventors: Habib Emile Talhami, Amit Sadanand Malegaonkar, Renuka Amit Malegaonkar, Clive David Summerfield
  • Patent number: 9390066
    Abstract: The invention consists of new ways of constructing a Measuring Matrices (MMs) including time deconvolution of Digital Fourier Transforms DFTs. Also, windowing functions specifically designed to facilitate time deconvolution may be used, and/or the DFTs may be performed in specific non-periodic ways to reduce artifacts and further facilitate deconvolution. These deconvolved DFTs may be used alone or correlated with other DFTs to produce a MM.
    Type: Grant
    Filed: November 11, 2010
    Date of Patent: July 12, 2016
    Assignee: Digital Harmonic LLC
    Inventors: Paul Reed Smith, Jack W. Smith, Ernestine M. Smith, Frederick M. Slay
  • Patent number: 9384730
    Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: July 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 9342588
    Abstract: A method of creating a statistical classification model for a classifier within a natural language understanding system can include processing training data using an existing statistical classification model. Sentences of the training data correctly classified into a selected class of the statistical classification model can be selected. The selected sentences of the training data can be assigned to a fringe group or a core group according to confidence score. The training data can be updated by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class. A new statistical classification model can be built from the updated training data. The new statistical classification model can be output.
    Type: Grant
    Filed: June 18, 2007
    Date of Patent: May 17, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rajesh Balchandran, Linda M. Boyer, Gregory Purdy
  • Patent number: 9324319
    Abstract: Method and apparatus for segmenting speech by detecting the pauses between the words and/or phrases, and to determine whether a particular time interval contains speech or non-speech, such as a pause.
    Type: Grant
    Filed: April 25, 2014
    Date of Patent: April 26, 2016
    Assignee: SPEECH MORPHING SYSTEMS, INC.
    Inventors: Fathy Yassa, Ben Reaves
  • Patent number: 9324325
    Abstract: A method and system for converting voice data to text data between users is provided. The method includes receiving voice data from at least one user and determining phoneme data items corresponding to the voice data. Conversion candidate string representations of the phoneme data items are identified by referencing a conversion dictionary defining the conversion candidate string representations for each phoneme data item. The plurality of conversion candidate string representations are scored and a specified conversion candidate string representation is selected as text data based on the scores. The text data is transmitted to a terminal device accessed by the at least one user.
    Type: Grant
    Filed: July 28, 2014
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventors: Kensuke Hashimoto, Yohichi Hattori, Taroh Sanui, Hisae Shiiki
  • Patent number: 9311915
    Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: April 12, 2016
    Assignee: Google Inc.
    Inventors: Eugene Weinstein, Pedro J. Mengibar, Johan Schalkwyk
  • Patent number: 9286894
    Abstract: Recognition techniques may include the following. On a first processing entity, a first recognition process is performed on a first element, where the first recognition process includes: in a first state machine having M (M>1) states, determining a first best path cost in at least a subset of the M states for at least part of the first element. On a second processing entity, a second recognition process is performed on a second element, where the second recognition process includes: in a second state machine having N (N>1) states, determining a second best path cost in at least a subset of the N states for at least part of the second element. At least one of the following is done: (i) passing the first best path cost to the second state machine, or (ii) passing the second best path cost to the first state machine. The foregoing techniques may include one or more of the following features, either alone or in combination.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: March 15, 2016
    Assignee: Google Inc.
    Inventor: Mitchel Weintraub
  • Patent number: 9282284
    Abstract: Videoconferencing may be provided. A participant may be identified from audio information and in video information. From the video information, a plurality of images may be captured of the participant identified in the video information. A unique identifier may be associated with the captured plurality of images. The unique identifier may correspond to the participant identified from the audio information. The captured plurality of images and the associated unique identifier may be saved in a database.
    Type: Grant
    Filed: May 20, 2013
    Date of Patent: March 8, 2016
    Assignee: Cisco Technology, Inc.
    Inventors: Sachin S. Kajarekar, Mainak Sen
  • Patent number: 9275640
    Abstract: Systems, methods, and apparatus, including computer program products for accepting a predetermined vocabulary-dependent characterization of a set of audio signals, the predetermined characterization including an identification of putative occurrences of each of a plurality of vocabulary items in the set of audio signals, the plurality of vocabulary items included in the vocabulary; accepting a new vocabulary item not included in the vocabulary; accepting putative occurrences of the new vocabulary item in the set of audio signals; and generating, by an analysis engine of a speech processing system, an augmented characterization of the set of audio signals based on the identified putative occurrences of the new vocabulary item.
    Type: Grant
    Filed: November 24, 2009
    Date of Patent: March 1, 2016
    Assignee: Nexidia Inc.
    Inventors: Kenneth King Griggs, Jon A. Arrowood
  • Patent number: 9269355
    Abstract: Features are disclosed for transferring speech recognition workloads between pooled execution resources. For example, various parts of an automatic speech recognition engine may be implemented by various pools of servers. Servers in a speech recognition pool may explore a plurality of paths in a graph to find the path that best matches an utterance. A set of active nodes comprising the last node explored in each path may be transferred between servers in the pool depending on resource availability at each server. A history of nodes or arcs traversed in each path may be maintained by a separate pool of history servers, and used to generate text corresponding to the path identified as the best match by the speech recognition servers.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: February 23, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Hugh Evan Secker-Walker, Naresh Narayanan
  • Patent number: 9251789
    Abstract: A speech recognition system that recognizes speech data is provided. The speech recognition system includes a speech recognition part that performs speech recognition of the speech data, and calculates a likelihood of the speech data with respect to a registered word that is pre-registered, a reliability judgment part that performs reliability judgment on the speech recognition based on the likelihood, and a judgment reference change processing part that changes a judgment reference for the reliability judgment, according to an utterance speed of the speech data.
    Type: Grant
    Filed: June 24, 2013
    Date of Patent: February 2, 2016
    Assignee: SEIKO EPSON CORPORATION
    Inventor: Kiyotaka Morioka
  • Patent number: 9239829
    Abstract: A speech recognition device includes: a speech recognition unit 23a that performs speech recognition for input speech; a reading information conversion data base in which a reading information conversion rule L is registered; a reading information conversion unit 27a that converts reading information of the word among the languages based on the rule L; and a speech recognition control unit 24a that performs control such that, when a word in a different language that is different from a predetermined language is included in a recognition subject vocabulary in which a speech recognition unit 23a refers to recognition subject word information E, the unit 27a converts the reading information in the different language into reading information in the predetermined language, and that the unit 23a performs the speech recognition that makes reference to the recognition subject word information of the corresponding word, including the converted reading information in the predetermined language.
    Type: Grant
    Filed: October 1, 2010
    Date of Patent: January 19, 2016
    Assignee: Mitsubishi Electric Corporation
    Inventors: Yoichi Kato, Jun Ishii, Hiroki Sakashita
  • Patent number: 9230555
    Abstract: An apparatus receives an input encoded audio data signal comprising a base layer and at least one enhancement layer. A reference unit (103) generates reference audio data corresponding to audio data of a reference set of layers. A layer unit (105) divides the layers of the input signal into a first subset and a second subset. A sample unit (107) generates sample audio data corresponding to the audio data of the first subset. A comparison unit (109) generates a difference measure by comparing the sample audio data to the reference audio data based on a perceptual model. An output unit (111) then determines if the difference measure meets a similarity criterion and generates an output signal without audio data from a layer of the second subset if the similarity criterion is met and including the audio data of the layer otherwise. The invention may provide reduced data rates without an unacceptable degradation of quality.
    Type: Grant
    Filed: April 1, 2010
    Date of Patent: January 5, 2016
    Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC
    Inventors: Holly Francois, Jonathan A. Gibbs
  • Patent number: 9224391
    Abstract: A method for automatically providing a hypothesis of a linguistic formulation that is uttered by users of a voice service based on an automatic speech recognition system and that is outside a recognition domain of the automatic speech recognition system. The method includes providing a constrained and an unconstrained speech recognition from an input speech signal, identifying a part of the constrained speech recognition outside the recognition domain, identifying a part of the unconstrained speech recognition corresponding to the identified part of the constrained speech recognition, and providing the linguistic formulation hypothesis based on the identified part of the unconstrained speech recognition.
    Type: Grant
    Filed: February 17, 2005
    Date of Patent: December 29, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Daniele Colibro, Claudio Vair, Luciano Fissore, Cosmin Popovici
  • Patent number: 9202466
    Abstract: The invention presents a method for analyzing speech in a spoken dialog system, comprising the steps of: accepting an utterance by at least one means for accepting acoustical signals, in particular a microphone, analyzing the utterance and obtaining prosodic cues from the utterance using at least one processing engine, wherein the utterance is evaluated based on the prosodic cues to determine a prominence of parts of the utterance, and wherein the utterance is analyzed to detect at least one marker feature, e.g. a negative statement, indicative of the utterance containing at least one part to replace at least one part in a previous utterance, the part to be replaced in the previous utterance being determined based on the prominence determined for the parts of the previous utterance and the replacement parts being determined based on the prominence of the parts in the utterance, and wherein the previous utterance is evaluated with the replacement part(s).
    Type: Grant
    Filed: March 18, 2013
    Date of Patent: December 1, 2015
    Assignee: HONDA RESEARCH INSTITUTE EUROPE GMBH
    Inventor: Martin Heckmann
  • Patent number: 9196246
    Abstract: A method for recognizing speech including a sequence of words determines a shape of a gesture and a location of the gesture with respect to a display device showing a set of interpretations of the speech. The method determines a type of the word sequence constraint based on the shape of the gesture and determines a value of the word sequence constraint based on the location of the gesture. Next, the speech is recognized using the word sequence constraint.
    Type: Grant
    Filed: July 22, 2014
    Date of Patent: November 24, 2015
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Bret Harsham, John Hershey
  • Patent number: 9162069
    Abstract: An arrangement and method for hearing implant signal processing is described. A hearing implant signal processor converts a speech signal input into multi-channel electrical stimulation signals for a cochlear implant electrode array. A feature adjustment module adjusts feature resolution of the electrical stimulation signals to produce a corresponding sequence of cochlear stimulation feature vectors. A speech recognition vector pre-processor maps the cochlear stimulation feature vectors into corresponding speech recognition feature vectors. A speech recognition engine evaluates the speech recognition features vectors with probabilistic state sequence models to produce speech recognition outputs corresponding to the speech signal input.
    Type: Grant
    Filed: December 11, 2013
    Date of Patent: October 20, 2015
    Assignee: MED-EL Elektromedizinische Geraete GmbH
    Inventor: Werner Hemmert
  • Patent number: 9165553
    Abstract: System and method for performing speech recognition using acoustic invariant structure for large vocabulary continuous speech. An information processing device receives sound as input and performs speech recognition. The information processing device includes: a speech recognition processing unit for outputting a speech recognition score, a structure score calculation unit for calculation of a structure score that is a score that, with respect for each hypothesis concerning all phoneme pairs comprising the hypothesis, is found by applying phoneme pair-by-pair weighting to phoneme pair inter-distribution distance likelihood and then performing summation, and a ranking unit for ranking the multiple hypotheses based on a sum value of speech recognition score and structure score.
    Type: Grant
    Filed: January 18, 2013
    Date of Patent: October 20, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Masayuki Suzuki, Masafumi Nishimura
  • Patent number: 9159329
    Abstract: A method and system for improving the quality of speech generated from Hidden Markov Model (HMM)-based Text-To-Speech Synthesizers using statistical post-filtering techniques. An example method involves: (a) determining a scale factor that, when applied to a synthesized reference spectral envelope, minimizes a statistical divergence between a natural reference spectral envelope and the synthesized reference spectral envelope, where the synthesized reference spectral envelope is generated by a state of an HMM; (b) for a given synthesized subject spectral envelope generated by the state of the HMM, determining an enhanced synthesized subject spectral envelope based on the determined scale factor; and (c) generating, by a computing device, a synthetic speech signal including the enhanced synthesized subject spectral envelope.
    Type: Grant
    Filed: December 5, 2012
    Date of Patent: October 13, 2015
    Assignee: Google Inc.
    Inventors: Ioannis Agiomyrgiannakis, Florian Alexander Eyben
  • Patent number: 9148732
    Abstract: A method for testing hearing aids, particularly in respect of the effect thereof on speech comprehension, is improved. The method has the following steps: a) a test system is provided, b) an audible voice signal in the form of at least one meaningless syllable is selected and presented c) the meaningless syllable and a number of further meaningless syllables are displayed on a graphical user interface in the test system, d) a heard meaningless syllable is selected from the displayed meaningless syllables by the user, e) the selection made is evaluated, and f) method steps c) to e) are repeated until a termination criterion is encountered. The audible voice signal is selected in each case on the basis of at least one selection made by the person in response to a previously presented voice signal. The method allows particularly fast and informative hearing aid tests.
    Type: Grant
    Filed: March 10, 2011
    Date of Patent: September 29, 2015
    Assignee: Sivantos Pte. Ltd.
    Inventors: Martina Bellanova, Matthias Latzel
  • Patent number: 9129602
    Abstract: Approaches are described for generating an audio signal that mimics speech captured a computing device. An input audio signal (e.g., a speech signal) can be transformed from the time domain into another domain, to generate one or more audio signal segments, where each segment can correspond to a window of time. The device can then determine, for each audio signal segment, a feature characteristic of the audio signal, such as a phoneme. Each one of segments can be mapped, based at least in part on the respective feature characteristic, to a model audio signal. The device can then generate an output audio signal including each model audio signal as determined by the mapping, where the output audio signal is in a sequence associated with the input audio signal.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: September 8, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Isaac Jeremy Shepard, Brian David Fisher
  • Patent number: 9124697
    Abstract: A computerized system for advising one communicant in electronic communication between two or more communicants has apparatus monitoring and recording interaction between the communicants, software executing from a machine-readable medium and providing analytics, the software functions including rendering speech into text, and analyzing the rendered text for topics, performing communicant verification, and detecting changes in communicant emotion. Advice is offered to the one communicant during the interaction, based on results of the analytics.
    Type: Grant
    Filed: May 6, 2013
    Date of Patent: September 1, 2015
    Assignee: GENESYS TELECOMMUNICATIONS LABORATORIES, INC.
    Inventors: Mark Scott, Jim Barnett
  • Patent number: 9075829
    Abstract: A technique extracts an object that is characteristic although the number of appearances is less demanded. A clustering apparatus includes: a similarity degree calculating section calculating a similarity degree of a combination of optional two of objects to store the calculated similarity degree in a similarity degree table, excluding a combination of one of the optional two and itself; a merging object selecting section selecting as merging objects, two objects related to the similarity degree which satisfies a predetermined reference; a new object generating section generating a new object from the merging objects; a merging object removing section removing from the similarity degree table, a similarity degree between each of the two objects selected as the merging objects and each of the objects; and a new object adding section calculating a similarity degree between the new object and each of the plurality of objects other than the new object.
    Type: Grant
    Filed: April 8, 2010
    Date of Patent: July 7, 2015
    Assignee: NEC CORPORATION
    Inventor: Yukiko Kuroiwa
  • Patent number: 9047540
    Abstract: Systems, apparatuses, and methods to relate images of words to a list of words are provided. A trellis based word decoder analyses a set of OCR characters and probabilities using a forward pass across a forward trellis and a reverse pass across a reverse trellis. Multiple paths may result, however, the most likely path from the trellises has the highest probability with valid links. A valid link is determined from the trellis by some dictionary word traversing the link. The most likely path is compared with a list of words to find the word closest to the most.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: June 2, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Pawan Kumar Baheti, Kishor K. Barman, Raj Kumar Krishna Kumar
  • Patent number: 9047857
    Abstract: Techniques for transitioning an electronic device between device states. In one example, a voice-controlled device is configured to transition from a sleep state to an awake state in response to identifying a user speaking a predefined utterance. The techniques may determine whether the user has spoken the predefined utterance with reference to traditional speech-recognition techniques, as well as with reference to changes in the volume of a user's voice.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: June 2, 2015
    Assignee: Rawles LLC
    Inventor: William F. Barton
  • Patent number: 9037463
    Abstract: A method for speech recognition is described that uses an initial recognizer to perform an initial speech recognition pass on an input speech utterance to determine an initial recognition result corresponding to the input speech utterance, and a reliability measure reflecting a per word reliability of the initial recognition result. For portions of the initial recognition result where the reliability of the result is low, a re-evaluation recognizer is used to perform a re-evaluation recognition pass on the corresponding portions of the input speech utterance to determine a re-evaluation recognition result corresponding to the re-evaluated portions of the input speech utterance. The initial recognizer and the re-evaluation recognizer are complementary so as to make different recognition errors. A final recognition result is determined based on the re-evaluation recognition result if any, and otherwise based on the initial recognition result.
    Type: Grant
    Filed: May 27, 2010
    Date of Patent: May 19, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Willett, Venkatesh Nagesha
  • Publication number: 20150134336
    Abstract: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.
    Type: Application
    Filed: November 12, 2013
    Publication date: May 14, 2015
    Inventors: Jun Huang, Yookyung Kim, Youssef Billawala, Farzad Ehsani, Demitrios Master
  • Publication number: 20150134337
    Abstract: A conversation based search method includes the steps of: proposing an utterance phrase corresponding to a query input by a user; and reforming the query by means of an answer from the user and offering a search result corresponding to the reformed query.
    Type: Application
    Filed: November 11, 2014
    Publication date: May 14, 2015
    Inventors: Hee-Cheol Seo, Injae Lee, Taeil Kim, Youngho Na
  • Patent number: 9031843
    Abstract: A method and apparatus for enabling multimodal tags in a communication device is disclosed. The method comprises receiving a first training signal and receiving a second training signal in conjunction with the first training signal. A multimodal tag is created by discarding redundant or non-discriminative information associated with each of the first and second training signals to represent a combination of the first training signal and the second training signal and a function is associated with the created multimodal tag.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: May 12, 2015
    Assignee: Google Technology Holdings LLC
    Inventors: Changxue Ma, Harry M. Bliss
  • Publication number: 20150127346
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.
    Type: Application
    Filed: November 4, 2014
    Publication date: May 7, 2015
    Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9020818
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Publication number: 20150112683
    Abstract: An utterance content estimator estimates a document ID corresponding to an answer to user input analysis results from a document on the basis of an utterance estimating model that is generated by learning a correspondence between hypothetical questions each as to a content of the document and document IDs each of which is an answer to one of the hypothetical questions. A result integrator integrates document estimation results of the utterance estimating model and document search results of search indexes so as to generate final search results.
    Type: Application
    Filed: December 27, 2012
    Publication date: April 23, 2015
    Applicant: Mitsubishi Electric Corporation
    Inventors: Yoichi Fujii, Jun Ishii
  • Patent number: 9009041
    Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: April 14, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
  • Publication number: 20150095031
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for crowdsourcing verification of word pronunciations. A system performing word pronunciation crowdsourcing identifies spoken words, or word pronunciations in a dictionary of words, for review by a turker. The identified words are assigned to one or more turkers for review. Assigned turkers listen to the word pronunciations, providing feedback on the correctness/incorrectness of the machine made pronunciation. The feedback can then be used to modify the lexicon, or can be stored for use in configuring future lexicons.
    Type: Application
    Filed: September 30, 2013
    Publication date: April 2, 2015
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. CONKIE, Ladan GOLIPOUR, Taniya MISHRA
  • Patent number: 8996380
    Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.
    Type: Grant
    Filed: May 4, 2011
    Date of Patent: March 31, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
  • Patent number: 8996374
    Abstract: Embodiments of the present invention include an apparatus, method, and system for calculating senone scores for multiple concurrent input speech streams. The method can include the following: receiving one or more feature vectors from one or more input streams; accessing the acoustic model one senone at a time; and calculating separate senone scores corresponding to each incoming feature vector. The calculation uses a single read access to the acoustic model for a single senone and calculates a set of separate senone scores for the one or more feature vectors, before proceeding to the next senone in the acoustic model.
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: March 31, 2015
    Assignee: Spansion LLC
    Inventor: Ojas A. Bapat
  • Patent number: 8990086
    Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.
    Type: Grant
    Filed: July 31, 2006
    Date of Patent: March 24, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong