Subportions Patents (Class 704/254)

Method and apparatus for voice control of a mobile device

Patent number: 9653080

Abstract: A method and apparatus for voice control of a mobile device are provided. The method establishes a connection between the mobile device and a voice-control module. Responsive to establishing the connection, the mobile device enters into an intermediate mode; and the voice-control module monitors for verbal input comprising a verbal command from among a set of predetermined verbal commands. The voice-control module sends instructions to the mobile device related to the verbal command received; and the mobile device acts on the received instructions. An apparatus/voice control module (VCM) for voice control of a mobile device, wherein the VCM includes a connection module configured for establishing a connection between the VCM and the mobile device; a monitoring module configured for monitoring for a verbal command from among a set of predetermined verbal commands; and a communications module configured for sending instructions to the mobile device related to the verbal command received.

Type: Grant

Filed: March 7, 2016

Date of Patent: May 16, 2017

Assignee: BlackBerry Limited

Inventors: Ahmed Abdelsamie, Nicholas Shane Choo, Guowei Zhang, Omar George Joseph Barake, Steven Anthony Lill
Navigation device that evaluates points of interest based on user utterance

Patent number: 9644985

Abstract: Because a navigation device that provides guidance on a moving route and displays map data on a display screen on the basis of the map data and the position of a moving object automatically acquires evaluation information for a point or a region from a recognition result and position information, and registers the evaluation information, the navigation device can leave an impression (evaluation) of the point or the region automatically and safely without requiring the user's special operation. Further, by adding the evaluation information to map data and displaying this map data at the time of use, the navigation device enables the user to revisit a point through which the user desires to pass once again, bypass a point through which the user desires to never pass again, and so on without requiring the user's work of consciously memorizing the impression of each point.

Type: Grant

Filed: October 10, 2012

Date of Patent: May 9, 2017

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventors: Takumi Takei, Yuki Furumoto
Voice recognition apparatus and method of controlling the same

Patent number: 9646608

Abstract: A voice recognition apparatus includes a voice recognizer configured to recognize user utterance, a storage unit configured to store a plurality of tokens, a token network generator configured to generate a plurality of recognition tokens from the recognized user utterance, search for a similar token similar to each of the recognition tokens and a peripheral token having a history used with the recognition token among the plurality of tokens stored in the storage unit, and generate a token network using the recognition token, the similar token, and the peripheral token, and a processor configured to control the token network generator to generate the token network in response to the user utterance being recognized through the voice recognizer, calculate a transition probability between the tokens constituting the token network, and generate text data for corrected user utterance using the calculated transition probability.

Type: Grant

Filed: November 13, 2015

Date of Patent: May 9, 2017

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Kyung-duk Kim, Kyoung-min Kim
Compact encoding of node locations

Patent number: 9619449

Abstract: Techniques for encoding node locations are provided. In some examples, a node in a hierarchy such as a document object model (DOM) may be expressed as a location path from the root node of the hierarchy to the node. Each node on the location path may be encoded using a mapping between the type and/or index of the node and a set of characters such as ASCII characters. The encoding of each node on the path be concatenated to form the encoded location path. The encoding seeks to minimize the length of the encoded location path. For example, in an embodiment, a single ASCII character may be used to encode both the type and the index of a node. In an embodiment, breakpoint locations may be encoded using methods provided to reduce storage and computing resource required.

Type: Grant

Filed: May 14, 2015

Date of Patent: April 11, 2017

Assignee: Oracle International Corporation

Inventors: Prabhu Thukkaram, Michal Chmielewski, Horst Heistermann
Systems and methods for multi-user multi-lingual communications

Patent number: 9600473

Abstract: Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments may enable multi-lingual communications through different modes of communications including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments may implement communications systems and methods that translate text between two or more languages (e.g., spoken), while handling/accommodating for one or more of the following in the text: specialized/domain-related jargon, abbreviations, acronyms, proper nouns, common nouns, diminutives, colloquial words or phrases, and profane words or phrases.

Type: Grant

Filed: February 8, 2013

Date of Patent: March 21, 2017

Assignee: Machine Zone, Inc.

Inventors: Gabriel Leydon, Francois Orsini, Nikhil Bojja, Shailen Karur
Sectioned memory networks for online word-spotting in continuous speech

Patent number: 9570069

Abstract: Systems, methods, and computer program products to detect a keyword in speech, by generating, from a sequence of spectral feature vectors generated from the speech, a plurality of blocked feature vector sequences, and analyzing, by a neural network, each of the plurality of blocked feature vector sequences to detect the presence of the keyword in the speech.

Type: Grant

Filed: September 9, 2014

Date of Patent: February 14, 2017

Assignee: Disney Enterprises, Inc.

Inventors: Jill F. Lehman, Pallavi N. Baljekar, Rita Singh
System and method for speech recognition modeling for mobile voice search

Patent number: 9558738

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating an acoustic model for use in speech recognition. A system configured to practice the method first receives training data and identifies non-contextual lexical-level features in the training data. Then the system infers sentence-level features from the training data and generates a set of decision trees by node-splitting based on the non-contextual lexical-level features and the sentence-level features. The system decorrelates training vectors, based on the training data, for each decision tree in the set of decision trees to approximate full-covariance Gaussian models, and then can train an acoustic model for use in speech recognition based on the training data, the set of decision trees, and the training vectors.

Type: Grant

Filed: March 8, 2011

Date of Patent: January 31, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Enrico Bocchieri, Diamantino Antonio Caseiro, Dimitrios Dimitriadis
Hearing aid specialized as a supplement to lip reading

Patent number: 9538295

Abstract: A hearing aid is disclosed. The hearing aid comprises a microphone adapted to receive sound signals, an amplifier configured to amplify signals received by the microphone and output means (e.g. a receiver). The hearing aid is configured to detect if speech is received by the microphone and the hearing aid is configured to provide amplification of the detected sound signals according to a non-speech mode when no speech is detected. The hearing aid is configured to provide amplification of the detected sound signals according to a speech mode when speech is detected. The amplification carried out according to the non-speech mode is different from the amplification carried out according to the speech mode. The invention also discloses a method for amplifying sound signals received by a microphone in a hearing aid.

Type: Grant

Filed: August 26, 2014

Date of Patent: January 3, 2017

Assignee: OTICON A/S

Inventor: Karsten Bo Rasmussen
Text processing apparatus and text display system

Patent number: 9519637

Abstract: A text processing apparatus includes an environmental information acquisition unit configured to acquire environmental information, a text acquisition unit configured to acquire text, a word extraction unit configured to extract a word from the text, and a joint indication unit configured to convert the word extracted from the text into a converted word using a dictionary that is accessed according to the environmental information acquired by the environmental information acquisition unit and indicate the converted word along with the word extracted from the text.

Type: Grant

Filed: August 11, 2014

Date of Patent: December 13, 2016

Assignee: Ricoh Company, Ltd.

Inventors: Atsushi Koinuma, Toshiyuki Tezuka
System and method for customized prompting

Patent number: 9472183

Abstract: A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.

Type: Grant

Filed: March 11, 2014

Date of Patent: October 18, 2016

Assignee: Visteon Global Technologies, Inc.

Inventors: Mark Zeinstra, Richard J. Chutorash, Jeffrey Golden, Jon M. Skekloff
Method and apparatus for exemplary segment classification

Patent number: 9449617

Abstract: Method and apparatus for segmenting speech by detecting the pauses between the words and/or phrases, and to determine whether a particular time interval contains speech or non-speech, such as a pause.

Type: Grant

Filed: March 21, 2016

Date of Patent: September 20, 2016

Assignee: SPEECH MORPHING SYSTEMS, INC.

Inventors: Fathy Yassa, Ben Reaves
Voice authentication and speech recognition system and method

Patent number: 9424837

Abstract: A method for configuring a speech recognition system comprises obtaining a speech sample utilized by a voice authentication system in a voice authentication process. The speech sample is processed to generate acoustic models for units of speech associated with the speech sample. The acoustic models are stored for subsequent use by the speech recognition system as part of a speech recognition process.

Type: Grant

Filed: January 23, 2013

Date of Patent: August 23, 2016

Inventors: Habib Emile Talhami, Amit Sadanand Malegaonkar, Renuka Amit Malegaonkar, Clive David Summerfield
Precision measurement of waveforms using deconvolution and windowing

Patent number: 9390066

Abstract: The invention consists of new ways of constructing a Measuring Matrices (MMs) including time deconvolution of Digital Fourier Transforms DFTs. Also, windowing functions specifically designed to facilitate time deconvolution may be used, and/or the DFTs may be performed in specific non-periodic ways to reduce artifacts and further facilitate deconvolution. These deconvolved DFTs may be used alone or correlated with other DFTs to produce a MM.

Type: Grant

Filed: November 11, 2010

Date of Patent: July 12, 2016

Assignee: Digital Harmonic LLC

Inventors: Paul Reed Smith, Jack W. Smith, Ernestine M. Smith, Frederick M. Slay
Pronunciation accuracy in speech recognition

Patent number: 9384730

Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.

Type: Grant

Filed: April 14, 2014

Date of Patent: July 5, 2016

Assignee: International Business Machines Corporation

Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
Reclassification of training data to improve classifier accuracy

Patent number: 9342588

Abstract: A method of creating a statistical classification model for a classifier within a natural language understanding system can include processing training data using an existing statistical classification model. Sentences of the training data correctly classified into a selected class of the statistical classification model can be selected. The selected sentences of the training data can be assigned to a fringe group or a core group according to confidence score. The training data can be updated by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class. A new statistical classification model can be built from the updated training data. The new statistical classification model can be output.

Type: Grant

Filed: June 18, 2007

Date of Patent: May 17, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh Balchandran, Linda M. Boyer, Gregory Purdy
Method and apparatus for exemplary segment classification

Patent number: 9324319

Abstract: Method and apparatus for segmenting speech by detecting the pauses between the words and/or phrases, and to determine whether a particular time interval contains speech or non-speech, such as a pause.

Type: Grant

Filed: April 25, 2014

Date of Patent: April 26, 2016

Assignee: SPEECH MORPHING SYSTEMS, INC.

Inventors: Fathy Yassa, Ben Reaves
Converting data between users during a data exchange session

Patent number: 9324325

Abstract: A method and system for converting voice data to text data between users is provided. The method includes receiving voice data from at least one user and determining phoneme data items corresponding to the voice data. Conversion candidate string representations of the phoneme data items are identified by referencing a conversion dictionary defining the conversion candidate string representations for each phoneme data item. The plurality of conversion candidate string representations are scored and a specified conversion candidate string representation is selected as text data based on the scores. The text data is transmitted to a terminal device accessed by the at least one user.

Type: Grant

Filed: July 28, 2014

Date of Patent: April 26, 2016

Assignee: International Business Machines Corporation

Inventors: Kensuke Hashimoto, Yohichi Hattori, Taroh Sanui, Hisae Shiiki
Context-based speech recognition

Patent number: 9311915

Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

Type: Grant

Filed: September 18, 2013

Date of Patent: April 12, 2016

Assignee: Google Inc.

Inventors: Eugene Weinstein, Pedro J. Mengibar, Johan Schalkwyk
Parallel recognition

Patent number: 9286894

Abstract: Recognition techniques may include the following. On a first processing entity, a first recognition process is performed on a first element, where the first recognition process includes: in a first state machine having M (M>1) states, determining a first best path cost in at least a subset of the M states for at least part of the first element. On a second processing entity, a second recognition process is performed on a second element, where the second recognition process includes: in a second state machine having N (N>1) states, determining a second best path cost in at least a subset of the N states for at least part of the second element. At least one of the following is done: (i) passing the first best path cost to the second state machine, or (ii) passing the second best path cost to the first state machine. The foregoing techniques may include one or more of the following features, either alone or in combination.

Type: Grant

Filed: January 31, 2013

Date of Patent: March 15, 2016

Assignee: Google Inc.

Inventor: Mitchel Weintraub
Method and system for facial recognition for a videoconference

Patent number: 9282284

Abstract: Videoconferencing may be provided. A participant may be identified from audio information and in video information. From the video information, a plurality of images may be captured of the participant identified in the video information. A unique identifier may be associated with the captured plurality of images. The unique identifier may correspond to the participant identified from the audio information. The captured plurality of images and the associated unique identifier may be saved in a database.

Type: Grant

Filed: May 20, 2013

Date of Patent: March 8, 2016

Assignee: Cisco Technology, Inc.

Inventors: Sachin S. Kajarekar, Mainak Sen
Augmented characterization for speech recognition

Patent number: 9275640

Abstract: Systems, methods, and apparatus, including computer program products for accepting a predetermined vocabulary-dependent characterization of a set of audio signals, the predetermined characterization including an identification of putative occurrences of each of a plurality of vocabulary items in the set of audio signals, the plurality of vocabulary items included in the vocabulary; accepting a new vocabulary item not included in the vocabulary; accepting putative occurrences of the new vocabulary item in the set of audio signals; and generating, by an analysis engine of a speech processing system, an augmented characterization of the set of audio signals based on the identified putative occurrences of the new vocabulary item.

Type: Grant

Filed: November 24, 2009

Date of Patent: March 1, 2016

Assignee: Nexidia Inc.

Inventors: Kenneth King Griggs, Jon A. Arrowood
Load balancing for automatic speech recognition

Patent number: 9269355

Abstract: Features are disclosed for transferring speech recognition workloads between pooled execution resources. For example, various parts of an automatic speech recognition engine may be implemented by various pools of servers. Servers in a speech recognition pool may explore a plurality of paths in a graph to find the path that best matches an utterance. A set of active nodes comprising the last node explored in each path may be transferred between servers in the pool depending on resource availability at each server. A history of nodes or arcs traversed in each path may be maintained by a separate pool of history servers, and used to generate text corresponding to the path identified as the best match by the speech recognition servers.

Type: Grant

Filed: March 14, 2013

Date of Patent: February 23, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Hugh Evan Secker-Walker, Naresh Narayanan
Speech-recognition system, storage medium, and method of speech recognition

Patent number: 9251789

Abstract: A speech recognition system that recognizes speech data is provided. The speech recognition system includes a speech recognition part that performs speech recognition of the speech data, and calculates a likelihood of the speech data with respect to a registered word that is pre-registered, a reliability judgment part that performs reliability judgment on the speech recognition based on the likelihood, and a judgment reference change processing part that changes a judgment reference for the reliability judgment, according to an utterance speed of the speech data.

Type: Grant

Filed: June 24, 2013

Date of Patent: February 2, 2016

Assignee: SEIKO EPSON CORPORATION

Inventor: Kiyotaka Morioka
Speech recognition device

Patent number: 9239829

Abstract: A speech recognition device includes: a speech recognition unit 23a that performs speech recognition for input speech; a reading information conversion data base in which a reading information conversion rule L is registered; a reading information conversion unit 27a that converts reading information of the word among the languages based on the rule L; and a speech recognition control unit 24a that performs control such that, when a word in a different language that is different from a predetermined language is included in a recognition subject vocabulary in which a speech recognition unit 23a refers to recognition subject word information E, the unit 27a converts the reading information in the different language into reading information in the predetermined language, and that the unit 23a performs the speech recognition that makes reference to the recognition subject word information of the corresponding word, including the converted reading information in the predetermined language.

Type: Grant

Filed: October 1, 2010

Date of Patent: January 19, 2016

Assignee: Mitsubishi Electric Corporation

Inventors: Yoichi Kato, Jun Ishii, Hiroki Sakashita
Apparatus and method for generating an output audio data signal

Patent number: 9230555

Abstract: An apparatus receives an input encoded audio data signal comprising a base layer and at least one enhancement layer. A reference unit (103) generates reference audio data corresponding to audio data of a reference set of layers. A layer unit (105) divides the layers of the input signal into a first subset and a second subset. A sample unit (107) generates sample audio data corresponding to the audio data of the first subset. A comparison unit (109) generates a difference measure by comparing the sample audio data to the reference audio data based on a perceptual model. An output unit (111) then determines if the difference measure meets a similarity criterion and generates an output signal without audio data from a layer of the second subset if the similarity criterion is met and including the audio data of the layer otherwise. The invention may provide reduced data rates without an unacceptable degradation of quality.

Type: Grant

Filed: April 1, 2010

Date of Patent: January 5, 2016

Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC

Inventors: Holly Francois, Jonathan A. Gibbs
Method and system for automatically providing linguistic formulations that are outside a recognition domain of an automatic speech recognition system

Patent number: 9224391

Abstract: A method for automatically providing a hypothesis of a linguistic formulation that is uttered by users of a voice service based on an automatic speech recognition system and that is outside a recognition domain of the automatic speech recognition system. The method includes providing a constrained and an unconstrained speech recognition from an input speech signal, identifying a part of the constrained speech recognition outside the recognition domain, identifying a part of the unconstrained speech recognition corresponding to the identified part of the constrained speech recognition, and providing the linguistic formulation hypothesis based on the identified part of the unconstrained speech recognition.

Type: Grant

Filed: February 17, 2005

Date of Patent: December 29, 2015

Assignee: Nuance Communications, Inc.

Inventors: Daniele Colibro, Claudio Vair, Luciano Fissore, Cosmin Popovici
Spoken dialog system using prominence

Patent number: 9202466

Abstract: The invention presents a method for analyzing speech in a spoken dialog system, comprising the steps of: accepting an utterance by at least one means for accepting acoustical signals, in particular a microphone, analyzing the utterance and obtaining prosodic cues from the utterance using at least one processing engine, wherein the utterance is evaluated based on the prosodic cues to determine a prominence of parts of the utterance, and wherein the utterance is analyzed to detect at least one marker feature, e.g. a negative statement, indicative of the utterance containing at least one part to replace at least one part in a previous utterance, the part to be replaced in the previous utterance being determined based on the prominence determined for the parts of the previous utterance and the replacement parts being determined based on the prominence of the parts in the utterance, and wherein the previous utterance is evaluated with the replacement part(s).

Type: Grant

Filed: March 18, 2013

Date of Patent: December 1, 2015

Assignee: HONDA RESEARCH INSTITUTE EUROPE GMBH

Inventor: Martin Heckmann
Determining word sequence constraints for low cognitive speech recognition

Patent number: 9196246

Abstract: A method for recognizing speech including a sequence of words determines a shape of a gesture and a location of the gesture with respect to a display device showing a set of interpretations of the speech. The method determines a type of the word sequence constraint based on the shape of the gesture and determines a value of the word sequence constraint based on the location of the gesture. Next, the speech is recognized using the word sequence constraint.

Type: Grant

Filed: July 22, 2014

Date of Patent: November 24, 2015

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Bret Harsham, John Hershey
Test method for cochlear implant stimulation strategies

Patent number: 9162069

Abstract: An arrangement and method for hearing implant signal processing is described. A hearing implant signal processor converts a speech signal input into multi-channel electrical stimulation signals for a cochlear implant electrode array. A feature adjustment module adjusts feature resolution of the electrical stimulation signals to produce a corresponding sequence of cochlear stimulation feature vectors. A speech recognition vector pre-processor maps the cochlear stimulation feature vectors into corresponding speech recognition feature vectors. A speech recognition engine evaluates the speech recognition features vectors with probabilistic state sequence models to produce speech recognition outputs corresponding to the speech signal input.

Type: Grant

Filed: December 11, 2013

Date of Patent: October 20, 2015

Assignee: MED-EL Elektromedizinische Geraete GmbH

Inventor: Werner Hemmert
Information processing device, large vocabulary continuous speech recognition method and program including hypothesis ranking

Patent number: 9165553

Abstract: System and method for performing speech recognition using acoustic invariant structure for large vocabulary continuous speech. An information processing device receives sound as input and performs speech recognition. The information processing device includes: a speech recognition processing unit for outputting a speech recognition score, a structure score calculation unit for calculation of a structure score that is a score that, with respect for each hypothesis concerning all phoneme pairs comprising the hypothesis, is found by applying phoneme pair-by-pair weighting to phoneme pair inter-distribution distance likelihood and then performing summation, and a ranking unit for ranking the multiple hypotheses based on a sum value of speech recognition score and structure score.

Type: Grant

Filed: January 18, 2013

Date of Patent: October 20, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Masayuki Suzuki, Masafumi Nishimura
Statistical post-filtering for hidden Markov modeling (HMM)-based speech synthesis

Patent number: 9159329

Abstract: A method and system for improving the quality of speech generated from Hidden Markov Model (HMM)-based Text-To-Speech Synthesizers using statistical post-filtering techniques. An example method involves: (a) determining a scale factor that, when applied to a synthesized reference spectral envelope, minimizes a statistical divergence between a natural reference spectral envelope and the synthesized reference spectral envelope, where the synthesized reference spectral envelope is generated by a state of an HMM; (b) for a given synthesized subject spectral envelope generated by the state of the HMM, determining an enhanced synthesized subject spectral envelope based on the determined scale factor; and (c) generating, by a computing device, a synthetic speech signal including the enhanced synthesized subject spectral envelope.

Type: Grant

Filed: December 5, 2012

Date of Patent: October 13, 2015

Assignee: Google Inc.

Inventors: Ioannis Agiomyrgiannakis, Florian Alexander Eyben
Method for testing hearing aids

Patent number: 9148732

Abstract: A method for testing hearing aids, particularly in respect of the effect thereof on speech comprehension, is improved. The method has the following steps: a) a test system is provided, b) an audible voice signal in the form of at least one meaningless syllable is selected and presented c) the meaningless syllable and a number of further meaningless syllables are displayed on a graphical user interface in the test system, d) a heard meaningless syllable is selected from the displayed meaningless syllables by the user, e) the selection made is evaluated, and f) method steps c) to e) are repeated until a termination criterion is encountered. The audible voice signal is selected in each case on the basis of at least one selection made by the person in response to a previously presented voice signal. The method allows particularly fast and informative hearing aid tests.

Type: Grant

Filed: March 10, 2011

Date of Patent: September 29, 2015

Assignee: Sivantos Pte. Ltd.

Inventors: Martina Bellanova, Matthias Latzel
Mimicking user speech patterns

Patent number: 9129602

Abstract: Approaches are described for generating an audio signal that mimics speech captured a computing device. An input audio signal (e.g., a speech signal) can be transformed from the time domain into another domain, to generate one or more audio signal segments, where each segment can correspond to a window of time. The device can then determine, for each audio signal segment, a feature characteristic of the audio signal, such as a phoneme. Each one of segments can be mapped, based at least in part on the respective feature characteristic, to a model audio signal. The device can then generate an output audio signal including each model audio signal as determined by the mapping, where the output audio signal is in a sequence associated with the input audio signal.

Type: Grant

Filed: December 14, 2012

Date of Patent: September 8, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Isaac Jeremy Shepard, Brian David Fisher
System for analyzing interactions and reporting analytic results to human operated and system interfaces in real time

Patent number: 9124697

Abstract: A computerized system for advising one communicant in electronic communication between two or more communicants has apparatus monitoring and recording interaction between the communicants, software executing from a machine-readable medium and providing analytics, the software functions including rendering speech into text, and analyzing the rendered text for topics, performing communicant verification, and detecting changes in communicant emotion. Advice is offered to the one communicant during the interaction, based on results of the analytics.

Type: Grant

Filed: May 6, 2013

Date of Patent: September 1, 2015

Assignee: GENESYS TELECOMMUNICATIONS LABORATORIES, INC.

Inventors: Mark Scott, Jim Barnett
Clustering apparatus, and clustering method

Patent number: 9075829

Abstract: A technique extracts an object that is characteristic although the number of appearances is less demanded. A clustering apparatus includes: a similarity degree calculating section calculating a similarity degree of a combination of optional two of objects to store the calculated similarity degree in a similarity degree table, excluding a combination of one of the optional two and itself; a merging object selecting section selecting as merging objects, two objects related to the similarity degree which satisfies a predetermined reference; a new object generating section generating a new object from the merging objects; a merging object removing section removing from the similarity degree table, a similarity degree between each of the two objects selected as the merging objects and each of the objects; and a new object adding section calculating a similarity degree between the new object and each of the plurality of objects other than the new object.

Type: Grant

Filed: April 8, 2010

Date of Patent: July 7, 2015

Assignee: NEC CORPORATION

Inventor: Yukiko Kuroiwa
Trellis based word decoder with reverse pass

Patent number: 9047540

Abstract: Systems, apparatuses, and methods to relate images of words to a list of words are provided. A trellis based word decoder analyses a set of OCR characters and probabilities using a forward pass across a forward trellis and a reverse pass across a reverse trellis. Multiple paths may result, however, the most likely path from the trellises has the highest probability with valid links. A valid link is determined from the trellis by some dictionary word traversing the link. The most likely path is compared with a list of words to find the word closest to the most.

Type: Grant

Filed: March 14, 2013

Date of Patent: June 2, 2015

Assignee: QUALCOMM Incorporated

Inventors: Pawan Kumar Baheti, Kishor K. Barman, Raj Kumar Krishna Kumar
Voice commands for transitioning between device states

Patent number: 9047857

Abstract: Techniques for transitioning an electronic device between device states. In one example, a voice-controlled device is configured to transition from a sleep state to an awake state in response to identifying a user speaking a predefined utterance. The techniques may determine whether the user has spoken the predefined utterance with reference to traditional speech-recognition techniques, as well as with reference to changes in the volume of a user's voice.

Type: Grant

Filed: December 19, 2012

Date of Patent: June 2, 2015

Assignee: Rawles LLC

Inventor: William F. Barton
Efficient exploitation of model complementariness by low confidence re-scoring in automatic speech recognition

Patent number: 9037463

Abstract: A method for speech recognition is described that uses an initial recognizer to perform an initial speech recognition pass on an input speech utterance to determine an initial recognition result corresponding to the input speech utterance, and a reliability measure reflecting a per word reliability of the initial recognition result. For portions of the initial recognition result where the reliability of the result is low, a re-evaluation recognizer is used to perform a re-evaluation recognition pass on the corresponding portions of the input speech utterance to determine a re-evaluation recognition result corresponding to the re-evaluated portions of the input speech utterance. The initial recognizer and the re-evaluation recognizer are complementary so as to make different recognition errors. A final recognition result is determined based on the re-evaluation recognition result if any, and otherwise based on the initial recognition result.

Type: Grant

Filed: May 27, 2010

Date of Patent: May 19, 2015

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Venkatesh Nagesha
Robust Information Extraction From Utterances

Publication number: 20150134336

Abstract: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.

Type: Application

Filed: November 12, 2013

Publication date: May 14, 2015

Inventors: Jun Huang, Yookyung Kim, Youssef Billawala, Farzad Ehsani, Demitrios Master
CONVERSATION BASED SEARCH SYSTEM AND METHOD

Publication number: 20150134337

Abstract: A conversation based search method includes the steps of: proposing an utterance phrase corresponding to a query input by a user; and reforming the query by means of an answer from the user and offering a search result corresponding to the reformed query.

Type: Application

Filed: November 11, 2014

Publication date: May 14, 2015

Inventors: Hee-Cheol Seo, Injae Lee, Taeil Kim, Youngho Na
Method and apparatus for enabling multimodal tags in a communication device by discarding redundant information in the tags training signals

Patent number: 9031843

Abstract: A method and apparatus for enabling multimodal tags in a communication device is disclosed. The method comprises receiving a first training signal and receiving a second training signal in conjunction with the first training signal. A multimodal tag is created by discarding redundant or non-discriminative information associated with each of the first and second training signals to represent a combination of the first training signal and the second training signal and a function is associated with the created multimodal tag.

Type: Grant

Filed: September 28, 2007

Date of Patent: May 12, 2015

Assignee: Google Technology Holdings LLC

Inventors: Changxue Ma, Harry M. Bliss
SELECTING ALTERNATES IN SPEECH RECOGNITION

Publication number: 20150127346

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

Type: Application

Filed: November 4, 2014

Publication date: May 7, 2015

Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9026442

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: August 14, 2014

Date of Patent: May 5, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Format based speech reconstruction from noisy signals

Patent number: 9020818

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 28, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
DOCUMENT SEARCH DEVICE AND DOCUMENT SEARCH METHOD

Publication number: 20150112683

Abstract: An utterance content estimator estimates a document ID corresponding to an answer to user input analysis results from a document on the basis of an utterance estimating model that is generated by learning a correspondence between hypothetical questions each as to a content of the document and document IDs each of which is an answer to one of the hypothetical questions. A result integrator integrates document estimation results of the utterance estimating model and document search results of search indexes so as to generate final search results.

Type: Application

Filed: December 27, 2012

Publication date: April 23, 2015

Applicant: Mitsubishi Electric Corporation

Inventors: Yoichi Fujii, Jun Ishii
Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data

Patent number: 9009041

Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.

Type: Grant

Filed: July 26, 2011

Date of Patent: April 14, 2015

Assignee: Nuance Communications, Inc.

Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
SYSTEM AND METHOD FOR CROWDSOURCING OF WORD PRONUNCIATION VERIFICATION

Publication number: 20150095031

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for crowdsourcing verification of word pronunciations. A system performing word pronunciation crowdsourcing identifies spoken words, or word pronunciations in a dictionary of words, for review by a turker. The identified words are assigned to one or more turkers for review. Assigned turkers listen to the word pronunciations, providing feedback on the correctness/incorrectness of the machine made pronunciation. The feedback can then be used to modify the lexicon, or can be stored for use in configuring future lexicons.

Type: Application

Filed: September 30, 2013

Publication date: April 2, 2015

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. CONKIE, Ladan GOLIPOUR, Taniya MISHRA
Methods and systems for synchronizing media

Patent number: 8996380

Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.

Type: Grant

Filed: May 4, 2011

Date of Patent: March 31, 2015

Assignee: Shazam Entertainment Ltd.

Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
Senone scoring for multiple input streams

Patent number: 8996374

Abstract: Embodiments of the present invention include an apparatus, method, and system for calculating senone scores for multiple concurrent input speech streams. The method can include the following: receiving one or more feature vectors from one or more input streams; accessing the acoustic model one senone at a time; and calculating separate senone scores corresponding to each incoming feature vector. The calculation uses a single read access to the acoustic model for a single senone and calculates a set of separate senone scores for the one or more feature vectors, before proceeding to the next senone in the acoustic model.

Type: Grant

Filed: November 6, 2012

Date of Patent: March 31, 2015

Assignee: Spansion LLC

Inventor: Ojas A. Bapat
Recognition confidence measuring by lexical distance between candidates

Patent number: 8990086

Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.

Type: Grant

Filed: July 31, 2006

Date of Patent: March 24, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong

prev 1 2 3 4 5 6 7 8 … next