Preliminary Matching Patents (Class 704/252)
-
Patent number: 10224030Abstract: In speech processing systems personalization is added in the Natural Language Understanding (NLU) processor by incorporating external knowledge sources of user information to improve entity recognition performance of the speech processing system. Personalization in the NLU is effected by incorporating one or more dictionaries of entries, or gazetteers, with information personal to a respective user, that provide the user's information to permit disambiguation of semantic interpretation for input utterances to improve quality of speech processing results.Type: GrantFiled: March 14, 2013Date of Patent: March 5, 2019Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Imre Attila Kiss, Arthur Richard Toth, Lambert Mathias
-
Patent number: 10224026Abstract: An electronic device comprising circuitry configured to record sensor data that is obtained from data sources and to retrieve information from the recorded sensor data using concepts that are defined by a user.Type: GrantFiled: March 2, 2017Date of Patent: March 5, 2019Assignee: SONY CORPORATIONInventors: Aurel Bordewieck, Fabien Cardinaux, Wilhelm Hagg, Thomas Kemp, Stefan Uhlich, Fritz Hohl
-
Patent number: 10152899Abstract: A training tool, method and a system for measuring crew member communication skills are disclosed, wherein an audio data processing terminal interfaced with a crew training apparatus, typically a crew-operated vehicle simulator. Audio data corresponding to a conversation between at least two crew members is recording during a training session and stored. Respective audio data of each crew member is extracted from the stored audio data, and a series of measures for at least one prosodic parameter in each respective audio data extracted is computed. A correlation coefficient of the series of measures is then computed, wherein the correlation coefficient is indicative of a level of prosodic accommodation between the at least two crew members. Specific communication skills in addition to prosodic accommodation performance can the be determined inferred.Type: GrantFiled: July 31, 2014Date of Patent: December 11, 2018Assignee: Crewfactors LimitedInventors: Brian Vaughan, Celine De Looze
-
Patent number: 10134386Abstract: Systems and methods for identifying content corresponding to a language are provided. Language spoken by a first user based on verbal input received from the first user is automatically determined with voice recognition circuitry. A database of content sources is cross-referenced to identify a content source associated with a language field value that corresponds to the determined language spoken by the first user. The language field in the database identifies the language that the associated content source transmits content to a plurality of users. A representation of the identified content source is generated for display to the first user.Type: GrantFiled: July 21, 2015Date of Patent: November 20, 2018Assignee: Rovi Guides, Inc.Inventor: Shuchita Mehra
-
Patent number: 10108608Abstract: A dialog state tracking system. One aspect of the system is the use of multiple utterance decoders and/or multiple spoken language understanding (SLU) engines generating competing results that improve the likelihood that the correct dialog state is available to the system and provide additional features for scoring dialog state hypotheses. An additional aspect is training a SLU engine and a dialog state scorer/ranker DSR engine using different subsets from a single annotated training data set. A further aspect is training multiple SLU/DSR engine pairs from inverted subsets of the annotated training data set. Another aspect is web-style dialog state ranking based on dialog state features using discriminative models with automatically generated feature conjunctions. Yet another aspect is using multiple parameter sets with each ranking engine and averaging the rankings. Each aspect independently improves dialog state tracking accuracy and may be combined in various combinations for greater improvement.Type: GrantFiled: June 12, 2014Date of Patent: October 23, 2018Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Jason D. Williams, Geoffrey G. Zweig
-
Patent number: 10043516Abstract: Systems and processes for operating an automated assistant are disclosed. In one example process, an electronic device provides an audio output via a speaker of the electronic device. While providing the audio output, the electronic device receives, via a microphone of the electronic device, a natural language speech input. The electronic device derives a representation of user intent based on the natural language speech input and the audio output, identifies a task based on the derived user intent; and performs the identified task.Type: GrantFiled: December 20, 2016Date of Patent: August 7, 2018Assignee: Apple Inc.Inventors: Harry J. Saddler, Aimee T. Piercy, Garrett L. Weinberg, Susan L. Booker
-
Patent number: 9911409Abstract: A speech recognition apparatus includes a processor configured to recognize a user's speech using any one or combination of two or more of an acoustic model, a pronunciation dictionary including primitive words, and a language model including primitive words; and correct word spacing in a result of speech recognition based on a word-spacing model.Type: GrantFiled: July 21, 2016Date of Patent: March 6, 2018Assignee: Samsung Electronics Co., Ltd.Inventor: Seokjin Hong
-
Patent number: 9805715Abstract: A method of recognizing speech commands includes generating a background acoustic model for a sound using a first sound sample, the background acoustic model characterized by a first precision metric. A foreground acoustic model is generated for the sound using a second sound sample, the foreground acoustic model characterized by a second precision metric. A third sound sample is received and decoded by assigning a weight to the third sound sample corresponding to a probability that the sound sample originated in a foreground using the foreground acoustic model and the background acoustic model. The method further includes determining if the weight meets predefined criteria for assigning the third sound sample to the foreground and, when the weight meets the predefined criteria, interpreting the third sound sample as a portion of a speech command. Otherwise, recognition of the third sound sample as a portion of a speech command is forgone.Type: GrantFiled: December 13, 2013Date of Patent: October 31, 2017Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Shuai Yue, Li Lu, Xiang Zhang, Dadong Xie, Haibo Liu, Bo Chen, Jian Liu
-
Patent number: 9784765Abstract: A system and method are provided for graphically actuating a trigger in a test and measurement device. The method includes displaying a visual representation of signal properties for one or more time-varying signals. A graphical user input is received, in which a portion of the visual representation is designated. The method further includes configuring a trigger of the test and measurement device in response to the graphical user input, by setting a value for a trigger parameter of the trigger. The set value for the trigger parameters varies with and is dependent upon the particular portion of the visual representation that is designated by the graphical user input. The trigger is then employed in connection with subsequent monitoring of signals within the test and measurement device.Type: GrantFiled: November 3, 2009Date of Patent: October 10, 2017Assignee: Tektronix, Inc.Inventors: Kathryn A. Engholm, Cecilia A. Case
-
Patent number: 9704482Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.Type: GrantFiled: March 11, 2015Date of Patent: July 11, 2017Assignee: International Business Machines CorporationInventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
-
Patent number: 9703394Abstract: In some examples, a method includes outputting a graphical keyboard (120) for display and responsive to receiving an indication of a first input (124), determining a new character string that is not included in a language model. The method may include adding the new character string to the language model and associating a likelihood value with the new character string. The method may include, responsive to receiving an indication of a second input, predicting the new character string, and responsive to receiving an indication of a third input that rejects the new character string, decreasing the likelihood value associated with the new character string.Type: GrantFiled: October 1, 2015Date of Patent: July 11, 2017Assignee: Google Inc.Inventors: Yu Ouyang, Shumin Zhai
-
Patent number: 9697830Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.Type: GrantFiled: June 25, 2015Date of Patent: July 4, 2017Assignee: International Business Machines CorporationInventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
-
Patent number: 9672201Abstract: Systems, methods and apparatus for learning parsing rules and argument identification from crowdsourcing of proposed command inputs. Crowdsourcing techniques to generate rules for parsing input sentences. A parse is used to determine whether the input sentence invokes a specific action, and if so, what arguments are to be passed to the invocation of the action.Type: GrantFiled: April 27, 2016Date of Patent: June 6, 2017Assignee: Google Inc.Inventors: Jakob D. Uszkoreit, Percy Liang
-
Patent number: 9620109Abstract: A server and a guide sentence generating method are provided. The method includes receiving user speech, analyzing the user speech, determining a category of the user speech from among a plurality of categories, storing the user speech in the determined category, determining a usage frequency and a popularity of each of the plurality of categories, selecting a category from among the plurality of categories based on the usage frequency and the popularity, and generating a guide sentence corresponding to the selected category.Type: GrantFiled: February 18, 2015Date of Patent: April 11, 2017Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: In-jee Song, Ji-hye Chung
-
Patent number: 9542936Abstract: A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.Type: GrantFiled: May 2, 2013Date of Patent: January 10, 2017Assignee: Genesys Telecommunications Laboratories, Inc.Inventors: Amir Lev-Tov, Avi Faizakof, Yochai Konig
-
Patent number: 9508339Abstract: A method for updating language understanding classifier models includes receiving via one or more microphones of a computing device, a digital voice input from a user of the computing device. Natural language processing using the digital voice input is used to determine a user voice request. Upon determining the user voice request does not match at least one of a plurality of pre-defined voice commands in a schema definition of a digital personal assistant, a GUI of an end-user labeling tool is used to receive a user selection of at least one of the following: at least one intent of a plurality of available intents and/or at least one slot for the at least one intent. A labeled data set is generated by pairing the user voice request and the user selection, and is used to update a language understanding classifier.Type: GrantFiled: January 30, 2015Date of Patent: November 29, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Vishwac Sena Kannan, Aleksandar Uzelac, Daniel J. Hwang
-
Patent number: 9436382Abstract: Natural language image editing techniques are described. In one or more implementations, a natural language input is converted from audio data using a speech-to-text engine. A gesture is recognized from one or more touch inputs detected using one or more touch sensors. Performance is then initiated of an operation identified from a combination of the natural language input and the recognized gesture.Type: GrantFiled: November 21, 2012Date of Patent: September 6, 2016Assignee: Adobe Systems IncorporatedInventors: Gregg D. Wilensky, Walter W. Chang, Lubomira A. Dontcheva, Gierad P. Laput, Aseem O. Agarwala
-
Patent number: 9275139Abstract: System and method to search audio data, including: receiving audio data representing speech; receiving a search query related to the audio data; compiling, by use of a processor, the search query into a hierarchy of scored speech recognition sub-searches; searching, by use of a processor, the audio data for speech identified by one or more of the sub-searches to produce hits; and combining, by use of a processor, the hits by use of at least one combination function to provide a composite search score of the audio data. The combination function may include an at-least-M-of-N function that produces a high score when at least M of N function inputs exceed a predetermined threshold value. The composite search score employ a soft time window such as a spline function.Type: GrantFiled: September 27, 2012Date of Patent: March 1, 2016Assignee: Aurix LimitedInventor: Keith Michael Ponting
-
Patent number: 9251141Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an entity identification model. In one aspect, a method includes obtaining a plurality of complete sentences that each include entity text that references a first entity; for each complete sentence in the plurality of complete sentences: providing a first portion of the complete sentence as input to an entity identification model that determines a predicted entity for the first portion of the complete sentence, the first portion being less than all of the complete sentence; comparing the predicted entity to the first entity; and updating the entity identification model based on the comparison of the predicted entity to the first entity.Type: GrantFiled: May 12, 2014Date of Patent: February 2, 2016Assignee: Google Inc.Inventors: Maxim Gubin, Sangsoo Sung, Krishna Bharat, Kenneth W. Dauber
-
Patent number: 9117452Abstract: A language processing system identifies, from log data, command inputs that parsed to a parsing rule associated with an action. If the command input has a signal indicative of user satisfaction, where the signal is derived from data that is not generated from performance of the action (e.g., user interactions with data provided in response to the performance of another, different action; resources identified in response to the performance of another, different action having a high quality score; etc.), then exception data is generated for the parsing rule. The exception data specifies the particular instance of the sentence parsed by the parsing rule, and precludes invocation of the action associated with the rule.Type: GrantFiled: June 25, 2013Date of Patent: August 25, 2015Assignee: Google Inc.Inventors: Jakob D. Uszkoreit, Percy Liang, Daniel M. Bikel
-
Patent number: 9037470Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.Type: GrantFiled: June 25, 2014Date of Patent: May 19, 2015Assignee: West Business Solutions, LLCInventors: Mark J. Pettay, Fonda J. Narke
-
Patent number: 9026431Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for semantic parsing with multiple parsers. One of the methods includes obtaining one or more transcribed prompt n-grams from a speech to text recognizer, providing the transcribed prompt n-grams to a first semantic parser that executes on the user device and accesses a first knowledge base for results responsive to the spoken prompt, providing the transcribed prompt n-grams to a second semantic parser that accesses a second knowledge base for results responsive to the spoken prompt, the first knowledge base including first data not included in the second knowledge base, receiving a result responsive to the spoken prompt from the first semantic parser or the second semantic parser, wherein the result is selected from the knowledge base associated with the semantic parser that provided the result to the user device, and performing an operation based on the result.Type: GrantFiled: July 30, 2013Date of Patent: May 5, 2015Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Diego Melendo Casado, Fadi Biadsy
-
Patent number: 9026447Abstract: A first communication path for receiving a communication is established. The communication includes speech, which is processed. A speech pattern is identified as including a voice-command. A portion of the speech pattern is determined as including the voice-command. That portion of the speech pattern is separated from the speech pattern and compared with a second speech pattern. If the two speech patterns match or resemble each other, the portion of the speech pattern is accepted as the voice-command. An operation corresponding to the voice-command is determined and performed. The operation may perform an operation on a remote device, forward the voice-command to a remote device, or notify a user. The operation may create a second communication path that may allow a headset to join in a communication between another headset and a communication device, several headsets to communicate with each other, or a headset to communicate with several communication devices.Type: GrantFiled: January 25, 2008Date of Patent: May 5, 2015Assignee: CenturyLink Intellectual Property LLCInventors: Erik Geldbach, Kelsyn D. Rooks, Sr., Shane M. Smith, Mark Wilmoth
-
Patent number: 9015043Abstract: A computer-implemented method includes receiving an electronic representation of one or more human voices, recognizing words in a first portion of the electronic representation of the one or more human voices, and sending suggested search terms to a display device for display to a user in a text format. The suggested search terms are based on the recognized words in the first portion of the electronic representation of the one or more human voices. A search query is received from the user, which includes one or more of the suggested search terms that were displayed to the user.Type: GrantFiled: October 1, 2010Date of Patent: April 21, 2015Assignee: Google Inc.Inventor: Scott Jenson
-
Patent number: 9009025Abstract: In some implementations, a digital work provider may provide language model information related to a plurality of different contexts, such as a plurality of different digital works. For example, the language model information may include language model difference information identifying a plurality of sequences of one or more words in a digital work that have probabilities of occurrence that differ from probabilities of occurrence in a base language model by a threshold amount. The language model difference information corresponding to a particular context may be used in conjunction with the base language model to recognize an utterance made by a user of a user device. In some examples, the recognition is performed on the user device. In other examples, the utterance and associated context information are sent over a network to a recognition computing device that performs the recognition.Type: GrantFiled: December 27, 2011Date of Patent: April 14, 2015Assignee: Amazon Technologies, Inc.Inventor: Brandon W. Porter
-
Patent number: 9002710Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.Type: GrantFiled: September 12, 2012Date of Patent: April 7, 2015Assignee: Nuance Communications, Inc.Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
-
Patent number: 8996368Abstract: A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.Type: GrantFiled: February 22, 2010Date of Patent: March 31, 2015Assignee: Nuance Communications, Inc.Inventor: Daniel Willett
-
Patent number: 8990071Abstract: A method for managing an interaction of a calling party to a communication partner is provided. The method includes automatically determining if the communication partner expects DTMF input. The method also includes translating speech input to one or more DTMF tones and communicating the one or more DTMF tones to the communication partner, if the communication partner expects DTMF input.Type: GrantFiled: March 29, 2010Date of Patent: March 24, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Yun-Cheng Ju, Stefanie Tomko, Frank Liu, Ivan Tashev
-
Patent number: 8990086Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.Type: GrantFiled: July 31, 2006Date of Patent: March 24, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
-
Patent number: 8990070Abstract: A method, system and computer program product for building an expression, including utilizing any formal grammar of a context-free language, displaying an expression on a computer display via a graphical user interface, replacing at least one non-terminal display object within the displayed expression with any of at least one non-terminal display object and at least one terminal display object, and repeating the replacing step a plurality of times for a plurality of non-terminal display objects until no non-terminal display objects remain in the displayed expression, wherein the non-terminal display objects correspond to non-terminal elements within the grammar, and wherein the terminal display objects correspond to terminal elements within the grammar.Type: GrantFiled: November 18, 2011Date of Patent: March 24, 2015Assignee: International Business Machines CorporationInventors: Yigal S. Dayan, Gil Fuchs, Josemina M. Magdalen
-
Patent number: 8983841Abstract: A network communication node includes an audio outputter that outputs an audible representation of data to be provided to a requester. The network communication node also includes a processor that determines a categorization of the data to be provided to the requester and that varies a pause between segments of the audible representation of the data in accordance with the categorization of the data to be provided to the requester.Type: GrantFiled: July 15, 2008Date of Patent: March 17, 2015Assignee: AT&T Intellectual Property, I, L.P.Inventors: Gregory Pulz, Steven Lewis, Charles Rajnai
-
Patent number: 8977549Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: September 26, 2013Date of Patent: March 10, 2015Assignee: Nuance Communications, Inc.Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 8977547Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.Type: GrantFiled: October 8, 2009Date of Patent: March 10, 2015Assignee: Mitsubishi Electric CorporationInventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
-
Patent number: 8938382Abstract: An item of information (212) is transmitted to a distal computer (220), translated to a different sense modality and/or language (222), and in substantially real time, and the translation (222) is transmitted back to the location (211) from which the item was sent. The device sending the item is preferably a wireless device, and more preferably a cellular or other telephone (210). The device receiving the translation is also preferably a wireless device, and more preferably a cellular or other telephone, and may advantageously be the same device as the sending device. The item of information (212) preferably comprises a sentence of human of speech having at least ten words, and the translation is a written expression of the sentence. All of the steps of transmitting the item of information, executing the program code, and transmitting the translated information preferably occurs in less than 60 seconds of elapsed time.Type: GrantFiled: March 21, 2012Date of Patent: January 20, 2015Assignee: Ulloa Research Limited Liability CompanyInventor: Robert D. Fish
-
Patent number: 8938388Abstract: Maintaining and supplying a plurality of speech models is provided. A plurality of speech models and metadata for each speech model are stored. A query for a speech model is received from a source. The query includes one or more conditions. The speech model with metadata most closely matching the supplied one or more conditions is determined. The determined speech model is provided to the source. A refined speech model is received from the source, and the refined speech model is stored.Type: GrantFiled: July 9, 2012Date of Patent: January 20, 2015Assignee: International Business Machines CorporationInventors: Bin Jia, Ying Liu, E. Feng Lu, Jia Wu, Zhen Zhang
-
Patent number: 8930187Abstract: An apparatus for utilizing textual data and acoustic data corresponding to speech data to detect sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including evaluating textual data and acoustic data corresponding to voice data associated with captured speech content. The computer program code may further cause the apparatus to analyze the textual data and the acoustic data to detect whether the textual data or the acoustic data includes one or more words indicating at least one sentiment of a user that spoke the speech content. The computer program code may further cause the apparatus to assign at least one predefined sentiment to at least one of the words in response to detecting that the word(s) indicates the sentiment of the user. Corresponding methods and computer program products are also provided.Type: GrantFiled: January 3, 2012Date of Patent: January 6, 2015Assignee: Nokia CorporationInventors: Imre Attila Kiss, Joseph Polifroni, Francois Mairesse, Mark Adler
-
Patent number: 8930191Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from a user. In response to the user request, (1) an echo of the speech input based on a textual interpretation of the speech input, and (2) a paraphrase of the user request based at least in part on a respective semantic interpretation of the speech input are presented to the user.Type: GrantFiled: March 4, 2013Date of Patent: January 6, 2015Assignee: Apple Inc.Inventors: Thomas Robert Gruber, Harry Joseph Saddler, Adam John Cheyer, Dag Kittlaus, Christopher Dean Brigham, Richard Donald Giuli, Didier Rene Guzzoni, Marcello Bastea-Forte
-
Patent number: 8924214Abstract: A method for detecting and recognizing speech is provided that remotely detects body motions from a speaker during vocalization with one or more radar sensors. Specifically, the radar sensors include a transmit aperture that transmits one or more waveforms towards the speaker, and each of the waveforms has a distinct wavelength. A receiver aperture is configured to receive the scattered radio frequency energy from the speaker. Doppler signals correlated with the speaker vocalization are extracted with a receiver. Digital signal processors are configured to develop feature vectors utilizing the vocalization Doppler signals, and words associated with the feature vectors are recognized with a word classifier.Type: GrantFiled: June 7, 2011Date of Patent: December 30, 2014Assignee: The United States of America, as represented by the Secretary of the NavyInventors: Jefferson M Willey, Todd Stephenson, Hugh Faust, James P. Hansen, George J Linde, Carol Chang, Justin Nevitt, James A Ballas, Thomas Herne Crystal, Vincent Michael Stanford, Jean W. De Graaf
-
Patent number: 8924197Abstract: Disclosed are systems, methods, and computer readable media for converting a natural language query into a logical query. The method embodiment comprises receiving a natural language query and converting the natural language query using an extensible engine to generate a logical query, the extensible engine being linked to the toolkit and knowledge base. In one embodiment, a natural language query can be processed in a domain independent method to generate a logical query.Type: GrantFiled: October 30, 2007Date of Patent: December 30, 2014Assignee: Semantifi, Inc.Inventors: Sreenivasa Rao Pragada, Viswanath Dasari, Abhijit A Patil
-
Patent number: 8918320Abstract: An apparatus for generating a review based in part on detected sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including determining a location(s) of the apparatus and a time(s) that the location(s) was determined responsive to capturing voice data of speech content associated with spoken reviews of entities. The computer program code may further cause the apparatus to analyze textual and acoustic data corresponding to the voice data to detect whether the textual or acoustic data includes words indicating a sentiment(s) of a user speaking the speech content. The computer program code may further cause the apparatus to generate a review of an entity corresponding to a spoken review(s) based on assigning a predefined sentiment to a word(s) responsive to detecting that the word indicates the sentiment of the user. Corresponding methods and computer program products are also provided.Type: GrantFiled: January 3, 2012Date of Patent: December 23, 2014Assignee: Nokia CorporationInventors: Mark Adler, Imre Attila Kiss, Francois Mairesse, Joseph Polifroni
-
Patent number: 8918316Abstract: The content of a media program is recognized by analyzing its audio content to extract therefrom prescribed features, which are compared to a database of features associated with identified content. The identity of the content within the database that has features that most closely match the features of the media program being played is supplied as the identity of the program being played. The features are extracted from a frequency domain version of the media program by a) filtering the coefficients to reduce their number, e.g., using triangular filters; b) grouping a number of consecutive outputs of triangular filters into segments; and c) selecting those segments that meet prescribed criteria, such as those segments that have the largest minimum segment energy with prescribed constraints that prevent the segments from being too close to each other. The triangular filters may be log-spaced and their output may be normalized.Type: GrantFiled: July 29, 2003Date of Patent: December 23, 2014Assignee: Alcatel LucentInventors: Jan I Ben, Christopher J Burges, Madjid Sam Mousavi, Craig R. Nohl
-
Patent number: 8914287Abstract: One embodiment may take the form of a voice control system. The system may include a first apparatus with a processing unit configured to execute a voice recognition module and one or more executable commands, and a receiver coupled to the processing unit and configured to receive a first audio file from a remote control device. The first audio file may include at least one voice command. The first apparatus may further include a communication component coupled to the processing unit and configured to receive programming content, and one or more storage media storing the voice recognition module. The voice recognition module may be configured to convert voice commands into text.Type: GrantFiled: January 28, 2011Date of Patent: December 16, 2014Assignee: EchoStar Technologies L.L.C.Inventors: Jeremy Mickelsen, Nathan A. Hale, Benjamin Mauser, David A. Innes, Brad Bylund
-
Patent number: 8914290Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.Type: GrantFiled: May 18, 2012Date of Patent: December 16, 2014Assignee: Vocollect, Inc.Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
-
Patent number: 8909528Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.Type: GrantFiled: May 9, 2007Date of Patent: December 9, 2014Assignee: Nuance Communications, Inc.Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
-
Patent number: 8903724Abstract: A speech recognition device includes, a speech recognition section that conducts a search, by speech recognition, on audio data stored in a first memory section to extract word-spoken portions where plural words transferred are each spoken and, of the word-spoken portions extracted, rejects the word-spoken portion for the word designated as a rejecting object; an acquisition section that obtains a derived word of a designated search target word, the derived word being generated in accordance with a derived word generation rule stored in a second memory section or read out from the second memory section; a transfer section that transfers the derived word and the search target word to the speech recognition section, the derived word being set to the outputting object or the rejecting object by the acquisition section; and an output section that outputs the word-spoken portion extracted and not rejected in the search.Type: GrantFiled: February 1, 2012Date of Patent: December 2, 2014Assignee: Fujitsu LimitedInventors: Nobuyuki Washio, Shouji Harada
-
Patent number: 8903727Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.Type: GrantFiled: March 6, 2013Date of Patent: December 2, 2014Assignee: Nuance Communications, Inc.Inventors: Liam David Comerford, Mahesh Viswanathan
-
Patent number: 8892446Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from the user. The user request is processed to obtain a representation of user intent, where the representation of user intent associates the user request with a task flow operationalizing a requested task, and the task flow is operable to invoke a plurality of services each supporting functions according to a respective plurality of service parameters. Based on the representation of user intent, one or more relevant task parameters are identified from a plurality of task parameters of the task flow. A subset of the plurality of services are selectively invoked during execution of the task flow, where the selectively invoked subset of the plurality of services support functions according to the identified one or more relevant task parameters.Type: GrantFiled: December 21, 2012Date of Patent: November 18, 2014Assignee: Apple Inc.Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
-
Patent number: 8874440Abstract: A speech detection apparatus and method are provided. The speech detection apparatus and method determine whether a frame is speech or not using feature information extracted from an input signal. The speech detection apparatus may estimate a situation related to an input frame and determine which feature information is required for speech detection for the input frame in the estimated situation. The speech detection apparatus may detect a speech signal using dynamic feature information that may be more suitable to the situation of a particular frame, instead of using the same feature information for each and every frame.Type: GrantFiled: April 16, 2010Date of Patent: October 28, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Chi-youn Park, Nam-hoon Kim, Jeong-mi Cho
-
Patent number: 8868409Abstract: In some implementations, audio data for an utterance is provided over a network. At a client device and over the network, information is received that indicates candidate transcriptions for the utterance and semantic information for the candidate transcriptions. A semantic parser is used at the client device to evaluate each of at least a plurality of the candidate transcriptions. One of the candidate transcriptions is selected based on at least the received semantic information and the output of the semantic parser for the plurality of candidate transcriptions that are evaluated.Type: GrantFiled: January 16, 2014Date of Patent: October 21, 2014Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Fadi Biadsy, Diego Melendo Casado
-
Patent number: 8818816Abstract: A voice recognition device includes a voice input unit 11 for inputting a voice of an uttered button name to convert the voice into an electric signal, a voice recognition processing unit 12 for performing a voice recognition process according to a sound signal sent thereto, as the electric signal, from the voice input unit, a button candidate detecting unit 13 for detecting, as a button candidate, a button having a button name which partially matches a voice recognition result acquired by the voice recognition processing unit, a display control unit 15 for, when a plurality of candidate buttons are detected by the button candidate detecting unit, producing a screen showing a state in which at least one of the plurality of button candidates is selected, and a display unit 16 for displaying the screen produced by the display control unit.Type: GrantFiled: April 23, 2009Date of Patent: August 26, 2014Assignee: Mitsubishi Electric CorporationInventors: Yuzuru Inoue, Takayoshi Chikuri, Yuki Furumoto