Markov Patents (Class 704/256)
  • Patent number: 10713313
    Abstract: An internet question answering method is applied in an internet question answering system. A current-turn user question is received. Based on a discourse labeling model, a discourse of the current-turn user question is labeled as “discourse continue” or “discourse end” online and real-time. Based on the discourse of the current-turn user question, whether to clear at least one discourse of an old boundary is determined. The internet question answering system answers the current-turn user question. A match degree between a next-turn user question and the segment of the dialog log is compared to determine whether the discourse of the current-turn user question is correct, whether to correct the discourse of the current-turn user question, and whether to adjust the discourse labeling model by using the current-turn user question.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: July 14, 2020
    Assignee: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventors: Min-Hsin Shen, Ching-Hsien Li, Chung-Jen Chiu
  • Patent number: 10671428
    Abstract: An exemplary method for using a virtual assistant may include, at an electronic device configured to transmit and receive data, receiving a user request for a service from a virtual assistant; determining at least one task to perform in response to the user request; estimating at least one performance characteristic for completion of the at least one task with the electronic device, based on at least one heuristic; based on the estimating, determining whether to execute the at least one task at the electronic device; in accordance with a determination to execute the at least one task at the electronic device, causing the execution of the at least one task at the electronic device; in accordance with a determination to execute the at least one task outside the electronic device: generating executable code for carrying out the least one task; and transmitting the executable code from the electronic device.
    Type: Grant
    Filed: May 26, 2016
    Date of Patent: June 2, 2020
    Assignee: Apple Inc.
    Inventor: Nicolas Zeitlin
  • Patent number: 10659488
    Abstract: A statistical model for predicting an expected path length (“EPL”) of the steps of an attacker is described. The model is based on utilizing vulnerability information along with an attack graph. Using the model, it is possible to identify the interaction among vulnerabilities and individual variables or risk factors that drive the EPL. Gaining a better understanding of the relationship between the vulnerabilities and their interactions can provide security administrators with a better view and understanding of their security status. In addition, a number of different attributable variables and their contribution in estimating the EPL can be ranked. Thus, it is possible to utilize the ranking process to take precautions and actions to minimize the EPL.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: May 19, 2020
    Assignee: University of South Florida
    Inventors: Sasith Maduranga Rajasooriya, Chris Peter Tsokos, Pubudu Kalpani K. Hitigala Kaluarachchilage
  • Patent number: 10650150
    Abstract: According to the embodiments, a statistical model is developed to estimate the probability of being in a certain stage of a particular vulnerability in its life cycle. The methodology with the application of Markov chain theory gives the basis for calculating estimates for probabilities for different stages of a life cycle of the vulnerability considered. Using the developed method, it is possible to evaluate the risk level of a particular vulnerability at a certain time. These developments allow an advantage in taking measures to avoid exploitations and introduce patches for the vulnerability before an attacker takes the advantage of that particular vulnerability.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: May 12, 2020
    Assignee: University of South Florida
    Inventors: Sasith Maduranga Rajasooriya, Chris Peter Tsokos, Pubudu Kalpani K Hitigala Kaluarachchilage
  • Patent number: 10635698
    Abstract: A dialogue system including: an input receiving data relating to a speech or text signal originating from a user; and a processor configured to: update a system state based on the input data using a state tracker model, the system state including probability values associated with each of plural possible values for each of plural categories; identify one or more relevant categories based on at least part of the updated system state information using an identifier model; define a set of information from stored information including plural action functions and categories, excluding categories not identified as relevant; generate a reduced system state, including the probability values associated with one or more of the plural possible values for each relevant category; determine an action based on the reduced system state and the set of information using a policy model; output information specified by the determined action at an output.
    Type: Grant
    Filed: August 28, 2017
    Date of Patent: April 28, 2020
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Alexandros Papangelis, Ioannis Stylianou
  • Patent number: 10623557
    Abstract: Embodiments of the present invention provide a method, system and computer program product for telephonic fraud prevention. In an embodiment of the invention, a method of telephonic fraud prevention includes speech recognizing in real-time, audio of a telephone call from a caller to a called party. Once the real-time audio has been speech recognized into text, the text may be parsed into words and the words compared to a hierarchical semantic tree of terms. On the condition that a match is detected between one of the words and a node of the tree, a level of the tree is identified at which the match is detected and the level is correlated to a degree of risk of the telephone call. Finally, an alert is displayed to the called party reflecting the degree of risk.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: April 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Norman Kung, Tao-Hung Jung, Ching-Chun Liu, Yu-siang Chen, Ci-Wei Lan
  • Patent number: 10607594
    Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: March 31, 2020
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
  • Patent number: 10573299
    Abstract: Methods and systems for a transportation vehicle are provided. One method includes receiving a user input for a valid communication session by a processor executable, digital assistant at a device on a transportation vehicle; tagging by the digital assistant, the user input words with a grammatical connotation; generating an action context, a filter context and a response context by a neural network, based on the tagged user input; storing by the digital assistant, a key-value pair for a parameter of the filter context at a short term memory, based on an output from the neural network; updating by the digital assistant, the key-value pair at the short term memory after receiving a reply to a follow-up request and another output from the trained neural network; and providing a response to the reply by the digital assistant.
    Type: Grant
    Filed: April 16, 2018
    Date of Patent: February 25, 2020
    Assignee: Panasonic Avionics Corporation
    Inventors: Rawad Hilal, Gurmukh Khabrani, Chin Perng
  • Patent number: 10507121
    Abstract: Described is a system for decoding recorded signals into movement commands of a prosthetic device. Using a biomechanical model and physical action data, biological signal data is related to kinetic data. The physical action data can include position, joint angle, speed, and acceleration of a part of a limb. The biological signal data can include recorded neural signals and recorded muscle signals. The kinetic data can include force, power, torque, and stress. Based on the relationship between the biological signal data and the kinetic data, control commands are generated to achieve an intended position and/or movement of a prosthesis.
    Type: Grant
    Filed: October 17, 2016
    Date of Patent: December 17, 2019
    Assignee: HRL Laboratories, LLC
    Inventors: Heiko Hoffmann, Vincent De Sapio, Darren J. Earl
  • Patent number: 10395646
    Abstract: Described herein are systems and methods for two-stage training of a spoken dialog system. The first stage trains a policy network using external data to produce a semi-trained policy network. The external data includes one or more known fixed dialogs. The second stage trains the semi-trained policy network through interaction to produce a trained policy network. The interaction may be interaction with a user simulator.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: August 27, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Seyed Mehdi Fatemi Booshehri, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
  • Patent number: 10366336
    Abstract: The present invention relates to a method and apparatus for exploiting human feedback in an intelligent automated assistant. One embodiment of a method for conducting an interaction with a human user includes inferring an intent from data entered by the human user, formulating a response in accordance with the intent, receiving feedback from a human advisor in response to at least one of the inferring and the formulating, wherein the human advisor is a person other than the human user, and adapting at least one model used in at least one of the inferring and the formulating, wherein the adapting is based on the feedback.
    Type: Grant
    Filed: September 1, 2010
    Date of Patent: July 30, 2019
    Assignee: SRI International
    Inventors: Gokhan Tur, Horacio E. Franco, William S. Mark, Norman D. Winarsky, Bart Peintner, Michael J. Wolverton, Neil Yorke-Smith
  • Patent number: 10341785
    Abstract: The application relates to a hearing device comprising a) an input unit for delivering a time varying electric input signal representing an audio signal comprising at least two sound sources, b) a cyclic analysis buffer unit of length A adapted for storing the last A audio samples, c) a cyclic synthesis buffer unit of length, where L is smaller than A, adapted for storing the last L audio samples, which are intended to be separated in individual sound sources, d) a database having stored recorded sound examples from said at least two sound sources, each entry in the database being termed an atom, the atoms originating from audio samples from first and second buffers corresponding in size to said synthesis and analysis buffer units, where for each atom, the audio samples from the first buffer overlaps with the audio samples from the second buffer, and where atoms originating from the first buffer constitute a reconstruction dictionary, and where atoms originating from the second buffer constitute an analysis d
    Type: Grant
    Filed: October 5, 2015
    Date of Patent: July 2, 2019
    Assignee: OTICON A/S
    Inventors: Thomas Barker, Tuomas Virtanen, Niels Henrik Pontoppidan
  • Patent number: 10249290
    Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.
    Type: Grant
    Filed: June 11, 2018
    Date of Patent: April 2, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
  • Patent number: 10176819
    Abstract: A method for converting speech using phonetic posteriorgrams (PPGs). A target speech is obtained and a PPG is generated based on acoustic features of the target speech. Generating the PPG may include using a speaker-independent automatic speech recognition (SI-ASR) system for equalizing different speakers. The PPG includes a set of values corresponding to a range of times and a range of phonetic classes, the phonetic classes corresponding to senones. A mapping between the PPG and one or more segments of the target speech is generated. A source speech is obtained, and the source speech is converted into a converted speech based on the PPG and the mapping.
    Type: Grant
    Filed: June 9, 2017
    Date of Patent: January 8, 2019
    Assignee: The Chinese University of Hong Kong
    Inventors: Lifa Sun, Kun Li, Hao Wang, Shiyin Kang, Mei Ling Helen Meng
  • Patent number: 10140973
    Abstract: Systems, methods, and devices for generating text-to-speech output using previously captured speech are described. Spoken audio is obtained and undergoes speech processing to create text. The resulting text is stored with the spoken audio, with both the text and the spoken audio being associated with the individual that spoke the audio. Various spoken audio and corresponding text are stored over time to create a library of speech units. When the individual sends a text message to a recipient, the text message is processed to determine portions of text, and the portions of text are compared to the library of text associated with the individual. When text in the library is identified, the system selects the spoken audio units associated with the identified stored text. The selected spoken audio units are then used to generate output audio data corresponding to the original text message, with the output audio data being sent to a device of the message recipient.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: November 27, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Manish Kumar Dalmia, Rafal Kuklinski
  • Patent number: 10109271
    Abstract: A frame erasure concealment technique for a bitstream-based feature extractors in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.
    Type: Grant
    Filed: May 19, 2014
    Date of Patent: October 23, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Patent number: 10102849
    Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: October 16, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
  • Patent number: 10074384
    Abstract: A state estimating apparatus includes: a spectrum calculating unit which calculates a power spectrum for each of a plurality of frequencies on a frame-by-frame basis from a voice signal containing voice of a first speaker and voice of a second speaker transmitted over a telephone line; a band power calculating unit which calculates power of a non-transmission band on a frame-by-frame basis, based on the power spectra of frequencies contained in the non-transmission band among the plurality of frequencies; a transmitted-voice judging unit which determines that any frame whose power in the non-transmission band is greater than a threshold value indicating the presence of voice carries the voice of the first speaker; and a state judging unit which judges whether the state of mind of the first speaker is normal or abnormal, based on the frame judged to carry the voice of the first speaker.
    Type: Grant
    Filed: February 12, 2014
    Date of Patent: September 11, 2018
    Assignee: FUJITSU LIMITED
    Inventor: Shoji Hayakawa
  • Patent number: 10056076
    Abstract: According to some embodiments of the present invention there is provided a computerized method for speech processing using a Gaussian Mixture Model. The method comprises the action of receiving by hardware processor(s) two or more covariance values representing relationships between distributions of speech coefficient values that represent two or more audible input speech signals recorded by a microphone. The method comprises the action of computing two or more eigenvectors and eignevalues using a principle component analysis of the covariance values and transforming the speech coefficient values using the eigenvectors and computing two or more second covariance values from the transformed speech coefficient values. The method comprises the action of modifying some of the second covariance values according to the eignevalues, the covariance values, and two or more indices of the speech coefficient values. The second covariance values to the speech processor comprising the Gaussian Mixture Model.
    Type: Grant
    Filed: September 6, 2015
    Date of Patent: August 21, 2018
    Assignee: International Business Machines Corporation
    Inventor: Hagai Aronowitz
  • Patent number: 9997154
    Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: June 12, 2018
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
  • Patent number: 9959341
    Abstract: In some embodiments, a system is provided comprising at least one processor programmed to process an input text to identify a plurality of semantic patterns that match the input text, wherein, for at least one semantic pattern of the plurality of semantic patterns: the at least one semantic pattern comprises a plurality of semantic entities identified from the at least one input text, and the plurality of semantic entities occur in a common context within the at least one input text. The at least one processor may be further programmed to use statistical information derived from training data to associate a respective weight with each semantic pattern of the plurality of semantic patterns.
    Type: Grant
    Filed: June 11, 2015
    Date of Patent: May 1, 2018
    Assignee: Nuance Communications, Inc.
    Inventor: Jan Curin
  • Patent number: 9911412
    Abstract: Methods, devices, and computer program products for recognizing and responding to natural language input are described herein. Natural language input is received at a natural language input interface of a computing device and transformed into computer-usable text. A natural language input recognizer obtains evidence from one or more evidence source and generates an evidence graph based on the evidence obtained. Evidence may be obtained asynchronously, and the natural language input recognizer may update the evidence graph upon receipt of additional evidence. The natural language input recognizer generates a set of recognition hypotheses based on the evidence graph and selects one of the recognition hypotheses as a recognition result for the natural language input. Semantic models, evidence models, and response models may be employed to generate the evidence graph and respond to the recognition result selected for the natural language input.
    Type: Grant
    Filed: March 6, 2015
    Date of Patent: March 6, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Martin Labský, Ladislav Kunc, Jan Kleindienst, Tomás Macek, Bart D'hoore
  • Patent number: 9906414
    Abstract: A network reliability evaluation method for an acyclic network and a system thereof are provided. The network reliability evaluation method searches all minimal paths from a source node to a sink source based on nodes according to the properties of the acyclic network. In conventional network reliability evaluation methods, searching all minimal paths is based on arcs. Therefore, the network reliability evaluation method and the system thereof can effectively reduce the complexity of calculation.
    Type: Grant
    Filed: September 8, 2015
    Date of Patent: February 27, 2018
    Assignee: NATIONAL TSING HUA UNIVERSITY
    Inventor: Wei-Chang Yeh
  • Patent number: 9865257
    Abstract: A controller for a dialog manager, the dialog manager being configured to receive a representation of an input utterance from a user and control the flow of conversation with the user. The controller is configured to implement a parameterized policy for defining the behavior of a dialog manager. The parameterized policy is configured to operate with an ontology-independent parameter as an input. The controller has a processor for parameterizing an end user ontology such that the parameterized policy can define the behavior of a dialog manager for the end user ontology. The processor is configured to define a slot of the end user ontology in terms of at least one of the ontology-independent parameters such that it is suitable for being used as an input to the parameterized policy.
    Type: Grant
    Filed: April 29, 2016
    Date of Patent: January 9, 2018
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Zhuoran Wang, Yannis Stylianou
  • Patent number: 9842102
    Abstract: A method of generating ontologies for a Virtual Assistant across different languages may include extracting a plurality of tokens in a first language from a plurality of web resources in a web domain that includes the Virtual Assistant. The web resources may be made available in a first language and a second language. The method may also include determining a first part-of-speech (POS) for each of the plurality of tokens, where the first POS may be specific to the first language. The method may additionally include mapping the first POS to a second POS from a standardized set of POS's that are general across the first language and the second language, and generating a plurality of lemmas from the plurality of tokens. The method may further include displaying a network representing the ontology.
    Type: Grant
    Filed: July 7, 2015
    Date of Patent: December 12, 2017
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Margaret Salome, Fabrice Nauze, Christian Kissig, Maria Begona Villada-Moiron
  • Patent number: 9711148
    Abstract: A processing system receives an audio signal encoding an utterance and determines that a first portion of the audio signal corresponds to a predefined phrase. The processing system accesses one or more text-dependent models associated with the predefined phrase and determines a first confidence based on the one or more text-dependent models associated with the predefined phrase, the first confidence corresponding to a first likelihood that a particular speaker spoke the utterance. The processing system determines a second confidence for a second portion of the audio signal using one or more text-independent models, the second confidence corresponding to a second likelihood that the particular speaker spoke the utterance. The processing system then determines that the particular speaker spoke the utterance based at least in part on the first confidence and the second confidence.
    Type: Grant
    Filed: July 18, 2013
    Date of Patent: July 18, 2017
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Dominik Roblek
  • Patent number: 9659568
    Abstract: A method of decoding an audio signal, and which includes extracting spectral data and a loss signal compensation parameter from an audio signal bitstream; detecting a loss signal based on the spectral data; generating first compensation data corresponding to the loss signal using a random signal based on the loss signal compensation parameter; generating a scale factor by adding a scale factor difference value to a scale factor reference value if the scale factor corresponds to a band quantized to zero, and the scale factor reference value is included in the loss signal compensation parameter; and generating second compensation data by applying the scale factor to the first compensation data.
    Type: Grant
    Filed: December 31, 2008
    Date of Patent: May 23, 2017
    Assignee: LG Electronics Inc.
    Inventors: Jae Hyun Lim, Dong Soo Kim, Hyun Kook Lee, Sung Yong Yoon, Hee Suk Pang
  • Patent number: 9626970
    Abstract: Embodiments of the present invention relate to speaker identification using spatial information. A method of speaker identification for audio content being of a format based on multiple channels is disclosed. The method comprises extracting, from a first audio clip in the format, a plurality of spatial acoustic features across the multiple channels and location information, the first audio clip containing voices from a speaker, and constructing a first model for the speaker based on the spatial acoustic features and the location information, the first model indicating a characteristic of the voices from the speaker. The method further comprises identifying whether the audio content contains voices from the speaker based on the first model. Corresponding system and computer program product are also disclosed.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: April 18, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Shen Huang, Xuejing Sun
  • Patent number: 9613638
    Abstract: Systems and methods are provided for generating an intelligibility score for speech of a non-native speaker. Words in a speech recording are identified using an automated speech recognizer, where the automated speech recognizer provides a string of words identified in the speech recording, and where the automated speech recognizer further provides an acoustic model likelihood score for each word in the string of words. For a particular word in the string of words, a context metric value is determined based upon a usage of the particular word within the string of words. An acoustic score for the particular word is determined based on the acoustic model likelihood score for the particular word from the automated speech recognizer. An intelligibility score is determined for the particular word based on the acoustic score for the particular word and the context metric value for the particular word.
    Type: Grant
    Filed: February 26, 2015
    Date of Patent: April 4, 2017
    Assignee: Educational Testing Service
    Inventors: Anastassia Loukina, Keelan Evanini
  • Patent number: 9570056
    Abstract: An audio data synthesis method including a time of a plurality of audio data is adjusted without using a device which can acquire the standard time. Specifically, audio data is obtained based on synchronized recording of the first and second recorders without using standard time. A time difference is calculated between an own terminal and another terminal, based on the time at which output of a first sound from the audio output module is started, a time at which input of a sound corresponding to the audio data to the audio input module is started, a time indicated by the first information, and a time indicated by the second information. Second and third audio data is synthesized after a time difference between the second and third audio data based on the third sound which is input to the audio input module is adjusted, based on the time difference.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: February 14, 2017
    Assignee: OLYMPUS CORPORATION
    Inventor: Ryuichi Kiyoshige
  • Patent number: 9560991
    Abstract: A system for establishing a hearing ability model of a hearing ability of a person, includes a data storage configured to store a representation of a distribution of a hearing ability of a population of individuals, and a processor configured to establish a hearing ability model representing a hearing ability of the person based at least in part on (i) information regarding a person's response to a stimulus of a hearing evaluation event, and (ii) the representation of the distribution of the hearing ability of the population.
    Type: Grant
    Filed: April 24, 2009
    Date of Patent: February 7, 2017
    Assignee: GN HEARING A/S
    Inventors: Aalbert de Vries, Svante Sten Johan Stadler, Arne Leijon, Tjeerd Maarten Hein Dijkstra, Alexander Ypma
  • Patent number: 9542927
    Abstract: A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.
    Type: Grant
    Filed: November 13, 2014
    Date of Patent: January 10, 2017
    Assignee: Google Inc.
    Inventors: Ioannis Agiomyrgiannakis, Alexander Gutkin
  • Patent number: 9530412
    Abstract: Systems, methods, and computer-readable storage devices are for an event-driven multi-agent architecture improves via a semi-hierarchical multi-agent reinforcement learning approach. A system receives a user input during a speech dialog between a user and the system. The system then processes the user input, identifying an importance of the user input to the speech dialog based on a user classification and identifying a variable strength turn-taking signal inferred from the user input. An utterance selection agent selects an utterance for replying to the user input based on the importance of the user input, and a turn-taking agent determines whether to output the utterance based on the utterance, and the variable strength turn-taking signal. When the turn-taking agent indicates the utterance should be output, the system selects when to output the utterance.
    Type: Grant
    Filed: August 29, 2014
    Date of Patent: December 27, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Ethan Selfridge
  • Patent number: 9530116
    Abstract: Systems, methods, devices, and non-transitory storage media provide a web-based graphical user interface to allow a user to control a softphone of a user device. A context handler operates on the user device and manages a page context of the web-based graphical user interface. A call manager operates as a RESTful web service on the user device. The context handler and the call manager communicate using JavaScript Object Notation with Padding (JSONP) messages. The call manager includes a finite state machine that indicates a current state of the softphone. The call manager generates application programming interface calls to the softphone in response to receiving JSONP messages. The user interface is updated to reflect the current state of the softphone based on execution of user selections by the softphone. The context handler chunks data into multiple JSONP messages if data exceeds a data size limitation of the JSONP.
    Type: Grant
    Filed: May 28, 2013
    Date of Patent: December 27, 2016
    Assignee: VERIZON PATENT AND LICENSING INC.
    Inventors: Manah M. Khalil, Mohammad Y. Salman
  • Patent number: 9524730
    Abstract: A system receives monaural sound which includes speech and background noises. The received sound is divided by frequency and time into time-frequency units (TFUs). Each TFU is classified as speech or non-speech by a processing unit. The processing unit for each frequency range includes at least one of a deep neural network (DNN) or a linear support vector machine (LSVM). The DNN extracts and classifies the features of the TFU and includes a pre-trained stack of Restricted Boltzmann Machines (RBM), and each RBM includes a visible and a hidden layer. The LSVM classifies each TFU based on extracted features from the DNN, including those from the visible layer of the first RBM, and those from the hidden layer of the last RBM in the stack. The LSVM and DNN include training with a plurality of training noises. Each TFU classified as speech is output.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: December 20, 2016
    Assignee: OHIO STATE INNOVATION FOUNDATION
    Inventors: DeLiang Wang, Yuxuan Wang
  • Patent number: 9520123
    Abstract: A system and method for concatenative speech synthesis is provided. Embodiments may include accessing, using one or more computing devices, a plurality of speech synthesis units from a speech database and determining a similarity between the plurality of speech synthesis units. Embodiments may further include retrieving two or more speech synthesis units having the similarity and pruning at least one of the two or more speech synthesis units based upon, at least in part, the similarity.
    Type: Grant
    Filed: March 19, 2015
    Date of Patent: December 13, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Heng Lu, Xu Shao, Wei Zhang, Wenhui Lei, Andrew Breen
  • Patent number: 9484032
    Abstract: The disclosed embodiments illustrate methods and systems for processing multimedia content. The method includes extracting one or more words from an audio stream associated with multimedia content. Each word has associated one or more timestamps indicative of temporal occurrences of said word in said multimedia content. The method further includes creating a word cloud of said one or more words in said multimedia content based on a measure of emphasis laid on each word in said multimedia content and said one or more timestamps associated with said one or more words. The method further includes presenting one or more multimedia snippets, of said multimedia content, associated with a word selected by a user from said word cloud. Each of said one or more multimedia snippets corresponds to said one or more timestamps associated with occurrences of said word in said multimedia content.
    Type: Grant
    Filed: October 27, 2014
    Date of Patent: November 1, 2016
    Assignee: Xerox Corporation
    Inventors: Kuldeep Yadav, Kundan Shrivastava, Om D Deshmukh
  • Patent number: 9449598
    Abstract: Features are disclosed for performing speech recognition on utterances using a grammar and a statistical language model, such as an n-gram model. States of the grammar may correspond to states of the statistical language model. Speech recognition may be initiated using the grammar. At a given state of the grammar, speech recognition may continue at a corresponding state of the statistical language model. Speech recognition may continue using the grammar in parallel with the statistical language model, or it may continue using the statistical language model exclusively. Scores associated with the correspondences between states (e.g., backoff arcs) may be determined according to a heuristically or based on test data.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: September 20, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Ariya Rastrow, Bjorn Hoffmeister, Sri Venkata Surya Siva Rama Krishna Garimella, Rohit Krishna Prasad
  • Patent number: 9393412
    Abstract: An audio-cochlear implant processor device for a hearing impaired listener is described. An input signal decoder decodes an audio input data signal into a corresponding multi-channel audio signal representing multiple audio objects. An audio processor adjusts the multi-channel audio output based on user-specific hearing impairment characteristics to produce a processed audio output to a cochlear implant audio processor with auditory scene analysis (ASA) cues for the hearing impaired listener.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: July 19, 2016
    Assignee: MED-EL Elektromedizinische Geraete GmbH
    Inventors: Stefan Strahl, Cornelia Falch
  • Patent number: 9244901
    Abstract: A method, a computer readable medium, and a system for tagging Natural language application. In the method, utterances are analyzed using one or more rules, and a tag is assigned to the analyzed utterances based on the one or more rules. The analysis of the utterances may include determining a frequency of the utterances, and grouping the utterances by their determined frequency and their assigned tag. The frequency may represent a number of occurrences in natural language where the utterances share semantic meanings, for example. Further, the one or more rules may be prioritized, and each of the utterances may be analyzed using the prioritized rules. In this manner, meaning may be assigned to utterances such that groups of utterances may be tagged simultaneously.
    Type: Grant
    Filed: April 22, 2008
    Date of Patent: January 26, 2016
    Assignee: WEST CORPORATION
    Inventors: Kiyomi Murata, Steven John Schanbacher, Aaron Scott Fisher
  • Patent number: 9224386
    Abstract: Features are disclosed for discriminative training of speech recognition language models. A confusion matrix can be generated from acoustic model training data for use in discriminative training. The confusion matrix can include probabilities for the substitution, insertion, and/or deletion of some or all subword units of a language. Probabilities can be calculated based on the presence or absence of subword units in a processed acoustic model training data audio recording when compared to a correct transcription of the recording. The probabilities can be used to generate erroneous transcriptions in language model training corpora, and the language model can be trained to distinguish the erroneous transcriptions from the correct transcriptions.
    Type: Grant
    Filed: June 22, 2012
    Date of Patent: December 29, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Frederick V. Weber
  • Patent number: 9159319
    Abstract: Keyword spotting may be improved by using a competitor model. In some embodiments, audio data is received by a device. At least a portion of the audio data may be compared with a keyword model to obtain a first score. The keyword model may model a keyword. The portion of the audio data may also be compared with a competitor model to obtain a second score. The competitor model may model a competitor word, which may be a word that is similar to the keyword. The device may compare the first score and the second score to determine if a keyword is spoken.
    Type: Grant
    Filed: December 3, 2012
    Date of Patent: October 13, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Bjorn Hoffmeister
  • Patent number: 9158652
    Abstract: An airborne, gas, or liquid particle sensor with one or more intelligent modules either within the instrument or attached to the instrument. These modules comprising sub-systems with local controllers or memory.
    Type: Grant
    Filed: March 15, 2014
    Date of Patent: October 13, 2015
    Assignee: Particles Plus, Inc.
    Inventor: David Pariseau
  • Patent number: 9142209
    Abstract: A method and apparatus receive multiple data pattern analysis requests from a controller and substantially simultaneously perform, with multiple data pattern analysis units, multiple data pattern analyses on one or more portions of a data stream.
    Type: Grant
    Filed: April 22, 2014
    Date of Patent: September 22, 2015
    Assignee: CYPRESS SEMICONDUCTOR CORPORATION
    Inventors: Richard Fastow, Qamrul Hasan
  • Patent number: 9135248
    Abstract: Systems, methods, and devices for determining contexts and determining associated demographic profiles using information received from multiple demographic sensor enabled electronic devices, are disclosed. Contexts can be defined by a description of spatial and/or temporal components. Such contexts can be arbitrarily defined using semantically meaningful and absolute descriptions of time and location. Demographic sensor data is associated with or includes context data that describes the circumstances under which the data was determined. The demographic sensor data can include demographic sensor readings that are implicit indications of a demographic for the context. The sensor data can also include user reported data with explicit descriptions of a demographic for the context. The demographic sensor data can be filtered by context data according a selected context.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: September 15, 2015
    Assignee: ARRIS Technology, Inc.
    Inventors: Jianguo Li, Mir F. Ali, Paul C. Davis, Dale W. Russell, Di You
  • Patent number: 9105053
    Abstract: An approach is provided for determining a user age range. An age estimator causes, at least in part, acquisition of voice data. Next, the age estimator calculates a first set of probability values, wherein each of the probability values represents a probability that the voice data is in a respective one of a plurality of predefined age ranges, and the predefined age ranges are segments of a lifespan. Then, the age estimator derives a second set of probability values by applying a correlation matrix to the first set of probability values, wherein the correlation matrix associates the first set of probability values with probabilities of the voice data matching individual ages over the lifespan. Then, the age estimator, for each of the predefined age ranges, calculates a sum of the probabilities in the second set of probability values corresponding to the individual ages within the respective predefined age ranges.
    Type: Grant
    Filed: March 23, 2010
    Date of Patent: August 11, 2015
    Assignee: Nokia Technologies Oy
    Inventors: Yang Cao, Feng Ding, Jilei Tian
  • Patent number: 9092757
    Abstract: The disclosure herein relates to a system and method for personalizing an online experience of a user based on the user's attitude. Attitude models are constructed from user activity data that are able to infer or determine attitudes for a user. Based on the attitudes derived from applying the attitude models, attitude profiles are created for the users. As a users engage in various types of online interactions, the attitude profiles associated with the users can be utilized to personalize the online experience of the user.
    Type: Grant
    Filed: May 9, 2012
    Date of Patent: July 28, 2015
    Assignee: Yahoo! Inc.
    Inventors: Judd Antin, David Ayman Shamma, Elizabeth Churchill
  • Patent number: 9082401
    Abstract: The present disclosure describes example systems, methods, and devices for generating a synthetic speech signal. An example method may include determining a phonemic representation of text. The example method may also include identifying one or more finite-state machines (“FSMs”) corresponding to one or more phonemes included in the phonemic representation of the text. A given FSM may be a compressed unit of recorded speech that simulates a Hidden Markov Model. The example method may further include determining a selected sequence of models that minimizes a cost function that represents a likelihood that a possible sequence of models substantially matches a phonemic representation of text. Each possible sequence of models may include at least one FSM. The method may additionally include generating a synthetic speech signal based on the selected sequence that includes one or more spectral features generated from at least one FSM included in the selected sequence.
    Type: Grant
    Filed: January 9, 2013
    Date of Patent: July 14, 2015
    Assignee: Google Inc.
    Inventors: Javier Gonzalvo Fructuoso, Alexander Gutkin
  • Patent number: 9077522
    Abstract: A two-party approximation protocol is transformed into a private approximation protocol. A first input x?{0, 1, . . . , M}n and a second input y?{0, 1, . . . , M}n of a two party approximation protocol approximating a function of a form ƒ(x, y)=?j=1ng (xj, yj) is received. Variable B is set as a public upper bound on ƒ(x, y). Variable l is set l=O*(1). The following is performed until ? j = 1 l ? z j ? l t or B<1, where t is an arbitrary number: (1) a private importance sampling protocol with the first input x, the second input y, and a third input 1k, is executed independently for j?[l], where k is a security parameter, an output of the private importance sampling protocol is shares of Ij?[n]?{?}; (2) l coin tosses z1, . . . , zl are independently generated where zj=1 iff Ij??; and (3) B is divided by 2 if ? j = 1 l ? z j ? l t or B<1 is not satisfied.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: July 7, 2015
    Assignee: International Business Machines Corporation
    Inventor: David Paul Woodruff
  • Patent number: 9037464
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 19, 2015
    Assignee: Google Inc.
    Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean