Recognition Patents (Class 704/231)
  • Patent number: 11074913
    Abstract: Various embodiments are provided for understanding user sentiment in a dialog system in a computing environment by a processor. A sentiment of a user may be detected according to a sentiment analysis and user feedback during a dialog with the user. One or more reasons for the sentiment of the user may be identified. Behavior of the dialog system may be adjusted according to the one or more reasons.
    Type: Grant
    Filed: January 3, 2019
    Date of Patent: July 27, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Oznur Alkan, Adi I. Botea, Elizabeth Daly, Matthew Davis, Christian Muise
  • Patent number: 11069345
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech recognition by generating a neural network output from an audio data input sequence, where the neural network output characterizes words spoken in the audio data input sequence. One of the methods includes, for each of the audio data inputs, providing a current audio data input sequence that comprises the audio data input and the audio data inputs preceding the audio data input in the audio data input sequence to a convolutional subnetwork comprising a plurality of dilated convolutional neural network layers, wherein the convolutional subnetwork is configured to, for each of the plurality of audio data inputs: receive the current audio data input sequence for the audio data input, and process the current audio data input sequence to generate an alternative representation for the audio data input.
    Type: Grant
    Filed: December 18, 2019
    Date of Patent: July 20, 2021
    Assignee: DeepMind Technologies Limited
    Inventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals, Lasse Espeholt
  • Patent number: 11048750
    Abstract: A conversation topic providing method includes: converting voice data, of a conversation of a user who is on a phone, into text; selecting a keyword, indicating an intention of the user, from the text; obtaining information of interest with respect to the keyword; and determining topics relating to the keyword based on user information.
    Type: Grant
    Filed: August 5, 2014
    Date of Patent: June 29, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hue-yin Kim, Sang-il Lee, Sung-kyu Lee, Seong-seol Hong, Jung-hoon Shin, Yeon-woo Lee
  • Patent number: 11043213
    Abstract: A system and method are disclosed for capturing a segment of speech audio, performing phoneme recognition on the segment of speech audio to produce a segmented phoneme sequence, comparing the segmented phoneme sequence to stored phoneme sequences that represent incorrect pronunciations of words to determine if there is a match, and identifying an incorrect pronunciation for a word in the segment of speech audio. The system builds a library based on the data collected for the incorrect pronunciations.
    Type: Grant
    Filed: December 7, 2018
    Date of Patent: June 22, 2021
    Assignee: SoundHound, Inc.
    Inventors: Katayoun Norouzi, Karl Stahl
  • Patent number: 11042705
    Abstract: According to one embodiment, an electronic device comprises a memory that stores dictionary data, a voice input receiver, and a hardware processor. The dictionary data comprises first dictionary data and updatable second dictionary data. A number of voice commands in the first dictionary data is greater than a number of voice commands in the second dictionary data. The first dictionary data is divided into sub-dictionaries. The hardware processor recognizes the received voice using at least one of the sub-dictionaries or the second dictionary data.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: June 22, 2021
    Assignee: Dynabook Inc.
    Inventor: Midori Nakamae
  • Patent number: 11043212
    Abstract: There is disclosed a system that, when in operation, evaluates speech, for example evaluates a speech signal generated using a microphone to record an oral utterance.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: June 22, 2021
    Inventor: Peter Bell
  • Patent number: 11043209
    Abstract: Methods and systems for training one or more neural networks for transcription and for transcribing a media file using the trained one or more neural networks are provided. One of the methods includes: segmenting the media file into a plurality of segments; extracting, using a first neural network, audio features of a first and second segment of the plurality of segments; and identifying, using a second neural network, a best-candidate engine for each of the first and second segments based at least on audio features of the first and second segments. A best-candidate engine is a neural network having a highest predicted transcription accuracy among a collection of neural networks.
    Type: Grant
    Filed: January 8, 2019
    Date of Patent: June 22, 2021
    Inventors: Peter Nguyen, David Kettler, Karl Schwamb, Chad Steelberg
  • Patent number: 11036803
    Abstract: An approach is provided that receives a question at a question-answering (QA) system. The received question includes one or more terms, and the question pertains to a subject matter domain that is supported by the QA system. Analyzing a number of expressions included in a set of question-answer pairs (QA pairs), with the QA pairs being ground-truths established to in support of the subject matter domain. The analysis identifies whether a selected term from the question is a synonym for any of the expressions. The expressions that are identified as synonyms are then used in a QA pipeline that generates one or more candidate answers to the received question.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: June 15, 2021
    Assignee: International Business Machines Corporation
    Inventors: Stephen A. Boxwell, Keith G. Frost, Stanley J. Vernier, Kyle M. Brake
  • Patent number: 11031005
    Abstract: A mechanism is described for facilitating continuous topic detection and adaption in audio environments, according to one embodiment. A method of embodiments, as described herein, includes detecting a term relating to a topic in an audio input received from one or more microphones of the computing device including a voice-enabled device; analyzing the term based on the topic to determine an action to be performed by the computing device; and triggering an event to facilitate the computing device to perform the action consistent with the term and the topic.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: June 8, 2021
    Assignee: INTEL CORPORATION
    Inventors: Georg Stemmer, Andrzej Mialkowski, Joachim Hofer, Piotr Rozen, Tomasz Szmelczynski
  • Patent number: 11030996
    Abstract: The present invention relates to an electronic device and a control method thereof. The electronic device comprises a microphone for acquiring sound; and a control unit for determining whether the acquired sound is a learned sound and outputting information on the acquired sound on the basis of a determination result.
    Type: Grant
    Filed: October 14, 2016
    Date of Patent: June 8, 2021
    Assignee: LG ELECTRONICS INC.
    Inventors: Jiyoung Huh, Jongcheol Shin, Sunryang Kim
  • Patent number: 11024296
    Abstract: Systems and methods are described herein for providing media guidance. Control circuitry may receive a first voice input and access a database of topics to identify a first topic associated with the first voice input. A user interface may generate a first response to the first voice input, and subsequent to generating the first response, the control circuitry may receive a second voice input. The control circuitry may determine a match between the second voice input and an interruption input such as a period of silence or a keyword or a phrase, such as “Ahh,”, “Umm,”, or “Hmm.” The user interface may generate a second response that is associated with a second topic related to the first topic. By interrupting the conversation and changing the subject from time to time, media guidance systems can appear to be more intelligent and human.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: June 1, 2021
    Assignee: Rovi Guides, Inc.
    Inventors: Charles Dawes, Walter R. Klappert
  • Patent number: 11024312
    Abstract: A voice recognition apparatus includes a communication part configured to communicate with a voice recognition server, a voice receiver configured to receive a user's voice signal, a storage part configured to store guide information comprising at least an example command for voice recognition; and a controller. The controller is configured to generate a guide image comprising at least a part of the example command, transmit the received user's voice signal to the voice recognition server through the communication part in response to receiving the user's voice signal by the voice receiver, and update the stored guide information based on update information received through the communication part.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: June 1, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jong-cheol Park, Do-wan Kim, Sang-shin Park
  • Patent number: 11024316
    Abstract: Computer-implemented method and system for receiving and processing one or more moment-associating elements. For example, the computer-implemented method includes receiving the one or more moment-associating elements, transforming the one or more moment-associating elements into one or more pieces of moment-associating information, and transmitting at least one piece of the one or more pieces of moment-associating information.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: June 1, 2021
    Assignee: Otter.ai, Inc.
    Inventors: Yun Fu, Simon Lau, Kaisuke Nakajima, Julius Cheng, Sam Song Liang, James Mason Altreuter, Kean Kheong Chin, Zhenhao Ge, Hitesh Anand Gupta, Xiaoke Huang, James Francis McAteer, Brian Francis Williams, Tao Xing
  • Patent number: 11024302
    Abstract: Systems and methods are provided for an automated speech recognition system. A microphone records a keyword spoken by a user, and a front end divides the recorded keyword into a plurality of subunits, each containing a segment of recorded audio, and extracts a set of features from each of the plurality of subunits. A decoder assigns one of a plurality of content classes to each of the plurality of subunits according to at least the extracted set of features for each subunit. A quality evaluation component calculates a score representing a quality of the keyword from the content classes assigned to the plurality of subunits.
    Type: Grant
    Filed: September 15, 2017
    Date of Patent: June 1, 2021
    Assignee: Texas Instruments Incorporated
    Inventors: Tarkesh Pande, Lorin Paul Netsch, David Patrick Magee
  • Patent number: 11024306
    Abstract: The present disclosure is generally directed to the generation of voice-activated data flows in interconnected network. The voice-activated data flows can include input audio signals that include a request and are detected at a client device. The client device can transmit the input audio signal to a data processing system, where the input audio signal can be parsed and passed to the data processing system of a service provider to fulfill the request in the input audio signal. The present solution is configured to conserve network resources by reducing the number of network transmissions needed to fulfill a request.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: June 1, 2021
    Assignee: GOOGLE LLC
    Inventors: Gaurav Bhaya, Ulas Kirazci, Bradley Abrams, Adam Coimbra, Ilya Firman, Carey Radebaugh
  • Patent number: 11011167
    Abstract: A communication system includes a pair of speech recognition devices that are capable of communicating with each other, each of the speech recognition devices including a speech input section into which speech is input, a speech recognition section that recognizes speech input to the speech input section, and a speech output section that outputs speech. The communication system also includes an information generation section that generates notification information corresponding to speech recognized by the speech recognition section in one speech recognition device from out of the pair of speech recognition devices, and a speech output control section that performs control to output notification speech corresponding to the notification information at a specific timing from the speech output section of the other speech recognition device from out of the pair of speech recognition devices.
    Type: Grant
    Filed: January 8, 2019
    Date of Patent: May 18, 2021
    Assignee: Toyota Jidosha Kabushiki Kaisha
    Inventors: Hideki Kobayashi, Akihiro Muguruma, Yukiya Sugiyama, Shota Higashihara, Riho Matsuo, Naoki Yamamuro
  • Patent number: 11011162
    Abstract: The technology disclosed relates to performing speech recognition for a plurality of different devices or devices in a plurality of conditions. This includes storing a plurality of acoustic models associated with different devices or device conditions, receiving speech audio including natural language utterances, receiving metadata indicative of a device type or device condition, selecting an acoustic model from the plurality in dependence upon the received metadata, and employing the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio. Each of speech recognition and the storage of acoustic models can be performed locally by devices or on a network-connected server. Also provided is a platform and interface, used by device developers to select, configure, and/or train acoustic models for particular devices and/or conditions.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: May 18, 2021
    Assignee: SOUNDHOUND, INC.
    Inventors: Mehul Patel, Keyvan Mohajer
  • Patent number: 11004458
    Abstract: Provided are a method and an apparatus for determining an encoding mode for improving the quality of a reconstructed audio signal. A method of determining an encoding mode includes determining one from among a plurality of encoding modes including a first encoding mode and a second encoding mode as an initial encoding mode in correspondence to characteristics of an audio signal, and if there is an error in the determination of the initial encoding mode, generating a modified encoding mode by modifying the initial encoding mode to a third encoding mode.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: May 11, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-hyun Choo, Anton Victorovich Porov, Konstantin Sergeevich Osipov, Nam-suk Lee
  • Patent number: 11004454
    Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: May 11, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Sundararajan Srinivasan, Arindam Mandal, Krishna Subramanian, Spyridon Matsoukas, Aparna Khare, Rohit Prasad
  • Patent number: 11005838
    Abstract: Systems, methods, and other embodiments associated with a monitoring process for event detection and notification transmission are described. In one embodiment, a method includes configuring a monitoring process with a matching rule used to evaluate data sources of an enterprise computing environment to determine if an event has occurred. The example method may also include executing the monitor process to identify a set of subscribers and establish a trust relationship. The example method may also include, for each subscriber, executing the monitoring process to impersonate a subscriber, execute the matching rule upon data sources accessible to the subscriber to perform a test as to whether the event has occurred, and transmit a message of the event if the event occurred.
    Type: Grant
    Filed: May 15, 2018
    Date of Patent: May 11, 2021
    Assignee: Oracle International Corporation
    Inventors: Michael Tebben, Haiyan Wang, Nicole Laurent, Qiu Zhong, Aaron Johnson, Darryl M. Shakespeare
  • Patent number: 11004445
    Abstract: In one embodiment, a smartwatch includes a processor and a memory storing instructions to be executed in the processor. The instructions are configured to cause the processor to obtain input comprising voice information; determine whether the voice information comprises interrogative keyword; and determine that the voice information is interrogative information in response to determining that the voice information comprises interrogative keyword. The instructions are configured to cause the processor to determine whether reply information corresponding to the interrogative information can be obtained from a memory of the smartwatch; and send the interrogative information to a server through a wireless network in response to determining that the reply information corresponding to the interrogative information cannot be obtained from the memory of the smartwatch.
    Type: Grant
    Filed: May 27, 2017
    Date of Patent: May 11, 2021
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yizu Feng, Bin Li
  • Patent number: 10991369
    Abstract: A system and method obtaining structured information from a conversation including receiving a first input from a user, determining a first set of slots filled based on the first input using natural language processing and a non-linear slot filling algorithm, determining first conversation based on the first set of slots filled, determining a first empty slot associated with the first conversation, prompting the user for a second input, the second input associated with the first empty slot, filling the first empty slot using natural language processing and the non-linear slot filling algorithm, determining that the slots associated with the first conversation are filled; and, responsive to determining that the slots associated with the first conversation are filled, initiating an action associated with the conversation.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: April 27, 2021
    Inventors: Hristo Borisov, Boyko Karadzhov, Ivan Atanasov, Georgi Varzonovtsev
  • Patent number: 10990902
    Abstract: A method, system, and computer program product for learning a recognition model for recognition processing. The method includes preparing one or more examples for learning, each of which includes an input segment, an additional segment adjacent to the input segment and an assigned label. The input segment and the additional segment are extracted from an original training data. A classification model is trained, using the input segment and the additional segment in the examples, to initialize parameters of the classification model so that extended segments including the input segment and the additional segment are reconstructed from the input segment. Then, the classification model is tuned to predict a target label, using the input segment and the assigned label in the examples, based on the initialized parameters. At least a portion of the obtained classification model is included in the recognition model.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: April 27, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Gakuto Kurata
  • Patent number: 10984788
    Abstract: An automatic speech recognition (ASR) system includes at least one processor and a memory storing instructions.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: April 20, 2021
    Assignee: BlackBerry Limited
    Inventor: Darrin Kenneth John Fry
  • Patent number: 10984801
    Abstract: AM and LM parameters to be used for adapting an ASR model are derived for each audio segment of an audio stream comprising multiple audio programs. A set of identifiers, including a speaker identifier, a speaker domain identifier and a program domain identifier, is obtained for each audio segment. The set of identifiers are used to select most suitable AM and LM parameters for the particular audio segment. The embodiments enable provision of maximum constraints on the AMs and LMs and enable adaptation of the ASR model on the fly for audio streams of multiple audio programs, such as broadcast audio. This means that the embodiments enable selecting AM and LM parameters that are most suitable in terms of ASR performance for each audio segment.
    Type: Grant
    Filed: May 8, 2017
    Date of Patent: April 20, 2021
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Volodya Grancharov, Erlendur Karlsson, Sigurdur Sverrisson, Maxim Teslenko, Konstantinos Vandikas, Aneta Vulgarakis Feljan
  • Patent number: 10978192
    Abstract: Techniques for documenting a clinical procedure involve transcribing audio data comprising audio of one or more clinical personnel speaking while performing the clinical procedure. Examples of applicable clinical procedures include sterile procedures such as surgical procedures, as well as non-sterile procedures such as those conventionally involving a core code reporter. The transcribed audio data may be analyzed to identify relevant information for documenting the clinical procedure, and a text report including the relevant information documenting the clinical procedure may be automatically generated.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: April 13, 2021
    Assignee: Nuance Communications, Inc.
    Inventor: Mariana Casella dos Santos
  • Patent number: 10978048
    Abstract: An apparatus comprising one or more processors, a communication circuit, and a memory for storing instructions, which when executed, performs a method of recognizing a user utterance. The method comprises: receiving first data associated with a user utterance, performing, a first determination to determine whether the user utterance includes the first data and a specified word, performing a second determination to determine whether the first data includes the specified word, transmitting the first data to an external server, receiving a text generated from the first data by the external server, performing a third determination to determine whether the received text matches the specified word, and determining whether to activate the voice-based input system based on the third determination.
    Type: Grant
    Filed: May 23, 2018
    Date of Patent: April 13, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tae Jin Lee, Young Woo Lee, Seok Yeong Jung, Chakladar Subhojit, Jae Hoon Jeong, Jun Hui Kim, Jae Geun Lee, Hyun Woong Lim, Soo Min Kang, Eun Hye Shin, Seong Min Je
  • Patent number: 10971160
    Abstract: A user device (e.g., voice assistant device, voice enabled device, smart device, computing device, etc.) may receive/detect audio content (e.g., speech, etc.) that includes a wake word and/or words similar to a wake word. The user device may require a wake word, a portion of the wake word, or words similar to the wake word to be detected prior to interacting with a user. The user device may, based on characteristics of the audio content, determine if the audio content originates from an authorized user. The user device may decrease and/or increase scrutiny applied to wake word detection based on whether audio content originates from an authorized user.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: April 6, 2021
    Assignee: Comcast Cable Communications, LLC
    Inventors: Hans Sayyadi, Nima Bina
  • Patent number: 10957446
    Abstract: Systems, methods, and computer readable storage medium for providing a genericized medical device architecture common to a plurality of medical devices are disclosed. The architecture may comprise at least one diagnostics module associated with at least one of the plurality of medical devices, wherein the at least one diagnostics module is configured to monitor an operational status of the at least one medical device. At least one hardware abstraction layer may be associated with at least one of the plurality of medical devices, and may be configured to provide abstracted access to hardware of the at least one medical device.
    Type: Grant
    Filed: August 8, 2016
    Date of Patent: March 23, 2021
    Assignee: Johnson & Johnson Surgical Vision, Inc.
    Inventors: Hou Man Chong, Edith W. Fung, Timothy L. Hunter, Deep K. Mehta
  • Patent number: 10957330
    Abstract: Systems and methods for control of vehicles are provided. A computer-implemented method in example embodiments may include receiving, at a computing system comprising one or more processors positioned in a vehicle, voice data from one or more audio sensors positioned in the vehicle. The system can determine whether configuration of a reference voiceprint for a speech processing system of the vehicle is authorized based at least in part on performance data associated with the vehicle. In response to determining that configuration of the reference voiceprint is authorized, a first reference voiceprint based on the reference voice data can be stored and the speech processing system configured to authenticate input voice data for a first set of voice commands based on the reference voiceprint.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: March 23, 2021
    Assignee: GE Aviation Systems Limited
    Inventors: Stefan Alexander Schwindt, Barry Foye
  • Patent number: 10950240
    Abstract: There is provided an information processing device and an information processing method that enable a desired voice recognition result to be easily obtained. The information processing device includes a presentation control unit that controls a separation at a time of presenting a recognition result of voice recognition on the basis of context relating to voice recognition. The present technology can be applied, for example, to an information processing device that independently performs voice recognition, a server that performs voice recognition in response to a call from a client and transmits the recognition result to the client, or the client that requests voice recognition to the server, receives the recognition result from the server, and presents the recognition result.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: March 16, 2021
    Assignee: SONY CORPORATION
    Inventors: Yuhei Taki, Shinichi Kawano
  • Patent number: 10949283
    Abstract: A computer-implemented method is presented for detecting anomalies in dynamic datasets generated in a cloud computing environment. The method includes monitoring a plurality of cloud servers receiving a plurality of data points, employing a two-level clustering training module to generate micro-clusters from the plurality of data points, each of the micro-clusters representing a set of original data from the plurality of data points, employing a detecting module to detect normal data points, abnormal data points, and unknown data points from the plurality of data points via a detection model, employing an evolving module using a different evolving mechanism for each of the normal, abnormal, and unknown data points to evolve the detection model, and generating a system report displayed on a user interface, the system report summarizing the micro-cluster information.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: March 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Jia Wei Yang, Fan Jing Meng
  • Patent number: 10943606
    Abstract: Detecting an end-point of user's voice command or utterance with high accuracy is critical in automatic speech recognition (ASR)-based human machine interface. If an ASR system incorrectly detects an end-point of utterance and transmits this incomplete sentence to other processing blocks for further processing, it is likely the processed result would lead to incorrect interpretation. A method includes selecting a first semantic network based on context of the audio signal and more accurately detecting the end-point of user's utterance included in the audio signal based on the first semantic network and also based on at least one timeout threshold associated with the first semantic network.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: March 9, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Paras Surendra Doshi, Ayush Agarwal, Shri Prakash
  • Patent number: 10944767
    Abstract: Mechanisms are provided for training a classifier to identify adversarial input data. A neural network processes original input data representing a plurality of non-adversarial original input data and mean output learning logic determines a mean response for each intermediate layer of the neural network based on results of processing the original input data. The neural network processes adversarial input data and layer-wise comparison logic compares, for each intermediate layer of the neural network, a response generated by the intermediate layer based on processing the adversarial input data, to the mean response associated with the intermediate layer, to thereby generate a distance metric for the intermediate layer. The layer-wise comparison logic generates a vector output based on the distance metrics that is used to train a classifier to identify adversarial input data based on responses generated by intermediate layers of the neural network.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: March 9, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gaurav Goswami, Sharathchandra Pankanti, Nalini K. Ratha, Richa Singh, Mayank Vatsa
  • Patent number: 10936641
    Abstract: A faster and more streamlined system for providing summary and analysis of large amounts of communication data is described. System and methods are disclosed that employ an ontology to automatically summarize communication data and present the summary to the user in a form that does not require the user to listen to the communication data. In one embodiment, the summary is presented as written snippets, or short fragments, of relevant communication data that capture the meaning of the data relating to a search performed by the user. Such snippets may be based on theme and meaning unit identification.
    Type: Grant
    Filed: May 21, 2018
    Date of Patent: March 2, 2021
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Roni Romano, Galia Zacay, Rahm Fehr
  • Patent number: 10930271
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using neural networks. A feature vector that models audio characteristics of a portion of an utterance is received. Data indicative of latent variables of multivariate factor analysis is received. The feature vector and the data indicative of the latent variables is provided as input to a neural network. A candidate transcription for the utterance is determined based on at least an output of the neural network.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: February 23, 2021
    Inventors: Andrew W. Senior, Ignacio Lopez Moreno
  • Patent number: 10930287
    Abstract: In some embodiments, an exemplary inventive system for improving computer speed and accuracy of automatic speech transcription includes at least components of: a computer processor configured to perform: generating a recognition model specification for a plurality of distinct speech-to-text transcription engines; where each distinct speech-to-text transcription engine corresponds to a respective distinct speech recognition model; receiving at least one audio recording representing a speech of a person; segmenting the audio recording into a plurality of audio segments; determining a respective distinct speech-to-text transcription engine to transcribe a respective audio segment; receiving, from the respective transcription engine, a hypothesis for the respective audio segment; accepting the hypothesis to remove a need to submit the respective audio segment to another distinct speech-to-text transcription engine, resulting in the improved computer speed and the accuracy of automatic speech transcription; and ge
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: February 23, 2021
    Inventors: Tejas Shastry, Matthew Goldey, Svyat Vergun
  • Patent number: 10916242
    Abstract: The present invention relates to the field of intelligent recognition, and discloses an intent recognition method based on a deep learning network, resolving a technical problem that accuracy of intent recognition is not high.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: February 9, 2021
    Assignee: NANJING SILICON INTELLIGENCE TECHNOLOGY CO., LTD.
    Inventors: Huapeng Sima, Ao Yao
  • Patent number: 10908873
    Abstract: A system and method for confirming a voice command of a media playback device is disclosed. The method includes receiving an instruction of a voice command and producing an audio confirmation of the command. A confirmation may be playing a media context item associated with the command, playing a verbal confirmation phrase, or playing a non-verbal audio cue.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: February 2, 2021
    Assignee: Spotify AB
    Inventors: Emma-Camelia Gosu, Daniel Bromand, Karl Humphreys
  • Patent number: 10892996
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. In one example process, an event associated with an audio input is detected with a first process. In accordance with a detection of the event, a delay value associated with an electronic device is determined. The delay value corresponds to a time required to determine, with a second process, whether the audio input includes a spoken trigger. In accordance with a determination that the delay value exceeds a threshold, the delay value is broadcast during a first advertising session, and determination is made, during a second advertising session, whether the electronic device is to respond to the audio input. In accordance with a determination that the threshold is not exceeded, a determination is made, during the first advertising session, whether the electronic device is to respond to the audio input or wait for the second advertising session.
    Type: Grant
    Filed: August 31, 2018
    Date of Patent: January 12, 2021
    Assignee: Apple Inc.
    Inventor: Kurt Piersol
  • Patent number: 10885918
    Abstract: A system, method and computer program is provided for generating customized text representations of audio commands. A first speech recognition module may be used for generating a first text representation of an audio command based on a general language grammar. A second speech recognition module may be used for generating a second text representation of the audio command, the second module including a custom language grammar that may include contacts for a particular user. Entity extraction is applied to the second text representation and the entities are checked against a file containing personal language. If the entities are found in the user-specific language, the two text representations may be fused into a combined text representation and named entity recognition may be performed again to extract further entities.
    Type: Grant
    Filed: September 18, 2014
    Date of Patent: January 5, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wilson Hsu, Kaheer Suleman, Joshua Pantony
  • Patent number: 10887462
    Abstract: A computing system, method and non-transitory computer readable memory are provided, to assist an agent during a client interaction between the agent and a client over a communications channel. An agent station may generate a graphic user interface (GUI) of the client interaction during the client interaction, the GUI displaying a current identified keyword and one or more interaction phases, each interaction phase having a respective current phase score for the client interaction. A keyword and associated keyword information from the client interaction may be received, including phase and corresponding phase score information, and the GUI updated with the currently identified keyword and newly received phase information accounting for the received corresponding phase score information. A situation report may be generated for a designated party, the situation report including an agent identification, and client interaction information including comments entered by the agent relating to the client interaction.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: January 5, 2021
    Assignee: West Corporation
    Inventors: Daniel A. Coyer, Ryan L. Techlin, Jeremy T. Tellock, Dennis C. White, Shelley A. Wildenberg
  • Patent number: 10872598
    Abstract: Embodiments of a production-quality text-to-speech (TTS) system constructed from deep neural networks are described. System embodiments comprise five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an audio synthesis model. For embodiments of the segmentation model, phoneme boundary detection was performed with deep neural networks using Connectionist Temporal Classification (CTC) loss. For embodiments of the audio synthesis model, a variant of WaveNet was created that requires fewer parameters and trains faster than the original. By using a neural network for each component, system embodiments are simpler and more flexible than traditional TTS systems, where each component requires laborious feature engineering and extensive domain expertise. Inference with system embodiments may be performed faster than real time.
    Type: Grant
    Filed: January 29, 2018
    Date of Patent: December 22, 2020
    Assignee: Baidu USA LLC
    Inventors: Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, John Miller, Andrew Ng, Jonathan Raiman, Shubhahrata Sengupta, Mohammad Shoeybi
  • Patent number: 10867600
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: December 15, 2020
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
  • Patent number: 10867595
    Abstract: Described herein are systems and methods for generating natural language sentences with Sequence-to-sequence (Seq2Seq) models with attention. The Seq2Seq models may be implemented in applications, such as machine translation, image captioning, and speech recognition. Performance has further been improved by leveraging unlabeled data, often in the form of a language models. Disclosed herein are “Cold Fusion” architecture embodiments that leverage a pre-trained language model during training. The Seq2Seq models with Cold Fusion embodiments are able to better utilize language information enjoying faster convergence, better generalization, and almost complete transfer to a new domain while using less labeled training data.
    Type: Grant
    Filed: March 6, 2018
    Date of Patent: December 15, 2020
    Assignee: Baidu USA LLC
    Inventors: Anuroop Sriram, Heewoo Jun, Sanjeev Satheesh, Adam Coates
  • Patent number: 10860801
    Abstract: A method includes extracting a keyword and a slot from a natural language input, where the slot includes information. The method includes determining whether the keyword corresponds to one of a plurality of formation groups. In response to determining that the keyword corresponds to a specific formation group, the method includes updating metadata of the specific formation group with the information of the slot. In response to determining that the keyword does not correspond to any of the formation groups, the method includes determining whether the keyword corresponds to one of a plurality of clusters. In response to determining that the keyword corresponds to a specific cluster, the method includes updating the specific cluster with the information of the slot. In response to determining that the keyword does not correspond to any of the clusters, the method includes creating an additional formation group that includes the keyword and the slot.
    Type: Grant
    Filed: January 15, 2019
    Date of Patent: December 8, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anil Yadav, Melvin Lobo, Chutian Wang
  • Patent number: 10860333
    Abstract: Embodiments of the present disclosure seek to mitigate the timing issues of prior approaches by performing the NVMe device reset and post-reset re-initialization in parallel. In embodiments, the NVMe device reset and re-initialization operations are logically divided into front-end and back-end operations that may be carried out in parallel. Upon receipt of the command from a host to reset, the NVMe device carries out front-end reset operations for resetting the device, and in parallel performing back-end reinitialization operations. Once the front-end reset operations are complete, or after a predetermined period of time, the NVMe device reports to the host that the device reset is complete, while back-end operations continue. Once all reset and reinitialization operations are complete, the NVMe device may continue to conduct I/O instructions from the host.
    Type: Grant
    Filed: October 14, 2019
    Date of Patent: December 8, 2020
    Assignee: WESTERN DIGITAL TECHNOLOGIES, INC.
    Inventor: Shay Benisty
  • Patent number: 10853584
    Abstract: Methods, apparatuses, and computer program products are described herein that are configured to express a time in an output text. In some example embodiments, a method is provided that comprises identifying a time period to be described linguistically in an output text. The method of this embodiment may also include identifying a communicative context for the output text. The method of this embodiment may also include determining one or more temporal reference frames that are applicable to the time period and a domain defined by the communicative context. The method of this embodiment may also include generating a phrase specification that linguistically describes the time period based on the descriptor that is defined by a temporal reference frame of the one or more temporal reference frames. In some examples, the descriptor specifies a time window that is inclusive of at least a portion of the time period to be described linguistically.
    Type: Grant
    Filed: April 19, 2019
    Date of Patent: December 1, 2020
    Assignee: ARRIA DATA2TEXT LIMITED
    Inventors: Gowri Somayajulu Sripada, Neil Burnett
  • Patent number: 10852720
    Abstract: Embodiments are disclosed for an example vehicle or driver assistance system for a vehicle. The example vehicle or driver assistance system includes an in-vehicle computing system of a vehicle, the in-vehicle computing system comprising an external device interface communicatively connecting the in-vehicle computing system to a mobile device, an inter-vehicle system communication module communicatively connecting the in-vehicle computing system to one or more vehicle systems of the vehicle, a processor, and a storage device storing instructions executable by the processor to receive a first command from the mobile device via the external device interface, perform a series of actions on the vehicle system until receiving a second command from the mobile device, both of the first command and the second command recognized by the mobile device based on one or more of voice commands issued by a user of the mobile device, and biometric analysis.
    Type: Grant
    Filed: February 10, 2016
    Date of Patent: December 1, 2020
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventor: Yogesh Devidas Dusane
  • Patent number: 10854193
    Abstract: Methods, apparatuses, devices and computer-readable storage media for real-time speech recognition are provided. The method includes: based on an input speech signal, obtaining truncating information for truncating a sequence of features of the speech signal; based on the truncating information, truncating the sequence of features into a plurality of subsequences; and for each subsequence in the plurality of subsequences, obtaining a real-time recognition result through attention mechanism.
    Type: Grant
    Filed: February 6, 2019
    Date of Patent: December 1, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Xiaoyin Fu, Jinfeng Bai, Zhijie Chen, Mingxin Liang, Xu Chen, Lei Jia