Probability Patents (Class 704/240)
  • Patent number: 11875790
    Abstract: Techniques are disclosed that enable dynamically adapting an automated assistant response using a dynamic familiarity measure. Various implementations process received user input to determine at least one intent, and generate a familiarity measure by processing intent specific parameters and intent agnostic parameters using a machine learning model. An automated assistant response is then determined that is based on the intent and that is based on the familiarity measure. The assistant response is responsive to the user input, and is adapted to the familiarity measure. For example, the assistant response can be more abbreviated and/or more resource efficient as the familiarity measure becomes more indicative of familiarity.
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: January 16, 2024
    Assignee: GOOGLE LLC
    Inventors: Tanya Kraljic, Miranda Callahan, Yi Sing Alex Loh, Darla Sharp
  • Patent number: 11861933
    Abstract: A touch-based method for user authentication includes a training stage and an authentication stage. The training stage includes: generating, by a touch interface, a plurality of training touch parameters; and generating, by a processor, a training heat map according to the plurality of training touch parameters. The authentication stage includes: generating, by the touch interface, a plurality of testing touch parameters; generating, by the processor, a testing heat map according to the plurality of testing touch parameters; comparing, by the processor, the testing heat map with the training heat map to generate an error map; and generating, by the processor, one of pass signal and fail signal according to the error map.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: January 2, 2024
    Assignees: INVENTEC (PUDONG) TECHNOLOGY CORPORATION, INVENTEC CORPORATION
    Inventor: Trista Pei-Chun Chen
  • Patent number: 11854562
    Abstract: A method (and structure and computer product) to permit zero-shot voice conversion with non-parallel data includes receiving source speaker speech data as input data into a content encoder of a style transfer autoencoder system, the content encoder providing a source speaker disentanglement of the source speaker speech data by reducing speaker style information of the input source speech data while retaining content information and receiving target speaker input speech as input data into a target speaker encoder. The output of the content encoder and the target speaker encoder are combined in a decoder of the style transfer autoencoder, and the output of the decoder provides the content information of the input source speech data in a style of the target speaker speech information.
    Type: Grant
    Filed: May 14, 2019
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Yang Zhang, Shiyu Chang
  • Patent number: 11783833
    Abstract: A system is provided for modifying how an output is presented via a multi-device synchronous configuration based on detecting a speech characteristic in the user input. For example, if the user whispers a request, then the system may temporarily modify how the responsive output is presented to the user via multiple devices. In one example, the system may lower the volume on all devices presented the output. In another example, the system may present the output via a single device rather than multiple devices. The system may also determine to operate in a alternate output mode based on certain non-audio data.
    Type: Grant
    Filed: June 27, 2022
    Date of Patent: October 10, 2023
    Assignee: Amazon Technologies, Inc.
    Inventor: Ezekiel Wade Sanborn de Asis
  • Patent number: 11776560
    Abstract: A method for processing multiple intents from an audio stream in a virtual reality application may include multiple steps, including: receiving a stream of words as a first utterance; processing the first utterance before the stream of words is fully received; based on the processing, determining a first intent from the first utterance before the stream of words is fully received; determining occurrence of a pause after the first utterance; and receiving a second stream of words as a second utterance, the second stream being received after the determined pause.
    Type: Grant
    Filed: October 13, 2022
    Date of Patent: October 3, 2023
    Assignee: Health Scholars Inc.
    Inventors: Brian Philip Gillett, Akmal Hisyam Idris, James Oliver Lussier, Dustin Richard Parham, Kit Lee Burgess
  • Patent number: 11755834
    Abstract: A computing system is described that includes user interface components configured to receive typed user input; and one or more processors. The one or more processors are configured to: receive, by a computing system and at a first time, a first portion of text typed by a user in an electronic message being edited; predict, based on the first portion of text, a first candidate portion of text to follow the first portion of text; output, for display, the predicted first candidate portion of text for optional selection to append to the first portion of text; determine, at a second time that is after the first time, that the electronic message is directed to a sensitive topic; and responsive to determining that the electronic message is directed to a sensitive topic, refrain from outputting subsequent candidate portions of text for optional selection to append to text in the electronic message.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Paul Roland Lambert, Timothy Youngjin Sohn, Jacqueline Amy Tsay, Gagan Bansal, Cole Austin Bevis, Kaushik Roy, Justin Tzi-jay Lu, Katherine Anna Evans, Tobias Bosch, Yinan Wang, Matthew Vincent Dierker, Gregory Russell Bullock, Ettore Randazzo, Tobias Kaufmann, Yonghui Wu, Benjamin N. Lee, Xu Chen, Brian Strope, Yun-hsuan Sung, Do Kook Choe, Rami Eid Sammouf Al-Rfou'
  • Patent number: 11741967
    Abstract: An automatic speech recognition system and a method thereof are provided. The system includes an encoder and a decoder. The encoder comprises a plurality of encoder layers. At least one encoder layer includes a plurality of encoder sublayers fused into one or more encoder kernels. The system further comprises a first pair of ping-pong buffers communicating with the one or more encoder kernels. The decoder comprises a plurality of decoder layers. At least one decoder layer includes a plurality of decoder sublayers fused into one or more decoder kernels. The decoder receives a decoder output related to the encoder output and generates a decoder output. The encoder sends the decoder output to a beam search kernel.
    Type: Grant
    Filed: January 4, 2021
    Date of Patent: August 29, 2023
    Assignee: KWAI INC.
    Inventors: Yongxiong Ren, Heng Liu, Yang Liu, Lingzhi Liu, Jie Li, Yuanyuan Zhao, Xiaorui Wang
  • Patent number: 11705116
    Abstract: Systems and methods described herein relate to adapting a language model for automatic speech recognition (ASR) for a new set of words. Instead of retraining the ASR models, language models and grammar models, the system only modifies one grammar model and ensures its compatibility with the existing models in the ASR system.
    Type: Grant
    Filed: August 18, 2021
    Date of Patent: July 18, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ankur Gandhe, Ariya Rastrow, Gautam Tiwari, Ashish Vishwanath Shenoy, Chun Chen
  • Patent number: 11704900
    Abstract: In one embodiment, a method includes, by a client system, receiving, at the client system, a first user input, processing by the client system, the first user input to provide an initial response by identifying one or more entities referenced by the first user input and providing, by the client system, the initial response, where the initial response includes a conversational filler referencing at least one of the one or more identified entities, processing the first user input to provide a complete response by identifying, by the client system, one or more intents and one or more slots associated with the first user input based on a semantic analysis by a natural-language understanding module, and providing, by the client system, the complete response subsequent to the initial response, where the complete response is based on the one or more intents and the one or more slots.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: July 18, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Emmanouil Koukoumidis, Michael Robert Hanson, Mohsen M Agsen
  • Patent number: 11682383
    Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.
    Type: Grant
    Filed: June 2, 2021
    Date of Patent: June 20, 2023
    Assignee: Google LLC
    Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
  • Patent number: 11676220
    Abstract: In one embodiment, a method includes receiving a user input based on a plurality of modalities at the client system, wherein at least one of the modalities of the user input is a visual modality, determining one or more subjects and one or more attributes associated with the one or more subjects, respectively, based on the visual modality of the user input, resolving one or more entities corresponding to the one or more subjects based on the determined one or more attributes, and presenting a communication content at the client system responsive to the user input, wherein the communication content comprises information associated with executing results of one or more tasks corresponding to the one or more resolved entities.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: June 13, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Vivek Natarajan, Shawn C. P. Mei, Zhengping Zuo
  • Patent number: 11651775
    Abstract: An exemplary automatic speech recognition (ASR) system may receive an audio input including a segment of speech. The segment of speech may be independently processed by general ASR and domain-specific ASR to generate multiple ASR results. A selection between the multiple ASR results may be performed based on respective confidence levels for the general ASR and domain-specific ASR. As incremental ASR is performed, a composite result may be generated based on general ASR and domain-specific ASR.
    Type: Grant
    Filed: July 26, 2021
    Date of Patent: May 16, 2023
    Assignee: ROVI GUIDES, INC.
    Inventor: Jeffry Copps Robert Jose
  • Patent number: 11587554
    Abstract: The control system includes a calculation unit configured to control a voice interaction system including voice recognition models, in which the calculation unit instructs, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model tentatively selected from among the voice recognition models, determines a voice recognition model estimated to be optimal among the voice recognition models held in the voice interaction system based on results of the voice recognition of a speech made by the target person in a voice recognition server, and instructs, when the voice recognition model estimated to be optimal is different from the one voice recognition model tentatively selected, the voice interaction system to switch the voice recognition model to the one estimated to be optimal and to perform voice recognition and response generation.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: February 21, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Narimasa Watanabe
  • Patent number: 11580005
    Abstract: Provided is an anomaly pattern detection system including an anomaly detection device connected to one or more servers. The anomaly detection device may include an anomaly detector configured to model input data by considering all of the input data as normal patterns, and detect an anomaly pattern from the input data based on the modeling result.
    Type: Grant
    Filed: November 8, 2019
    Date of Patent: February 14, 2023
    Inventor: Jang Won Suh
  • Patent number: 11568254
    Abstract: An electronic apparatus is provided.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: January 31, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Dongsoo Lee, Sejung Kwon, Parichay Kapoor, Byeoungwook Kim
  • Patent number: 11562739
    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: January 24, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Andrew Smith, Christopher Schindler, Karthik Ramakrishnan, Rohit Prasad, Michael George, Rafal Kuklinski
  • Patent number: 11545157
    Abstract: Techniques are described for training and/or utilizing an end-to-end speaker diarization model. In various implementations, the model is a recurrent neural network (RNN) model, such as an RNN model that includes at least one memory layer, such as a long short-term memory (LSTM) layer. Audio features of audio data can be applied as input to an end-to-end speaker diarization model trained according to implementations disclosed herein, and the model utilized to process the audio features to generate, as direct output over the model, speaker diarization results. Further, the end-to-end speaker diarization model can be a sequence-to-sequence model, where the sequence can have variable length. Accordingly, the model can be utilized to generate speaker diarization results for any of various length audio segments.
    Type: Grant
    Filed: April 15, 2019
    Date of Patent: January 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Quan Wang, Yash Sheth, Ignacio Lopez Moreno, Li Wan
  • Patent number: 11508356
    Abstract: Disclosed are a speech recognition method and a speech recognition device, in which speech recognition is performed by executing an artificial intelligence (AI) algorithm and/or a machine learning algorithm provided therein. According to an embodiment of the present disclosure, the speech recognition method includes buffering an inputted spoken utterance, determining whether a preset wake-up word is present in the spoken utterance by comparing the buffered spoken utterance to the preset wake-up word, and in response to the preset wake-up word in the spoken utterance, activating a speech recognition function and isolating, from the spoken utterance, a spoken sentence as a voice command without the wake-up word, and processing the spoken sentence and outputting a processing result.
    Type: Grant
    Filed: September 10, 2019
    Date of Patent: November 22, 2022
    Assignee: LG ELECTRONICS INC.
    Inventor: Jong Hoon Chae
  • Patent number: 11442932
    Abstract: Systems and methods for mapping natural language to queries using a query grammar are described. For example, methods may include generating, based on a string, a set of tokens of a database syntax; generating a query graph for the set of tokens using a finite state machine representing a query grammar, wherein nodes of the finite state machine represent token types, directed edges of the finite state machine represent valid transitions between token types in the query grammar, vertices of the query graph correspond to respective tokens of the set of tokens, and directed edges of the query graph represent a transition between two tokens in a sequencing of the tokens; determining, based on the query graph, a sequence of the tokens in the set of tokens, forming a database query; and invoking a search of a database using a query based on the database query to obtain search results.
    Type: Grant
    Filed: July 16, 2019
    Date of Patent: September 13, 2022
    Assignee: ThoughtSpot, Inc.
    Inventors: Nikhil Yadav, Ravi Tandon
  • Patent number: 11425252
    Abstract: Exemplary aspects involve a data-communications apparatus or system communicate over a broadband network with a plurality of remotely-located data-communications circuits respectively associated with a plurality of remotely-situated client entities. The system includes a unified-communications and call center (UC-CC) platform that processes incoming data-communication interactions including different types of digitally-represented communications among which are incoming call, and that is integrated with a memory circuit including a database of information sets. Each of the information sets includes experience data corresponding to past incoming data-communication interactions processed by the platform, and with aggregated and organized data based on data collected in previous incoming interactions.
    Type: Grant
    Filed: July 21, 2021
    Date of Patent: August 23, 2022
    Assignee: 8x8, Inc.
    Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
  • Patent number: 11410658
    Abstract: Audio data saved at the end of client interactions are sampled, analyzed for pauses in speech, and sliced into stretches of acoustic data containing human speech between those pauses. The acoustic data are accompanied by machine transcripts made by VoiceAI. A suitable distribution of data useful for training and testing are stipulated during data sampling by applying certain filtering criteria. The resulting datasets are sent for transcription by a human transcriber team. The human transcripts are retrieved, some post-transcription processing and cleaning are performed, and the results are added to datastores for training and testing an acoustic model.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: August 9, 2022
    Assignee: Dialpad, Inc.
    Inventors: Eddie Yee Tak Ma, James Palmer, Kevin James, Etienne Manderscheid
  • Patent number: 11393471
    Abstract: A system is provided for modifying how an output is presented via a multi-device synchronous configuration based on detecting a speech characteristic in the user input. For example, if the user whispers a request, then the system may temporarily modify how the responsive output is presented to the user via multiple devices. In one example, the system may lower the volume on all devices presented the output. In another example, the system may present the output via a single device rather than multiple devices. The system may also determine to operate in a alternate output mode based on certain non-audio data.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: July 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventor: Ezekiel Wade Sanborn de Asis
  • Patent number: 11388480
    Abstract: An information processing apparatus according to the present technology includes a reception unit, a first generation unit, a collection unit, and a second generation unit. The reception unit receives a content. The first generation unit analyzes the received content and generates one or more pieces of analysis information related to the content. The collection unit collects content information related to the content on a network on the basis of the one or more pieces of generated analysis information. The second generation unit generates an utterance sentence on the basis of at least one of the one or more pieces of analysis information and the collected content information.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: July 12, 2022
    Inventor: Hideo Nagasaka
  • Patent number: 11367438
    Abstract: An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: June 21, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Hyoeun Kim, Hangil Jeong, Heeyeon Choi
  • Patent number: 11355113
    Abstract: A method, apparatus, device, and computer readable storage medium for recognizing and decoding a voice based on a streaming attention model are provided. The method may include generating a plurality of acoustic paths for decoding the voice using the streaming attention model, and then merging acoustic paths with identical last syllables of the plurality of acoustic paths to obtain a plurality of merged acoustic paths. The method may further include selecting a preset number of acoustic paths from the plurality of merged acoustic paths as retained candidate acoustic paths. Embodiments of the present disclosure present a concept that acoustic score calculating of a current voice fragment is only affected by its last voice fragment and has nothing to do with earlier voice history, and merge acoustic paths with the identical last syllables of the plurality of candidate acoustic paths.
    Type: Grant
    Filed: March 9, 2020
    Date of Patent: June 7, 2022
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Junyao Shao, Sheng Qian, Lei Jia
  • Patent number: 11341988
    Abstract: A hybrid machine learning-based and DSP statistical post-processing technique is disclosed for voice activity detection. The hybrid technique may use a DNN model with a small context window to estimate the probability of speech by frames. The DSP statistical post-processing stage operates on the frame-based speech probabilities from the DNN model to smooth the probabilities and to reduce transitions between speech and non-speech states. The hybrid technique may estimate the soft decision on detected speech in each frame based on the smoothed probabilities, generate a hard decision using a threshold, detect a complete utterance that may include brief pauses, and estimate the end point of the utterance. The hybrid voice activity detection technique may incorporate a target directional probability estimator to estimate the direction of the speech source. The DSP statistical post-processing module may use the direction of the speech source to inform the estimates of the voice activity.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: May 24, 2022
    Assignee: APPLE INC.
    Inventors: Ramin Pishehvar, Feiping Li, Ante Jukic, Mehrez Souden, Joshua D. Atkins
  • Patent number: 11341340
    Abstract: Adapters for neural machine translation systems. A method includes determining a set of similar n-grams that are similar to a source n-gram, and each similar n-gram and the source n-gram is in a first language; determining, for each n-gram in the set of similar n-grams, a target n-gram is a translation of the similar n-gram in the first language to the target n-gram in the second language; generating a source encoding of the source n-gram, and, for each target n-gram determined from the set of similar n-grams determined for the source n-gram, a target encoding of the target n-gram and a conditional source target memory that is an encoding of each of the target encodings; providing, as input to a first prediction model, the source encoding and the condition source target memory; and generating a predicted translation of the source n-gram from the first language to the second language.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: May 24, 2022
    Assignee: Google LLC
    Inventors: Ankur Bapna, Ye Tian, Orhan Firat
  • Patent number: 11334182
    Abstract: In some implementations, data indicating a touch received on a proximity-sensitive display is received while the proximity-sensitive display is presenting one or more items. In one aspect, the techniques describe may involve a process for disambiguating touch selections of hypothesized items, such as text or graphical objects that have been generated based on input data, on a proximity-sensitive display. This process may allow a user to more easily select hypothesized items that the user may wish to correct, by determining whether a touch received through the proximity-sensitive display represents a selection of each hypothesized item based at least on a level of confidence that the hypothesized item accurately represents the input data.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: May 17, 2022
    Assignee: Google LLC
    Inventors: Jakob Nicolaus Foerster, Diego Melendo Casado, Glen Shires
  • Patent number: 11328731
    Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: May 10, 2022
    Assignee: salesforce.com, inc.
    Inventors: Weiran Wang, Yingbo Zhou, Caiming Xiong
  • Patent number: 11295732
    Abstract: In order to improve the accuracy of ASR, an utterance is transcribed using a plurality of language models, such as for example, an N-gram language model and a neural language model. The language models are trained separately. They each output a probability score or other figure of merit for a partial transcription hypothesis. Model scores are interpolated to determine a hybrid score. While recognizing an utterance, interpolation weights are chosen or updated dynamically, in the specific context of processing. The weights are based on dynamic variables associated with the utterance, the partial transcription hypothesis, or other aspects of context.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventors: Steffen Holm, Terry Kong, Kiran Garaga Lokeswarappa
  • Patent number: 11289099
    Abstract: An information processing device including processing circuitry is provided. The processing circuitry configured to receive voice information regarding a voice of a user collected by a specific microphone of a plurality of microphones. The processing circuitry is configured to determine the user identified on a basis of the voice information regarding the voice of the user collected by the specific microphone among the plurality of microphones to be a specific type of user that has performed speech a predefined number of times or more within at least a certain period of time. Further, the processing circuitry is configured to control a message to be output to the user via a speaker corresponding to the specific microphone based on the user being determined to be the specific type of user.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: March 29, 2022
    Assignee: SONY CORPORATION
    Inventor: Keigo Ihara
  • Patent number: 11276413
    Abstract: Disclosed are an audio signal encoding method and audio signal decoding method, and an encoder and decoder performing the same. The audio signal encoding method includes applying an audio signal to a training model including N autoencoders provided in a cascade structure, encoding an output result derived through the training model, and generating a bitstream with respect to the audio signal based on the encoded output result.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: March 15, 2022
    Assignees: Electronics and Telecommunications Research Institute, THE TRUSTEES OF INDIANA UNIVERSITY
    Inventors: Mi Suk Lee, Jongmo Sung, Minje Kim, Kai Zhen
  • Patent number: 11243810
    Abstract: The system uses the non-repudiatory persistence of blockchain technology to store all task statuses and results across the distributed computer network in an immutable blockchain database. Coupled with the resiliency of the stored data, the system may determine a sequence of processing tasks for a given processing request and use the sequence to detect and/or predict failures. Accordingly, in the event of a detected system failure, the system may recover the results prior to the failure, minimizing disruptions to processing the request and improving hardware resiliency.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: February 8, 2022
    Assignee: The Bank of New York Mellon
    Inventors: Sanjay Kumar Stribady, Saket Sharma, Gursel Taskale
  • Patent number: 11217252
    Abstract: A method of zoning a transcription of audio data includes separating the transcription of audio data into a plurality of utterances. A that each word in an utterances is a meaning unit boundary is calculated. The utterance is split into two new utterances at a work with a maximum calculated probability. At least one of the two new utterances that is shorter than a maximum utterance threshold is identified as a meaning unit.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: January 4, 2022
    Assignee: VERINT SYSTEMS INC.
    Inventors: Roni Romano, Yair Horesh, Jeremie Dreyfuss
  • Patent number: 11204685
    Abstract: User interfaces may enable users to initiate voice-communications with voice-controlled devices via a Wi-Fi network or other network via an Internet Protocol (IP) address. The user interfaces may include controls to enable users to initiate voice communications, such as Voice over Internet Protocol (VoIP) calls, with devices that do not have connectivity with traditional mobile telephone networks, such as traditional circuit transmissions of a Public Switched Telephone Network (PSTN). For example, the user interface may enable initiating a voice communication with a voice-controlled device that includes network connectivity via a home Wi-Fi network. The user interfaces may indicate availability of devices and/or contacts for voice communications and/or recent activity of devices or contact.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: December 21, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Blair Harold Beebe, Katherine Ann Baker, David Michael Rowell, Peter Chin
  • Patent number: 11178082
    Abstract: Methods, systems, and computer programs are presented for a smart communications assistant with an audio interface. One method includes an operation for getting messages addressed to a user. The messages are from one or more message sources and each message comprising message data that includes text. The method further includes operations for analyzing the message data to determine a meaning of each message, for generating a score for each message based on the respective message data and the meaning of the message, and for generating a textual summary for the messages based on the message scores and the meaning of the messages. A speech summary is created based on the textual summary and the speech summary is then sent to a speaker associated with the user. The audio interface further allows the user to verbally request actions for the messages.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: November 16, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nikrouz Ghotbi, August Niehaus, Sachin Venugopalan, Aleksandar Antonijevic, Tvrtko Tadic, Vashutosh Agrawal, Lisa Stifelman
  • Patent number: 11164584
    Abstract: Systems and methods are provided for application awakening and speech recognition. Such system may comprise a microphone configured to record an audio in an audio queue. The system may further comprise a processor configured to monitor the audio queue for an awakening phrase, in response to detecting the awakening phrase, obtain an audio segment from the audio queue, and transmit the obtained audio segment to a server. The recording of the audio may be continuous from a beginning of the awakening phrase to an end of the audio segment.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: November 2, 2021
    Assignee: Beijing DiDi Infinity Technology and Development Co., Ltd.
    Inventors: Liting Guo, Gangtao Hu
  • Patent number: 11164564
    Abstract: According to certain embodiments, a system comprises interface circuitry and processing circuitry. The processing circuitry receives an input via the interface circuitry. The input is based on an utterance of a user, and the processing circuitry uses a probabilistic engine to determine one or more candidate intents associated with the utterance. The processing circuitry determines a number of the one or more candidate intents that exceed a threshold. If the number of candidate intents that exceed the threshold does not equal one, the processing circuitry uses a deterministic engine to compare the input to a set of regular expression patterns. If the input matches one of the regular expression patterns, the processing circuitry uses the matching regular expression pattern to determine the intent of the utterance. The interface circuitry communicates the intent of the utterance as an output.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: November 2, 2021
    Assignee: Bank of America Corporation
    Inventors: Donatus Asumu, Bhargav Aditya Ayyagari
  • Patent number: 11151332
    Abstract: Embodiments provide for dialog based speech recognition by clustering a plurality of nodes comprising a dialog tree into at least a first cluster and a second cluster; creating a first dataset of natural language sentences for the first cluster and a second dataset of natural language sentences for the second cluster; generating a first specialized language model (LM) associated with the first cluster based on the first dataset; and generating a second specialized LM associated with the second cluster based on the second dataset, wherein the first specialized LM is different from the second specialized LM.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Business
    Inventors: Julio Nogima, Marcelo C. Grave, Claudio S. Pinhanez
  • Patent number: 11115463
    Abstract: The description relates to predicting terms based on text inputted by a user. One example includes a computing device comprising a processor configured to send, over a communications network, the text to a remote prediction engine. The processor is configured to send the text to a local prediction engine stored at the computing device, and to monitor for a local predicted term from the local prediction engine and a remote predicted term from the remote prediction engine, in response to the sent text. The computing device includes a user interface configured to present a final predicted term to the user such that the user is able to select the final term. The processor is configured to form the final predicted term using either the remote predicted term or the local predicted term on the basis of a time interval running from the time at which the user input the text.
    Type: Grant
    Filed: November 22, 2016
    Date of Patent: September 7, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Adam John Cudworth, Alexander Gautam Primavesi, Piotr Jerzy Holc, Joseph Charles Woodward
  • Patent number: 11094316
    Abstract: A device includes a memory configured to store category labels associated with categories of a natural language processing library. A processor is configured to analyze input audio data to generate a text string and to perform natural language processing on at least the text string to generate an output text string including an action associated with a first device, a speaker, a location, or a combination thereof. The processor is configured to compare the input audio data to audio data of the categories to determine whether the input audio data matches any of the categories and, in response to determining that the input audio data does not match any of the categories: create a new category label, associate the new category label with at least a portion of the output text string, update the categories with the new category label, and generate a notification indicating the new category label.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: August 17, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Sunkuk Moon, Lae-Hoon Kim, Ravi Choudhary
  • Patent number: 11087764
    Abstract: A speech recognition apparatus includes a speech detection unit configured to detect a speech input by a user, an information providing unit configured to perform information provision to the user, using either first speech recognition information based on a recognition result of the speech by a first speech recognition unit or second speech recognition information based on a recognition result of the speech by a second speech recognition unit different from the first speech recognition unit, and a selection unit configured to select either the first speech recognition information or the second speech recognition information as speech recognition information to be used by the information providing unit on the basis of an elapsed time from the input of the speech, and change a method of the information provision by the information providing unit.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: August 10, 2021
    Assignee: Clarion Co., Ltd.
    Inventors: Takeshi Homma, Rui Zhang, Takuya Matsumoto, Hiroaki Kokubo
  • Patent number: 11074908
    Abstract: A method, computer program product, and computer system for identifying, by a computing device, at least one language model component of a plurality of language model components in at least one application associated with automatic speech recognition (ASR) and natural language understanding (NLU) usage. A contribution bias may be received for the at least one language model component. The ASR and NLU may be aligned between the plurality of language model components based upon, at least in part, the contribution bias.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: July 27, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Nathan Bodenstab, Matt Hohensee, Dermot Connolly, Kenneth Smith, Vittorio Manzone
  • Patent number: 11056118
    Abstract: A method of speaker identification comprises receiving a speech signal and dividing the speech signal into segments. Following each segment, a plurality of features are extracted from a most recently received segment, and scoring information is derived from the extracted features of the most recently received segment. The scoring information derived from the extracted features of the most recently received segment is combined with previously stored scoring information derived from the extracted features of any previously received segment. The new combined scoring information is stored, and an identification score is calculated using the combined scoring information.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: July 6, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: David Martínez González, Carlos Vaquero Avilés-Casco
  • Patent number: 11056104
    Abstract: In an approach for acoustic modeling with a language model, a computer isolates an audio stream. The computer identifies one or more language models based at least in part on the isolated audio stream. The computer selects a language model from the identified one or more language models. The computer creates a text based on the selected language model and the isolated audio stream. The computer creates an acoustic model based on the created text. The computer generates a confidence level associated with the created acoustic model. The computer selects a highest ranked language model based at least in part on the generated confidence level.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: July 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Stephen C. Hammer, Mauro Marzorati
  • Patent number: 11049045
    Abstract: A classification apparatus includes: a calculation unit that outputs, as a classification result, results of classification by each of a plurality of classifiers with respect to learning data formed of data of at least two classes at a learning time and calculates a combination result value obtained by linear combination, using a combination coefficient, of results of classification by each of the plurality of classifiers with respect to the learning data to output the calculated combination result value as the classification result at a classification time; an extraction unit that extracts a correct solution class and an incorrect solution class for each of the classifiers from the classification result; a difference calculation unit that calculates a difference between the correct solution class and the incorrect solution class for each of the classifiers; a conversion unit that calculates a feature vector using the calculated difference for each of the classifiers; and a combination coefficient setting uni
    Type: Grant
    Filed: November 16, 2016
    Date of Patent: June 29, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kotaro Funakoshi, Naoto Iwahashi
  • Patent number: 11037551
    Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: June 15, 2021
    Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
  • Patent number: 11018885
    Abstract: In general, the disclosure describes techniques for automatically generating summaries of meetings. A computing system obtains a transcript of a meeting and may produce, based on the transcript of the meeting, a data structure that comprises utterance features. Furthermore, the computing system may determine, based on the transcript of the meeting, temporal bounds of a plurality of activity episodes within the meeting. For each respective activity episode of a plurality of activity episodes, the computing system may determine, based on the utterance features associated with the respective activity episode, a conversational activity type associated with the respective activity episode. Additionally, the computing system may produce an episode summary for the respective activity episode that is dependent on the determined conversational activity type associated with the respective activity episode.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: May 25, 2021
    Assignee: SRI International
    Inventor: John Niekrasz
  • Patent number: 10991363
    Abstract: An apparatus, method, and computer program product for adapting an acoustic model to a specific environment are defined. An adapted model obtained by adapting an original model to the specific environment using adaptation data, the original model being trained using training data and being used to calculate probabilities of context-dependent phones given an acoustic feature. Adapted probabilities obtained by adapting original probabilities using the training data and the adaptation data, the original probabilities being trained using the training data and being prior probabilities of context-dependent phones. An adapted acoustic model obtained from the adapted model and the adapted probabilities.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: April 27, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Bhuvana Ramabhadran, Masayuki Suzuki
  • Patent number: 10957129
    Abstract: Methods, systems, and apparatus for monitoring a sound are described. An audio signal is obtained and the audio signal is analyzed to generate an audio signature. An object type is identified based on the audio signature and an action corresponding to the object type is identified.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: March 23, 2021
    Assignee: eBay Inc.
    Inventor: Sergio Pinzon Gonzales, Jr.