Word Recognition Patents (Class 704/251)
  • Patent number: 11335347
    Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: May 17, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Gustavo Alfonso Aguilar Alas, Viktor Rozgic, Chao Wang
  • Patent number: 11337061
    Abstract: A system and method for providing anonymous communications from a user to a called party includes obtaining a dedicated phone number and creating a user account for the user and assigning the dedicated phone number to the user account. A provider account is created for a digital assistant using the dedicated phone number and the digital assistant is preprogrammed with the user account. The digital assistant is also preprogrammed with a skill for recognizing a specific utterance (e.g. “Call”). Connectivity is provided between the digital assistant and the Internet, for example, using a wireless access point. The digital assistant listens for the specific utterance and, upon recognizing the specific utterance followed by an identification of the called party, the digital assistant initiates a voice call through the Internet to the called party.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: May 17, 2022
    Assignee: Ways Investments, LLC
    Inventor: Mark Edward Gray
  • Patent number: 11328714
    Abstract: Processing data for speech recognition by generating hypotheses from input data, assigning each hypothesis, a score according to a confidence level value and hypothesis ranking, executing a pass/fail grammar test against each hypothesis, generating replacement hypotheses according to grammar test failures, assigning each replacement hypothesis a score according to a number of hypothesis changes, and providing a set of hypotheses, wherein the set comprises at least one replacement hypotheses.
    Type: Grant
    Filed: January 2, 2020
    Date of Patent: May 10, 2022
    Assignee: International Business Machines Corporation
    Inventors: Andrew R. Freed, Marco Noel, Victor Povar
  • Patent number: 11328096
    Abstract: The present disclosure provides a method, system, and device for distributing a software release. To illustrate, based on one or more files for distribution as a software release, a release bundle is generated that includes release bundle information, such as, for each file of the one or more files, a checksum, meta data, or both. One or more other aspects of the present disclosure further provide sending the release bundle to a node device. After receiving the release bundle at the node device, the node device receives and stores at least one file at a transaction directory. After verification that each of the one or more files is present/available at the node device, the one or more files may be provided to a memory of a node device and meta data included in the release bundle information may be applied to the one or more files transferred to the memory.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: May 10, 2022
    Assignee: JFROG, LTD.
    Inventor: Yoav Landman
  • Patent number: 11322141
    Abstract: An information processing device includes a communication controller that performs communication control for receiving transmission data transmitted from a client, transmitting the transmission data to a first service providing server that performs a first service process, receiving a first service process result from the first service providing server, transmitting data according to the first service process result to a second service providing server that performs a second service process that is different from a first service, receiving a second service process result from the second service providing server, and transmitting the second service process result to the client. The first service process result is obtained by performing the first service process on the transmission data. The second service process result is obtained by performing the second service process on the data according to the first service process result.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: May 3, 2022
    Assignee: SONY CORPORATION
    Inventors: Takao Okuda, Takashi Shibuya
  • Patent number: 11322130
    Abstract: The invention relates to a method for airborne-sound acoustic monitoring of an exterior and/or an interior of a vehicle, in which at least one microphone (1) is used to convert airborne sound into an electrical signal (S) and to route it for evaluation purposes to a device for voice and/or sound recognition (2). According to the invention, the electrical signal (S) is subjected to a pre-evaluation in a device for trigger detection (3), and detection of a trigger results in the device for voice and/or sound recognition (2) being moved from an inactive or partially active state to a fully active state by means of the device for trigger detection (3). Further, the invention relates to an apparatus for airborne-sound acoustic monitoring of an exterior and/or an interior of a vehicle and to a vehicle having such an apparatus. The subject matter of the invention is also a computer-readable storage medium.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: May 3, 2022
    Assignee: Robert Bosch GmbH
    Inventors: Thomas Fleischmann, Udo Hermann, Niko Dorsch
  • Patent number: 11322151
    Abstract: According to embodiments of the disclosure, a method and an apparatus for processing a speech signal, and a computer-readable storage medium are provided. The method includes obtaining a set of speech feature representations of a speech signal received. The method also includes generating a set of source text feature representations based on a text recognized from the speech signal, each source text feature representation corresponding to an element in the text. The method also includes generating a set of target text feature representations based on the set of speech feature representations and the set of source text feature representations. The method also includes determining a match degree between the set of target text feature representations and a set of reference text feature representations predefined for the text, the match degree indicating an accuracy of recognizing of the text.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: May 3, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD
    Inventors: Chuanlei Zhai, Xu Chen, Jinfeng Bai, Lei Jia
  • Patent number: 11308958
    Abstract: In one aspect, a networked microphone device is configured to (i) receive sound data, (ii) determine, via the wake-word engine, that a first portion of the sound data is representative of a wake word, (iii) determine that a second networked microphone device was added to a media playback system, (iv) transmit the first portion of the sound data to a second networked microphone device, (v) begin determining a command to be performed by the first networked microphone device, (vi) receive an indication of whether the first portion of the sound data is representative of the wake word, and (vii) output a response indicative of whether the first portion of the sound data is representative of the wake word.
    Type: Grant
    Filed: February 7, 2020
    Date of Patent: April 19, 2022
    Assignee: Sonos, Inc.
    Inventor: Connor Kristopher Smith
  • Patent number: 11308937
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for identifying a key phrase in audio, a device and a computer readable storage medium. The method for identifying a key phrase in audio includes obtaining audio data to be identified. The method further includes identifying the key phrase in the audio data using a trained key phrase identification model. The key phrase identification model is trained based on first training data for identifying feature information of words in a first training text and second training data for identifying the key phrase in a second training text. In this way, embodiments of the present disclosure can accurately and efficiently identify key information in the audio data.
    Type: Grant
    Filed: August 2, 2019
    Date of Patent: April 19, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zhihua Wang, Tianxing Yang, Zhipeng Wu, Bin Peng, Chengyuan Zhao
  • Patent number: 11301507
    Abstract: Systems and methods for searching for a media asset are described. In some aspects, the system includes control circuitry that receives a first search query from a user. The control circuitry identifies media assets related to the first search query from a content database. The control circuitry receives a second search query following the first search query. The control circuitry determines whether a media asset from the media assets is related to the second search query. In response to determining that less than a threshold number of media assets from the media assets are related to the second search query, the control circuitry transmits an instruction requesting the user to repeat the second search query. The control circuitry receives a third search query related to the first search query. The control circuitry determines a media asset from the media assets that is related to the third search query.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: April 12, 2022
    Assignee: Rovi Guides, Inc.
    Inventors: Sashikumar Venkataraman, Ahmed Nizam Mohaideen Pathurudeen
  • Patent number: 11302306
    Abstract: A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: April 12, 2022
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Zhenyong Zhang, Wei Ma
  • Patent number: 11301885
    Abstract: Embodiments herein provide data clustering and user modeling for next-best-action decisions. Specifically, a modeling tool is configured to: receive indicators within unstructured social data from a plurality of users; analyze the unstructured social data of each of the plurality of users to assign a set of feature vectors to each of the plurality of users, each feature vector corresponding to one or more personality characteristics of each of the plurality of users; and analyze the feature vectors to identify two or more users from the plurality of users sharing a set of similar feature vectors. The modeling tool is further configured to: group the two or more users from the plurality of users sharing the set of similar feature vectors to form a cluster; identify attributes of the cluster; and input the attributes of the cluster into a predictive model to determine an offer corresponding to the cluster.
    Type: Grant
    Filed: September 16, 2019
    Date of Patent: April 12, 2022
    Assignee: International Business Machines Corporation
    Inventors: Norbert Herman, Daniel T. Lambert
  • Patent number: 11295730
    Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
  • Patent number: 11295755
    Abstract: A non-transitory computer-readable storage medium storing a program that causes a processor included in a computer mounted on a sound source direction estimation device to execute a process, the process includes calculating a sound pressure difference between a first voice data acquired from a first microphone and a second voice data acquired from a second microphone and estimating a sound source direction of the first voice data and the second voice data based on the sound pressure difference, outputting an instruction to execute a voice recognition on the first voice data or the second voice data in a language corresponding to the estimated sound source direction, and controlling a reference for estimating a sound source direction based on the sound pressure difference, based on a time length of the voice data used for the voice recognition based on the instruction and a voice recognition time length.
    Type: Grant
    Filed: August 5, 2019
    Date of Patent: April 5, 2022
    Assignee: FUJITSU LIMITED
    Inventors: Nobuyuki Washio, Masanao Suzuki, Chisato Shioda
  • Patent number: 11288710
    Abstract: Automatically collecting advertisement bidding order by automatically accessing at least one Internet content site and presenting the Internet content site with at least one virtual user data and an at least one of IP address representing a geographic location of the virtual user. In response, receiving from the Internet content site advertisement content, and bidding data. Presenting the advertisement bidding data to a user, and/or storing the advertisement bidding data.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: March 29, 2022
    Assignee: BI SCIENCE (2009) LTD
    Inventors: Assaf Toval, Kfir Moyal
  • Patent number: 11282520
    Abstract: Embodiments of the present application provide a method, apparatus and device for interaction of intelligent voice devices, and a storage medium. The method includes: receiving wake-up messages sent by respective awakened intelligent voice devices; determining a forwarding device according to the wake-up messages; sending a forwarding instruction to the forwarding device to enable the forwarding device to receive a user voice request according to the forwarding instruction, where the forwarding instruction includes: type skill information of all intelligent voice devices; and sending a non-response message to other awakened intelligent voice device other than the forwarding device, which enables the most appropriate response device to execute the cloud result requested by the forwarding device, and the plurality of awakened intelligent voice devices do not respond at the same time so as to avoid confusion, thus making it easier to meet user needs.
    Type: Grant
    Filed: July 16, 2019
    Date of Patent: March 22, 2022
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Gaofei Cheng, Fei Wang, Yan Zhang, Qin Xiong, Leilei Gao
  • Patent number: 11282495
    Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: March 22, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Hongda Mao, George Yu-Chien Lin, Sundararajan Srinivasan, Chu-Cheng Hsieh
  • Patent number: 11270691
    Abstract: A voice interaction system performs a voice interaction with a user. The voice interaction system includes: topic detection means for estimating a topic of the voice interaction and detecting a change in the topic that has been estimated; and ask-again detection means for detecting, when the change in the topic has been detected by the topic detection means, the user's voice as ask-again by the user based on prosodic information on the user's voice.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: March 8, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Narimasa Watanabe, Sawa Higuchi, Tatsuro Hori
  • Patent number: 11270700
    Abstract: An artificial intelligence device includes a microphone configured to acquire speech including a plurality of languages, and a processor configured to generate, from the speech, text data corresponding to the speech, generate a plurality of pieces of separated data acquired by separating the text data for each language, perform natural language understanding processing corresponding to a language of each of the plurality of pieces of separated data to generate a natural language understanding processing result for each of the plurality of pieces of separated data, acquire command information about a command to be instructed by the speech and slot information about an entity subjected to the command, based on the natural language understanding processing result, perform an operation corresponding to the speech based on the command information and the slot information, and generate a response based on a result of performing the operation.
    Type: Grant
    Filed: February 24, 2020
    Date of Patent: March 8, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Hyun Yu, Byeongha Kim, Yejin Kim, Jonghoon Chae
  • Patent number: 11257491
    Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: February 22, 2022
    Assignee: ADOBE INC.
    Inventors: Sarah Kong, Yinglan Ma, Hyunghwan Byun, Chih-Yao Hsieh
  • Patent number: 11257492
    Abstract: Embodiments of the present disclosure provide a voice interaction method and apparatus for a customer service. The method includes: receiving customer demand information from a customer demand end, the customer demand information including a customer demand end identifier and a voice demand instruction; performing a speech recognition on the voice demand instruction; and if a demanded service type in the voice demand instruction is identified, sending a service-providing request to a service management system based on the demanded service type, the service-providing request including the customer demand end identifier and the demanded service type. The embodiments of the present disclosure realize the interaction between the customer demand end, the service management system and the customer by adopting the voice interaction method, so that the customer's demand can be quickly and intelligently recognized and the corresponding service can be provided.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: February 22, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Xiantang Chang
  • Patent number: 11244681
    Abstract: A system and a method for automating drive-thru orders are provided. In particular, a bridge board is provided that can integrate existing drive-thru hardware with computer devices executing machine learning models to detect and analyze speech from a drive-thru. The system and method employ vehicle tracking to account for vehicle behavior during its time at the drive-thru as well as enhanced vehicle analytics. Additionally, the system and method employ tools for assessing customer speed-of-service. Cameras and vehicle image analysis are used to link drive-thru and/or on-line food/beverage orders with vehicles entering the eatery property to accelerate food/beverage delivery to vehicle occupants.
    Type: Grant
    Filed: May 20, 2021
    Date of Patent: February 8, 2022
    Assignee: XENIAL, INC.
    Inventors: Christopher Siefken, William Wine, Arjun Wadwalkar, Brian Keith Jackson, Andrew Grindstaff
  • Patent number: 11238856
    Abstract: Aspects of the present disclosure relate to ignoring trigger words of a buffered media stream. A buffered media stream of media content is accessed in advance of the playing the media stream. One or more trigger words in the media content of the buffered media stream are identified. A time stamp is generated for each of the one or more identified trigger words in relation to a play time of the media content of the buffered media stream. A voice command device is instructed to ignore audio content of the buffered media stream based on the time stamp for each of the one or more identified trigger words while the buffered media stream is played.
    Type: Grant
    Filed: May 1, 2018
    Date of Patent: February 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Eunjin Lee, Jack Dunning, John J. Wood, Giacomo G. Chiarella, Daniel T. Cunnington
  • Patent number: 11227250
    Abstract: A method, computer system, and a computer program product for customer representative ratings is provided. The present invention may include receiving a chat transcript with one or more tagged triplets and one or more multi-dimensional success vectors. The present invention may include aggregating the one or more multi-dimensional success vectors. The present invention may include receiving at least one business priority. The present invention may include applying at least one filter to the one or more multi-dimensional success vectors. The present invention may include normalizing the one or more multi-dimensional success vectors based on the at least one applied filter. The present invention may include obtaining a rating.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: January 18, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steven Ware Jones, Arjun Jauhari, Jennifer A. Mallette, Vivek Salve
  • Patent number: 11222175
    Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.
    Type: Grant
    Filed: May 24, 2019
    Date of Patent: January 11, 2022
    Assignee: International Business Machines Corporation
    Inventors: Michael Glass, Alfio M Gliozzo
  • Patent number: 11222635
    Abstract: An electronic device of the present invention comprises: a housing; a touchscreen display; a microphone; at least one speaker; a button disposed on a portion of the housing or set to be displayed on the touchscreen display; a wireless communication circuit; a processor; and a memory. The electronic device is configured to store an application program including a user interface for receiving a text input. When the user interface is not displayed on the touchscreen display, the electronic device enables a user to receive a user input through the button, receives user speech through the microphone, and then provides data on the user speech to an external server including an automatic speech recognition system and an intelligence system. An instruction for performing a task generated by the intelligence system in response to the user speech is received from the server.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: January 11, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sang-Ki Kang, Jang-Seok Seo, Kook-Tae Choi, Hyun-Woo Kang, Jin-Yeol Kim, Chae-Hwan Li, Kyung-Tae Kim, Dong-Ho Jang, Min-Kyung Hwang
  • Patent number: 11222623
    Abstract: A speech keyword recognition method includes: obtaining first speech segments based on a to-be-recognized speech signal; obtaining first probabilities respectively corresponding to the first speech segments by using a preset first classification model. A first probability of a first speech segment is obtained from probabilities of the first speech segment respectively corresponding to pre-determined word segmentation units of a pre-determined keyword.
    Type: Grant
    Filed: May 27, 2020
    Date of Patent: January 11, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Jun Wang, Dan Su, Dong Yu
  • Patent number: 11211046
    Abstract: A mistranscription generated by a speech recognition system is identified. A received utterance is matched to a first utterance member within a set of known utterance members. The matching operation matches fewer than the first plural number of words in the received utterance and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member. The received utterance is sent to a mistranscription analyzer component which increments evidence that the received utterance is evidence of a mistranscription. Once the incremented evidence for the mistranscription exceeds a threshold, future received utterances containing the mistranscription are treated as though the first word was recognized.
    Type: Grant
    Filed: January 13, 2020
    Date of Patent: December 28, 2021
    Assignee: International Business Machines Corporation
    Inventors: Andrew Aaron, Shang Guo, Jonathan Lenchner, Maharaj Mukherjee
  • Patent number: 11205415
    Abstract: An electronic apparatus which includes a memory configured to store first voice recognition information related to a first language and second voice recognition information related to a second language, and a processor to obtain a first text corresponding to a user voice that is received on the basis of first voice recognition information. The processor, based on an entity name being included in the user voice according to the obtained first text, identifies a segment in the user voice in which the entity name is included, and obtains a second text corresponding to the identified segment of the user voice on the basis of the second voice recognition information, and obtains control information corresponding to the user voice on the basis of the first text and the second text.
    Type: Grant
    Filed: October 25, 2019
    Date of Patent: December 21, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chansik Bok, Jihun Park
  • Patent number: 11200909
    Abstract: A method is disclosed. The proposed method includes: providing an initial speech corpus including plural utterances; based on a condition of maximum a posteriori (MAP), according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the kth utterance, using a probability of an ISR of the kth utterance xk to estimate an estimated value {circumflex over (x)}k of the xk; and through the MAP condition, according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the given lth breath group/prosodic phrase group (BG/PG) of the kth utterance, using a probability of an ISR of the lth BG/PG of the kth utterance xk,l to estimate an estimated value {circumflex over (x)}k,l of the xk,l wherein the {circumflex over (x)}k,l is the estimated value of local ISR, and a mean of a prior probability model of the {circumflex over (x)}k,l is the {circumflex over (x)}k.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: December 14, 2021
    Assignee: NATIONAL YANG MING CHIAO TUNG UNIVERSITY
    Inventors: Chen-Yu Chiang, Guan-Ting Liou, Yih-Ru Wang, Sin-Horng Chen
  • Patent number: 11200382
    Abstract: This application discloses a prosodic pause prediction method, a prosodic pause prediction device and an electronic device. The specific implementation scheme includes: obtaining a first matrix by mapping a to-be-tested text sequence through a trained embedding layer, where the to-be-tested text sequence includes a to-be-tested input text and an identity of a to-be-tested speaker; inputting the first matrix into a trained attention model, and determining a semantic representation matrix by the trained attention model; and, performing prosodic pause prediction based on the semantic representation matrix and outputting a prosodic pause prediction result of each word in the to-be-tested input text.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: December 14, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Zhipeng Nie, Yanyao Bian, Zhanjie Gao, Changbin Chen
  • Patent number: 11194825
    Abstract: A distributed sequential pattern data mining framework mines user data to determine statistically-relevant sequential patterns which are used to correlate the sequential patterns to a particular outcome. The correlation is provided by a statistical model, a binary predictive model and/or a logistic regression model which uses the sequential patterns to learn the behavior of end users during their usage of a software application.
    Type: Grant
    Filed: September 23, 2018
    Date of Patent: December 7, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Shengyu Fu, Sai Tulasi Neppali, Neelakantan Sundaresan, Siyu Yang
  • Patent number: 11189288
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.
    Type: Grant
    Filed: January 15, 2020
    Date of Patent: November 30, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Michael Johnston, Derya Ozkan
  • Patent number: 11189287
    Abstract: Provided are an optimization method, apparatus, device for a wake-up model and a storage medium, which allow for: acquiring a training set and a verification set; performing an iterative training on the wake-up model according to the training set and the verification set; during the iterative training, periodically updating the training set and the verification set according to the wake-up model and a preset corpus database, and continuing performing the iterative training on the wake-up model according to the updated training set and verification set; and outputting the wake-up model when a preset termination condition is reached. The embodiments of the present disclosure, by periodically updating the training set and the verification set according to the wake-up model and the preset corpus database during an iteration, may improve optimization efficiency and effects of the wake-up model, thereby improving stability and adaptability of the wake-up model and avoiding overfitting.
    Type: Grant
    Filed: December 4, 2019
    Date of Patent: November 30, 2021
    Assignees: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., SHANGHAI XIAODU TECHNOLOGY CO. LTD.
    Inventor: Yongchao Zhang
  • Patent number: 11188923
    Abstract: Aspects of the disclosure relate to real-time knowledge-based widget prioritization and display. A computing platform may detect, via a computing device, a voice-based interaction between an enterprise agent and a customer. Then, the computing platform may cause, via the computing device, the voice-based interaction to be captured as audio data. The computing platform may then transform the audio data to textual data. Subsequently, the computing platform may identify, in the textual data, a customer query. Then, the computing platform may retrieve, in real-time and based on the voice-based interaction and from a repository of widgets, a first widget, where the first widget includes information at least partially responsive to the customer query. Then, the computing platform may display, to the enterprise agent and via a graphical user interface in use by the enterprise agent, the first widget.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: November 30, 2021
    Assignee: Bank of America Corporation
    Inventors: Gaurav Bansal, Shekhar Singh Mehra, Vinod Maghnani, Sandeep Kumar Chauhan
  • Patent number: 11182555
    Abstract: A sequence processing method and apparatus are provided. The sequence processing method includes determining a word of a first R-node corresponding to a root node based on an input sequence, generating first I-nodes that are connected to the first R-node and include relative position information with respect to the word of the first R-node, determining a word of a second R-node to correspond to each of the first I-nodes, and determining an output sequence corresponding to the input sequence based on the determined words.
    Type: Grant
    Filed: April 9, 2020
    Date of Patent: November 23, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hwidong Na, Min-Joong Lee
  • Patent number: 11183187
    Abstract: The present invention provides a dialog system comprising a speech receiving step in which the dialog system receives input of a speech of a human, a first speech determination step in which the dialog system determines a first speech which is a speech in response to the speech of the human, a first speech presentation step in which the first speech is presented by a first agent, a reaction acquisition step in which the dialog system acquires a reaction of the human to the first speech, a second speech determination step in which the dialog system determines, when the reaction of the human is a reaction indicating that the first speech is not a speech in response to the speech of the human, a second speech which is different from the first speech, and a second speech presentation step in which the second speech is presented by the second agent.
    Type: Grant
    Filed: May 19, 2017
    Date of Patent: November 23, 2021
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, OSAKA UNIVERSITY
    Inventors: Hiroaki Sugiyama, Toyomi Meguro, Junji Yamato, Yuichiro Yoshikawa, Hiroshi Ishiguro
  • Patent number: 11176141
    Abstract: An aspect provides a method, including: receiving, at an input component of an information handling device, user input comprising one or more words; identifying, using a processor of the information handling device, an emotion associated with the one or more words; creating, using the processor, an emotion tag including the emotion associated with the one or more words; storing the emotion tag in a memory; analyzing one or more emotion tags; and modifying an operation of an application based on the analyzing. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 16, 2016
    Date of Patent: November 16, 2021
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Suzanne Marion Beaumont, Russell Speight VanBlon, Rod D. Waltermann
  • Patent number: 11176214
    Abstract: Methods, apparatuses, and computer program products are described herein that are configured to express a linguistic description of set of points within a spatial area in an output text. In some example embodiments, a method is provided that comprises generating one or more descriptors and/or one or more combinations of descriptors that are configured to linguistically describe at least a portion of a set of points within a spatial area. The method of this embodiment may also include scoring each of the one or more descriptors and/or one or more combinations of the one or more descriptors. The method of this embodiment may also include selecting a descriptor or combination of descriptors that has the highest score when compared to other descriptors or combination of descriptors, providing the descriptor or combination of descriptors satisfies a threshold.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: November 16, 2021
    Assignee: ARRIA DATA2TEXT LIMITED
    Inventors: Gowri Somayajulu Sripada, Neil Burnett
  • Patent number: 11176520
    Abstract: A method may include configuring a processor to monitor, in an application, composition of an electronic communication addressed to a second user from a first user, the electronic communication associated with a set of parameters; determine an intent of the electronic communication based on the set of parameters; search an associative data structure to retrieve content associated with the intent, the content previously transmitted to a third user from the first user or content(s) received from a fourth user(s); and present a suggestion in the application to include the retrieved content in the electronic communication
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: November 16, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Manoj Ramakrishnan
  • Patent number: 11164562
    Abstract: A system for entity-level clarification in conversation services includes a memory having instructions therein. The system also includes at least one processor in communication with the memory. The at least one processor is configured to execute the instructions to receive a conversation services training example set, build an entity usage map using the conversation services training example set, receive a user utterance, and, responsive to a reception of the user utterance, generate a clarification response using the entity usage map. The at least one processor is also configured to execute the instructions to provide the clarification response to a user.
    Type: Grant
    Filed: January 10, 2019
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Carmine M. DiMascio, Donna K. Byron, Benjamin L. Johnson, Florian Pinel
  • Patent number: 11164561
    Abstract: A method and system for building a speech recognizer, and a speech recognition method and system are proposed. The method for building a speech recognizer includes: reading and parsing each grammar file, and building a network of each grammar; reading an acoustic syllable mapping relationship table, and deploying the network of each grammar as a syllable network; performing a merge minimization operation for each syllable network to form a sound element decoding network; forming the speech recognizer by using the sound element decoding network and a language model. The technical solutions of the present disclosure may be applied to exhibit strong extensibility, support an N-Gram language model, support a class model, present flexible use, and adapt for an embedded recognizer in a vehicle-mounted environment.
    Type: Grant
    Filed: August 19, 2019
    Date of Patent: November 2, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhijian Wang, Sheng Qian
  • Patent number: 11159685
    Abstract: A display control device includes a display section, a first receiving section, a second receiving section, and a performing section. The display section displays an object. The first receiving section receives non-voice input specifying a first operation on the object. The second receiving section receives voice input specifying a second operation on the object. The performing section performs, on the object, a complex operation specified by the non-voice input and the voice input.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: October 26, 2021
    Assignee: KYOCERA Document Solutions Inc.
    Inventors: Nobuto Fujita, Kenji Kiyose, Sumio Yamada, Takayuki Mashimo, Ryota Seike, Koji Kuroda
  • Patent number: 11151988
    Abstract: Techniques for implementing multiple wakeword detectors on a single device are described. A digital signal processor (DSP) of the device may implement a wakeword detection component to detect when captured speech includes a wakeword. A companion application installed on the device may implement a wakeword detection component trained using speech of a user of the device. If the DSP's wakeword detection component detects a wakeword in speech, the companion application's wakeword detection component may be used to determine whether the wakeword was spoken by the user of the device. If the companion application's wakeword detection component determines the user spoke the wakeword, audio data representing the speech may be sent to at least one server(s) for processing.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: October 19, 2021
    Assignee: Amazon Technolgies, Inc.
    Inventors: Deepak Yavagal, Ajith Prabhakara, John Gray
  • Patent number: 11137978
    Abstract: An electronic device includes a processor, and a memory. The memory may store instructions that, cause the processor to display a user interface including items, receive a first user utterance while the user interface is displayed, wherein the first user utterance includes a first request for executing a first task by using at least one item, transmit first data related to the first user utterance to an external server, receive a first response from the external server, wherein the first response includes information on a first sequence of states of the electronic device for executing the first task and further includes numbers and locations of the items in the user interface, and execute the first task including an operation of allowing the application program to select the one or the plurality of items based on the numbers or the locations.
    Type: Grant
    Filed: April 27, 2018
    Date of Patent: October 5, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwang Yong Lee, Jung Hoe Kim, Soo Bin Park, Kyoung Gu Woo, Seong Min Je
  • Patent number: 11120802
    Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
  • Patent number: 11113672
    Abstract: A system and method to provide computer support for a meeting of invitees comprises accessing one or more sensory data streams providing digitized sensory data responsive to an activity of one or more of the invitees during the meeting, the one or more sensory data streams including at least one audio stream. The method also comprises subjecting the at least one audio stream to phonetic and situational computer modeling to recognize a sequence of words in the audio stream and to assign each word to an invitee, subjecting the sequence of words to semantic computer modeling to recognize a sequence of directives in the sequence of words, and releasing one or more output data streams based on the sequence of directives, the one or more output data streams including one or more notifications.
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: September 7, 2021
    Inventors: Robert Alexander Sim, Marcello Mendes Hasegawa, Ryen William White, Mudit Jain, Tomer Hermelin, Adi Gerzi Rosenthal, Sagi Hilleli
  • Patent number: 11114100
    Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.
    Type: Grant
    Filed: August 23, 2019
    Date of Patent: September 7, 2021
    Assignee: GOOGLE LLC
    Inventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn
  • Patent number: 11113098
    Abstract: The present disclosure relates to the field of a multi-chip system, and provides an interrupt processing method, a master chip, a slave chip, and a multi-chip system. An interrupt processing method is applied to a master chip and includes: when an interrupt transport request sent by a slave chip through an interrupt line is detected, obtaining all current interrupt requests (irq_s_1-irq_s_N) of the slave chip, the interrupt request (irq_s_1_-irq_s_N) is generated by a first peripheral (4) of the slave chip; obtaining an interrupt subroutine corresponding to each of the interrupt requests (irq_s_1-irq_s_N), and processing the corresponding interrupt request (irq_s_1-irq_s_N) by using the interrupt subroutine. In the embodiments of the present disclosure, all the interrupt requests (irq_s_1-irq_s_N) of the slave chip are mapped to the master chip, so that the interrupt processing flow of the peripheral on the slave chip is simplified.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: September 7, 2021
    Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.
    Inventors: Zhibing Liang, Yifan Li, Zekai Chen
  • Patent number: 11109104
    Abstract: Novel techniques are described for viewer compositing using media playback systems for enhanced media recommendation and consumption. For example, a display device can be in communication with a media recommendation and consumption compositor (MRCC) system. When a group of viewers desires a shared media consumption experience, the MRCC system can detect the group of viewers and can obtain respective viewer profiles, which can be used to generate a composite profile representing a composite of the group of viewers. The MRCC system can determine an available content space indicating the content available for consumption and can compute a content recommendation space as a function of the composite viewer profile and the available content space that defines recommended content options for the composited group of viewers. A recommendation interface can be output to indicate recommended content options for selecting and viewing.
    Type: Grant
    Filed: July 10, 2019
    Date of Patent: August 31, 2021
    Assignee: DISH Network L.L.C.
    Inventor: John Rishea