Speech To Image Patents (Class 704/235)
  • Patent number: 11030400
    Abstract: Methods, programming, and system for identifying one or more variable slots within an utterance are described herein. In a non-limiting embodiment, a first slot-value pair for an utterance may be obtained. The first slot-value pair may include a first slot and a first value associated with the slot. The first slot may be of a first entity type, where an intent and a data object are estimated based on the first utterance. A data structure representing the data object may be identified. Based on the intent, a first variable slot in the data structure associated with the first entity type may be determined, where the first variable slot may be associated with at least one of: multiple values and an adjustable value. Based on the intent, the first value may be assigned to the first variable slot in the data structure.
    Type: Grant
    Filed: February 22, 2018
    Date of Patent: June 8, 2021
    Assignee: Verizon Media Inc.
    Inventors: Prakhar Biyani, Cem Akkaya, Kostas Tsioutsiouliklis
  • Patent number: 11030406
    Abstract: A method for expanding an initial ontology via processing of communication data, wherein the initial ontology is a structural representation of language elements comprising a set of entities, a set of terms, a set of term-entity associations, a set of entity-association rules, a set of abstract relations, and a set of relation instances. A method for extracting a set of significant phrases and a set of significant phrase co-occurrences from an input set of documents further includes utilizing the terms to identify relations within the training set of communication data, wherein a relation is a pair of terms that appear in proximity to one another.
    Type: Grant
    Filed: January 27, 2016
    Date of Patent: June 8, 2021
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Daniel Baum, Uri Segal, Ron Wein, Oana Sidi
  • Patent number: 11030789
    Abstract: The present invention relates to a method for generating and causing display of a communication interface that facilitates the sharing of emotions through the creation of 3D avatars, and more particularly with the creation of such interfaces for displaying 3D avatars for use with mobile devices, cloud based systems and the like.
    Type: Grant
    Filed: February 17, 2020
    Date of Patent: June 8, 2021
    Assignee: Snap Inc.
    Inventors: Jesse Chand, Jeremy Voss
  • Patent number: 11024310
    Abstract: Various techniques are described herein for supporting voice command control of electronic programming guides (EPGs) and other media content selection systems. The voice input hardware and software components of a remote control device, television receiver, smartphone, virtual assistant, and/or other media device may receive voice commands from a user corresponding to a selection of a media content. In response to the received voice input, the media device may perform a speech-to-text conversion of the voice input, and then perform an analysis of the command text to determine one or more content selections of the user. The analysis may include identifying within the command text one or more television channel names, program names, or other media content names, as well as identifying other instructions, preferences, or other meaningful insights from the command text.
    Type: Grant
    Filed: April 12, 2019
    Date of Patent: June 1, 2021
    Assignee: Sling Media Pvt. Ltd.
    Inventors: Soham Sahabhaumik, Karthik Mahabaleshwar Hegde, Amrit Mishra, Yatish Jayant Naik Raikar
  • Patent number: 11023931
    Abstract: Disclosed is a method of receiving an audio stream containing user speech from a first device, generating text based on the user speech, identifying a key phrase in the text, receiving from an advertiser an advertisement related to the identified key phrase, and displaying the advertisement. The method can include receiving from an advertiser a set of rules associated with the advertisement and displaying the advertisement in accordance with the associated set of rules. The method can display the advertisement on one or both of a first device and a second device. A central server can generate text based on the speech. A key phrase in the text can be identified based on a confidence score threshold. The advertisement can be displayed after the audio stream terminates.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: June 1, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Patrick Jason Morrison
  • Patent number: 11024299
    Abstract: Systems, methods, and computer-readable media are disclosed for providing privacy and intent preserving redactions of text derived from utterance data. Certain embodiments provide new techniques for using MadLib-style replacements to replace one or more terms or phrases in a text string. Example methods may include receiving utterance data and determining a public portion and a private portion of the utterance data. Certain methods include determining a cluster of candidates having a same semantic context as the private portion and identifying from within the cluster of candidates a first candidate. Certain methods include determining a redacted utterance comprising the public portion of the utterance and the first candidate. Certain methods include providing the redacted utterance to downstream systems and processes.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: June 1, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Thomas Drake, Oluwaseyi Feyisetan, Borja de Balle Pigem, Tom Diethe
  • Patent number: 10997606
    Abstract: Systems and methods are provided herein for autonomously determining and resolving a customer's perceived discrepancy during a customer service interaction. The method can include receiving an incoming communication from a customer; extracting, by a Natural Language Processing (NLP) device, a perceived state and an expected state of a product or service based on the incoming communication; determining by a discrepancy determination device, a discrepancy between the perceived and expected state of the product or service; verifying, by a rule-based platform, the discrepancy; generating a response based on the discrepancy, the response comprising one or more of: a fact pattern response related to the perceived discrepancy and a confirmation or correction of a verified discrepancy; and outputting, for presentation to the customer, the response.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: May 4, 2021
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Alexandra Coman, Erik Mueller
  • Patent number: 10997968
    Abstract: Described herein is a mechanism for improving the accuracy of a language model interpreting short input utterances. A language model operates in a stateless manner, only ascertaining the intents and/or entities associated with a presented input utterance. To increase the accuracy, two language understanding models are trained. One is trained using only input utterances. The second is trained using input utterance-prior dialog context pairs. The prior dialog context is previous intents and/or entities already determined from the utterances in prior turns of the dialog. When input is received, the language understanding model decides whether the input comprises only an utterance or an utterance and prior dialog context. The appropriate trained machine learning model is selected and the intents and/or entities associated with the input determined by the selected machine learning model.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 4, 2021
    Assignee: MICROSOFTTECHNOLOGY LICENSING, LLC
    Inventors: Nayer Mahmoud Wanas, Riham Hassan Abdel Moneim Mansour, Kareem Saied Abdelhamid Yousef, Youssef Shahin, Carol Ishak Girgis Hanna, Basma Ayman Mohammed Mohammed Emara
  • Patent number: 10991374
    Abstract: A voice control method, a voice control device, a computer readable storage medium, and a computer device are disclosed. The voice control method comprises: receiving, in an instruction receiving state, a voice instruction of a user for a specific operation; performing voice processing on the voice instruction to obtain voice information; transmitting, to the user, a request to confirm the voice information; receiving, from the user, a response to the request; and performing the specific operation if the response confirms that the voice information is correct.
    Type: Grant
    Filed: November 1, 2018
    Date of Patent: April 27, 2021
    Assignee: BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Ruibin Xue, Xin Li, Shuai Yang
  • Patent number: 10984339
    Abstract: A method for building a factual database of concepts and entities that are related to the concepts through a learning process. Training content (e.g., news articles, books) and a set of entities (e.g., Bill Clinton and Barack Obama) that are related to a concept (e.g., Presidents) is received. Groups of words that co-occur frequently in the textual content in conjunction with the entities are identified as templates. Templates may also be identified by analyzing parts-of-speech patterns of the templates. Entities that co-occur frequently in the textual content in conjunction with the templates are identified as additional related entities (e.g., Ronald Reagan and Richard Nixon). To eliminate erroneous results, the identified entities may be presented to a user who removes any false positives. The entities are then stored in association with the concept.
    Type: Grant
    Filed: December 16, 2016
    Date of Patent: April 20, 2021
    Assignee: Verizon Media Inc.
    Inventors: Amit R. Kapur, Steven F. Pearman, James R. Benedetto
  • Patent number: 10984802
    Abstract: A system of determining identity based on voiceprint and voice password, and a method thereof are disclosed. In the method, after the voice signal is received, the judgment result of the voiceprint of the voice signal and the judgment result of the content of the voice signal are used to determine whether to pass the verification, and this technical solution of the present invention can confirm that the voice, identified based on the voiceprint thereof, is made by a real person, so as to improve the security of identity determination.
    Type: Grant
    Filed: August 7, 2018
    Date of Patent: April 20, 2021
    Assignees: INVENTEC (PUDONG) TECHNOLOGY CORPORATION, INVENTEC CORPORATION
    Inventor: Chaucer Chiu
  • Patent number: 10984198
    Abstract: Methods, systems and computer program products for automated testing of dialog systems are provided herein. A computer-implemented method includes receiving selection of a conversation workspace of the automated dialog system and identifying test case inputs to the automated dialog system, the test case inputs comprising example user input for the given conversation workspace that has portions thereof modified and which the automated dialog system maps to a different intent and/or a different entity relative to the example user input. The method further includes generating human-interpretable explanations of mappings of portions of the test case inputs to the different intent and/or entity, generating suggestions for modifying intents, entities and dialog flows of the given conversation workspace such that the test case inputs map to the same intent and/or the same entity as their corresponding example user input, and outputting the suggestions and the human-interpretable explanations to a user.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: April 20, 2021
    Assignee: International Business Machines Corporation
    Inventors: Arpan Losalka, Diptikalyan Saha
  • Patent number: 10977578
    Abstract: A conversation processing method and apparatus based on artificial intelligence, a device and a computer-readable storage medium. The disclosure embodiments, enable the user feedback information provided by conversation service conducted by the user to model conversation understanding system, then according to the user feedback information, perform adjustment processing for a service state of the model conversation understanding system, to obtain an adjustment state of the model conversation understanding system so that it is possible to execute the conversation service with the model conversation understanding system, based on the adjustment state.
    Type: Grant
    Filed: June 12, 2018
    Date of Patent: April 13, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ke Sun, Shiqi Zhao, Dianhai Yu, Haifeng Wang
  • Patent number: 10978047
    Abstract: Embodiments of methods and apparatuses for recognizing a speech are provided. An implementation can include: determining an identity of a target user inputting the speech input signal; extracting a common expression set of the target user from a stored common expression database, the common expression set including a plurality of common expressions; extracting an acoustic feature of the speech input signal and input the same into an acoustic model to obtain an acoustic model score; judging whether a content of the speech input signal is a common expression of the target user based on the acoustic model score of the speech input signal and acoustic model scores of the common expressions in the stored common expression set of the target user; and if yes, decoding the acoustic feature of the speech input signal using a language model constructed based on common expressions, to obtain a speech recognition result.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: April 13, 2021
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventor: Chao Tian
  • Patent number: 10970646
    Abstract: Systems and methods are provided for suggesting actions for selected text based on content displayed on a mobile device. An example method can include converting a selection made via a display device into a query, providing the query to an action suggestion model that is trained to predict an action given a query, each action being associated with a mobile application, receiving one or more predicted actions, and initiating display of the one or more predicted actions on the display device. Another example method can include identifying, from search records, queries where a website is highly ranked, the website being one of a plurality of websites in a mapping of websites to mobile applications. The method can also include generating positive training examples for an action suggestion model from the identified queries, and training the action suggestion model using the positive training examples.
    Type: Grant
    Filed: October 1, 2015
    Date of Patent: April 6, 2021
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Daniel Ramage, David Petrou
  • Patent number: 10970040
    Abstract: The present disclosure is directed to systems and methods for the creation of a localized audio message for use in a personal audio device. The system includes: a database of information relating to a pre-determined subject obtained from online media content; one or more processors; and a personal audio device configured to receive a localized audio message. The processors extract a dataset comprising information relating to a pre-determined subject from online media content; generate one or more summaries of the information relating to the pre-determined subject; generate a localized audio message based on the one or more summaries; and send the localized audio message to a personal audio device of a user.
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: April 6, 2021
    Assignee: Bose Corporation
    Inventors: Elizabeth Nielsen, Elio Dante Querze, Marko Orescanin, Naganagouda B. Patil, Marina Shah, Vijayan P. Sarathy, Shanthi Chandrababu, Shuo Zhang, Isaac Julien
  • Patent number: 10964318
    Abstract: A system and method to receive a spoken utterance and convert the spoken utterance into a recognized speech results through an automatic speech recognition service. A spoken utterance into a recognized speech result through an automatic speech recognition service. The recognized speech results are interpreted through a natural language processing module. A normalizer processes the recognized speech results that transforms the recognized speech interpretations into predefined form for a given automatic speech recognition domain and further determines which automatic speech recognition domains or the recognized speech results are processed by a dedicated dialogue management proxy module or a conversation module.
    Type: Grant
    Filed: January 4, 2018
    Date of Patent: March 30, 2021
    Assignee: BlackBerry Limited
    Inventor: Darrin Kenneth John Fry
  • Patent number: 10965811
    Abstract: A conversation may be monitored in real time using a trained machine learning model. This real-time monitoring may detect attributes of a conversation, such as a conversation type, a state of a conversation, as well as other attributes that help specify a context of a conversation. Contextually appropriate behavioral targets may be provided by machine learning model to an agent participating in a conversation. In some embodiments, these “behavioral targets” are identified by applying a set of rules to the contemporaneously identified conversation attributes. The behavioral targets may be defined in advance prior to the start of a conversation. In this way, the machine learning model may be trained to associate particular behavioral target(s) with one or more conversation attributes (or collections of attributes). This facilitates the real-time monitoring of a conversation and contemporaneous guidance of an agent with machine-identified behavioral targets.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: March 30, 2021
    Assignee: CRESTA INTELLIGENCE INC.
    Inventors: Tianlin Shi, Peter Elliot Schmidt-Nielsen, Navjot Matharu, Alexander Donald Roe, JungHa Lee, Syed Zayd Enam
  • Patent number: 10956481
    Abstract: Described herein are technologies that facilitate effective use (e.g., indexing and searching) of non-text machine data (e.g., audio/visual data) in an event-based machine-data intake and query system.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: March 23, 2021
    Assignee: SPLUNK Inc.
    Inventor: Adam Oliner
  • Patent number: 10957310
    Abstract: The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 23, 2021
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Seyed Majid Emami, Chris Wilson, Bernard Mont-Reynaud
  • Patent number: 10957305
    Abstract: An information processing method and an electronic device are provided. The method includes: obtaining audio data collected by a slave device; obtaining contextual data corresponding to the slave device; and obtaining a recognition result of recognizing the audio data based on the contextual data. The contextual data characterizes a voice environment of the audio data collected by the slave device.
    Type: Grant
    Filed: June 16, 2017
    Date of Patent: March 23, 2021
    Assignee: LENOVO (BEIJING) CO., LTD.
    Inventor: Weixing Shi
  • Patent number: 10949626
    Abstract: The present disclosure provides a global simultaneous interpretation method and production thereof, the method includes the following steps: receiving a calling request sent by a terminal by a smart phone, connecting the calling request, and establishing a calling connection; receiving a first voice information transmitted through the calling connection by the smart phone, and when the first voice information is identified and is determined as a non-specified language, translating the first voice information into a second voice information of a specified language; and playing the second voice information by using a speaker device by the smart phone.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: March 16, 2021
    Assignee: WING TAK LEE SILICONE RUBBER TECHNOLOGY (SHENZHEN) CO., LTD
    Inventor: Tak Nam Liu
  • Patent number: 10950228
    Abstract: Methods and systems for receiving shouted-out user responses to broadcast entertainment content, and for determining the responsiveness of those responses in relation to the broadcast content. In particular, entertainment broadcasts can be accompanied by mark-up data that represents various events within a given broadcast, which can be compared to the shouted-out responses to determine their accuracy. For example, if a game show was broadcast and an individual started shouting out answers during the broadcast, embodiments disclosed herein could utilize a voice-controlled electronic device that captures the shouted-out answers and passes them on to a language processing system that determines whether they are correct by comparing the answers to the mark-up data.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: March 16, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Alfred Yong-Hock Tan, Matthew Luker, David Allen Markley
  • Patent number: 10943590
    Abstract: The present invention relates to a washing machine and a server system that recommend a laundry course and washing tip information in consideration of information on a kind of laundry and a degree of contamination inputted by using artificial intelligence through speech recognition, and a method for controlling such washing machine and server system. The present invention extracts a cloth-word indicating a kind of clothes of laundry and a stain-word indicating a kind of contaminant. Then, the present invention determines a laundry course in consideration of the cloth-word. Then, washing tip information on the stain-word is searched in a pre-stored database and the searched washing tip information is provided to a user. As a result, the user can be notified of a washing method capable of effectively removing the contaminant.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: March 9, 2021
    Assignee: LG Electronics Inc.
    Inventor: Heungkyu Lee
  • Patent number: 10937414
    Abstract: Systems and methods for text input based on neuromuscular information. The system includes a plurality of neuromuscular sensors, arranged on one or more wearable devices, wherein the plurality of neuromuscular sensors is configured to continuously record a plurality of neuromuscular signals from a user, at least one storage device configured to store one or more trained statistical models, and at least one computer processor programmed to obtain the plurality of neuromuscular signals from the plurality of neuromuscular sensors, provide as input to the one or more trained statistical models, the plurality of neuromuscular signals or signals derived from the plurality of neuromuscular signals, and determine based, at least in part, on an output of the one or more trained statistical models, one or more linguistic tokens.
    Type: Grant
    Filed: May 8, 2018
    Date of Patent: March 2, 2021
    Assignee: Facebook Technologies, LLC
    Inventors: Adam Berenzweig, Alan Huan Du, Jeffrey Scott Seely
  • Patent number: 10936936
    Abstract: A system and method of configuring a graphical control structure for controlling a machine learning-based automated dialogue system includes configuring a root dialogue classification node that performs a dialogue intent classification task for utterance data input; configuring a plurality of distinct dialogue state classification nodes that are arranged downstream of the root dialogue classification node; configuring a graphical edge connection between the root dialogue classification node and the plurality of distinct state dialogue classification nodes that graphically connects each of the plurality of distinct state dialogue classification nodes to the root dialogue classification node, wherein (i) the root dialogue classification node, (ii) the plurality of distinct classification nodes, (iii) and the transition edge connections define a graphical dialogue system control structure that governs an active dialogue between a user and the machine learning-based automated dialogue system.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: March 2, 2021
    Assignee: Clinc, Inc.
    Inventors: Parker Hill, Jason Mars, Lingjia Tang, Michael A. Laurenzano, Johann Hauswald, Yiping Kang, Yunqi Zhang
  • Patent number: 10936812
    Abstract: An approach is provided that receives words that are input by a user of an application with the words being displayed on a display device. Each of the words are compared to words from a dictionary. Based on the comparisons, words that are not found in the dictionary and only appear a single time are highlighted as being misspelled words. However, words that are not in the dictionary but appear multiple times in the document are highlighted differently to indicate that these words are possible misspelled words with the difference in highlighting allowing the user to easily discern between misspelled and possibly misspelled words.
    Type: Grant
    Filed: January 10, 2019
    Date of Patent: March 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kyle M. Brake, Stephen A. Boxwell, Stanley J. Vernier, Keith G. Frost
  • Patent number: 10930288
    Abstract: Aspects of the disclosure provide systems and methods for facilitating dictation. Speech input may be provided to an audio input device of a computing device. A speech recognition engine at the computing device may obtain text corresponding to the speech input. The computing device may transmit the text to a remotely-located storage device. A login webpage that includes a session identifier may be accessed from a target computing device also located remotely relative to the storage device. The session identifier may be transmitted to the storage device and, in response, a text display webpage may be received at the target computing device. The text display webpage may include the speech-derived text and may be configured to automatically copy the text to a copy buffer of the target computing device. The speech-derived text may also be provided to native applications at target computing devices or NLU engines for natural language processing.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: February 23, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Markus Vogel, Andreas Neubacher
  • Patent number: 10932098
    Abstract: A wireless access point supports media conferencing for wireless User Equipment (UE). The wireless access point wirelessly exchanges timing signaling with the wireless UE to synchronize the wireless UE. After the wireless UE is synchronized, the wireless access point wirelessly exchanges connect signaling with the wireless UE to receive an Establishment Cause and a Wireless Network Identifier from the wireless UE. The wireless access point selects a media conferencing Mobility Management Entity (MME) when the Establishment Cause is associated with the media conferencing MME. The wireless access point selects a data MME based on the Wireless Network Identifier when the Establishment Cause is not associated with the media conferencing MME. The wireless access point exchanges network signaling with the data MME or the media conferencing MME. The wireless access point wirelessly exchanges user data with the wireless UE under control of the data MME or the media conferencing MME.
    Type: Grant
    Filed: February 15, 2018
    Date of Patent: February 23, 2021
    Assignee: Sprint Communications Company L.P.
    Inventor: Rajil Malhotra
  • Patent number: 10930270
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing audio waveforms. In some implementations, a time-frequency feature representation is generated based on audio data. The time-frequency feature representation is input to an acoustic model comprising a trained artificial neural network. The trained artificial neural network comprising a frequency convolution layer, a memory layer, and one or more hidden layers. An output that is based on output of the trained artificial neural network is received. A transcription is provided, where the transcription is determined based on the output of the acoustic model.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: February 23, 2021
    Assignee: Google LLC
    Inventors: Tara N. Sainath, Ron J. Weiss, Andrew W. Senior, Kevin William Wilson
  • Patent number: 10930263
    Abstract: This disclosure describes techniques for replicating characteristics of an actor or actresses voice across different languages. The disclosed techniques have the practical application of enabling automatic generation of dubbed video content for multiple languages, with particular speakers in each dubbing having the same voice characteristics as the corresponding speakers in the original version of the video content.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: February 23, 2021
    Assignee: Amazon Technologies, Inc.
    Inventor: Hooman Mahyar
  • Patent number: 10930272
    Abstract: A technique for semantic search and retrieval that is event-based, wherein is event is composed of a sequence of observations that are user speech or physical actions. Using a first set of conversations, a machine learning model is trained against groupings of utterances therein to generate a speech act classifier. Observation sequences therein are organized into groupings of events and configured for subsequent event recognition. A set of second (unannotated) conversations are then received. The set of second conversations is evaluated using the speech act classifier and information retrieved from the event recognition to generate event-level metadata that comprises, for each utterance or physical action within an event, one or more associated tags. In response to a query, a search is performed against the metadata. Because the metadata is derived from event recognition, the search is performed against events learned from the set of first conversations.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: February 23, 2021
    Assignee: Drift.com, Inc.
    Inventors: Jeffrey D. Orkin, Christopher M. Ward, Elias Torres
  • Patent number: 10929009
    Abstract: An electronic device is provided. The electronic device includes a housing, a touch screen display that includes a first edge and a second edge, a microphone, at least one speaker, a wireless communication circuit, a memory, and a processor operably connected with the touch screen display, the microphone, the at least one speaker, the wireless communication circuit, and the memory. The processor is configured to output a home screen including a plurality of application icons in a matrix pattern. The processor is configured receive an input from the first edge to the second edge. The processor is configured output a user interface on the touch screen display that includes a button that allows user to call a first operation and a plurality of cards. To call the first operation the processor is configured to receive a user input, transmit data and receive a response, and perform a task.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: February 23, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Young Seok Lim, Hong Seok Kwon, Ho Min Moon, Mi Jung Park, Woo Young Park, Ki Hyoung Son, Won Ick Ahn, Pil Seung Yang, Jae Seok Yoon, Gi Soo Lee, Sun Jung Lee, Jae Hyeok Lee, Hyun Yeul Lee, Hyeon Cheon Jo, Doo Soon Choi, Kyung Wha Hong, Da Som Lee, Yong Joon Jeon
  • Patent number: 10923106
    Abstract: An audio synthesis method adapted to video characteristics is provided. The audio synthesis method according to an embodiment includes: extracting characteristics x from a video in a time-series way; extracting characteristics p of phonemes from a text; and generating an audio spectrum characteristic St used to generate an audio to be synthesized with a video at a time t, based on correlations between an audio spectrum characteristic St-1, which is used to generate an audio to be synthesized with a video at a time t?1, and the characteristics x. Accordingly, an audio can be synthesized according to video characteristics, and speech according to a video can be easily added.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: February 16, 2021
    Assignee: Korea Electronics Technology Institute
    Inventors: Jong Yeol Yang, Young Han Lee, Choong Sang Cho, Hye Dong Jung
  • Patent number: 10924605
    Abstract: Systems and methods for providing and facilitating multi-mode communication are disclosed. Users may initiate, receive and/or respond to messages and message notifications on a computing device using multi-mode interactions executed through either a device display or a wearable device such as a headset with enhanced functionality. Contextual prompts guide the user interaction with the computing device using on-board or remote voice recognition text-to-speech and speech-to-text processing and playback. Voice and text data are packaged and transmitted to the network.
    Type: Grant
    Filed: January 11, 2018
    Date of Patent: February 16, 2021
    Assignee: ONVOCAL, INC.
    Inventors: William Wang Graylin, Bogdan Sima, Pichrachana Sun, Andrew Molloy
  • Patent number: 10923118
    Abstract: An audio input method includes: in an audio-input mode, receiving a first audio input by a user, recognizing the first audio to generate a first recognition result, and displaying corresponding verbal content to the user based on the first recognition result; and in an editing mode, receiving a second audio input by the user and recognizing and generating a second recognition result, converting the second recognition result to an editing instruction, and executing a corresponding operation based on the editing operation. The audio-input mode and the editing mode are switchable.
    Type: Grant
    Filed: November 17, 2016
    Date of Patent: February 16, 2021
    Assignee: BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO., LTD.
    Inventors: Liping Li, Suhang Wang, Congxian Yan, Lei Yang, Min Liu, Hong Zhao, Jia Yao
  • Patent number: 10917519
    Abstract: A method and system to transcribe communications the method comprising the steps of obtaining an audio message originating at a first device during a voice communication session between the first device and a second device, providing the audio message to a first speech recognition system to generate a first transcript of the audio message, directing the first transcript to the second device, in response to obtaining an indication that indicates a quality of the first transcript is below a quality threshold, using a second speech recognition system to generate a second transcript based on the audio message while continuing to provide the audio data to the first speech recognition system to generate the first transcript and, in response to occurrence of an event that indicates the second transcript is to be directed to the second device, directing the second transcript to the second device instead of directing the first transcript.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: February 9, 2021
    Assignee: Ultratec, Inc.
    Inventors: Robert M. Engelke, Kevin R. Colwell, Christopher Engelke, Robert P Leistiko
  • Patent number: 10916249
    Abstract: A method of processing a speech signal for speaker recognition in an electronic apparatus includes: obtaining a speech signal of a first user; extracting a speech feature comprising a feature value from the speech signal; comparing the speech feature extracted from the speech signal of the first user with a predetermined reference value; selecting a first user feature that corresponds to the speech feature of the first user compared with the reference value; generating a recommended phrase used for speaker recognition based on the first user feature; and outputting the recommended phrase.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: February 9, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Youngho Han, Keunseok Cho, Jaeyoung Roh, Namhoon Kim, Chiyoun Park, Jongyoub Ryu
  • Patent number: 10917607
    Abstract: This disclosure describes techniques that include modifying text associated with a sequence of images or a video sequence to thereby generate new text and overlaying the new text as captions in the video sequence. In one example, this disclosure describes a method that includes receiving a sequence of images associated with a scene occurring over a time period; receiving audio data of speech uttered during the time period; transcribing into text the audio data of the speech, wherein the text includes a sequence of original words; associating a timestamp with each of the original words during the time period; generating, responsive to input, a sequence of new words; and generating a new sequence of images by overlaying each of the new words on one or more of the images.
    Type: Grant
    Filed: October 14, 2019
    Date of Patent: February 9, 2021
    Assignee: Facebook Technologies, LLC
    Inventors: Vincent Charles Cheung, Marc Layne Hemeon, Nipun Mathur
  • Patent number: 10909162
    Abstract: Described herein are technologies that facilitate effective use (e.g., indexing and searching) of non-text machine data (e.g., audio/visual data) in an event-based machine-data intake and query system.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: February 2, 2021
    Assignee: SPLUNK Inc.
    Inventor: Adam Oliner
  • Patent number: 10910001
    Abstract: A voice recognition device including: a recognizer which recognizes a movement of a mouth of an utterer; a detector which detects a noise among a sound around the device; and a controller which controls a voice recognition timing based on the movement of the mouth of the utterer recognized by the recognizer and the noise among the sound around the device detected by the detector.
    Type: Grant
    Filed: December 23, 2018
    Date of Patent: February 2, 2021
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Keisuke Shimada
  • Patent number: 10902831
    Abstract: Methods and apparatus to classify media based on a pitch-independent timbre attribute from a media signal are disclosed. An example apparatus includes means for accessing a media signal; and means for: determining a spectrum of audio corresponding to the media signal; and determining a timbre-independent pitch attribute of audio of the media signal based on an inverse transform of a complex argument of a transform of the spectrum.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: January 26, 2021
    Assignee: The Nielsen Company (US), LLC
    Inventor: Zafar Rafii
  • Patent number: 10891106
    Abstract: Aspects of the subject technology relate to systems and methods for processing voice input data. Voice input data is received from a computing. An intended task is determined based on the received voice input data. Contextual information related to the intended task is obtained. A plurality of services to be accessed at the computing device is determined based on the intended task and the obtained contextual information. Instructions associated with the plurality of services are provided for transmission to the computing device for execution at the computing device.
    Type: Grant
    Filed: October 13, 2015
    Date of Patent: January 12, 2021
    Assignee: Google LLC
    Inventors: Alexander Friedrich Kuscher, Santhosh Balasubramanian, Tiantian Zha
  • Patent number: 10891954
    Abstract: Embodiments for managing a voice response system by one or more processors are described. At least one sound is detected. A signal that is representative of at least a portion of the at least one detected sound is received. A voice communication is determined based on the at least one detected sound and the signal. A response to the determined voice communication is determined.
    Type: Grant
    Filed: January 3, 2019
    Date of Patent: January 12, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shikhar Kwatra, Jeremy Fox, Paul Krystek, Sarbajit Rakshit
  • Patent number: 10885898
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data including an utterance, obtaining context data that indicates one or more expected speech recognition results, determining an expected speech recognition result based on the context data, receiving an intermediate speech recognition result generated by a speech recognition engine, comparing the intermediate speech recognition result to the expected speech recognition result for the audio data based on the context data, determining whether the intermediate speech recognition result corresponds to the expected speech recognition result for the audio data based on the context data, and setting an end of speech condition and providing a final speech recognition result in response to determining the intermediate speech recognition result matches the expected speech recognition result, the final speech recognition result including the one or more expected speech recognition results indicated b
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: January 5, 2021
    Assignee: Google LLC
    Inventors: Petar Aleksic, Glen Shires, Michael Buchanan
  • Patent number: 10867610
    Abstract: A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: December 15, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Adi Diamant, Karen Master Ben-Dor, Eyal Krupka, Raz Halaly, Yoni Smolin, Ilya Gurvich, Aviv Hurvitz, Lijuan Qin, Wei Xiong, Shixiong Zhang, Lingfeng Wu, Xiong Xiao, Ido Leichter, Moshe David, Xuedong Huang, Amit Kumar Agarwal
  • Patent number: 10867596
    Abstract: A voice assistant system includes a server apparatus performing voice assistant and a plurality of devices, in which the server apparatus and the devices are communicatively connected to each other. The plurality of devices each records the same user's speech through a microphone, and then transmits recorded data of the same user's speech to the server apparatus. The server apparatus receives the recorded data transmitted from each of the plurality of devices, and then voice-recognizes two or more of the received recorded data in accordance with a predetermined standard to thereby interpret the contents of the user's speech to perform the voice assistant.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: December 15, 2020
    Assignee: Lenovo (Singapore) PTE. LTD.
    Inventors: Masaharu Yoneda, Kazuhiro Kosugi, Koji Kawakita
  • Patent number: 10860786
    Abstract: The growing amount of communication data generated by inmates in controlled environments makes a timely and effective investigation and analysis more and more difficult. The present disclosure provides details of a system and method to investigate and analyze the communication data in a correctional facility timely and effectively. Such a system receives both real time communication data and recorded communication data, processes and investigates the data automatically, and stores the received communication data and processed communication data in a unified data server. Such a system enables a reviewer to review, modify and insert markers and comments for the communication data. Such a system further enables the reviewer to search the communication data and create scheduled search reports.
    Type: Grant
    Filed: June 1, 2017
    Date of Patent: December 8, 2020
    Assignee: Global Tel*Link Corporation
    Inventor: Stephen Lee Hodge
  • Patent number: 10861444
    Abstract: Systems and methods are described for determining whether to activate a voice activated device based on a speaking cadence of the user. When the user speaks with a first cadence the system may determine that the user does not intend to activate the device and may accordingly not to trigger a voice activated device. When the user speaks with a second cadence the system may determine that the user does wish to trigger the device and may accordingly trigger the voice activated device.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: December 8, 2020
    Assignee: Rovi Guides, Inc.
    Inventors: Edison Lin, Rowena Young, Kanchan Sripathy, Reda Harb
  • Patent number: 10861451
    Abstract: One embodiment provides a method, including: receiving, at an information handling device, an audible command to perform a function; determining, using a processor, at least one aspect associated with the audible command that prevents performance of the function; and providing, based on the determining, a suggested modification to the audible command. Other aspects are described and claimed.
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: December 8, 2020
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John Carl Mese, Nathan J. Peterson, Russell Speight VanBlon