Procedures Used During A Speech Recognition Process, E.g., Man-machine Dialogue, Etc. (epo) Patents (Class 704/E15.04)

Recording medium recording program, information processing apparatus, and information processing method for transcription

Patent number: 11798558

Abstract: A method for transcription is performed by a computer. The method includes: accepting input of a voice after causing a display unit to display a sentence including a plurality of words; acquiring first sound information being information concerning sounds corresponding to the sentence; acquiring second sound information being information concerning sounds of the voice accepted in the accepting; specifying a portion in the first sound information having a prescribed similarity to the second sound information; and correcting a character string in the sentence corresponding to the specified portion based on a character string corresponding to the second sound information.

Type: Grant

Filed: June 29, 2020

Date of Patent: October 24, 2023

Assignee: FUJITSU LIMITED

Inventor: Satoru Sankoda
Allowing spelling of arbitrary words

Patent number: 11797763

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

Type: Grant

Filed: July 24, 2021

Date of Patent: October 24, 2023

Assignee: Google LLC

Inventors: Evgeny A. Cherepanov, Gleb Skobeltsyn, Jakob Nicolaus Foerster, Petar Aleksic, Assaf Avner Hurwitz Michaely
Configurable output data formats

Patent number: 11798556

Abstract: Configurable core domains of a speech processing system are described. A core domain output data format for a given command is originally configured with default content portions. When a user indicates additional content should be output for the command, the speech processing system creates a new output data format for the core domain. The new output data format is user specific and includes both default content portions as well as user preferred content portions.

Type: Grant

Filed: January 14, 2022

Date of Patent: October 24, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Rohan Mutagi, Felix Wu, Rongzhou Shen, Neelam Satish Agrawal, Vibhunandan Gavini, Pablo Carballude Gonzalez
Knowledge acquisition tool

Patent number: 11797610

Abstract: A natural language interfacing system may use a knowledge acquisition tool to obtain structured representations from user input text. The system may initiate interaction with a request for input and a partial statement with blank text slots labeled by field types. The system may receive input text to fill in a slot of the partial statement and perform semantic parsing on the input text to identify a trigger concept. The system may generate a list of templates defining different semantic frames for the trigger concept. A generated template may include additional generated slots and/or suggested slot-fillers to guide user input. In response to a template selection, the partial statement includes the trigger concept annotated with a semantic frame. This process is repeated by iteratively updating the list of templates until the statement is completed. The statement is mapped to a structured representation including semantic frames.

Type: Grant

Filed: September 15, 2020

Date of Patent: October 24, 2023

Assignee: Elemental Cognition Inc.

Inventors: David Ferrucci, Clifton James McFate, Aditya Kalyanpur, Andrea Bradshaw, David Melville
Managing communications-related data based on interactions between and aggregated data involving data-center communications server and client-specific circuitry

Patent number: 11792300

Abstract: Certain aspects of the disclosure are directed to context aggregation in a data communications network. According to a specific example, process user-data communications between a client station and another station participating in data communications via the data communications services can be processed, where the client station is associated with one of a plurality of client entities configured and arranged to interface with a data communications server providing data communications services. Context information can be aggregated for each respective user-data communication between the client station and the participating station, where the context information corresponds to at least one communications-specific characteristic associated with the user-data communications.

Type: Grant

Filed: June 21, 2022

Date of Patent: October 17, 2023

Assignee: 8x8, Inc.

Inventors: Ali Arsanjani, Bryan R. Martin, Manu Mukerji, Venkat Nagaswamy, Marshall Lincoln
Voice recognition method and device

Patent number: 11790903

Abstract: Disclosed is a voice recognition device and method. According to the disclosure, the voice recognition device, upon failing to grasp the intent of the user's utterance from the original utterance which is divided into a head utterance and a tail utterance, figures out the intent from the head utterance to thereby complete the original utterance and provides the result of voice recognition processing on the original utterance. According to an embodiment, the voice recognition device may be related to artificial intelligence (AI) modules, robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.

Type: Grant

Filed: May 7, 2020

Date of Patent: October 17, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Hyun Yu, Byeongha Kim, Yejin Kim
Acoustic event detection

Patent number: 11783850

Abstract: Techniques for detecting certain acoustic events from audio data are described. A system may perform event aggregation for certain types of events before sending an output to a device representing the event is detected. The system may bypass the event aggregation process for certain types of events that the system may detect with a high level of confidence. In such cases, the system may send an output to the device when the event is detected. The system may be used to detect acoustic events representing presence of a person or other harmful circumstances (such as, fire, smoke, etc.) in a home, an office, a store, or other types of indoor settings.

Type: Grant

Filed: March 30, 2021

Date of Patent: October 10, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Harshavardhan Sundar, Sheetal Laad, Jialiang Bao, Ming Sun, Chao Wang, Chungnam Chan, Cengiz Erbas, Mathias Jourdain, Nipul Bharani, Aaron David Wirshba
Detecting and assigning action items to conversation participants in real-time and detecting completion thereof

Patent number: 11783829

Abstract: Described herein is a system for automatically detecting and assigning action items in a real-time conversation and determining whether such action items have been completed. The system detects, during a meeting, a plurality of action items and an utterance that corresponds to a completed action item. Responsive to detecting the utterance, the system generates a similarity score with respect to a first action item of the plurality of action items. The system compares the similarity score to a first threshold. Responsive to determining that the similarity score does not exceed the first threshold, the system generates a second similarity score with respect to a second action item of the plurality of action items. The system compares the second similarity score to a second threshold, which exceeds the first threshold. Responsive to determining that the second similarity score exceeds the second threshold, the system marks the second action item as completed.

Type: Grant

Filed: April 29, 2021

Date of Patent: October 10, 2023

Assignee: Outreach Corporation

Inventors: Rohit Ganpat Mane, Abhishek Abhishek, Krishnamohan Reddy Nareddy, Rajiv Garg
Systems and methods for conflict detection based on user preferences

Patent number: 11785310

Abstract: Systems and methods for detecting a conflict between viewing selections of two users before viewing a media asset. In some aspects, the method comprises receiving an audio input through an audio channel, detecting a first utterance from a first user and a second utterance from a second user in the input of the audio channel, parsing the first utterance and the second utterance, analyzing the first utterance and the second utterance to determine context about the first media asset and the second media asset, and presenting a conflict to the users to display on the media asset.

Type: Grant

Filed: August 2, 2021

Date of Patent: October 10, 2023

Assignee: Rovi Guides, Inc.

Inventor: Ti-Shiang Wang
Digital pen with enhanced educational and therapeutic feedback

Patent number: 11772409

Abstract: Disclosed is a digital pen that tracks the user's writing and provides useful feedback based on the user's writing. In one embodiment, the pen may provide feedback when the user has written a misspelled word, invalid mathematical expression, or any noncompliant expression. The pen may also provide feedback relating to the user's handwriting. The feedback may be visual, auditory, or tactile, and may be realtime or delayed. Statistics relating to the user's performance may be tracked, uploaded to external devices, and shared with others. This allows the user and interested parties to track the user's progress over time. The disclosed pen will be useful in educational settings.

Type: Grant

Filed: June 7, 2021

Date of Patent: October 3, 2023

Inventor: Lauren Michelle Neubauer
Speaker-dependent voice-activated camera system

Patent number: 11778303

Abstract: A voice-activated camera system for a computing device. The voice-activated camera system includes a processor, a camera module, a speech recognition module and a microphone for accepting user voice input. The voice-activated camera system includes authorized for only a specific user's voice, so that a camera function may be performed when the authorized user speaks the keyword, but the camera function is not performed when an unauthorized user speaks the keyword.

Type: Grant

Filed: June 14, 2021

Date of Patent: October 3, 2023

Inventor: Jesse L. Wobrock
Context-aware hardware-based voice activity detection

Patent number: 11776562

Abstract: Certain aspects of the present disclosure provide a method for performing voice activity detection, including: receiving audio data from an audio source of an electronic device; generating a plurality of model input features using a hardware-based feature generator based on the received audio data; providing the plurality of model input features to a hardware-based voice activity detection model; receiving an output value from the hardware-based voice activity detection model; and determining a presence of voice activity in the audio data based on the output value.

Type: Grant

Filed: May 29, 2020

Date of Patent: October 3, 2023

Assignee: QUALCOMM Incorporated

Inventors: Ren Li, Xiaofei Chen, Murray Jarvis
Multi-factor audio watermarking

Patent number: 11776549

Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.

Type: Grant

Filed: December 7, 2020

Date of Patent: October 3, 2023

Assignee: GOOGLE LLC

Inventors: Aleks Kracun, Matthew Sharifi
System and method for validating trust provisioning operation on system-on-chip

Patent number: 11768963

Abstract: A system-on-chip (SoC) includes a memory, a trust provisioning system, a one-time programmable (OTP) element, and a comparator. The memory is configured to store a first secret key before an execution of a trust provisioning operation. The trust provisioning system is configured to receive an encrypted version of a first set of secure assets and one of a second secret key and an encrypted version of the second secret key, and execute the trust provisioning operation on the SoC to store the first set of secure assets and the second secret key in the OTP element. The comparator is configured to compare the first and second secret keys to generate a valid signal that is indicative of a validation of the trust provisioning operation. The first set of secure assets and a second set of secure assets associated with the SoC are accessible based on the valid signal.

Type: Grant

Filed: January 22, 2021

Date of Patent: September 26, 2023

Assignee: NXP USA, INC.

Inventors: Atul Dahiya, Akshay Kumar Pathak
Access to multiple virtual assistants

Patent number: 11763809

Abstract: A speech-processing system may provide access to multiple virtual assistants via one or more voice-controlled devices. Each assistant may leverage language processing and language generation features of the speech-processing system, while handling different commands and/or providing access to different back applications. Each assistant may be associated with its own voice and/or speech style, and thus be perceived as having a particular “personality.” In some situations, a user may invoke a first assistant, e.g., with a wakeword or button press, and provide a command that the speech-processing system may determine will be better handled by a second assistant. The speech-processing system may thus call on a component to generate plan data describing one or more operations for the speech-processing system to execute to handoff the command to the second assistant and provide the user with indications of which assistant will handle the command.

Type: Grant

Filed: December 7, 2020

Date of Patent: September 19, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Naveen Bobbili, David Henry, Mark Vincent Mattione, Richard Du, Jyoti Chhabra
Voice recognition apparatus, voice recognition method, and program

Patent number: 11756552

Abstract: A voice recognition apparatus includes: a communication circuit acquiring a speech sentence that is a result of voice recognition of a speech; a storage storing digit number information indicative of the maximum number of digits; and a control circuit. When the number of digits of a first numerical value indicated by a first numeral included in the speech sentence is larger than the maximum number of digits, the control circuit replaces the first numeral in the speech sentence with a second numeral indicative of a second numerical value having the number of digits equal to or less than the maximum number of digits. The control circuit divides the first numeral into a plurality of numerals and adds numerical values respectively indicated by the plurality of numerals to calculate the second numerical value.

Type: Grant

Filed: October 29, 2020

Date of Patent: September 12, 2023

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventor: Natsuki Saeki
Search and knowledge base question answering for a voice user interface

Patent number: 11740863

Abstract: A voice-controlled question answering system that is capable of answering questions using both a knowledge base and a search engine. The knowledge base is used to answer questions when answers to those questions are contained in the knowledge base. If an answer using the knowledge base is unavailable, and if the question is suitable for answering using an unstructured search approach, the system may obtain an answer using a search engine. The search engine results may be processed to obtain an answer to the question suitable for output using a voice user interface.

Type: Grant

Filed: May 1, 2020

Date of Patent: August 29, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Daniel Lewis Spector, Fergus O'Donoghue, Chase Wesley Brown, Jr., Shayne Leon Snow, Brandon Gerald Li Horst, William Folwell Barton
Systems and methods for determining a response to a user query

Patent number: 11741139

Abstract: Systems and methods are presented for providing a response to a user query. Reception of a user query is detected. An augmentation machine learning model is utilized to determine one or more variations of the user query that correspond to a semantic meaning of the user query. A plurality of response candidates is determined that correspond to the user query by comparing the user query and the one or more variations of the user query to a plurality of documents. A final response candidate is determined from the plurality of response candidates based on utilizing a semantic machine learning model to perform a semantic comparison between the plurality of response candidates and at least the user query.

Type: Grant

Filed: May 11, 2021

Date of Patent: August 29, 2023

Assignee: PayPal, Inc.

Inventors: Yuzhen Zhuo, Sandro Cavallari, Van Hoang Nguyen, Kim Dung Bui, Rey Neo, Harsha Singalreddy, Lei Xu, Hewen Wang, Quan Jin Ferdinand Tang, Chun Kiat Ho
Systems and methods for conversations with devices about media using interruptions and changes of subjects

Patent number: 11735170

Abstract: Systems and methods are described herein for providing media guidance. Control circuitry may receive a first voice input and access a database of topics to identify a first topic associated with the first voice input. A user interface may generate a first response to the first voice input, and subsequent to generating the first response, the control circuitry may receive a second voice input. The control circuitry may determine a match between the second voice input and an interruption input such as a period of silence or a keyword or a phrase, such as “Ahh,”, “Umm,”, or “Hmm.” The user interface may generate a second response that is associated with a second topic related to the first topic. By interrupting the conversation and changing the subject from time to time, media guidance systems can appear to be more intelligent and human.

Type: Grant

Filed: April 29, 2021

Date of Patent: August 22, 2023

Assignee: ROVI GUIDES, INC.

Inventors: Charles Dawes, Walter R. Klappert
Voice command matching during testing of voice-assisted application prototypes for languages with non-phonetic alphabets

Patent number: 11727929

Abstract: Voice command matching during testing of voice-assisted application prototypes for languages with non-phonetic alphabets is described. A visual page of an application prototype is displayed during a testing phase of the application prototype. A speech-to-text service converts a non-phonetic voice command spoken in a language with a non-phonetic alphabet, captured by at least one microphone during the testing phase of the application prototype, into a non-phonetic text string in the non-phonetic alphabet of the voice command. A phonetic language translator translates the non-phonetic text string of the voice command into a phonetic text string in a phonetic alphabet of the voice command. A comparison module compares the phonetic text string of the voice command to phonetic text strings in the phonetic alphabet of stored voice commands associated with the application prototype to identify a matching voice command. A performance module performs an action associated with the matching voice command.

Type: Grant

Filed: May 2, 2021

Date of Patent: August 15, 2023

Assignee: Adobe Inc.

Inventors: Mark C. Webster, Scott Thomas Werner, Susse Soenderby Jensen, Daniel Cameron Cundiff, Blake Allen Clayton Sawyer
Voice detection optimization based on selected voice assistant service

Patent number: 11727936

Abstract: Systems and methods for optimizing voice detection via a network microphone device (NMD) based on a selected voice-assistant service (VAS) are disclosed herein. In one example, the NMD detects sound via individual microphones and selects a first VAS to communicate with the NMD. The NMD produces a first sound-data stream based on the detected sound using a spatial processor in a first configuration. Once the NMD determines that a second VAS is to be selected over the first VAS, the spatial processor assumes a second configuration for producing a second sound-data stream based on the detected sound. The second sound-data stream is then transmitted to one or more remote computing devices associated with the second VAS.

Type: Grant

Filed: June 7, 2021

Date of Patent: August 15, 2023

Assignee: Sonos, Inc.

Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
Intuitive computing methods and systems

Patent number: 11715473

Abstract: A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone's camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user's apparent interest in the task. In some arrangements, data may be referred to the cloud for analysis, or for gleaning. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.

Type: Grant

Filed: September 1, 2020

Date of Patent: August 1, 2023

Assignee: Digimarc Corporation

Inventors: Tony F. Rodriguez, Geoffrey B. Rhoads, Bruce L. Davis
Machine learning based generation of synthetic crowd responses

Patent number: 11705150

Abstract: Systems and methods for generating real-time synthetic crowd responses for events, to augment the experience of event participants, remote viewers, and the like. Various sensors monitor the event in question, and various event properties are derived from their output using an event state model. These event properties, along with various event parameters such as score, time remaining, etc., are then input to a machine learning model that determines a real-time synthetic audience reaction tailored to the immediate state of the event. Reaction parameters are used to generate a corresponding crowd or audience audio signal, which may be broadcast to event participants, viewers, spectators, or anyone who may be interested. This instantaneous, realistic crowd reaction more closely simulates the experience of events with full on-site audiences, enhancing the viewing experience of both event participants and those watching.

Type: Grant

Filed: February 5, 2021

Date of Patent: July 18, 2023

Assignee: NVIDIA Corporation

Inventors: Benjemin Thomas Waine, Amy Rose, Andrew James Woodard
Dynamic wakeword detection

Patent number: 11699433

Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.

Type: Grant

Filed: July 23, 2020

Date of Patent: July 11, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
Intelligent automated order-based customer dialogue system

Patent number: 11694039

Abstract: Based on a detection that a customer has arrived at an enterprise location to pick up a previously-placed order, an intelligent automated customer dialogue system generates an interface via which an intelligent customer dialogue application dialogues with the customer. The application generates and initially offers, at the interface using natural language, content which is contextual to one or more items of the order, e.g., by using a specially trained intelligent dialogue machine learning model. The application may intelligently respond to the customer's natural language responses and/or requests to refine, augment, or redirect subsequently-offered content and/or dialogue, e.g., by using the model. Offered content (e.g., product information, services, coupons, suggestions, recommendations, etc.) generally provides value-add to the customer as well as maintains customer engagement.

Type: Grant

Filed: January 22, 2021

Date of Patent: July 4, 2023

Assignee: WALGREEN CO.

Inventor: Oliver Derza
Providing educational media content items based on a determined context of a vehicle or driver of the vehicle

Patent number: 11688293

Abstract: Techniques are described for providing educational media content. According to an embodiment, a system for providing, by a processor, educational media content items based on a determined context of a vehicle or driver of the vehicle is described. The system can comprise a context component that can determine a context of a vehicle or a driver of the vehicle, with the context component employing at least one of artificial intelligence or machine learning to facilitate inferring intent of the driver. The system can comprise a vehicle education component that can perform a utility-based analysis in connection with selecting a media content item relating to a feature of the vehicle based on the determined context, the inferred driver intent and the utility-based analysis. Further, the system can comprise a media component that can output the selected media content item.

Type: Grant

Filed: March 29, 2019

Date of Patent: June 27, 2023

Assignee: VOLVO CAR CORPORATION

Inventors: Magnus Rönnäng, Staffan Davidsson
Dialog management with multiple modalities

Patent number: 11688402

Abstract: Features are disclosed for performing functions in response to user requests. Natural Language Understanding (“NLU”) processing may be performed to generate command data that represents a subject of an utterance. The command data may be sent to an application that causes presentation of first output content in a first modality at a first time in response to receiving the command data, and generates second output content in a second modality different from the first modality, wherein the second output content is associated with the first output content. The second output content may be presented in the second modality at a second time subsequent to the first time.

Type: Grant

Filed: July 2, 2020

Date of Patent: June 27, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Nishant Kumar, David Robert Thomas, Sumedha Arvind Kshirsagar, Vikas Jain, Jeff Bradley Beal, Ajay Gopalakrishnan, Shishir Sridhar Bharathi
Editing parameters

Patent number: 11681864

Abstract: In some embodiments, a method is provided for updating an editing parameter for a model for automatically suggesting revisions to text data. The method may include displaying, on a graphical user interface (GUI) of a user device, one or more interactive input elements, wherein each of the one or more input elements is associated with an editing parameter for a model for automatically suggesting revisions to text data. The method may include receiving, via the GUI, an input from a selected input element of the one or more input elements, wherein the input comprises an indication of a value for a selected editing parameter associated with the selected input element. The method may include updating the selected editing parameter for the model based on the value. The method may include using the model with the updated selected editing parameter to apply an edit operation to an obtained text-under-analysis.

Type: Grant

Filed: December 27, 2021

Date of Patent: June 20, 2023

Assignee: BLACKBOILER, INC.

Inventors: Liam Roshan Dunan Emmart, Jonathan Herr, Daniel P. Broderick, Daniel Edward Simonson
Training an artificial intelligence of a voice response system based on non_verbal feedback

Patent number: 11676593

Abstract: Methods, systems, and computer program products for training an artificial intelligence (AI) of a voice response system. Aspects include receiving, by the voice response system from a user, a voice command to perform a requested action and interpreting, by an AI model, the voice command. Aspects also include performing an action based on the interpretation of the voice command and receiving non-verbal feedback from the user. Aspects further include updating the AI model based on a determination that the non-verbal feedback indicates that the user is not satisfied with the action performed.

Type: Grant

Filed: December 1, 2020

Date of Patent: June 13, 2023

Assignee: International Business Machines Corporation

Inventors: Shikhar Kwatra, Paul N. Krystek, Sushain Pandit, Sarbajit K. Rakshit
Dialog shortcuts for interactive agents

Patent number: 11676596

Abstract: In an approach to creation and execution of dialog shortcuts, responsive to detecting initiation of a dialog, an utterance is received from a user. Whether the utterance contains an objective of the user is determined, where the objective is chosen from a group including create a shortcut, execute the shortcut, modify the shortcut, and delete the shortcut. Responsive to determining that the utterance contains the objective, the objective is implemented.

Type: Grant

Filed: March 2, 2021

Date of Patent: June 13, 2023

Assignee: International Business Machines Corporation

Inventors: Danish Contractor, Sachindra Joshi
Systems and methods for automated medical image analysis

Patent number: 11676701

Abstract: Systems and methods are provided for automatically marking locations within a radiograph of one or more dental pathologies, anatomies, anomalies or other conditions determined by automated image analysis of the radiograph by a number of different machine learning models. Image annotation data may be generated based at least in part on obtained results associated with output of the multiple machine learning models, where the image annotation data indicates at least one location in the radiograph and an associated dental pathology, restoration, anatomy or anomaly detected at the at least one location by at least one of the machine learning models. A number of different pathologies may be identified and their locations marked within a single radiograph image.

Type: Grant

Filed: September 5, 2019

Date of Patent: June 13, 2023

Assignee: Pearl Inc.

Inventors: Cambron Neil Carter, Nandakishore Puttashamachar, Rohit Sanjay Annigeri, Joshua Alexander Tabak, Nishita Kailashnath Sant, Ophir Tanz, Adam Michael Wilbert, Mustafa Alammar
Adaptive text-to-speech outputs based on language proficiency

Patent number: 11670281

Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.

Type: Grant

Filed: January 20, 2021

Date of Patent: June 6, 2023

Assignee: Google LLC

Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
Voice processing system, voice processing method, and storage medium storing voice processing program

Patent number: 11670296

Abstract: A voice processing system includes a command specifier that specifies a command based on a first voice; a command processor that causes the specified command to be executed for a control target; a command determiner that determines whether or not the specified command is a repeated command; and an instruction determiner that determines whether or not a second voice, which corresponds to an execution instruction word indicating an instruction for executing the repeated command, has been received, after the repeated command corresponding to the first voice is executed, when the specified command is the repeated command, wherein when the second voice is received after the repeated command is executed, the command processor causes the repeated command to be repeatedly executed.

Type: Grant

Filed: February 19, 2021

Date of Patent: June 6, 2023

Assignee: SHARP KABUSHIKI KAISHA

Inventors: Keiko Hirukawa, Yuuki Iwamoto, Satoshi Terada
System and method for mobile device active callback integration

Patent number: 11665283

Abstract: A system and method for mobile device active callback integration, utilizing a callback integration engine operating on a user's mobile device that present a callback token for integration through the operating system and software applications operating on the device, wherein interacting with the callback token produces a callback object used to execute a callback incorporating device hardware, context, scheduling, and trust information.

Type: Grant

Filed: November 15, 2022

Date of Patent: May 30, 2023

Assignee: VIRTUAL HOLD TECHNOLOGY SOLUTIONS, LLC

Inventors: Matthew DiMaria, Shannon Lekas, Kurt Nelson, Nicholas James Kennedy, Brian R. Galvin, Daniel Bohannon
Wake word detection modeling

Patent number: 11657804

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

Type: Grant

Filed: November 5, 2020

Date of Patent: May 23, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
Multi-tier speech processing and content operations

Patent number: 11657807

Abstract: A multi-tier architecture is provided for processing user voice queries and making routing decisions for generating responses, including responses to book browsing requests and other content requests. When an utterance is associated with multiple applications in a given domain, the applications may be organized into a subdomain and a tier of routing decisions may be added to the inter-domain and intra-domain routing decision system. The system uses contextual signals to make subdomain routing decisions, including signals regarding content items that are already in a user's content catalog, consumption status of individual content items in the user's catalog, and the like.

Type: Grant

Filed: June 24, 2021

Date of Patent: May 23, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Ponnu Jacob, Jingqian Zhao, Prathap Ramachandra, Uday Kumar Kollu, Lior Maor Maimon, Sean Gunnar Skaar
Distillation of part experts for whole-body pose estimation

Patent number: 11651608

Abstract: A system for generating whole body poses includes: a body regression module configured to generate a first pose of a body of an animal in an input image by regressing from a stored body anchor pose; a face regression module configured to generate a second pose of a face of the animal in the input image by regressing from a stored face anchor pose; an extremity regression module configured to generate a third pose of an extremity of the animal in the input image by regressing from a stored extremity anchor pose; and a pose module configured to generate a whole body pose of the animal in the input image based on the first pose, the second pose, and the third pose.

Type: Grant

Filed: September 13, 2022

Date of Patent: May 16, 2023

Assignees: NAVER CORPORATION, NAVER LABS CORPORATION

Inventors: Philippe Weinzaepfel, Romain Bregier, Hadrien Combaluzier, Vincent Leroy, Gregory Rogez
Creating a virtual context for a voice command

Patent number: 11646024

Abstract: A method includes determining a plurality of voice assistance systems located in a plurality of environments and receiving, from a headset of a user, a voice command from the user. The voice command lacks an identifier for a first voice assistance system of the plurality of voice assistance systems in a first environment of the plurality of environments. The method also includes predicting, based on the voice command, a subset of the plurality of voice assistance systems for executing the voice command and communicating, to the headset, images of environments of the plurality of environments in which the subset of the plurality of voice assistance systems are located. The method further includes detecting that the user selected, from the images, an image of the first environment that contains the first voice assistance system and in response, communicating the voice command to the first voice assistance system.

Type: Grant

Filed: May 10, 2021

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Venkata Vara Prasad Karri, Abhishek Jain, Sarbajit K. Rakshit, Khader Saheb Shaik, Saraswathi Sailaja Perumalla
End-of-turn detection in spoken dialogues

Patent number: 11645473

Abstract: Systems, computer-implemented methods, and computer program products that can facilitate predicting a source of a subsequent spoken dialogue are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a speech receiving component that can receive a spoken dialogue from a first entity. The computer executable components can further comprise a speech processing component that can employ a network that can concurrently process a transition type and a dialogue act of the spoken dialogue to predict a source of a subsequent spoken dialogue.

Type: Grant

Filed: December 23, 2020

Date of Patent: May 9, 2023

Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, THE REGENTS OF THE UNIVERSITY OF MICHIGAN

Inventors: Lazaros Polymenakos, Dimitrios B. Dimitriadis, Zakaria Aldeneh, Emily Mower Provost
Generating and/or utilizing a machine learning model in response to a search request

Patent number: 11645277

Abstract: Implementations relate to providing, in response to a query, machine learning model output that is based on output from a trained machine learning model. The machine learning model output can include a predicted answer to the query, that is predicted based on the trained machine learning model. The machine learning model output can additionally or alternatively include an interactive interface for the trained machine learning model. Some implementations relate to generating a trained machine learning model “on the fly” based on a search query. Some implementations additionally or alternatively relate to storing, in a search index, an association of a machine learning model with a plurality of content items from resource(s) on which the machine learning model was trained.

Type: Grant

Filed: December 11, 2017

Date of Patent: May 9, 2023

Assignee: GOOGLE LLC

Inventors: Steven Ross, Christopher Farrar
Method for dialogue summarization with word graphs

Patent number: 11640493

Abstract: Disclosed is a method for dialogued summarization with word graphs, which is performed by one or more processors of a computing device. The method may include: generating a word graph based on information on a dialogue which is a summary target; extracting at least one keyword based on the generated word graph; generating a plurality of candidate summary sentences based on the generated word graph; and calculating a score associated with at least one keyword for each of the plurality of candidate summary sentences, and selecting at least one of the plurality of candidate summary sentences based on the calculated score.

Type: Grant

Filed: August 3, 2022

Date of Patent: May 2, 2023

Assignee: ActionPower Corp.

Inventors: Seongmin Park, Jihwa Lee
Sharing data with a particular audience

Patent number: 11630924

Abstract: Methods, apparatuses, and non-transitory machine-readable media associated with sharing data with a particular audience are described. Examples can include receiving first data at a processing resource, determining whether the first data comprises a combination of bits associated with text or an image, or both, and comparing the combination of bits to second data stored on a memory resource. Examples can include identifying one or more words or one or more images represented by the first data, or both, based on the comparison and assigning to the first data first metadata representative of a first security categorization and a first confidence level and second metadata representative of a second security categorization and a second confidence level Examples can include transmitting an output that comprises the first data or third data that comprises a modified combination of bits relative to the combination of bits of the first data.

Type: Grant

Filed: August 28, 2020

Date of Patent: April 18, 2023

Assignee: Micron Technology, Inc.

Inventors: Bhagyashree Bokade, Anusha Gunda, Lisa R. Copenspire-Ross
Information processing device and information processing method

Patent number: 11620997

Abstract: Provided is an information processing device that includes a determination unit that determines whether an object that outputs voice is a dialogue target related to voice dialogue based on a result of recognition of an input image, and a dialogue function unit that performs control related to the voice dialogue based on the determination. The dialogue function unit provides a voice dialogue function to the object based on the determination that the object being the dialogue target. Further provided is a method that includes determining whether an object that outputs voice is a dialogue target related to voice dialogue based on a result of recognition of an input image, and performing control related to the voice dialogue based on a result of the determining. The performing of the control further includes providing a voice dialogue function to the object based on the determination that the object is the dialogue target.

Type: Grant

Filed: January 23, 2019

Date of Patent: April 4, 2023

Assignee: SONY CORPORATION

Inventors: Hiromi Kurasawa, Kazumi Aoyama, Yasuharu Asano
Speech recognition error correction apparatus

Patent number: 11620981

Abstract: According to one embodiment, a speech recognition error correction apparatus includes a correction network memory and an error correction circuitry. The error correction circuitry calculates a difference between a speech recognition result string of an error correction target, which is a result of performing speech recognition on a new series of speech data, and a correction network, where a speech recognition result string and a correction result by a user for the speech recognition result string are associated, and when a value indicating the difference is equal to or less than a threshold, perform error correction on a speech recognition error portion in the speech recognition result string of the error correction target by using the correction network to generate a speech recognition error correction result string.

Type: Grant

Filed: September 4, 2020

Date of Patent: April 4, 2023

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Taira Ashikawa, Hiroshi Fujimura, Kenji Iwata
Information processing device, information processing system, and control method thereof

Patent number: 11615146

Abstract: An information processing device includes a network interface and a processor. The processor is configured to: acquire voice data via the network interface, analyze the acquired voice data, based on a result of the analysis, determine a search condition including one or more keywords for searching for one or more items, perform a search using the determined search condition, generate a first text indicating an item found by the search, and controls the network interface to output the generated first text. The processor is further configured to, when two or more items are found by the search, generate a second text suggesting another keyword other than said one or more keywords that have been used for the search, and controls the network interface to output the generated second text.

Type: Grant

Filed: February 18, 2021

Date of Patent: March 28, 2023

Assignee: Toshiba Tec Kabushiki Kaisha

Inventors: Shogo Watada, Naoki Sekine
Interactive device

Patent number: 11604831

Abstract: A dialogue device enabling speech capable of improving a degree of intimacy with a user or a user satisfaction is provided. An input information acquiring unit (101) configured to acquire input information from a user, a focus information acquiring unit (103) configured to acquire focus information representing a focus in the input information, a user profile DB (110) configured to store profile information of the user and date and time information at which the profile information is registered in association with each other, a profile information acquiring unit (107) configured to acquire the profile information in accordance with a priority level determined on the basis of the date and time information from a user profile corresponding to the focus information stored in the user profile DB (110), and a speech generating unit (108) configured to generate a speech sentence (speech information) corresponding to the user profile are included.

Type: Grant

Filed: April 25, 2019

Date of Patent: March 14, 2023

Assignee: NTT DOCOMO, INC.

Inventor: Yuiko Tsunomori
Context-aware signal conditioning for vehicle exterior voice assistant

Patent number: 11600278

Abstract: A vehicle includes a plurality of microphones to obtain speech from a person outside the vehicle as an input signal and a sensor system to determine a location and orientation of the person relative to the vehicle. The vehicle also includes a controller to determine characteristics of the input signal and to determine whether to perform speech enhancement on the input signal based on one or more of the characteristics and the location and orientation of the person.

Type: Grant

Filed: April 19, 2021

Date of Patent: March 7, 2023

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Alaa M. Khamis, Gaurav Talwar, Romeo D. Garcia, Jr., Carmine F. D'agostino, Neeraj R. Gautama
Methods and systems for correcting transcribed audio files

Patent number: 11594211

Abstract: Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.

Type: Grant

Filed: November 4, 2020

Date of Patent: February 28, 2023

Assignee: III Holdings 1, LLC

Inventor: Paul M. Hager
Real time key conversational metrics prediction and notability

Patent number: 11587552

Abstract: A system and a method are disclosed for alerting a manager device to an occurrence of an event an agent device during a conversation between the agent device and an external party. N an embodiment, a processor receives transcript data during a conversation between the agent device and the external party. The processor normalizing the transcript data, and inputs the normalized transcript data into a machine learning model, the machine learning model trained to identify an inflection point in the conversation. The processor receives, as output from the machine learning model, a measure of notability of the normalized transcript data. The processor determines whether the measure of notability corresponds to an inflection point, and, responsive to determining that the measure of notability corresponds to an inflection point, alerts the manager device.

Type: Grant

Filed: April 30, 2020

Date of Patent: February 21, 2023

Assignee: Sutherland Global Services Inc.

Inventors: Eric Jee-Keng Dunn, Dmytro Kovalchuk, Brenton William D'Adamo
Method for exiting a voice skill, apparatus, device and storage medium

Patent number: 11580974

Abstract: A method for exiting a voice skill, an apparatus, a device, and a storage medium are provided by embodiments of the present disclosure, wherein a user voice instruction is received; a target exit intention corresponding to the user voice instruction is identified according to the user voice instruction and a grammar rule of a preset exit intention; and a corresponding operation is executed on a current voice skill of a device according to the target exit intention. The embodiments of the present disclosure refine and expand the user's exit intention. After the target exit intention to which the user voice instruction belongs is identified, the corresponding operation is executed according to the target exit intention so as to meet the users' different exit requirements for the voice skills, enhance the fluency and convenience of user interaction with the device and improve the user's exit experience when using the voice skills.

Type: Grant

Filed: June 29, 2020

Date of Patent: February 14, 2023

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Huan Tang, Xiao Zhou, Liangcheng Wu

prev 1 2 3 4 5 next