Procedures Used During A Speech Recognition Process, E.g., Man-machine Dialogue, Etc. (epo) Patents (Class 704/E15.04)
  • Patent number: 11798558
    Abstract: A method for transcription is performed by a computer. The method includes: accepting input of a voice after causing a display unit to display a sentence including a plurality of words; acquiring first sound information being information concerning sounds corresponding to the sentence; acquiring second sound information being information concerning sounds of the voice accepted in the accepting; specifying a portion in the first sound information having a prescribed similarity to the second sound information; and correcting a character string in the sentence corresponding to the specified portion based on a character string corresponding to the second sound information.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: October 24, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Satoru Sankoda
  • Patent number: 11797763
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.
    Type: Grant
    Filed: July 24, 2021
    Date of Patent: October 24, 2023
    Assignee: Google LLC
    Inventors: Evgeny A. Cherepanov, Gleb Skobeltsyn, Jakob Nicolaus Foerster, Petar Aleksic, Assaf Avner Hurwitz Michaely
  • Patent number: 11798556
    Abstract: Configurable core domains of a speech processing system are described. A core domain output data format for a given command is originally configured with default content portions. When a user indicates additional content should be output for the command, the speech processing system creates a new output data format for the core domain. The new output data format is user specific and includes both default content portions as well as user preferred content portions.
    Type: Grant
    Filed: January 14, 2022
    Date of Patent: October 24, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohan Mutagi, Felix Wu, Rongzhou Shen, Neelam Satish Agrawal, Vibhunandan Gavini, Pablo Carballude Gonzalez
  • Patent number: 11797610
    Abstract: A natural language interfacing system may use a knowledge acquisition tool to obtain structured representations from user input text. The system may initiate interaction with a request for input and a partial statement with blank text slots labeled by field types. The system may receive input text to fill in a slot of the partial statement and perform semantic parsing on the input text to identify a trigger concept. The system may generate a list of templates defining different semantic frames for the trigger concept. A generated template may include additional generated slots and/or suggested slot-fillers to guide user input. In response to a template selection, the partial statement includes the trigger concept annotated with a semantic frame. This process is repeated by iteratively updating the list of templates until the statement is completed. The statement is mapped to a structured representation including semantic frames.
    Type: Grant
    Filed: September 15, 2020
    Date of Patent: October 24, 2023
    Assignee: Elemental Cognition Inc.
    Inventors: David Ferrucci, Clifton James McFate, Aditya Kalyanpur, Andrea Bradshaw, David Melville
  • Patent number: 11792300
    Abstract: Certain aspects of the disclosure are directed to context aggregation in a data communications network. According to a specific example, process user-data communications between a client station and another station participating in data communications via the data communications services can be processed, where the client station is associated with one of a plurality of client entities configured and arranged to interface with a data communications server providing data communications services. Context information can be aggregated for each respective user-data communication between the client station and the participating station, where the context information corresponds to at least one communications-specific characteristic associated with the user-data communications.
    Type: Grant
    Filed: June 21, 2022
    Date of Patent: October 17, 2023
    Assignee: 8x8, Inc.
    Inventors: Ali Arsanjani, Bryan R. Martin, Manu Mukerji, Venkat Nagaswamy, Marshall Lincoln
  • Patent number: 11790903
    Abstract: Disclosed is a voice recognition device and method. According to the disclosure, the voice recognition device, upon failing to grasp the intent of the user's utterance from the original utterance which is divided into a head utterance and a tail utterance, figures out the intent from the head utterance to thereby complete the original utterance and provides the result of voice recognition processing on the original utterance. According to an embodiment, the voice recognition device may be related to artificial intelligence (AI) modules, robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: October 17, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Hyun Yu, Byeongha Kim, Yejin Kim
  • Patent number: 11783850
    Abstract: Techniques for detecting certain acoustic events from audio data are described. A system may perform event aggregation for certain types of events before sending an output to a device representing the event is detected. The system may bypass the event aggregation process for certain types of events that the system may detect with a high level of confidence. In such cases, the system may send an output to the device when the event is detected. The system may be used to detect acoustic events representing presence of a person or other harmful circumstances (such as, fire, smoke, etc.) in a home, an office, a store, or other types of indoor settings.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: October 10, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Harshavardhan Sundar, Sheetal Laad, Jialiang Bao, Ming Sun, Chao Wang, Chungnam Chan, Cengiz Erbas, Mathias Jourdain, Nipul Bharani, Aaron David Wirshba
  • Patent number: 11783829
    Abstract: Described herein is a system for automatically detecting and assigning action items in a real-time conversation and determining whether such action items have been completed. The system detects, during a meeting, a plurality of action items and an utterance that corresponds to a completed action item. Responsive to detecting the utterance, the system generates a similarity score with respect to a first action item of the plurality of action items. The system compares the similarity score to a first threshold. Responsive to determining that the similarity score does not exceed the first threshold, the system generates a second similarity score with respect to a second action item of the plurality of action items. The system compares the second similarity score to a second threshold, which exceeds the first threshold. Responsive to determining that the second similarity score exceeds the second threshold, the system marks the second action item as completed.
    Type: Grant
    Filed: April 29, 2021
    Date of Patent: October 10, 2023
    Assignee: Outreach Corporation
    Inventors: Rohit Ganpat Mane, Abhishek Abhishek, Krishnamohan Reddy Nareddy, Rajiv Garg
  • Patent number: 11785310
    Abstract: Systems and methods for detecting a conflict between viewing selections of two users before viewing a media asset. In some aspects, the method comprises receiving an audio input through an audio channel, detecting a first utterance from a first user and a second utterance from a second user in the input of the audio channel, parsing the first utterance and the second utterance, analyzing the first utterance and the second utterance to determine context about the first media asset and the second media asset, and presenting a conflict to the users to display on the media asset.
    Type: Grant
    Filed: August 2, 2021
    Date of Patent: October 10, 2023
    Assignee: Rovi Guides, Inc.
    Inventor: Ti-Shiang Wang
  • Patent number: 11772409
    Abstract: Disclosed is a digital pen that tracks the user's writing and provides useful feedback based on the user's writing. In one embodiment, the pen may provide feedback when the user has written a misspelled word, invalid mathematical expression, or any noncompliant expression. The pen may also provide feedback relating to the user's handwriting. The feedback may be visual, auditory, or tactile, and may be realtime or delayed. Statistics relating to the user's performance may be tracked, uploaded to external devices, and shared with others. This allows the user and interested parties to track the user's progress over time. The disclosed pen will be useful in educational settings.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: October 3, 2023
    Inventor: Lauren Michelle Neubauer
  • Patent number: 11778303
    Abstract: A voice-activated camera system for a computing device. The voice-activated camera system includes a processor, a camera module, a speech recognition module and a microphone for accepting user voice input. The voice-activated camera system includes authorized for only a specific user's voice, so that a camera function may be performed when the authorized user speaks the keyword, but the camera function is not performed when an unauthorized user speaks the keyword.
    Type: Grant
    Filed: June 14, 2021
    Date of Patent: October 3, 2023
    Inventor: Jesse L. Wobrock
  • Patent number: 11776562
    Abstract: Certain aspects of the present disclosure provide a method for performing voice activity detection, including: receiving audio data from an audio source of an electronic device; generating a plurality of model input features using a hardware-based feature generator based on the received audio data; providing the plurality of model input features to a hardware-based voice activity detection model; receiving an output value from the hardware-based voice activity detection model; and determining a presence of voice activity in the audio data based on the output value.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: October 3, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Ren Li, Xiaofei Chen, Murray Jarvis
  • Patent number: 11776549
    Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: October 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Aleks Kracun, Matthew Sharifi
  • Patent number: 11768963
    Abstract: A system-on-chip (SoC) includes a memory, a trust provisioning system, a one-time programmable (OTP) element, and a comparator. The memory is configured to store a first secret key before an execution of a trust provisioning operation. The trust provisioning system is configured to receive an encrypted version of a first set of secure assets and one of a second secret key and an encrypted version of the second secret key, and execute the trust provisioning operation on the SoC to store the first set of secure assets and the second secret key in the OTP element. The comparator is configured to compare the first and second secret keys to generate a valid signal that is indicative of a validation of the trust provisioning operation. The first set of secure assets and a second set of secure assets associated with the SoC are accessible based on the valid signal.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: September 26, 2023
    Assignee: NXP USA, INC.
    Inventors: Atul Dahiya, Akshay Kumar Pathak
  • Patent number: 11763809
    Abstract: A speech-processing system may provide access to multiple virtual assistants via one or more voice-controlled devices. Each assistant may leverage language processing and language generation features of the speech-processing system, while handling different commands and/or providing access to different back applications. Each assistant may be associated with its own voice and/or speech style, and thus be perceived as having a particular “personality.” In some situations, a user may invoke a first assistant, e.g., with a wakeword or button press, and provide a command that the speech-processing system may determine will be better handled by a second assistant. The speech-processing system may thus call on a component to generate plan data describing one or more operations for the speech-processing system to execute to handoff the command to the second assistant and provide the user with indications of which assistant will handle the command.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: September 19, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Naveen Bobbili, David Henry, Mark Vincent Mattione, Richard Du, Jyoti Chhabra
  • Patent number: 11756552
    Abstract: A voice recognition apparatus includes: a communication circuit acquiring a speech sentence that is a result of voice recognition of a speech; a storage storing digit number information indicative of the maximum number of digits; and a control circuit. When the number of digits of a first numerical value indicated by a first numeral included in the speech sentence is larger than the maximum number of digits, the control circuit replaces the first numeral in the speech sentence with a second numeral indicative of a second numerical value having the number of digits equal to or less than the maximum number of digits. The control circuit divides the first numeral into a plurality of numerals and adds numerical values respectively indicated by the plurality of numerals to calculate the second numerical value.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: September 12, 2023
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventor: Natsuki Saeki
  • Patent number: 11740863
    Abstract: A voice-controlled question answering system that is capable of answering questions using both a knowledge base and a search engine. The knowledge base is used to answer questions when answers to those questions are contained in the knowledge base. If an answer using the knowledge base is unavailable, and if the question is suitable for answering using an unstructured search approach, the system may obtain an answer using a search engine. The search engine results may be processed to obtain an answer to the question suitable for output using a voice user interface.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Daniel Lewis Spector, Fergus O'Donoghue, Chase Wesley Brown, Jr., Shayne Leon Snow, Brandon Gerald Li Horst, William Folwell Barton
  • Patent number: 11741139
    Abstract: Systems and methods are presented for providing a response to a user query. Reception of a user query is detected. An augmentation machine learning model is utilized to determine one or more variations of the user query that correspond to a semantic meaning of the user query. A plurality of response candidates is determined that correspond to the user query by comparing the user query and the one or more variations of the user query to a plurality of documents. A final response candidate is determined from the plurality of response candidates based on utilizing a semantic machine learning model to perform a semantic comparison between the plurality of response candidates and at least the user query.
    Type: Grant
    Filed: May 11, 2021
    Date of Patent: August 29, 2023
    Assignee: PayPal, Inc.
    Inventors: Yuzhen Zhuo, Sandro Cavallari, Van Hoang Nguyen, Kim Dung Bui, Rey Neo, Harsha Singalreddy, Lei Xu, Hewen Wang, Quan Jin Ferdinand Tang, Chun Kiat Ho
  • Patent number: 11735170
    Abstract: Systems and methods are described herein for providing media guidance. Control circuitry may receive a first voice input and access a database of topics to identify a first topic associated with the first voice input. A user interface may generate a first response to the first voice input, and subsequent to generating the first response, the control circuitry may receive a second voice input. The control circuitry may determine a match between the second voice input and an interruption input such as a period of silence or a keyword or a phrase, such as “Ahh,”, “Umm,”, or “Hmm.” The user interface may generate a second response that is associated with a second topic related to the first topic. By interrupting the conversation and changing the subject from time to time, media guidance systems can appear to be more intelligent and human.
    Type: Grant
    Filed: April 29, 2021
    Date of Patent: August 22, 2023
    Assignee: ROVI GUIDES, INC.
    Inventors: Charles Dawes, Walter R. Klappert
  • Patent number: 11727929
    Abstract: Voice command matching during testing of voice-assisted application prototypes for languages with non-phonetic alphabets is described. A visual page of an application prototype is displayed during a testing phase of the application prototype. A speech-to-text service converts a non-phonetic voice command spoken in a language with a non-phonetic alphabet, captured by at least one microphone during the testing phase of the application prototype, into a non-phonetic text string in the non-phonetic alphabet of the voice command. A phonetic language translator translates the non-phonetic text string of the voice command into a phonetic text string in a phonetic alphabet of the voice command. A comparison module compares the phonetic text string of the voice command to phonetic text strings in the phonetic alphabet of stored voice commands associated with the application prototype to identify a matching voice command. A performance module performs an action associated with the matching voice command.
    Type: Grant
    Filed: May 2, 2021
    Date of Patent: August 15, 2023
    Assignee: Adobe Inc.
    Inventors: Mark C. Webster, Scott Thomas Werner, Susse Soenderby Jensen, Daniel Cameron Cundiff, Blake Allen Clayton Sawyer
  • Patent number: 11727936
    Abstract: Systems and methods for optimizing voice detection via a network microphone device (NMD) based on a selected voice-assistant service (VAS) are disclosed herein. In one example, the NMD detects sound via individual microphones and selects a first VAS to communicate with the NMD. The NMD produces a first sound-data stream based on the detected sound using a spatial processor in a first configuration. Once the NMD determines that a second VAS is to be selected over the first VAS, the spatial processor assumes a second configuration for producing a second sound-data stream based on the detected sound. The second sound-data stream is then transmitted to one or more remote computing devices associated with the second VAS.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: August 15, 2023
    Assignee: Sonos, Inc.
    Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
  • Patent number: 11715473
    Abstract: A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone's camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user's apparent interest in the task. In some arrangements, data may be referred to the cloud for analysis, or for gleaning. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.
    Type: Grant
    Filed: September 1, 2020
    Date of Patent: August 1, 2023
    Assignee: Digimarc Corporation
    Inventors: Tony F. Rodriguez, Geoffrey B. Rhoads, Bruce L. Davis
  • Patent number: 11705150
    Abstract: Systems and methods for generating real-time synthetic crowd responses for events, to augment the experience of event participants, remote viewers, and the like. Various sensors monitor the event in question, and various event properties are derived from their output using an event state model. These event properties, along with various event parameters such as score, time remaining, etc., are then input to a machine learning model that determines a real-time synthetic audience reaction tailored to the immediate state of the event. Reaction parameters are used to generate a corresponding crowd or audience audio signal, which may be broadcast to event participants, viewers, spectators, or anyone who may be interested. This instantaneous, realistic crowd reaction more closely simulates the experience of events with full on-site audiences, enhancing the viewing experience of both event participants and those watching.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: July 18, 2023
    Assignee: NVIDIA Corporation
    Inventors: Benjemin Thomas Waine, Amy Rose, Andrew James Woodard
  • Patent number: 11699433
    Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: July 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
  • Patent number: 11694039
    Abstract: Based on a detection that a customer has arrived at an enterprise location to pick up a previously-placed order, an intelligent automated customer dialogue system generates an interface via which an intelligent customer dialogue application dialogues with the customer. The application generates and initially offers, at the interface using natural language, content which is contextual to one or more items of the order, e.g., by using a specially trained intelligent dialogue machine learning model. The application may intelligently respond to the customer's natural language responses and/or requests to refine, augment, or redirect subsequently-offered content and/or dialogue, e.g., by using the model. Offered content (e.g., product information, services, coupons, suggestions, recommendations, etc.) generally provides value-add to the customer as well as maintains customer engagement.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: July 4, 2023
    Assignee: WALGREEN CO.
    Inventor: Oliver Derza
  • Patent number: 11688293
    Abstract: Techniques are described for providing educational media content. According to an embodiment, a system for providing, by a processor, educational media content items based on a determined context of a vehicle or driver of the vehicle is described. The system can comprise a context component that can determine a context of a vehicle or a driver of the vehicle, with the context component employing at least one of artificial intelligence or machine learning to facilitate inferring intent of the driver. The system can comprise a vehicle education component that can perform a utility-based analysis in connection with selecting a media content item relating to a feature of the vehicle based on the determined context, the inferred driver intent and the utility-based analysis. Further, the system can comprise a media component that can output the selected media content item.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: June 27, 2023
    Assignee: VOLVO CAR CORPORATION
    Inventors: Magnus Rönnäng, Staffan Davidsson
  • Patent number: 11688402
    Abstract: Features are disclosed for performing functions in response to user requests. Natural Language Understanding (“NLU”) processing may be performed to generate command data that represents a subject of an utterance. The command data may be sent to an application that causes presentation of first output content in a first modality at a first time in response to receiving the command data, and generates second output content in a second modality different from the first modality, wherein the second output content is associated with the first output content. The second output content may be presented in the second modality at a second time subsequent to the first time.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: June 27, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Nishant Kumar, David Robert Thomas, Sumedha Arvind Kshirsagar, Vikas Jain, Jeff Bradley Beal, Ajay Gopalakrishnan, Shishir Sridhar Bharathi
  • Patent number: 11681864
    Abstract: In some embodiments, a method is provided for updating an editing parameter for a model for automatically suggesting revisions to text data. The method may include displaying, on a graphical user interface (GUI) of a user device, one or more interactive input elements, wherein each of the one or more input elements is associated with an editing parameter for a model for automatically suggesting revisions to text data. The method may include receiving, via the GUI, an input from a selected input element of the one or more input elements, wherein the input comprises an indication of a value for a selected editing parameter associated with the selected input element. The method may include updating the selected editing parameter for the model based on the value. The method may include using the model with the updated selected editing parameter to apply an edit operation to an obtained text-under-analysis.
    Type: Grant
    Filed: December 27, 2021
    Date of Patent: June 20, 2023
    Assignee: BLACKBOILER, INC.
    Inventors: Liam Roshan Dunan Emmart, Jonathan Herr, Daniel P. Broderick, Daniel Edward Simonson
  • Patent number: 11676593
    Abstract: Methods, systems, and computer program products for training an artificial intelligence (AI) of a voice response system. Aspects include receiving, by the voice response system from a user, a voice command to perform a requested action and interpreting, by an AI model, the voice command. Aspects also include performing an action based on the interpretation of the voice command and receiving non-verbal feedback from the user. Aspects further include updating the AI model based on a determination that the non-verbal feedback indicates that the user is not satisfied with the action performed.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: June 13, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shikhar Kwatra, Paul N. Krystek, Sushain Pandit, Sarbajit K. Rakshit
  • Patent number: 11676596
    Abstract: In an approach to creation and execution of dialog shortcuts, responsive to detecting initiation of a dialog, an utterance is received from a user. Whether the utterance contains an objective of the user is determined, where the objective is chosen from a group including create a shortcut, execute the shortcut, modify the shortcut, and delete the shortcut. Responsive to determining that the utterance contains the objective, the objective is implemented.
    Type: Grant
    Filed: March 2, 2021
    Date of Patent: June 13, 2023
    Assignee: International Business Machines Corporation
    Inventors: Danish Contractor, Sachindra Joshi
  • Patent number: 11676701
    Abstract: Systems and methods are provided for automatically marking locations within a radiograph of one or more dental pathologies, anatomies, anomalies or other conditions determined by automated image analysis of the radiograph by a number of different machine learning models. Image annotation data may be generated based at least in part on obtained results associated with output of the multiple machine learning models, where the image annotation data indicates at least one location in the radiograph and an associated dental pathology, restoration, anatomy or anomaly detected at the at least one location by at least one of the machine learning models. A number of different pathologies may be identified and their locations marked within a single radiograph image.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: June 13, 2023
    Assignee: Pearl Inc.
    Inventors: Cambron Neil Carter, Nandakishore Puttashamachar, Rohit Sanjay Annigeri, Joshua Alexander Tabak, Nishita Kailashnath Sant, Ophir Tanz, Adam Michael Wilbert, Mustafa Alammar
  • Patent number: 11670281
    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
    Type: Grant
    Filed: January 20, 2021
    Date of Patent: June 6, 2023
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
  • Patent number: 11670296
    Abstract: A voice processing system includes a command specifier that specifies a command based on a first voice; a command processor that causes the specified command to be executed for a control target; a command determiner that determines whether or not the specified command is a repeated command; and an instruction determiner that determines whether or not a second voice, which corresponds to an execution instruction word indicating an instruction for executing the repeated command, has been received, after the repeated command corresponding to the first voice is executed, when the specified command is the repeated command, wherein when the second voice is received after the repeated command is executed, the command processor causes the repeated command to be repeatedly executed.
    Type: Grant
    Filed: February 19, 2021
    Date of Patent: June 6, 2023
    Assignee: SHARP KABUSHIKI KAISHA
    Inventors: Keiko Hirukawa, Yuuki Iwamoto, Satoshi Terada
  • Patent number: 11665283
    Abstract: A system and method for mobile device active callback integration, utilizing a callback integration engine operating on a user's mobile device that present a callback token for integration through the operating system and software applications operating on the device, wherein interacting with the callback token produces a callback object used to execute a callback incorporating device hardware, context, scheduling, and trust information.
    Type: Grant
    Filed: November 15, 2022
    Date of Patent: May 30, 2023
    Assignee: VIRTUAL HOLD TECHNOLOGY SOLUTIONS, LLC
    Inventors: Matthew DiMaria, Shannon Lekas, Kurt Nelson, Nicholas James Kennedy, Brian R. Galvin, Daniel Bohannon
  • Patent number: 11657804
    Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: May 23, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
  • Patent number: 11657807
    Abstract: A multi-tier architecture is provided for processing user voice queries and making routing decisions for generating responses, including responses to book browsing requests and other content requests. When an utterance is associated with multiple applications in a given domain, the applications may be organized into a subdomain and a tier of routing decisions may be added to the inter-domain and intra-domain routing decision system. The system uses contextual signals to make subdomain routing decisions, including signals regarding content items that are already in a user's content catalog, consumption status of individual content items in the user's catalog, and the like.
    Type: Grant
    Filed: June 24, 2021
    Date of Patent: May 23, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ponnu Jacob, Jingqian Zhao, Prathap Ramachandra, Uday Kumar Kollu, Lior Maor Maimon, Sean Gunnar Skaar
  • Patent number: 11651608
    Abstract: A system for generating whole body poses includes: a body regression module configured to generate a first pose of a body of an animal in an input image by regressing from a stored body anchor pose; a face regression module configured to generate a second pose of a face of the animal in the input image by regressing from a stored face anchor pose; an extremity regression module configured to generate a third pose of an extremity of the animal in the input image by regressing from a stored extremity anchor pose; and a pose module configured to generate a whole body pose of the animal in the input image based on the first pose, the second pose, and the third pose.
    Type: Grant
    Filed: September 13, 2022
    Date of Patent: May 16, 2023
    Assignees: NAVER CORPORATION, NAVER LABS CORPORATION
    Inventors: Philippe Weinzaepfel, Romain Bregier, Hadrien Combaluzier, Vincent Leroy, Gregory Rogez
  • Patent number: 11646024
    Abstract: A method includes determining a plurality of voice assistance systems located in a plurality of environments and receiving, from a headset of a user, a voice command from the user. The voice command lacks an identifier for a first voice assistance system of the plurality of voice assistance systems in a first environment of the plurality of environments. The method also includes predicting, based on the voice command, a subset of the plurality of voice assistance systems for executing the voice command and communicating, to the headset, images of environments of the plurality of environments in which the subset of the plurality of voice assistance systems are located. The method further includes detecting that the user selected, from the images, an image of the first environment that contains the first voice assistance system and in response, communicating the voice command to the first voice assistance system.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Venkata Vara Prasad Karri, Abhishek Jain, Sarbajit K. Rakshit, Khader Saheb Shaik, Saraswathi Sailaja Perumalla
  • Patent number: 11645473
    Abstract: Systems, computer-implemented methods, and computer program products that can facilitate predicting a source of a subsequent spoken dialogue are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a speech receiving component that can receive a spoken dialogue from a first entity. The computer executable components can further comprise a speech processing component that can employ a network that can concurrently process a transition type and a dialogue act of the spoken dialogue to predict a source of a subsequent spoken dialogue.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: May 9, 2023
    Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, THE REGENTS OF THE UNIVERSITY OF MICHIGAN
    Inventors: Lazaros Polymenakos, Dimitrios B. Dimitriadis, Zakaria Aldeneh, Emily Mower Provost
  • Patent number: 11645277
    Abstract: Implementations relate to providing, in response to a query, machine learning model output that is based on output from a trained machine learning model. The machine learning model output can include a predicted answer to the query, that is predicted based on the trained machine learning model. The machine learning model output can additionally or alternatively include an interactive interface for the trained machine learning model. Some implementations relate to generating a trained machine learning model “on the fly” based on a search query. Some implementations additionally or alternatively relate to storing, in a search index, an association of a machine learning model with a plurality of content items from resource(s) on which the machine learning model was trained.
    Type: Grant
    Filed: December 11, 2017
    Date of Patent: May 9, 2023
    Assignee: GOOGLE LLC
    Inventors: Steven Ross, Christopher Farrar
  • Patent number: 11640493
    Abstract: Disclosed is a method for dialogued summarization with word graphs, which is performed by one or more processors of a computing device. The method may include: generating a word graph based on information on a dialogue which is a summary target; extracting at least one keyword based on the generated word graph; generating a plurality of candidate summary sentences based on the generated word graph; and calculating a score associated with at least one keyword for each of the plurality of candidate summary sentences, and selecting at least one of the plurality of candidate summary sentences based on the calculated score.
    Type: Grant
    Filed: August 3, 2022
    Date of Patent: May 2, 2023
    Assignee: ActionPower Corp.
    Inventors: Seongmin Park, Jihwa Lee
  • Patent number: 11630924
    Abstract: Methods, apparatuses, and non-transitory machine-readable media associated with sharing data with a particular audience are described. Examples can include receiving first data at a processing resource, determining whether the first data comprises a combination of bits associated with text or an image, or both, and comparing the combination of bits to second data stored on a memory resource. Examples can include identifying one or more words or one or more images represented by the first data, or both, based on the comparison and assigning to the first data first metadata representative of a first security categorization and a first confidence level and second metadata representative of a second security categorization and a second confidence level Examples can include transmitting an output that comprises the first data or third data that comprises a modified combination of bits relative to the combination of bits of the first data.
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: April 18, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Bhagyashree Bokade, Anusha Gunda, Lisa R. Copenspire-Ross
  • Patent number: 11620997
    Abstract: Provided is an information processing device that includes a determination unit that determines whether an object that outputs voice is a dialogue target related to voice dialogue based on a result of recognition of an input image, and a dialogue function unit that performs control related to the voice dialogue based on the determination. The dialogue function unit provides a voice dialogue function to the object based on the determination that the object being the dialogue target. Further provided is a method that includes determining whether an object that outputs voice is a dialogue target related to voice dialogue based on a result of recognition of an input image, and performing control related to the voice dialogue based on a result of the determining. The performing of the control further includes providing a voice dialogue function to the object based on the determination that the object is the dialogue target.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: April 4, 2023
    Assignee: SONY CORPORATION
    Inventors: Hiromi Kurasawa, Kazumi Aoyama, Yasuharu Asano
  • Patent number: 11620981
    Abstract: According to one embodiment, a speech recognition error correction apparatus includes a correction network memory and an error correction circuitry. The error correction circuitry calculates a difference between a speech recognition result string of an error correction target, which is a result of performing speech recognition on a new series of speech data, and a correction network, where a speech recognition result string and a correction result by a user for the speech recognition result string are associated, and when a value indicating the difference is equal to or less than a threshold, perform error correction on a speech recognition error portion in the speech recognition result string of the error correction target by using the correction network to generate a speech recognition error correction result string.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: April 4, 2023
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Taira Ashikawa, Hiroshi Fujimura, Kenji Iwata
  • Patent number: 11615146
    Abstract: An information processing device includes a network interface and a processor. The processor is configured to: acquire voice data via the network interface, analyze the acquired voice data, based on a result of the analysis, determine a search condition including one or more keywords for searching for one or more items, perform a search using the determined search condition, generate a first text indicating an item found by the search, and controls the network interface to output the generated first text. The processor is further configured to, when two or more items are found by the search, generate a second text suggesting another keyword other than said one or more keywords that have been used for the search, and controls the network interface to output the generated second text.
    Type: Grant
    Filed: February 18, 2021
    Date of Patent: March 28, 2023
    Assignee: Toshiba Tec Kabushiki Kaisha
    Inventors: Shogo Watada, Naoki Sekine
  • Patent number: 11604831
    Abstract: A dialogue device enabling speech capable of improving a degree of intimacy with a user or a user satisfaction is provided. An input information acquiring unit (101) configured to acquire input information from a user, a focus information acquiring unit (103) configured to acquire focus information representing a focus in the input information, a user profile DB (110) configured to store profile information of the user and date and time information at which the profile information is registered in association with each other, a profile information acquiring unit (107) configured to acquire the profile information in accordance with a priority level determined on the basis of the date and time information from a user profile corresponding to the focus information stored in the user profile DB (110), and a speech generating unit (108) configured to generate a speech sentence (speech information) corresponding to the user profile are included.
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: March 14, 2023
    Assignee: NTT DOCOMO, INC.
    Inventor: Yuiko Tsunomori
  • Patent number: 11600278
    Abstract: A vehicle includes a plurality of microphones to obtain speech from a person outside the vehicle as an input signal and a sensor system to determine a location and orientation of the person relative to the vehicle. The vehicle also includes a controller to determine characteristics of the input signal and to determine whether to perform speech enhancement on the input signal based on one or more of the characteristics and the location and orientation of the person.
    Type: Grant
    Filed: April 19, 2021
    Date of Patent: March 7, 2023
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Alaa M. Khamis, Gaurav Talwar, Romeo D. Garcia, Jr., Carmine F. D'agostino, Neeraj R. Gautama
  • Patent number: 11594211
    Abstract: Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: February 28, 2023
    Assignee: III Holdings 1, LLC
    Inventor: Paul M. Hager
  • Patent number: 11587552
    Abstract: A system and a method are disclosed for alerting a manager device to an occurrence of an event an agent device during a conversation between the agent device and an external party. N an embodiment, a processor receives transcript data during a conversation between the agent device and the external party. The processor normalizing the transcript data, and inputs the normalized transcript data into a machine learning model, the machine learning model trained to identify an inflection point in the conversation. The processor receives, as output from the machine learning model, a measure of notability of the normalized transcript data. The processor determines whether the measure of notability corresponds to an inflection point, and, responsive to determining that the measure of notability corresponds to an inflection point, alerts the manager device.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: February 21, 2023
    Assignee: Sutherland Global Services Inc.
    Inventors: Eric Jee-Keng Dunn, Dmytro Kovalchuk, Brenton William D'Adamo
  • Patent number: 11580974
    Abstract: A method for exiting a voice skill, an apparatus, a device, and a storage medium are provided by embodiments of the present disclosure, wherein a user voice instruction is received; a target exit intention corresponding to the user voice instruction is identified according to the user voice instruction and a grammar rule of a preset exit intention; and a corresponding operation is executed on a current voice skill of a device according to the target exit intention. The embodiments of the present disclosure refine and expand the user's exit intention. After the target exit intention to which the user voice instruction belongs is identified, the corresponding operation is executed according to the target exit intention so as to meet the users' different exit requirements for the voice skills, enhance the fluency and convenience of user interaction with the device and improve the user's exit experience when using the voice skills.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: February 14, 2023
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Huan Tang, Xiao Zhou, Liangcheng Wu