Speech To Image Patents (Class 704/235)
  • Patent number: 12294676
    Abstract: A method and system provide for receiving the first request to generate a pipeline flow, identifying a user account based on the first request, associating the pipeline flow with the user account, receiving a second request to process an action associated with the user account, and processing the action including applying the pipeline flow to select the component configuration based on the parameter.
    Type: Grant
    Filed: August 16, 2022
    Date of Patent: May 6, 2025
    Assignee: Twilio Inc.
    Inventors: Christer Jan Erik Fahlgren, Umair Akeel
  • Patent number: 12293825
    Abstract: In some aspects, a method of using a virtual medical assistant to assist a medical professional, the virtual medical assistant implemented, at least in part, by at least one processor of a host device capable of connecting to at least one network is provided. The method comprises receiving free-form instruction from the medical professional, providing the free-form instruction for processing to assist in identifying from the free-form instruction at least one medical task to be performed, obtaining identification of at least one impediment to performing the at least one medical task, and inferring at least some information needed to overcome the at least one impediment.
    Type: Grant
    Filed: July 7, 2022
    Date of Patent: May 6, 2025
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Guido Remi Marcel Gallopyn, Justin Hubbard, Reid W. Coleman
  • Patent number: 12293134
    Abstract: According to some embodiments, a method includes: receiving, by a client device, speech of a user during a screen sharing session; transcribing, by the client device, the speech into text; analyzing, by the client device, the text to identify one or more UI elements referenced within the speech, the one or more UI elements visible within the screen sharing session; and highlighting the one or more UI elements visible on the client device.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: May 6, 2025
    Inventors: Hao Wu, Taodong Lu, Yu Xin
  • Patent number: 12288031
    Abstract: Filtering user intents corresponding to user utterances is provided. A list of allowed user intents is generated, using a natural language understanding model of a chatbot, based on identifying one or more of a set of user intents corresponding to a user utterance within a filtered user intent mapping table. It is determined whether a user intent having a highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents. In response to determining that the user intent having the highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents, content corresponding to the user intent having the highest confidence score is sent, using the chatbot, to a client device of a user who submitted the user utterance as a response to the user utterance.
    Type: Grant
    Filed: July 13, 2022
    Date of Patent: April 29, 2025
    Assignee: ADP, Inc.
    Inventors: Henry C. Will, IV, Stefan George Wilk
  • Patent number: 12288555
    Abstract: An electronic device receives a voice input, and determines whether the voice input is matched with a natural language understanding (NLU) model for determining the presence or absence of a verb. The electronic device further identifies a display context object associated with the voice input based on the voice input being matched with the NLU model. The electronic device calculates a similarity value between the voice input and the display context object, and update a user interface (UI) depending on the calculated the similarity values.
    Type: Grant
    Filed: September 14, 2022
    Date of Patent: April 29, 2025
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Yoonju Lee, Dongwan Kim, Juwhan Kim, Yoonjae Park
  • Patent number: 12284582
    Abstract: The system receives a SIP call from a UE associated with a user, where the SIP call includes multiple fields, and where a field among the multiple fields indicates that the SIP call is the open-line call. The system receives a first indication that the UE associated with a callee has generated a notification to the callee of the SIP call. Upon identifying the SIP call as the open-line call and receiving the first indication, the system sends a first message to the UE associated with the caller that the UE associated with the callee has generated the notification. The first message indicates to the UE associated with the caller to generate a first inaudible notification indicating that the UE associated with the callee generated the notification.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: April 22, 2025
    Assignee: T-Mobile USA, Inc.
    Inventors: Hsin-Fu Henry Chiang, William Michael Hooker
  • Patent number: 12282745
    Abstract: An intelligent question answering method includes: determining, based on received question information, a target object and a target attribute corresponding to the question information; obtaining an answer knowledge path and an external knowledge path of the target object other than the answer knowledge path from a pre-established knowledge graph based on the target object and the target attribute, the answer knowledge path including target context information for describing the target attribute, and the external knowledge path including external context information for describing another attribute; inputting the answer knowledge path and the external knowledge path into a trained neural network model to obtain a reply text, a training corpus of the neural network model during training including at least comment information of the target object; and outputting the reply text.
    Type: Grant
    Filed: March 14, 2022
    Date of Patent: April 22, 2025
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xiaoxue Liu, Yuyao Tang, Ninghua Wang, He Liu
  • Patent number: 12283291
    Abstract: Systems, devices, and methods are provided for determining factually consistent generative narrations. A narrative may be generated by performing steps to determine one or more metadata messages for a first portion of a video stream, determine transcribed commentary for a second portion of the video stream, wherein the second portion includes the first portion, and determine a prompt based at least in part on the one or more metadata messages and the transcribed commentary. The prompt may be provided to a generative model that produces an output text. Techniques for performing a factual consistency evaluation may be used to determine a consistency score for the output text that indicates whether the output text is factually consistent with the one or more metadata messages and the transcribed commentary. A narrated highlight video may be generated using the consistent narrative.
    Type: Grant
    Filed: August 16, 2023
    Date of Patent: April 22, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Noah Lirone Sarfati, Ido Yerushalmy, Michael Chertok, Ianir Ideses
  • Patent number: 12277870
    Abstract: An item generation interface may generate knowledge assessment items directed a subject area based on a set of model items collectively directed to the subject area. The item generation interface may group the set of model assessment items into a plurality of similar item groups using numeric features corresponding to the model assessment items. Similar item groups may include model assessment items covering conceptually similar concepts within the subject area. A conditioning input may be generated for each of the item groups based on the numeric features corresponding to the model assessment items in the item group. Responsive to providing the conditioning inputs to a transformer-based natural language generation model, the item generation interface may receive raw assessment items from the transformer-based natural language generation model. Knowledge assessment items may be identified from the raw assessment items.
    Type: Grant
    Filed: June 27, 2022
    Date of Patent: April 15, 2025
    Assignee: Prometric LLC
    Inventors: Saad Masood Khan, Jesse Andrew Lewis Hamer, Tiago Lima Almeida, Charles Foster, Geoff Converse, Claudio Souza, Lucas Cezimbra, Sara Vispoel
  • Patent number: 12274940
    Abstract: A method and system for providing gaze-based generation of virtual effects indicators correlated with directional sounds is disclosed. Gaze data is tracked via a camera associated with a client device to identify a point of focus within a three-dimensional virtual environment towards which one or both eyes of the player are focused. When the point of focus indicated by the gaze data when the point of focus does not move towards the source location within the three-dimensional virtual environment when the directional sound is received indicates that a virtual effect indicator associated with the directional sound type of the indicated directional sound is should be generated.
    Type: Grant
    Filed: April 17, 2024
    Date of Patent: April 15, 2025
    Assignees: Sony Interactive Entertainment LLC, Sony Interactive Entertainment Inc.
    Inventors: Kristie Ramirez, Elizabeth Juenger, Katie Egeland, Sepideh Karimi, Lachmin Singh, Olga Rudi
  • Patent number: 12266363
    Abstract: The present disclosure provides methods, devices, apparatus, and storage medium for performing speech-to-text conversion. The method includes: displaying, by a first device, a first user interface, the first user interface being a display screen of a virtual environment that provides a virtual activity place for a first virtual role controlled by a first user account; displaying, by a second device, a second user interface, the second user interface being a display screen of a virtual environment that provides a virtual activity place for a second virtual role controlled by a second user account; in response to a speech input operation by the first user account performed on the first device, displaying, by the first device, a chat message in a first language, and displaying, by the second device, the chat message in a second language.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: April 1, 2025
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Peicheng Liu, Xiaohao Liu, Yancan Wang, Dong Ding, Kai Tang, Shan Lin
  • Patent number: 12266345
    Abstract: This disclosure relates generally to ASR and is particularly directed to automatic, efficient, and intelligent detection of transcription bias in ASR models. Contrary to a tradition approach to the testing of ASR bias, the example implementations disclosed herein do not require actual test speeches and corresponding ground-truth texts. Instead, test speeches may be machine-generated from a pre-constructed reference textual passage according short speech samples of speakers using a neural voice cloning technology. The reference passage may be constructed according to a particular target domain of the ASR model being tested. Bias of the ASR model in various aspects may be identified by analyzing transcribed text from the machine-generated speeches and the reference textual passage. The underlying principles for bias detection may be applied to evaluation of general transcription effectiveness and accuracy of the ASR model.
    Type: Grant
    Filed: August 25, 2022
    Date of Patent: April 1, 2025
    Assignee: Accenture Global Solutions Limited
    Inventors: Anup Bera, Hemant Palivela
  • Patent number: 12266359
    Abstract: Implementations described herein generally relate to generating a modification selectable element that may be provided for presentation to a user in a smart dictation session with an automated assistant. The modification selectable element may, when selected, cause a transcription, that includes textual data generated based on processing audio data that captures a spoken utterance and that is automatically arranged, to be modified. The transcription may be automatically arranged to include spacing, punctuation, capitalization, indentations, paragraph breaks, and/or other arrangement operations that are not specified by the user in providing the spoken utterance. Accordingly, a subsequent selection of the modification selectable element may cause these automatic arrangement operation(s), and/or the textual data locationally proximate to these automatic arrangement operation(s), to be modified.
    Type: Grant
    Filed: September 2, 2022
    Date of Patent: April 1, 2025
    Assignee: GOOGLE LLC
    Inventors: Nicolo D'Ercole, Shumin Zhai, Swante Scholz, Mehek Sharma, Adrien Olczak, Akshay Kannan, Alvin Abdagic, Julia Proskurnia, Viesturs Zarins
  • Patent number: 12265581
    Abstract: Multi-modal search systems with improved search request routing are provided. A device can include a module that identifies, based on content of a search request, provider criterion that indicates factors to be considered in making a routing decision, a criterion processor that determines, based on the provider criterion, a routing decision indicating whether to route the search request to a search engine or a chat engine based, at least in part, respective compute costs of servicing the search request using the search engine and the chat engine, respectively, and respective accuracies of responses provided responsive to the search request using the search engine and the chat engine, respectively, and an output port coupled to receive the search request and to provide the search request to the search engine or the chat engine in accord with the routing decision.
    Type: Grant
    Filed: September 28, 2023
    Date of Patent: April 1, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Ryen William White
  • Patent number: 12266197
    Abstract: Systems and computer-implemented methods disclosed herein relate to detecting errors in manually entered data. In one embodiment, the system can identify a named entity automatically from a conversation between a customer and service agent with a named entity recognition model that employs natural language processing and machine learning to detect a word or string of words in the conversation that corresponds to a named entity category. In another embodiment, the system can determine whether data entered into a field on a service platform by the service agent includes an error by comparing the data entered with the named entity. In another embodiment, the system can transmit an alert to the service agent through the service platform when there is a mismatch between the named entity and the data entered.
    Type: Grant
    Filed: April 28, 2022
    Date of Patent: April 1, 2025
    Assignee: Capital One Services, LLC
    Inventors: Tyler Maiman, Joshua Edwards, Feng Qiu, Michael Mossoba, Alexander Lin, Meredith L Critzer, Guadalupe Bonilla, Vahid Khanagha, Mia Rodriguez, Aysu Ezen Can
  • Patent number: 12260841
    Abstract: Generally discussed herein are techniques, software, apparatuses, and systems configured for presenting information to a user through multiple primary user interfaces. In one or more embodiments, a method can include providing first data to a first primary user interface that causes the first primary user interface to provide a first view of an application state of a software application, providing second data to a second primary user interface that causes the second primary user interface to provide a second view of the application state, changing the application state to a new application state, and in response to changing the application state, providing data to the first and second view modules so as to cause the first primary user interface to provide a first view of the new application state and the second primary user interface to provide a second view of the new application state.
    Type: Grant
    Filed: June 22, 2021
    Date of Patent: March 25, 2025
    Assignee: Hand Held Products, Inc.
    Inventors: Jeffrey Pike, Shawn Zabel, Dennis Doubleday, Brian Bender, Mark David Murawski
  • Patent number: 12260852
    Abstract: A method of training a speech recognition model is provided. The method includes that: speech data of each of a plurality of training samples is inputted into a teacher model and a to-be-trained speech recognition model separately. Additionally, an embedding outputted by the teacher model and encoded data outputted by the to-be-trained speech recognition model are obtained. Furthermore, quantized codebook data is obtained by performing a multi-codebook quantization on the embedding. A loss is calculated based on the encoded data, the quantized codebook data, and text data in the training sample. Moreover, a trained speech recognition model is obtained by stopping training the to-be-trained speech recognition model when the loss is less than or equal to a preset loss threshold and/or trained times is greater than preset trained times.
    Type: Grant
    Filed: December 9, 2022
    Date of Patent: March 25, 2025
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Zengwei Yao, Liyong Guo, Povey Daniel, Long Lin, Fangjun Kuang, Wei Kang, Mingshuang Luo, Quandong Wang, Yuxiang Kong
  • Patent number: 12256120
    Abstract: A system may include a memory and a processor in communication therewith configured to perform operations. The operations may include receiving an audio file and a text file related to the audio file, analyzing the audio file to produce an analysis, and determining a portion of the audio file is similar to a segment of the text file. The operations may include identifying a first terminal signal and corresponding the first terminal signal to a first terminal tag in the text file such that the first terminal tag is aligned with the first terminal signal; the first terminal signal identifies a first portion terminal end of the portion and the first terminal tag identifies a first segment terminal end of the segment. The operations may include generating a converted text from the analysis and inserting the segment into the converted text.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: March 18, 2025
    Assignee: International Business Machines Corporation
    Inventors: Charles Steven Lingafelt, Gina Renee Howard, John E. Moore, Jr., Andrew R. Jones
  • Patent number: 12254874
    Abstract: An automated speech recognition (ASR) transcript of at least a portion of a media content is obtained from an ASR tool. Suggested words are received for corrected words of the ASR transcript of the media content. Features are obtained using at least the suggested words or the corrected words. The features include features relating to sound similarities between the suggested words and the corrected words. The features are input into a machine learning (ML) model to obtain a determination regarding a validity of the suggested words. Responsive to the suggested words constituting a valid suggestion, the suggested words are incorporated into the ASR transcript. At least a portion of the ASR transcript is transmitted to a user device in conjunction with at least a portion of the media content.
    Type: Grant
    Filed: February 20, 2022
    Date of Patent: March 18, 2025
    Assignee: GOOGLE LLC
    Inventors: Dirk Padfield, Noah Murad, Edward Lo, Bryan Huh
  • Patent number: 12256199
    Abstract: Embodiments herein relate to assistive listening devices and systems for providing audio streams to device wearers within sound fields. In an embodiment an assistive listening device is included having a control circuit, an electroacoustic transducer for generating sound in electrical communication with the control circuit, a power supply circuit in electrical communication with the control circuit, and a communications circuit in electrical communication with the control circuit. The control circuit can be configured to issue a communication to an audio communication device or audio provisioning device including at least one of a language preference, a set of hearing requirements, data regarding a presentation delay, and an authorization status identifier, digital code, digital token, or digital key specific to a wearer of the assistive listening device. Other embodiments are also included herein.
    Type: Grant
    Filed: November 17, 2023
    Date of Patent: March 18, 2025
    Assignee: Starkey Laboratories, Inc.
    Inventors: Justin R. Burwinkel, Peter J. Tetrick, Jeffrey Paul Solum, Thomas Scheller
  • Patent number: 12249341
    Abstract: An electronic apparatus includes an interface configured to connect with an external apparatus, and a processor. The processor is configured to, in response to a first user speech received by the electronic apparatus including a trigger word, identify a first noise level corresponding to the first user speech received by the electronic apparatus. The processor is configured to identify a first recognition apparatus among a plurality of recognition apparatuses having a highest first noise level corresponding to the first user speech. The plurality of recognition apparatuses identify the first user speech as the trigger word and include the electronic apparatus and the external apparatus. The processor is configured to perform a control operation to implement a function corresponding to a second user speech in response to identifying a second recognition apparatus as having a highest second noise level corresponding to the second user speech among the plurality of recognition apparatuses.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: March 11, 2025
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Chansik Bok
  • Patent number: 12249342
    Abstract: Systems, methods and non-transitory computer readable media for processing audio and visually presenting information are provided. Audio data captured from an environment of a wearer of a wearable apparatus may be obtained. The audio data may be analyzed to obtain textual information. The audio data may be analyzed to associate different portions of the textual information with different speakers. A head mounted display system may be used to present each portion of the textual information in a presentation region associated with the speaker associated with the portion of the textual information. In one example, it may be determined that a part of the textual information is associated with speech that do not involve the user, and presenting the part may be avoided. In one example, a part of the textual information may be associated with speech produced by a user, and presenting the part may be avoided.
    Type: Grant
    Filed: October 20, 2023
    Date of Patent: March 11, 2025
    Inventor: Ron Zass
  • Patent number: 12248755
    Abstract: A method for dynamically generating training data for a model includes receiving a transcript corresponding to a conversation between a customer and an agent, the transcript comprising a customer input and an agent input. The method includes receiving a logic model including a plurality of responses, each response of the plurality of responses representing a potential reply to the customer input. The method further includes selecting, based on the agent input, a response from the plurality of responses of the logic model. The method includes determining that a similarity score between the selected response and the agent input satisfies a similarity threshold, and, based on determining that the similarity score between the selected response and the agent input satisfies the similarity threshold, training a machine learning model using the customer input and the selected response.
    Type: Grant
    Filed: December 9, 2022
    Date of Patent: March 11, 2025
    Assignee: Google LLC
    Inventors: Joseph Lange, Henry Scott Dlhopolsky, Vladimir Vuskovic
  • Patent number: 12248387
    Abstract: Techniques for identifying user reaction in connection with an online application are disclosed. In one embodiment, a method is disclosed comprising obtaining activity data in connection with a user of an application, generating feature data using the obtained activity data, obtaining a user reaction prediction from a user reaction prediction model trained to output the user reaction prediction in response to the feature data. The user reaction prediction indicates a probability of the user reaction in connection with the application. A determination is made, using the user reaction prediction, whether or not to take a remedial action in connection with the user and the application.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: March 11, 2025
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Subham Biswas, Prashanth Veerapandian
  • Patent number: 12249336
    Abstract: Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.
    Type: Grant
    Filed: June 29, 2021
    Date of Patent: March 11, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jinyu Li, Long Zhou, Xie Sun, Shujie Liu
  • Patent number: 12249318
    Abstract: A method including in response to voice information, acquiring a response object to which corresponding voice information is registered and which occupies an interactive medium; and informing the response object to respond to the voice information. The technical solution enables an operating system to automatically determine a response object responding to voice information triggered by a user from numerous response objects in view of voice information registered at each response object and the occupation of interactive media by each response object. The technical solution removes the restriction that a user must speak a wakening phrase to activate a required response object each time before speaking voice information. Thus, a user only needs to speak short voice information that shows the intention for use thereof during a process of using a certain response object, thereby improving the convenience of a user performing voice interaction.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: March 11, 2025
    Assignee: Alibaba Group Holding Limited
    Inventors: Yang Yang, Xinzheng Li, Kai Wang, Zhijun Yuan
  • Patent number: 12243511
    Abstract: A neural text-to-speech system may be configured to emphasize words. Applying emphasis where appropriate enables the TTS system to better reproduce prosodic characteristics of human speech. Emphasis may make the resulting synthesized speech more understandable and engaging than synthesized speech lacking emphasis. Emphasis may be manually annotated to, and/or predicted from, a source text (e.g., a book). In some implementations, the system may use a generative model such as a variational autoencoder to generate word acoustic embeddings indicating how emphasis is to be reflected in the synthesized speech. A phoneme encoder of the TTS system may process phonemes to generate phoneme embeddings. A decoder may process the word acoustic embeddings and the phoneme embeddings to generate spectrogram data representing the synthesized speech.
    Type: Grant
    Filed: March 31, 2022
    Date of Patent: March 4, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Arnaud Vincent Pierre Yves Joly, Marco Nicolis, Elena Sergeevna Sokolova, Jedrzej Sobanski, Mateusz Aleksander Lajszczak, Arent van Korlaar, Ruizhe Li
  • Patent number: 12243091
    Abstract: Systems and methods are disclosed for providing roadside service through a synchronized interactive voice response (IVR) system and graphical user interface (GUI). One method may include: receiving, based on an incoming phone call from a vehicle user device, a request for roadside service for a disabled vehicle of a vehicle user; sending a text to a phone number of the vehicle user device, wherein the text includes a link to a mobile application for roadside service requests; receiving, from the vehicle user device and via the mobile application, information associated with the request for roadside service; determining, based on a location sensor of the vehicle user device, a location of the disabled vehicle; matching a roadside service provider with the disabled vehicle based on the request for roadside service; and enabling the vehicle user to track the service status of the roadside service provider through the mobile application.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: March 4, 2025
    Assignee: Allstate Insurance Company
    Inventors: Paul R. Turnbull, Anthony Banks, Brinderjit Singh
  • Patent number: 12236956
    Abstract: This disclosure relates to systems and methods for processing content and, particularly, but not exclusively, systems and methods for processing audio content. Systems and methods are described that provide techniques for processing, analyzing, and/or structuring of longer-form content to, among other things, make the content searchable, identify relevant and/or interesting segments within the content, provide for and/or otherwise generate search results and/or coherent shorter-form summaries and/or highlights, enable new shorter-form audio listening experiences, and/or the like. Various aspects of the disclosed systems and methods may further enable relatively efficient transcription and/or indexing of content libraries at scale, while also generating effective formats for users interacting with such libraries to engage with search results.
    Type: Grant
    Filed: November 11, 2021
    Date of Patent: February 25, 2025
    Assignee: A9.com, Inc.
    Inventors: Mari Joller, Ottokar Tilk, Aleksandr Tkatšenko, Johnathan Joseph Groat, Mark Fišel, Kaur Karus
  • Patent number: 12236148
    Abstract: A machine includes a processor and a memory connected to the processor. The memory stores instructions executed by the processor to receive a message and a message parameter indicative of a characteristic of the message, where the message includes a photograph or a video. A determination is made that the message parameter corresponds to a selected gallery, where the selected gallery includes a sequence of photographs or videos. The message is posted to the selected gallery in response to the determination. The selected gallery is supplied in response to a request.
    Type: Grant
    Filed: September 8, 2023
    Date of Patent: February 25, 2025
    Assignee: Snap Inc.
    Inventor: Timothy Sehn
  • Patent number: 12236798
    Abstract: A computer implemented interactive presentation assessment and valuation system which provides a server computer that allows one or more computing devices to access a presentation assessment and valuation system which provides a presentation analyzer which applies standardized scoring algorithms to the video data or audio data associated with a presentation and correspondingly generates standardized word rate, word clarity, filler words, tone, or eye contact scores, and calculates a presentation score based upon an average or weighted average of the scores.
    Type: Grant
    Filed: October 3, 2018
    Date of Patent: February 25, 2025
    Assignee: Bongo Learn, Inc.
    Inventors: Brian A. Scholz, Bruce E. Fischer
  • Patent number: 12229151
    Abstract: In an embodiment, a method includes matching terms of a received query to a database of provider capabilities. The method further includes determining, for each portion of the received query, a respective provider to execute the portion of the received query. The method further includes sending each portion of the received query to its respective provider. The method further includes combining results from each respective provider to a returned query result.
    Type: Grant
    Filed: June 21, 2022
    Date of Patent: February 18, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Peter Yeh, Ezra Story, Prateek Jain
  • Patent number: 12231269
    Abstract: Provided is an active user detection and channel estimation method of a base station in a wireless communication system based on grant-free uplink transmission. The method, according to one embodiment, comprises the steps of: receiving superimposed signals () from k active terminals; by using a first artificial neural network and using the received signals () as input, calculating, for all terminals in a cell of a base station, an estimated probability ({circumflex over (?)}) of each of the terminals being an active terminal; and estimating channels of the active terminals by using a second artificial neural network and using the received signals () and an active user detection result value as input.
    Type: Grant
    Filed: November 1, 2021
    Date of Patent: February 18, 2025
    Assignee: Seoul National University R&DB Foundation
    Inventors: Byonghyo Shim, Yongjun Ahn
  • Patent number: 12230257
    Abstract: Various implementations relate to techniques, for controlling smart devices, that are low latency and/or that provide computational efficiencies (client and/or server) and/or network efficiencies. Those implementations relate to generating and/or utilizing cache entries, of a cache that is stored locally at an assistant client device, in control of various smart devices (e.g., smart lights, smart thermostats, smart plugs, smart appliances, smart routers, etc.). Each of the cache entries includes a mapping of text to one or more corresponding semantic representations.
    Type: Grant
    Filed: September 15, 2023
    Date of Patent: February 18, 2025
    Assignee: GOOGLE LLC
    Inventors: David Roy Schairer, Di Lin, Lucas Palmer
  • Patent number: 12230277
    Abstract: Disclosed are a method, a device, and a program for selecting a main speaker among speakers included in a sound source or a conversation record based on the sound source or the conversation record including conversation contents of at least one speaker and generating a summary based on the main speaker. A method of generating a summary for a sound source, the method being performed by at least one computing device, includes: generating a speak score for at least one speaker based on the sound source; determining a main speaker of the sound source based on a speak score for said at least one speaker; and generating a summary for the sound source in consideration of the determined main speaker.
    Type: Grant
    Filed: October 7, 2022
    Date of Patent: February 18, 2025
    Assignee: ActionPower Corp.
    Inventors: Seongmin Park, Seungho Kwak
  • Patent number: 12231600
    Abstract: A method of operating a captioned telephone comprising the steps of dialing a telephone number to a relay gateway, connecting to the gateway over a communication network, performing a test to assess communication network quality, based on the test results, setting up the captioned telephone to operate in at least one of a first mode and a second mode wherein each of the first and second modes is one of (i) a PSTN mode in which data is communicated between the captioned telephone and the relay gateway via analog transmission and (ii) an IP mode in which data is communicated between the captioned telephone and the relay gateway via the Internet.
    Type: Grant
    Filed: February 27, 2024
    Date of Patent: February 18, 2025
    Assignee: Ultratec, Inc.
    Inventors: Robert M. Engelke, Kevin R. Colwell, Troy Vitek
  • Patent number: 12223259
    Abstract: Techniques for managing access to sensitive data in transcriptions are described. A method for managing access to sensitive data in transcriptions may include receiving a request to generate a redacted transcript of content, obtaining a transcript of the content, sending at least a portion of the transcript to a model endpoint to identify sensitive entities in the transcript, receiving an inference response identifying one or more sensitive entities in the transcript, and generating the redacted transcript based at least one the transcript and the inference response.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: February 11, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Varun Sembium Varadarajan, Sravan Babu Bodapati, Deepthi Devaiah Devanira, Pu Paul Zhao, Katrin Kirchhoff, Yue Yang
  • Patent number: 12225248
    Abstract: Systems and methods are described to address shortcomings in conventional systems by correcting an erroneous term in on-screen caption text for a media asset. In some aspects, the systems and methods identify the erroneous term in a text segment of the on-screen caption text, and identify one or more video frames of the media asset corresponding to the text segment. The systems and methods further identify a contextual term related to the erroneous term from the one or more video frames. By accessing a knowledge graph, the systems and methods identify a candidate correction based on the contextual term and a portion of the text segment. Lastly, the systems and methods replaces the erroneous term with the candidate correction.
    Type: Grant
    Filed: November 14, 2023
    Date of Patent: February 11, 2025
    Assignee: Adeia Guides Inc.
    Inventors: Ajay Kumar Gupta, Abhijit Satchidanand Savarkar
  • Patent number: 12217757
    Abstract: A recording system for a motor vehicle includes a communication interface receiving an audio feed from a telephone. A voice recognition module is communicatively coupled to the communication interface and responds to a command from a human user by converting at least a portion of the audio feed from the telephone into text data, and storing the text data in a memory device.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: February 4, 2025
    Assignee: Panasonic Automotive Systems Company of America, Division of Panasonic Corporation of North America
    Inventors: Garrett Laskowski, Padhu Kanagaraj
  • Patent number: 12210818
    Abstract: Various embodiments provide for summarization of an interaction, conversation, encounter, and/or the like in at least an abstractive manner. In one example embodiment, a method is provided. The method includes generating, using an encoder-decoder machine learning model, a party-agnostic representation data object for each utterance data object. The method further includes generating an attention graph data object to represent semantic and party-wise relationships between a plurality of utterance data objects. The method further includes modifying, using the attention graph data object, the party-agnostic representation data object for each utterance data object to form a party-wise representation data object for each utterance data object. The method further includes selecting a subset of party-wise representation data objects for each of a plurality of parties.
    Type: Grant
    Filed: May 2, 2022
    Date of Patent: January 28, 2025
    Assignee: OPTUM, INC.
    Inventors: Suman Roy, Vijay Varma Malladi, Ayan Sengupta
  • Patent number: 12211501
    Abstract: Methods and systems for correcting, based on subsequent second speech, an error in an input generated from first speech using automatic speech recognition, without an explicit indication in the second speech that a user intended to correct the input with the second speech, include determining that a time difference between when search results in response to the input were displayed and when the second speech was received is less than a threshold time, and based on the determination, correcting the input based on the second speech. The methods and systems also include determining that a difference in acceleration of a user input device, used to input the first speech and second speech, between when the search results in response to the input were displayed and when the second speech was received is less than a threshold acceleration, and based on the determination, correcting the input based on the second speech.
    Type: Grant
    Filed: November 7, 2022
    Date of Patent: January 28, 2025
    Assignee: Adeia Guides Inc.
    Inventor: Arun Sreedhara
  • Patent number: 12210816
    Abstract: A method for modifying a voice-signal transcription generated by an automatic speech recognition engine from a voice signal, prior to a generation of a response by a semantic engine equipped with at least one ontology comprising a vocabulary space, from an injection of a question, comprises the following steps: determining the set of all of the words of the transcription not belonging to the space, and for each word of the determined set: a step of word-embedding of the word in the space; determining semantic distances between the embedding and each of the words of the space; a step of selecting at least one word of the space; and modifying the transcription by replacing the word of the transcription with at least one selected word in order to generate a modified signal transcription.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: January 28, 2025
    Assignee: AIRUDIT
    Inventors: Philippe Lebas, Antoine Sottiau, Mathilde Couraud, Florian Boyer
  • Patent number: 12210836
    Abstract: Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system may input the unstructured data into a Naïve Bayes machine learning model, a long short-term memory (LSTM) machine learning model, a named entity recognition (NER) model, a semantic role labeling (SRL) model, a sentiment scoring algorithm, and/or a gradient boosted regression tree (GBRT) machine learning model. Based on determining that the unstructured data is of interest, a data alert may be generated and transmitted for manual review or as part of an automated decisioning process.
    Type: Grant
    Filed: June 6, 2023
    Date of Patent: January 28, 2025
    Assignee: American Express Travel Related Services Company, Inc.
    Inventors: Ravi Batra, Sandeep Bose, Mario Fragoso, Ravneet Ghuman, Madhu Sudan Reddy Gudur, Suraj Madnani, Curtis T. Merryweather, Ravi Varma, Vinod Yadav
  • Patent number: 12206820
    Abstract: Systems and methods for processing calls to determine if the call is potentially fraudulent or unwanted. The system extracts a speech signal from an audio signal associated with a call. The system identifies audio characteristics based on analysis of the audio signal. The system generates textual transcript of the audio signal based on automatic speech recognition of the speech signal, which is used to assign text categories for the call based on an automated multi-label textual classification of the textual transcript. The system assigns audio categories for the call based on automated multi-label acoustic classification of the speech signal. The system generates an output label for the call based on a combined analysis of the text categories, the audio categories, and the audio characteristics. The language spoken during the call may be detected and used to generate the textual transcript and to assign the text categories and the audio categories.
    Type: Grant
    Filed: July 29, 2022
    Date of Patent: January 21, 2025
    Assignee: REALNETWORKS LLC
    Inventors: Branimir Dropuljic, Michael J. Bordash
  • Patent number: 12204807
    Abstract: A machine includes a processor and a memory connected to the processor. The memory stores instructions executed by the processor to receive a message and a message parameter indicative of a characteristic of the message, where the message includes a photograph or a video. A determination is made that the message parameter corresponds to a selected gallery, where the selected gallery includes a sequence of photographs or videos. The message is posted to the selected gallery in response to the determination. The selected gallery is supplied in response to a request.
    Type: Grant
    Filed: September 8, 2023
    Date of Patent: January 21, 2025
    Assignee: Snap Inc.
    Inventor: Timothy Sehn
  • Patent number: 12205211
    Abstract: A content enhancement system includes a computing platform having processing hardware and a system memory storing software code. The processing hardware is configured to execute the software code to receive audio-video (A/V) content, to execute at least one of a visual analysis or an audio analysis of the A/V content, and to determine, based on executing the at least one of the visual analysis or the audio analysis, an emotional aspect of the A/V content. The processing hardware is further configured to execute the software code to generate, using the emotional aspect of the A/V content, a sign language translation of the A/V content, the sign language translation including one or more of a gesture, a posture, or a facial expression conveying the emotional aspect.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: January 21, 2025
    Assignee: Disney Enterprises, Inc.
    Inventors: Marc Brandon, Mark Arana
  • Patent number: 12205575
    Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof.
    Type: Grant
    Filed: July 5, 2023
    Date of Patent: January 21, 2025
    Assignee: GOOGLE LLC
    Inventors: Françoise Beaufays, Andrew Hard, Swaroop Indra Ramaswamy, Om Dipakbhai Thakkar, Rajiv Mathews
  • Patent number: 12198682
    Abstract: An example system includes a processor to receive a summary of a conversation to be generated. The processor can input the summary into a trained summary-grounded conversation generator. The processor can receive a generated conversation from the trained summary-grounded conversation generator.
    Type: Grant
    Filed: September 13, 2021
    Date of Patent: January 14, 2025
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Chulaka Gunasekara, Guy Feigenblat, Benjamin Sznajder, Sachindra Joshi
  • Patent number: 12197885
    Abstract: Techniques are disclosed for translating messages into multiple languages among a group of wireless portable electronic communication devices communicable with one another via a remote cloud-based communications platform server (CPS). The CPS receives a first voice message from a sending portable communication device (PCD) and determines a language setting for the sending PCD. The CPS may then convert the first voice message to a first text message in the language of the sending PCD. The CPS may then determine a logical channel associated with the first voice message and any other PCDs associated with the logical channel set to receive the message sent by the sending PCD.
    Type: Grant
    Filed: May 8, 2024
    Date of Patent: January 14, 2025
    Assignee: Relay, Inc.
    Inventors: Adam Cook, Jared Kashimba, Hunter Davis
  • Patent number: 12200031
    Abstract: A conferencing server receives audio data from devices connected to a conference. The conferencing server generates multiple time-contiguous containers. Each time-contiguous container includes an identifier of an associated device of the devices and one or more payloads of the audio data from the associated device. Each payload has a predefined time length. The conferencing server transmits the multiple time-contiguous containers to a consumer server.
    Type: Grant
    Filed: September 26, 2023
    Date of Patent: January 14, 2025
    Assignee: Zoom Video Communications, Inc.
    Inventors: Sumitha Pillutla, Nicholas Alexander Sieracki