Speech To Image Patents (Class 704/235)

Information insertion method and apparatus, device, medium, and computer program product

Patent number: 12387033

Abstract: An information insertion method performed by a computer device relates to the field of information processing. The method includes: displaying an editing interface corresponding to an online document; receiving a contact insertion operation in the editing interface; in response to the contact insertion operation, inserting a contact corresponding to a first account into the online document; and displaying the contact corresponding to the first account in the online document, the contact corresponding to the first account providing a communication portal for instant messaging with the first account through the online document.

Type: Grant

Filed: June 30, 2023

Date of Patent: August 12, 2025

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Tieming Huang, Yang Zhou, Rui Tang, Li Lin, Bin Li
Role assignment for intelligent virtual assistants

Patent number: 12388831

Abstract: A method includes identifying a cluster of users with a plurality of devices, where each user from the cluster of users is associated with at least one device from the plurality of devices. The method also includes identifying an authorized user from the cluster of users to delegate role assignments to a remaining portion of the cluster of users and receiving, from the authorized user, a first role assignment for a first user from the remaining portion of the cluster of users. In response to receiving, from the first user, an audio command, the method also includes determining whether the first user is authorized to provide the audio command to the intelligent virtual assistant based on the first role assignment. In response to determining the first user is authorized to provide the audio command, the method also includes performing the audio command from the first user.

Type: Grant

Filed: September 19, 2022

Date of Patent: August 12, 2025

Assignee: International Business Machines Corporation

Inventors: Raghuveer Prasad Nagar, Sarbajit K. Rakshit, Radha Srinivasan, Sidharth Ullal
Generating genre appropriate voices for audio books

Patent number: 12380876

Abstract: Systems and processes for generating audio books from text are provided. An example process includes, at an electronic device having one or more processors and memory: receiving a text including at least a first subset and a second subset, wherein at least a portion of the first subset overlaps with at least a portion of the second subset; determining, based on the text, a prosody for a speech output, wherein the prosody is representative of a genre; determining a semantic meaning of the text; and generating, based on the prosody and the semantic meaning, the speech output of the text.

Type: Grant

Filed: October 31, 2022

Date of Patent: August 5, 2025

Assignee: Apple Inc.

Inventors: Ramya Rasipuram, William Beckman, Ladan Golipour, David A. Winarsky, Cheng-Chieh Yeh, Weicheng Zhang
Context-data based speech enhancement

Patent number: 12380909

Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.

Type: Grant

Filed: June 14, 2023

Date of Patent: August 5, 2025

Assignee: QUALCOMM Incorporated

Inventors: Kyungguen Byun, Shuhua Zhang, Lae-Hoon Kim, Erik Visser, Sunkuk Moon, Vahid Montazeri
Machine learning based extraction of partition objects from electronic documents

Patent number: 12373083

Abstract: An object-extraction method includes generating multiple partition objects based on an electronic document, and receiving a first user selection of a data element via a user interface of a compute device. In response to the first user selection, and using a machine learning model, a first subset of partition objects from the multiple partition objects is detected and displayed via the user interface. A user interaction, via the user interface, with one of the partition objects is detected, and in response, a weight of the machine learning model is modified, to produce a modified machine learning model. A second user selection of the data element is received via the user interface, and in response and using the modified machine learning model, a second subset of partition objects from the multiple partition objects is detected and displayed via the user interface, the second subset different from the first subset.

Type: Grant

Filed: February 8, 2021

Date of Patent: July 29, 2025

Inventors: Dan G. Tecuci, Ravi Kiran Reddy Palla, Hamid Reza Motahari Nezhad, Vincent Poon, Nigel Paul Duffy, Joseph Nipko
Identifying high effort statements for call center summaries

Patent number: 12367345

Abstract: Disclosed herein are system, method, and computer program product embodiments for machine learning systems to process incoming call-center calls to provide communication summaries that capture effort levels of statements made during interactive communications. For a given call, the system receives a transcript as the input and generates a textual summary as the output. In order to improve a call summary and customize a summarization task to a call center domain, the technology disclosed herein may employ a classifier that predicts an effort level and attention score for individual utterances within a call transcript, ranks the attention scores and uses selected ones of the ranked utterances in the summary.

Type: Grant

Filed: May 17, 2024

Date of Patent: July 22, 2025

Assignee: Capital One Services, LLC

Inventors: Aysu Ezen Can, Zachary S. Brown, Chris Symons
System and method for automated hashtag hierarchical ontology generation from social media data

Patent number: 12367347

Abstract: Aspects of the present disclosure provide systems, methods, apparatus, and computer-readable storage media for determining semantic relationships between hashtags in social media messages, particularly for use in generating a hierarchical ontology of hashtags based on co-occurrence frequency and diversity metrics. For example, natural language processing may be performed on a plurality of social media messages to extract hashtags from the social media messages. Co-occurrence frequency counts for at least two hashtags and other hashtags, in addition to an ensemble score based on a combination of one or more diversity metrics, may be determined. A hierarchical ontology may be generated based on the co-occurrence frequency counts and the ensemble scores for the at least two hashtags. Such a hierarchical ontology may group hashtags into communities of common topics that are ordered based on ensemble scores.

Type: Grant

Filed: January 21, 2022

Date of Patent: July 22, 2025

Assignee: Thomson Reuters Enterprise Centre GmbH

Inventors: Spencer Bradley Torene, Blake Stephen Howald
Conversion table generation device, conversion table generation method, and recording medium for decision making support

Patent number: 12367351

Abstract: A conversion table generation device includes a similar word extraction unit and a conversion table generation unit. The similar word extraction unit is configured to extract similar words similar to first words, for each of the first words included in a word group used in a dialogue. The conversion table generation unit is configured to associate any one of the first words with the extracted similar words, that are similar to the plurality of first words, as second words, on the basis of the priority, and generates a conversion table for voice recognition with the second word as a conversion source and the first word as a conversion destination.

Type: Grant

Filed: December 25, 2019

Date of Patent: July 22, 2025

Assignee: NEC Corporation

Inventor: Shoujirou Moribe
System and method for generating alternative information formats using advanced computational models for data analysis and automated processing

Patent number: 12368691

Abstract: Systems, computer program products, and methods are described herein for generating alternative information formats using advanced computational models for data analysis and automated processing.

Type: Grant

Filed: October 24, 2023

Date of Patent: July 22, 2025

Assignee: BANK OF AMERICA CORPORATION

Inventors: Malinda Kieffer, Tanya A. Wilson, Susan J. Moss, Andrzej Grabski, Kiran Boosetty, Donna Lee Phillips, Gerard P. Gay, Robert Ronald Rosseland, Jr., Ravinder Kaur Sodhi, Rahul Kumar Mishra, Samuel M. Moiyallah, Jr.
Mixing participant audio from multiple rooms within a virtual conferencing system

Patent number: 12362954

Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for mixing participant audio from multiple rooms within a virtual conferencing system. The program and method provide, in association with designing a first room for virtual conferencing, display of a user interface for mixing participant audio from one or more second rooms into an audio channel for the first room; receive indication of user input via the user interface, the user input corresponding to settings for mixing the participant audio from the one or more second rooms; and provide, based on the settings and in association with virtual conferencing within the first room, for mixing the participant audio from one or more second rooms with respect to the audio channel for the first room.

Type: Grant

Filed: September 7, 2023

Date of Patent: July 15, 2025

Assignee: SNAP INC.

Inventors: Andrew Cheng-min Lin, Walton Lin
Establishing a connection between a mobile device and a vehicle

Patent number: 12363216

Abstract: A system for establishing a wireless connection between a mobile device and a vehicle includes a human-machine interface (HMI), a vehicle communication system, where the vehicle communication system includes a wireless connection transceiver, a speaker, a microphone, a controller in electrical communication with the HMI, the vehicle communication system, the speaker, and the microphone. The controller is programmed to activate a wireless connection mode of the controller based at least in part on a signal from the microphone and transmit a vehicle wireless connection identifier using the speaker. The controller is further programmed to confirm a vehicle wireless connection passcode using at least one of the speaker and the microphone and establish a wireless connection between the mobile device and the controller using the wireless connection transceiver in response to confirming the vehicle wireless connection passcode.

Type: Grant

Filed: October 11, 2022

Date of Patent: July 15, 2025

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Mohamed A. Layouni, Markus Jochim, Thomas M. Forest
Unstructured text classification

Patent number: 12361124

Abstract: The technology described herein identifies malicious URLs using a classifier that is both accurate and fast. Aspects of the technology are particularly well adapted for use as a real-time URL security analysis tool because the technology is able to quickly process a URL and produce a warning when a malicious URL is identified. The rapid processing speed of the technology described herein is produced, in part, by use of only a single input signal, which is the URL itself. The high accuracy produced by the technology described herein is achieved by analyzing the unstructured text on both a character-by-character level and a word-by-word level. The technology described herein uses both character-level and word-level information from the incoming URL.

Type: Grant

Filed: August 14, 2023

Date of Patent: July 15, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Arunkumar Gururajan, Jack Wilson Stokes, III, Farid Tajaddodianfar
Discrete three-dimensional processor

Patent number: 12355023

Abstract: A discrete three-dimensional (3-D) processor comprises stacked first and second dice. The first die comprises three-dimensional memory (3D-M) arrays, whereas the second die comprises at least a portion of a logic/processing circuit and an off-die peripheral-circuit component of the 3D-M array(s). The preferred 3-D processor can be used to compute non-arithmetic function/model. In other applications, the preferred 3-D processor may also be a 3-D configurable computing array, a 3-D pattern processor, or a 3-D neuro-processor.

Type: Grant

Filed: October 23, 2022

Date of Patent: July 8, 2025

Assignee: Hong Kong HaiCun Technology Co., Limited

Inventor: Guobiao Zhang
Configuring a virtual assistant based on conversation data in a data-communications server system

Patent number: 12355617

Abstract: In one example, a server system interfaces with a plurality of remotely-situated client entities to provide data communications services. The system uses processing circuitry for: accessing an archive of digital voice data indicative of transcribed audio conversations respectively involving different client stations participating in data communications; correlating a text-based message received by a virtual assistant and associated with one of the different client entities, with at least one intent or at least one topic associated with the archived digital voice data; and automatically configuring the virtual assistant, based on the text-based message being correlated and via the data-processing computer circuitry, to address or otherwise process the received text-based message.

Type: Grant

Filed: April 10, 2024

Date of Patent: July 8, 2025

Assignee: 8x8, Inc.

Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
Analysis of and notation of communications

Patent number: 12347435

Abstract: This disclosure describes techniques that include facilitating note-taking in various contexts, including when a customer is speaking to an agent of a business. In one example, this disclosure describes a method that includes analyzing, by a computing system, communications between a customer of an organization and an agent of the organization, where the communications include an issue to be addressed by the organization; generating, by the computing system, artifacts of the communication between the customer and the agent; determining, based on the artifacts of the communication, an action to be taken to address the issue; and generating, by the computing system, a user interface providing options associated with addressing the issue.

Type: Grant

Filed: August 9, 2022

Date of Patent: July 1, 2025

Assignee: Wells Fargo Bank, N.A.

Inventors: Ray Joanna Luz Reyes Ramilo, Dennis Emmanuel Montenegro, Manpreet Singh, Ananya Bandyopadhyay, Prakash Jagannathan
Method for processing a payment transaction, and corresponding device, system and programs

Patent number: 12347437

Abstract: A method for registering a user to a voice service. The method is implemented by an interface server and includes: obtaining, by using an electronic voice processing device including at least one component for capturing audio samples, at least one voice sample of the user; obtaining, from the at least one voice sample of the user, at least one information item for confirming consent of the user to the conditions for accessing the voice service; obtaining, from the voice sample, at least one information item associated with the user; and computing a reference voice print associated with the user.

Type: Grant

Filed: February 20, 2020

Date of Patent: July 1, 2025

Assignee: BANKS AND ACQUIRERS INTERNATIONAL HOLDING

Inventors: Arnaud Dubreuil, Quiterie D'Avout, Pierre Quentin
Power demand side speech interaction method and system

Patent number: 12348030

Abstract: Disclosed are a power demand side speech interaction method and system. The method includes: obtaining original demand information, the original demand information including user's basic information, user demand information, and a user demand time; converting the original demand information into first information in text format; performing text statistical analysis based on an industry term on the first information in text format, to obtain second information; searching for corresponding user's actual information from a database according to the second information; outputting the user's actual information; searching for a corresponding forecasting model from the database, according to the second information and the user's basic information; calculating, according to a policy limit value of latest policy information in the database, a time for which the model corresponding to the user's basic information reaches the policy limit value; and transmitting an early warning message.

Type: Grant

Filed: January 14, 2022

Date of Patent: July 1, 2025

Assignees: State Grid Lianyungang Power Supply Company, Lianyungang Zhiyuan Electric Power Design Co., Ltd.

Inventors: Bin Yang, Bo Yang, Weitai Kong, Zhi Sun, Jianxin Wang, Wenjun Ruan, Yucheng Ren, Lu Qi, Hao Chen, Yueping Kong, Wei Yu, Hong Li, Guangxi Li, Hao Wu, Xue Sun, Xuewen Sun, Houkai Zhao, Houying Song, Hongxin Yin
Systems and methods for generating a dynamic list of hint words for automated speech recognition

Patent number: 12347428

Abstract: Systems and methods are provided for determining hint words that improve the accuracy of automated speech recognition (ASR) systems. Hint words are typically determined in the context of a user issuing voice commands in connection with a voice interface system, however, a voice interface system may capture terms from overheard content and/or conversations. A system may determine a sliding window of hint words using set of qualifier rules. The system may capture audio, e.g., from a conversation or played back content, as a first input and decipher a plurality of words including a qualifying first term added to the hint words. The voice interface system may capture more audio as a second input and decipher a second plurality of words including a qualifying second term. The first term may be removed from the set of hint words, e.g., when the second term is added or after an expiration time.

Type: Grant

Filed: July 30, 2021

Date of Patent: July 1, 2025

Assignee: Adeia Guides Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose
EMG speech signal detection

Patent number: 12346500

Abstract: Methods and systems are disclosed for training a machine learning (ML) model to detect inner speech. The system collects, by an electromyograph (EMG) communication device used by a user, a first set of EMG signals over a first time interval. The system generates a first plurality of features based on the first set of EMG signals and generates a first probability associated with presence of inner speech by processing the first plurality of features with a machine learning (ML) model. The system compares the first probability generated by the ML model to a specified threshold and detects presence of the inner speech of the user in response to determining that the first probability generated by the ML model transgresses the specified threshold.

Type: Grant

Filed: April 17, 2023

Date of Patent: July 1, 2025

Assignee: Snap Inc.

Inventors: Mark Kliger, Meir Meshulam, Assif Ziv
Systems and methods for indicating communication efficiency or compliance with ATC phraseology

Patent number: 12347422

Abstract: Systems and methods for indicating communication effectiveness with air traffic control (ATC) are disclosed. The method includes: receiving a transcribed message containing a plurality of words used by an ownship flight crew member in a communication directed to ATC; determining a message intent of the transcribed message from the words used in the communication; identifying a plurality of ideal words that should be used for an ideal message having the same message intent as the transcribed message; comparing the words used in the communication with the words that should have been used in the ideal message; determining based on the comparing whether the words used in the communication conformed to ATC standard phraseology (e.g., ICAO Pilot communication vocabulary); generating an indicator for flight crew that indicates whether the words used in the communication conformed to ATC standard phraseology; and signaling an aircraft display device to display the indicator.

Type: Grant

Filed: January 12, 2022

Date of Patent: July 1, 2025

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Naveen Venkatesh Prasad Nama, Chaya Garg, Vasantha Paulraj, Gobinathan Baladhandapani, Hariharan Saptharishi, Sivakumar Kanagarajan
Speech recognition apparatus, control method, and non-transitory storage medium

Patent number: 12340807

Abstract: A speech recognition apparatus (2000) acquires source data (10) representing an audio signal including an utterance. The speech recognition apparatus (2000) converts the source data (10) into a text string (30). The speech recognition apparatus (2000) generates a concatenated text (40) representing a content of an utterance by concatenating a text (32) included in the text string (30). Herein, texts (32) adjacent to each other in the text string (30) are such that parts of associated audio signals overlap each other on a time axis. At a time of concatenating texts (32) adjacent to each other, the speech recognition apparatus (2000) eliminates a trailing portion of a preceding text (32) and a leading portion of a succeeding text (32).

Type: Grant

Filed: March 9, 2020

Date of Patent: June 24, 2025

Assignee: NEC CORPORATION

Inventors: Shuji Komeiji, Hitoshi Yamamoto
Intelligent expanding similar word model system and method thereof

Patent number: 12334053

Abstract: An intelligent expanding similar word model system and a method thereof are provided. The system is operated in a database system host and includes: a character analysis unit, configured to combine a plurality of key word acoustic models with an interference sound key word test set into a key word forward test module; a candidate word generation unit, configured to generate a plurality of candidate word temporary acoustic models; a recognition rate processing unit, configured to generate a first candidate word acoustic model; a false waking-up rate processing unit, configured to generate a second candidate word acoustic model; and an adjustment unit, configured to combine the plurality of key word acoustic models with the second candidate word acoustic model into a similar word acoustic model.

Type: Grant

Filed: October 14, 2022

Date of Patent: June 17, 2025

Assignee: CYBERON CORPORATION

Inventors: Chin-Jung Liu, Shih-Hsun Chen, Chih-Lung Lin
Detecting corrupted speech in voice-based computer interfaces

Patent number: 12334068

Abstract: Approaches are generally described for corrupted speech detection in voice-based computer interfaces. First input data including first audio data representing a user utterance may be received. First data representing the first audio data may be generated using a first encoder. First text data representing a transcription of the user utterance may be generated. Second data representing the first text data may be generated using a second encoder different from the first encoder. Third data may be generated by combining the first data and the second data. The third data may be sent to a classifier network trained to predict a relevant corruption state for speech processing inputs. The classifier network may determine that the first input data corresponds to a first corruption state.

Type: Grant

Filed: September 29, 2022

Date of Patent: June 17, 2025

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Di Wang, Deshen Wang, Lan Ma, Shu Wang, Wenbo Yan, Prathap Ramachandra
Intuitive dictation

Patent number: 12334060

Abstract: A system and method that can be implemented in, among other things, a computer-implemented method for intuitive dictation without or with minimal use of other input devices besides a microphone, and without or with minimal use of keywords. The method includes receiving speech audio data from a microphone and in response to receiving the audio data, determining automatically whether a transcription of it is intended as a substitution for a fragment of existing text or as new additional text. The method further includes aligning a representation of the speech audio data with the existing text and based on that, determining the likelihood that a transcription of the speech audio data is intended as a replacement of a fragment of existing text and what that fragment is. The method further includes automatically replacing the fragment with the transcription or inserting/appending the transcription, adjusting the final text for proper punctuation and semantics.

Type: Grant

Filed: March 21, 2023

Date of Patent: June 17, 2025

Inventor: Orlin Todorov
Systems and methods for reconstructing voice packets using natural language generation during signal loss

Patent number: 12334048

Abstract: A device may receive and convert audio data to text data in real-time, and may detect a network fluctuation that causes missing voice packets. The device may process partial text and context of the text data, with a model, to generate a new phrase, and may generate a response phoneme for the new phrase. The device may utilize a text embedding model to generate a text embedding for the response phoneme, and may process the audio data, with the model, to generate a target voice sequence. The device may utilize an audio embedding model to generate an audio embedding for the target voice sequence, and may combine the text embedding and the audio embedding to generate an embedding input vector. The device may process the embedding input vector, with an audio synthesis model, to generate a final voice response, and may provide the audio data and the final voice response.

Type: Grant

Filed: October 12, 2022

Date of Patent: June 17, 2025

Assignee: Verizon Patent and Licensing Inc.

Inventors: Saurabh Tahiliani, Subham Biswas
Stochastic future context for speech processing

Patent number: 12334055

Abstract: The amount of future context used in a speech processing application allows for tradeoffs between performance and the delay in providing results to users. Existing speech processing applications may be trained with a specified future context size and perform poorly when used in production with a different future context size. A speech processing application trained using a stochastic future context allows a trained neural network to be used in production with different amounts of future context. During an update step in training, a future-context size may be sampled from a probability distribution, used to mask a neural network, and compute an output of the masked neural network. The output may then be used to compute a loss value and update parameters of the neural network. The trained neural network may then be used in production with different amounts of future context to provide greater flexibility for production speech processing applications.

Type: Grant

Filed: November 18, 2021

Date of Patent: June 17, 2025

Assignee: ASAPP, INC.

Inventors: Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu Jeong Han
Hardware efficient automatic speech recognition

Patent number: 12334075

Abstract: Modern automatic speech recognition (ASR) systems can utilize artificial intelligence (AI) models to service ASR requests. The number and scale of AI models used in a modern ASR system can be substantial. The process of configuring and reconfiguring hardware to execute various AI models corresponding to a substantial number of ASR requests can be time consuming and inefficient. Among other features, the described technology utilizes batching of ASR requests, splitting of the ASR requests, and/or parallel processing to efficiently use hardware tasked with executing AI models corresponding to ASR requests. In one embodiment, the compute graphs of ASR tasks are used to batch the ASR requests. The corresponding AI models of each batch can be loaded into hardware, and batches can be processed in parallel. In some embodiments, the ASR requests are split, batched, and processed in parallel.

Type: Grant

Filed: October 14, 2022

Date of Patent: June 17, 2025

Assignee: Deepgram, Inc.

Inventors: Adam Joseph Sypniewski, Joshua Gevirtz, Nikola Lazar Whallon, Anthony John Deschamps, Scott Ivan Stephenson
Systems and methods for diagnosing equipment

Patent number: 12327571

Abstract: A method may include recording operation of equipment to create an audio file, extracting features from the audio file, inputting the extracted features into a machine learning model, and determining with the machine learning model a score indicative of the operation of the equipment. A system may include an audio sensor to record audio of operation of equipment and generate an audio file, and one or more processors. The one or more processors extract features from the audio file, input the extracted features into a machine learning model, and determine with the machine learning model a score indicative of the operation of the equipment.

Type: Grant

Filed: November 10, 2021

Date of Patent: June 10, 2025

Assignee: Transportation IP Holdings, LLC

Inventor: Naveenkumar Ramaiah
Audio speech signal analysis for fraud detection

Patent number: 12323554

Abstract: A device, system and method for analyzing audio speech signals to detect fraudulent calls to a contact center comprising splitting an audio recording of a call in real-time into a foreground speech signal attributed to a main speaker and a background audio signal, extracting audio features from the foreground speech signal and background audio signal, inputting the extracted audio features into an ensemble model comprising multiple different machine learning models co-trained to cumulatively detect fraud, wherein the multiple different machine learning models include: a speaker audio model to detect audio speech anomalies, a speaker intent model to classify intent of the main speaker, and a prosody model to detect voice intonation of the main speaker. A prediction may be output, by the ensemble model, indicating whether the call is fraudulent.

Type: Grant

Filed: November 10, 2024

Date of Patent: June 3, 2025

Assignee: Morgan Stanley Services Group Inc.

Inventors: Cheryl Fernandes, Mehak Mehta, Aratrika Sarkar, Melissa Kagaju
Electronic device and operation method

Patent number: 12315494

Abstract: An electronic device may include a user interface, a processor operatively connected to the user interface, and a memory operatively connected to the processor. The memory may store instructions that, when executed, may cause the processor to identify a modified hotword included in the first user input in response to failing to detect a hotword included in a first user input received using the user interface, to monitor a second user input received during a specified time using the user interface, to identify an existing hotword corresponding to the modified hotword using the second user input, to provide response data indicating whether to update the existing hotword using the modified hotword, through the user interface, and to update a hotword model based on a user input to the response data. Moreover, various example embodiments found through the disclosure, as well as other embodiments, are possible.

Type: Grant

Filed: July 22, 2022

Date of Patent: May 27, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyunson Seo, Kartik Khandelwal
Error correction system for a misbehavior detection system

Patent number: 12316516

Abstract: A first UE may receive a V2X message from a second UE. The V2X message may be a suspicious message. The V2X message may include a plurality of values. Each value in the plurality of values may correspond to a respective field in a plurality of fields. The first UE may identify a health score associated with the V2X message. The first UE may adjust at least one value in the plurality of values if the health score is greater than a threshold. Accordingly, a health level of a suspicious V2X message may be analyzed quantitatively. A suspicious V2X message with a sufficiently low health level may be suppressed. Moreover, a correctable field value in a suspicious V2X message may be corrected based on a computed, measured, or prestored value.

Type: Grant

Filed: July 19, 2022

Date of Patent: May 27, 2025

Assignee: QUALCOMM Incorporated

Inventors: Jean-Philippe Monteuuis, Cong Chen, Jonathan Petit, Mohammad Raashid Ansari
Algorithmic determination of a story readers discontinuation of reading

Patent number: 12315533

Abstract: The disclosure provides technology for enhancing the ability of a computing device to detect when a user has discontinued reading a text source. An example method includes receiving audio data comprising a spoken word associated with a text source, comparing the audio data with data of the text source, determining, based on the comparing, whether a segment of the audio data corresponds to a location of the text source, and responsive to determining that the segment of the audio data does not correspond to a location of the text source, transmitting a signal indicating that a user has discontinued reading the text source, the signal causing to cease the comparing of the audio data with the data of the text source.

Type: Grant

Filed: December 29, 2023

Date of Patent: May 27, 2025

Assignee: Google LLC

Inventors: Chaitanya Gharpure, Evan Fisher, Eric Liu, Peng Yang, Emily Hou, Victoria Fang
Rapid event and trauma documentation using voice capture

Patent number: 12300249

Abstract: Methods, systems, and computer-readable media for rapid event voice documentation are provided herein. The rapid event voice documentation system captures verbalized orders and actions and translates that unstructured voice data to structured, usable data for documentation. The voice data captured is tagged with metadata including the name and role of the speaker, a time stamp indicating a time the data was spoken, and a clinical concept identified in the data captured. The system automatically identifies orders (e.g., medications, labs and procedures, etc.), treatments, and assessments/findings that were verbalized during the rapid event to create structured data that is usable by a health information system and ready for documentation directly into an EHR. The system provides all of the captured data including orders, assessment documentation, vital signs and measurements, performed procedures, and treatments, and who performed each, available for viewing and interaction in real time.

Type: Grant

Filed: May 1, 2024

Date of Patent: May 13, 2025

Assignee: CERNER INNOVATION, INC.

Inventors: Allison Michelle Thilges, Neil Curtis Pfeiffer, Eslie Rolland Phillips, III, Geoffrey Harold Simmons
System and method for multi-modal podcast summarization

Patent number: 12300243

Abstract: In one aspect, a method includes receiving podcast content, generating a transcript of at least a portion of the podcast content, and parsing the podcast content to (i) identify audio segments within the podcast content, (ii) determine classifications for the audio segments, (iii) identify audio segment offsets, and (iv) identify sentence offsets. The method also includes based on the audio segments, the classifications, the audio segment offsets, and the sentence offsets, dividing the generated transcript into text sentences and, from among the text sentences of the divided transcript, selecting a group of text sentences for use in generating an audio summary of the podcast content. The method also includes based on timestamps at which the group of text sentences begin in the podcast content, combining portions of audio in the podcast content that correspond to the group of text sentences to generate an audio file representing the audio summary.

Type: Grant

Filed: February 22, 2022

Date of Patent: May 13, 2025

Assignee: Gracenote, Inc.

Inventors: Amanmeet Garg, Aneesh Vartakavi, Joshua Ernest Morris
Automatic speech recognition correction

Patent number: 12300225

Abstract: Systems, methods, and computer-readable media for correcting transcriptions created through automatic speech recognition. A transcription of speech created using an automatic speech recognition system can be received. One or more domain-specific contexts associated with the speech can be identified and a text span that includes a mistranscribed entry can be recognized from the speech based on the one or more domain-specific contexts. Additionally, features can be extracted from the mistranscribed entry and the extracted features can be matched against an index of domain-specific entries to identify a correct entry of the mistranscribed entry. Subsequently, the transcription can be corrected by replacing with the mistranscribed entry with the correct entry.

Type: Grant

Filed: September 22, 2022

Date of Patent: May 13, 2025

Assignee: Cisco Technology, Inc.

Inventors: Karthik Raghunathan, Arushi Raghuvanshi, Vijay Ramakrishnan Thimmaiyah, Lucien Serapio Carroll, Varsha Ravikumar Embar
Context-based text suggestion

Patent number: 12299390

Abstract: Generating text suggestions based on context can leverage sources associated with the context to generate more accurate and informed text suggestions. For example, the context can be a user situation, such as the user is attending a meeting. Obtaining text from sources associated with the user situation can generate a corpus of text that can be leveraged for generating the context-based text suggestions.

Type: Grant

Filed: June 4, 2021

Date of Patent: May 13, 2025

Assignee: GOOGLE LLC

Inventors: Daniel V. Klein, Igor dos Santos Ramos
Virtual medical assistant methods and apparatus

Patent number: 12293825

Abstract: In some aspects, a method of using a virtual medical assistant to assist a medical professional, the virtual medical assistant implemented, at least in part, by at least one processor of a host device capable of connecting to at least one network is provided. The method comprises receiving free-form instruction from the medical professional, providing the free-form instruction for processing to assist in identifying from the free-form instruction at least one medical task to be performed, obtaining identification of at least one impediment to performing the at least one medical task, and inferring at least some information needed to overcome the at least one impediment.

Type: Grant

Filed: July 7, 2022

Date of Patent: May 6, 2025

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Guido Remi Marcel Gallopyn, Justin Hubbard, Reid W. Coleman
Pipeline flow management for calls

Patent number: 12294676

Abstract: A method and system provide for receiving the first request to generate a pipeline flow, identifying a user account based on the first request, associating the pipeline flow with the user account, receiving a second request to process an action associated with the user account, and processing the action including applying the pipeline flow to select the component configuration based on the parameter.

Type: Grant

Filed: August 16, 2022

Date of Patent: May 6, 2025

Assignee: Twilio Inc.

Inventors: Christer Jan Erik Fahlgren, Umair Akeel
Voice assisted remote screen sharing

Patent number: 12293134

Abstract: According to some embodiments, a method includes: receiving, by a client device, speech of a user during a screen sharing session; transcribing, by the client device, the speech into text; analyzing, by the client device, the text to identify one or more UI elements referenced within the speech, the one or more UI elements visible within the screen sharing session; and highlighting the one or more UI elements visible on the client device.

Type: Grant

Filed: November 24, 2021

Date of Patent: May 6, 2025

Inventors: Hao Wu, Taodong Lu, Yu Xin
Filtering user intent eligibility

Patent number: 12288031

Abstract: Filtering user intents corresponding to user utterances is provided. A list of allowed user intents is generated, using a natural language understanding model of a chatbot, based on identifying one or more of a set of user intents corresponding to a user utterance within a filtered user intent mapping table. It is determined whether a user intent having a highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents. In response to determining that the user intent having the highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents, content corresponding to the user intent having the highest confidence score is sent, using the chatbot, to a client device of a user who submitted the user utterance as a response to the user utterance.

Type: Grant

Filed: July 13, 2022

Date of Patent: April 29, 2025

Assignee: ADP, Inc.

Inventors: Henry C. Will, IV, Stefan George Wilk
Electronic device and operation method thereof

Patent number: 12288555

Abstract: An electronic device receives a voice input, and determines whether the voice input is matched with a natural language understanding (NLU) model for determining the presence or absence of a verb. The electronic device further identifies a display context object associated with the voice input based on the voice input being matched with the NLU model. The electronic device calculates a similarity value between the voice input and the display context object, and update a user interface (UI) depending on the calculated the similarity values.

Type: Grant

Filed: September 14, 2022

Date of Patent: April 29, 2025

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Yoonju Lee, Dongwan Kim, Juwhan Kim, Yoonjae Park
Inaudibly notifying a caller of a status of an open-line call

Patent number: 12284582

Abstract: The system receives a SIP call from a UE associated with a user, where the SIP call includes multiple fields, and where a field among the multiple fields indicates that the SIP call is the open-line call. The system receives a first indication that the UE associated with a callee has generated a notification to the callee of the SIP call. Upon identifying the SIP call as the open-line call and receiving the first indication, the system sends a first message to the UE associated with the caller that the UE associated with the callee has generated the notification. The first message indicates to the UE associated with the caller to generate a first inaudible notification indicating that the UE associated with the callee generated the notification.

Type: Grant

Filed: March 15, 2022

Date of Patent: April 22, 2025

Assignee: T-Mobile USA, Inc.

Inventors: Hsin-Fu Henry Chiang, William Michael Hooker
Factually consistent generative narrations

Patent number: 12283291

Abstract: Systems, devices, and methods are provided for determining factually consistent generative narrations. A narrative may be generated by performing steps to determine one or more metadata messages for a first portion of a video stream, determine transcribed commentary for a second portion of the video stream, wherein the second portion includes the first portion, and determine a prompt based at least in part on the one or more metadata messages and the transcribed commentary. The prompt may be provided to a generative model that produces an output text. Techniques for performing a factual consistency evaluation may be used to determine a consistency score for the output text that indicates whether the output text is factually consistent with the one or more metadata messages and the transcribed commentary. A narrated highlight video may be generated using the consistent narrative.

Type: Grant

Filed: August 16, 2023

Date of Patent: April 22, 2025

Assignee: Amazon Technologies, Inc.

Inventors: Noah Lirone Sarfati, Ido Yerushalmy, Michael Chertok, Ianir Ideses
Intelligent question answering method, apparatus, and device, and computer-readable storage medium

Patent number: 12282745

Abstract: An intelligent question answering method includes: determining, based on received question information, a target object and a target attribute corresponding to the question information; obtaining an answer knowledge path and an external knowledge path of the target object other than the answer knowledge path from a pre-established knowledge graph based on the target object and the target attribute, the answer knowledge path including target context information for describing the target attribute, and the external knowledge path including external context information for describing another attribute; inputting the answer knowledge path and the external knowledge path into a trained neural network model to obtain a reply text, a training corpus of the neural network model during training including at least comment information of the target object; and outputting the reply text.

Type: Grant

Filed: March 14, 2022

Date of Patent: April 22, 2025

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Xiaoxue Liu, Yuyao Tang, Ninghua Wang, He Liu
Gaze-based coordination of virtual effects indicators

Patent number: 12274940

Abstract: A method and system for providing gaze-based generation of virtual effects indicators correlated with directional sounds is disclosed. Gaze data is tracked via a camera associated with a client device to identify a point of focus within a three-dimensional virtual environment towards which one or both eyes of the player are focused. When the point of focus indicated by the gaze data when the point of focus does not move towards the source location within the three-dimensional virtual environment when the directional sound is received indicates that a virtual effect indicator associated with the directional sound type of the indicated directional sound is should be generated.

Type: Grant

Filed: April 17, 2024

Date of Patent: April 15, 2025

Assignees: Sony Interactive Entertainment LLC, Sony Interactive Entertainment Inc.

Inventors: Kristie Ramirez, Elizabeth Juenger, Katie Egeland, Sepideh Karimi, Lachmin Singh, Olga Rudi
Interface to natural language generator for generation of knowledge assessment items

Patent number: 12277870

Abstract: An item generation interface may generate knowledge assessment items directed a subject area based on a set of model items collectively directed to the subject area. The item generation interface may group the set of model assessment items into a plurality of similar item groups using numeric features corresponding to the model assessment items. Similar item groups may include model assessment items covering conceptually similar concepts within the subject area. A conditioning input may be generated for each of the item groups based on the numeric features corresponding to the model assessment items in the item group. Responsive to providing the conditioning inputs to a transformer-based natural language generation model, the item generation interface may receive raw assessment items from the transformer-based natural language generation model. Knowledge assessment items may be identified from the raw assessment items.

Type: Grant

Filed: June 27, 2022

Date of Patent: April 15, 2025

Assignee: Prometric LLC

Inventors: Saad Masood Khan, Jesse Andrew Lewis Hamer, Tiago Lima Almeida, Charles Foster, Geoff Converse, Claudio Souza, Lucas Cezimbra, Sara Vispoel
Automatic speech generation and intelligent and robust bias detection in automatic speech recognition model

Patent number: 12266345

Abstract: This disclosure relates generally to ASR and is particularly directed to automatic, efficient, and intelligent detection of transcription bias in ASR models. Contrary to a tradition approach to the testing of ASR bias, the example implementations disclosed herein do not require actual test speeches and corresponding ground-truth texts. Instead, test speeches may be machine-generated from a pre-constructed reference textual passage according short speech samples of speakers using a neural voice cloning technology. The reference passage may be constructed according to a particular target domain of the ASR model being tested. Bias of the ASR model in various aspects may be identified by analyzing transcribed text from the machine-generated speeches and the reference textual passage. The underlying principles for bias detection may be applied to evaluation of general transcription effectiveness and accuracy of the ASR model.

Type: Grant

Filed: August 25, 2022

Date of Patent: April 1, 2025

Assignee: Accenture Global Solutions Limited

Inventors: Anup Bera, Hemant Palivela
Speech to text conversion method, system, and apparatus, and medium

Patent number: 12266363

Abstract: The present disclosure provides methods, devices, apparatus, and storage medium for performing speech-to-text conversion. The method includes: displaying, by a first device, a first user interface, the first user interface being a display screen of a virtual environment that provides a virtual activity place for a first virtual role controlled by a first user account; displaying, by a second device, a second user interface, the second user interface being a display screen of a virtual environment that provides a virtual activity place for a second virtual role controlled by a second user account; in response to a speech input operation by the first user account performed on the first device, displaying, by the first device, a chat message in a first language, and displaying, by the second device, the chat message in a second language.

Type: Grant

Filed: October 13, 2021

Date of Patent: April 1, 2025

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Peicheng Liu, Xiaohao Liu, Yancan Wang, Dong Ding, Kai Tang, Shan Lin
Detection of manual entry error

Patent number: 12266197

Abstract: Systems and computer-implemented methods disclosed herein relate to detecting errors in manually entered data. In one embodiment, the system can identify a named entity automatically from a conversation between a customer and service agent with a named entity recognition model that employs natural language processing and machine learning to detect a word or string of words in the conversation that corresponds to a named entity category. In another embodiment, the system can determine whether data entered into a field on a service platform by the service agent includes an error by comparing the data entered with the named entity. In another embodiment, the system can transmit an alert to the service agent through the service platform when there is a mismatch between the named entity and the data entered.

Type: Grant

Filed: April 28, 2022

Date of Patent: April 1, 2025

Assignee: Capital One Services, LLC

Inventors: Tyler Maiman, Joshua Edwards, Feng Qiu, Michael Mossoba, Alexander Lin, Meredith L Critzer, Guadalupe Bonilla, Vahid Khanagha, Mia Rodriguez, Aysu Ezen Can
Multi-modal search request router

Patent number: 12265581

Abstract: Multi-modal search systems with improved search request routing are provided. A device can include a module that identifies, based on content of a search request, provider criterion that indicates factors to be considered in making a routing decision, a criterion processor that determines, based on the provider criterion, a routing decision indicating whether to route the search request to a search engine or a chat engine based, at least in part, respective compute costs of servicing the search request using the search engine and the chat engine, respectively, and respective accuracies of responses provided responsive to the search request using the search engine and the chat engine, respectively, and an output port coupled to receive the search request and to provide the search request to the search engine or the chat engine in accord with the routing decision.

Type: Grant

Filed: September 28, 2023

Date of Patent: April 1, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventor: Ryen William White

1 2 3 4 5 … next