Speech To Image Patents (Class 704/235)
-
Patent number: 12387033Abstract: An information insertion method performed by a computer device relates to the field of information processing. The method includes: displaying an editing interface corresponding to an online document; receiving a contact insertion operation in the editing interface; in response to the contact insertion operation, inserting a contact corresponding to a first account into the online document; and displaying the contact corresponding to the first account in the online document, the contact corresponding to the first account providing a communication portal for instant messaging with the first account through the online document.Type: GrantFiled: June 30, 2023Date of Patent: August 12, 2025Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Tieming Huang, Yang Zhou, Rui Tang, Li Lin, Bin Li
-
Patent number: 12388831Abstract: A method includes identifying a cluster of users with a plurality of devices, where each user from the cluster of users is associated with at least one device from the plurality of devices. The method also includes identifying an authorized user from the cluster of users to delegate role assignments to a remaining portion of the cluster of users and receiving, from the authorized user, a first role assignment for a first user from the remaining portion of the cluster of users. In response to receiving, from the first user, an audio command, the method also includes determining whether the first user is authorized to provide the audio command to the intelligent virtual assistant based on the first role assignment. In response to determining the first user is authorized to provide the audio command, the method also includes performing the audio command from the first user.Type: GrantFiled: September 19, 2022Date of Patent: August 12, 2025Assignee: International Business Machines CorporationInventors: Raghuveer Prasad Nagar, Sarbajit K. Rakshit, Radha Srinivasan, Sidharth Ullal
-
Patent number: 12380876Abstract: Systems and processes for generating audio books from text are provided. An example process includes, at an electronic device having one or more processors and memory: receiving a text including at least a first subset and a second subset, wherein at least a portion of the first subset overlaps with at least a portion of the second subset; determining, based on the text, a prosody for a speech output, wherein the prosody is representative of a genre; determining a semantic meaning of the text; and generating, based on the prosody and the semantic meaning, the speech output of the text.Type: GrantFiled: October 31, 2022Date of Patent: August 5, 2025Assignee: Apple Inc.Inventors: Ramya Rasipuram, William Beckman, Ladan Golipour, David A. Winarsky, Cheng-Chieh Yeh, Weicheng Zhang
-
Patent number: 12380909Abstract: A device to perform speech enhancement includes one or more processors configured to process image data to detect at least one of an emotion, a speaker characteristic, or a noise type. The one or more processors are also configured to generate context data based at least in part on the at least one of the emotion, the speaker characteristic, or the noise type. The one or more processors are further configured to obtain input spectral data based on an input signal. The input signal represents sound that includes speech. The one or more processors are also configured to process, using a multi-encoder transformer, the input spectral data and the context data to generate output spectral data that represents a speech enhanced version of the input signal.Type: GrantFiled: June 14, 2023Date of Patent: August 5, 2025Assignee: QUALCOMM IncorporatedInventors: Kyungguen Byun, Shuhua Zhang, Lae-Hoon Kim, Erik Visser, Sunkuk Moon, Vahid Montazeri
-
Patent number: 12373083Abstract: An object-extraction method includes generating multiple partition objects based on an electronic document, and receiving a first user selection of a data element via a user interface of a compute device. In response to the first user selection, and using a machine learning model, a first subset of partition objects from the multiple partition objects is detected and displayed via the user interface. A user interaction, via the user interface, with one of the partition objects is detected, and in response, a weight of the machine learning model is modified, to produce a modified machine learning model. A second user selection of the data element is received via the user interface, and in response and using the modified machine learning model, a second subset of partition objects from the multiple partition objects is detected and displayed via the user interface, the second subset different from the first subset.Type: GrantFiled: February 8, 2021Date of Patent: July 29, 2025Inventors: Dan G. Tecuci, Ravi Kiran Reddy Palla, Hamid Reza Motahari Nezhad, Vincent Poon, Nigel Paul Duffy, Joseph Nipko
-
Patent number: 12367345Abstract: Disclosed herein are system, method, and computer program product embodiments for machine learning systems to process incoming call-center calls to provide communication summaries that capture effort levels of statements made during interactive communications. For a given call, the system receives a transcript as the input and generates a textual summary as the output. In order to improve a call summary and customize a summarization task to a call center domain, the technology disclosed herein may employ a classifier that predicts an effort level and attention score for individual utterances within a call transcript, ranks the attention scores and uses selected ones of the ranked utterances in the summary.Type: GrantFiled: May 17, 2024Date of Patent: July 22, 2025Assignee: Capital One Services, LLCInventors: Aysu Ezen Can, Zachary S. Brown, Chris Symons
-
Patent number: 12367347Abstract: Aspects of the present disclosure provide systems, methods, apparatus, and computer-readable storage media for determining semantic relationships between hashtags in social media messages, particularly for use in generating a hierarchical ontology of hashtags based on co-occurrence frequency and diversity metrics. For example, natural language processing may be performed on a plurality of social media messages to extract hashtags from the social media messages. Co-occurrence frequency counts for at least two hashtags and other hashtags, in addition to an ensemble score based on a combination of one or more diversity metrics, may be determined. A hierarchical ontology may be generated based on the co-occurrence frequency counts and the ensemble scores for the at least two hashtags. Such a hierarchical ontology may group hashtags into communities of common topics that are ordered based on ensemble scores.Type: GrantFiled: January 21, 2022Date of Patent: July 22, 2025Assignee: Thomson Reuters Enterprise Centre GmbHInventors: Spencer Bradley Torene, Blake Stephen Howald
-
Patent number: 12367351Abstract: A conversion table generation device includes a similar word extraction unit and a conversion table generation unit. The similar word extraction unit is configured to extract similar words similar to first words, for each of the first words included in a word group used in a dialogue. The conversion table generation unit is configured to associate any one of the first words with the extracted similar words, that are similar to the plurality of first words, as second words, on the basis of the priority, and generates a conversion table for voice recognition with the second word as a conversion source and the first word as a conversion destination.Type: GrantFiled: December 25, 2019Date of Patent: July 22, 2025Assignee: NEC CorporationInventor: Shoujirou Moribe
-
Patent number: 12368691Abstract: Systems, computer program products, and methods are described herein for generating alternative information formats using advanced computational models for data analysis and automated processing.Type: GrantFiled: October 24, 2023Date of Patent: July 22, 2025Assignee: BANK OF AMERICA CORPORATIONInventors: Malinda Kieffer, Tanya A. Wilson, Susan J. Moss, Andrzej Grabski, Kiran Boosetty, Donna Lee Phillips, Gerard P. Gay, Robert Ronald Rosseland, Jr., Ravinder Kaur Sodhi, Rahul Kumar Mishra, Samuel M. Moiyallah, Jr.
-
Patent number: 12362954Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for mixing participant audio from multiple rooms within a virtual conferencing system. The program and method provide, in association with designing a first room for virtual conferencing, display of a user interface for mixing participant audio from one or more second rooms into an audio channel for the first room; receive indication of user input via the user interface, the user input corresponding to settings for mixing the participant audio from the one or more second rooms; and provide, based on the settings and in association with virtual conferencing within the first room, for mixing the participant audio from one or more second rooms with respect to the audio channel for the first room.Type: GrantFiled: September 7, 2023Date of Patent: July 15, 2025Assignee: SNAP INC.Inventors: Andrew Cheng-min Lin, Walton Lin
-
Patent number: 12363216Abstract: A system for establishing a wireless connection between a mobile device and a vehicle includes a human-machine interface (HMI), a vehicle communication system, where the vehicle communication system includes a wireless connection transceiver, a speaker, a microphone, a controller in electrical communication with the HMI, the vehicle communication system, the speaker, and the microphone. The controller is programmed to activate a wireless connection mode of the controller based at least in part on a signal from the microphone and transmit a vehicle wireless connection identifier using the speaker. The controller is further programmed to confirm a vehicle wireless connection passcode using at least one of the speaker and the microphone and establish a wireless connection between the mobile device and the controller using the wireless connection transceiver in response to confirming the vehicle wireless connection passcode.Type: GrantFiled: October 11, 2022Date of Patent: July 15, 2025Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: Mohamed A. Layouni, Markus Jochim, Thomas M. Forest
-
Patent number: 12361124Abstract: The technology described herein identifies malicious URLs using a classifier that is both accurate and fast. Aspects of the technology are particularly well adapted for use as a real-time URL security analysis tool because the technology is able to quickly process a URL and produce a warning when a malicious URL is identified. The rapid processing speed of the technology described herein is produced, in part, by use of only a single input signal, which is the URL itself. The high accuracy produced by the technology described herein is achieved by analyzing the unstructured text on both a character-by-character level and a word-by-word level. The technology described herein uses both character-level and word-level information from the incoming URL.Type: GrantFiled: August 14, 2023Date of Patent: July 15, 2025Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Arunkumar Gururajan, Jack Wilson Stokes, III, Farid Tajaddodianfar
-
Patent number: 12355023Abstract: A discrete three-dimensional (3-D) processor comprises stacked first and second dice. The first die comprises three-dimensional memory (3D-M) arrays, whereas the second die comprises at least a portion of a logic/processing circuit and an off-die peripheral-circuit component of the 3D-M array(s). The preferred 3-D processor can be used to compute non-arithmetic function/model. In other applications, the preferred 3-D processor may also be a 3-D configurable computing array, a 3-D pattern processor, or a 3-D neuro-processor.Type: GrantFiled: October 23, 2022Date of Patent: July 8, 2025Assignee: Hong Kong HaiCun Technology Co., LimitedInventor: Guobiao Zhang
-
Patent number: 12355617Abstract: In one example, a server system interfaces with a plurality of remotely-situated client entities to provide data communications services. The system uses processing circuitry for: accessing an archive of digital voice data indicative of transcribed audio conversations respectively involving different client stations participating in data communications; correlating a text-based message received by a virtual assistant and associated with one of the different client entities, with at least one intent or at least one topic associated with the archived digital voice data; and automatically configuring the virtual assistant, based on the text-based message being correlated and via the data-processing computer circuitry, to address or otherwise process the received text-based message.Type: GrantFiled: April 10, 2024Date of Patent: July 8, 2025Assignee: 8x8, Inc.Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
-
Patent number: 12347435Abstract: This disclosure describes techniques that include facilitating note-taking in various contexts, including when a customer is speaking to an agent of a business. In one example, this disclosure describes a method that includes analyzing, by a computing system, communications between a customer of an organization and an agent of the organization, where the communications include an issue to be addressed by the organization; generating, by the computing system, artifacts of the communication between the customer and the agent; determining, based on the artifacts of the communication, an action to be taken to address the issue; and generating, by the computing system, a user interface providing options associated with addressing the issue.Type: GrantFiled: August 9, 2022Date of Patent: July 1, 2025Assignee: Wells Fargo Bank, N.A.Inventors: Ray Joanna Luz Reyes Ramilo, Dennis Emmanuel Montenegro, Manpreet Singh, Ananya Bandyopadhyay, Prakash Jagannathan
-
Patent number: 12347437Abstract: A method for registering a user to a voice service. The method is implemented by an interface server and includes: obtaining, by using an electronic voice processing device including at least one component for capturing audio samples, at least one voice sample of the user; obtaining, from the at least one voice sample of the user, at least one information item for confirming consent of the user to the conditions for accessing the voice service; obtaining, from the voice sample, at least one information item associated with the user; and computing a reference voice print associated with the user.Type: GrantFiled: February 20, 2020Date of Patent: July 1, 2025Assignee: BANKS AND ACQUIRERS INTERNATIONAL HOLDINGInventors: Arnaud Dubreuil, Quiterie D'Avout, Pierre Quentin
-
Patent number: 12348030Abstract: Disclosed are a power demand side speech interaction method and system. The method includes: obtaining original demand information, the original demand information including user's basic information, user demand information, and a user demand time; converting the original demand information into first information in text format; performing text statistical analysis based on an industry term on the first information in text format, to obtain second information; searching for corresponding user's actual information from a database according to the second information; outputting the user's actual information; searching for a corresponding forecasting model from the database, according to the second information and the user's basic information; calculating, according to a policy limit value of latest policy information in the database, a time for which the model corresponding to the user's basic information reaches the policy limit value; and transmitting an early warning message.Type: GrantFiled: January 14, 2022Date of Patent: July 1, 2025Assignees: State Grid Lianyungang Power Supply Company, Lianyungang Zhiyuan Electric Power Design Co., Ltd.Inventors: Bin Yang, Bo Yang, Weitai Kong, Zhi Sun, Jianxin Wang, Wenjun Ruan, Yucheng Ren, Lu Qi, Hao Chen, Yueping Kong, Wei Yu, Hong Li, Guangxi Li, Hao Wu, Xue Sun, Xuewen Sun, Houkai Zhao, Houying Song, Hongxin Yin
-
Patent number: 12347428Abstract: Systems and methods are provided for determining hint words that improve the accuracy of automated speech recognition (ASR) systems. Hint words are typically determined in the context of a user issuing voice commands in connection with a voice interface system, however, a voice interface system may capture terms from overheard content and/or conversations. A system may determine a sliding window of hint words using set of qualifier rules. The system may capture audio, e.g., from a conversation or played back content, as a first input and decipher a plurality of words including a qualifying first term added to the hint words. The voice interface system may capture more audio as a second input and decipher a second plurality of words including a qualifying second term. The first term may be removed from the set of hint words, e.g., when the second term is added or after an expiration time.Type: GrantFiled: July 30, 2021Date of Patent: July 1, 2025Assignee: Adeia Guides Inc.Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose
-
Patent number: 12346500Abstract: Methods and systems are disclosed for training a machine learning (ML) model to detect inner speech. The system collects, by an electromyograph (EMG) communication device used by a user, a first set of EMG signals over a first time interval. The system generates a first plurality of features based on the first set of EMG signals and generates a first probability associated with presence of inner speech by processing the first plurality of features with a machine learning (ML) model. The system compares the first probability generated by the ML model to a specified threshold and detects presence of the inner speech of the user in response to determining that the first probability generated by the ML model transgresses the specified threshold.Type: GrantFiled: April 17, 2023Date of Patent: July 1, 2025Assignee: Snap Inc.Inventors: Mark Kliger, Meir Meshulam, Assif Ziv
-
Patent number: 12347422Abstract: Systems and methods for indicating communication effectiveness with air traffic control (ATC) are disclosed. The method includes: receiving a transcribed message containing a plurality of words used by an ownship flight crew member in a communication directed to ATC; determining a message intent of the transcribed message from the words used in the communication; identifying a plurality of ideal words that should be used for an ideal message having the same message intent as the transcribed message; comparing the words used in the communication with the words that should have been used in the ideal message; determining based on the comparing whether the words used in the communication conformed to ATC standard phraseology (e.g., ICAO Pilot communication vocabulary); generating an indicator for flight crew that indicates whether the words used in the communication conformed to ATC standard phraseology; and signaling an aircraft display device to display the indicator.Type: GrantFiled: January 12, 2022Date of Patent: July 1, 2025Assignee: HONEYWELL INTERNATIONAL INC.Inventors: Naveen Venkatesh Prasad Nama, Chaya Garg, Vasantha Paulraj, Gobinathan Baladhandapani, Hariharan Saptharishi, Sivakumar Kanagarajan
-
Patent number: 12340807Abstract: A speech recognition apparatus (2000) acquires source data (10) representing an audio signal including an utterance. The speech recognition apparatus (2000) converts the source data (10) into a text string (30). The speech recognition apparatus (2000) generates a concatenated text (40) representing a content of an utterance by concatenating a text (32) included in the text string (30). Herein, texts (32) adjacent to each other in the text string (30) are such that parts of associated audio signals overlap each other on a time axis. At a time of concatenating texts (32) adjacent to each other, the speech recognition apparatus (2000) eliminates a trailing portion of a preceding text (32) and a leading portion of a succeeding text (32).Type: GrantFiled: March 9, 2020Date of Patent: June 24, 2025Assignee: NEC CORPORATIONInventors: Shuji Komeiji, Hitoshi Yamamoto
-
Patent number: 12334053Abstract: An intelligent expanding similar word model system and a method thereof are provided. The system is operated in a database system host and includes: a character analysis unit, configured to combine a plurality of key word acoustic models with an interference sound key word test set into a key word forward test module; a candidate word generation unit, configured to generate a plurality of candidate word temporary acoustic models; a recognition rate processing unit, configured to generate a first candidate word acoustic model; a false waking-up rate processing unit, configured to generate a second candidate word acoustic model; and an adjustment unit, configured to combine the plurality of key word acoustic models with the second candidate word acoustic model into a similar word acoustic model.Type: GrantFiled: October 14, 2022Date of Patent: June 17, 2025Assignee: CYBERON CORPORATIONInventors: Chin-Jung Liu, Shih-Hsun Chen, Chih-Lung Lin
-
Patent number: 12334068Abstract: Approaches are generally described for corrupted speech detection in voice-based computer interfaces. First input data including first audio data representing a user utterance may be received. First data representing the first audio data may be generated using a first encoder. First text data representing a transcription of the user utterance may be generated. Second data representing the first text data may be generated using a second encoder different from the first encoder. Third data may be generated by combining the first data and the second data. The third data may be sent to a classifier network trained to predict a relevant corruption state for speech processing inputs. The classifier network may determine that the first input data corresponds to a first corruption state.Type: GrantFiled: September 29, 2022Date of Patent: June 17, 2025Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Di Wang, Deshen Wang, Lan Ma, Shu Wang, Wenbo Yan, Prathap Ramachandra
-
Patent number: 12334060Abstract: A system and method that can be implemented in, among other things, a computer-implemented method for intuitive dictation without or with minimal use of other input devices besides a microphone, and without or with minimal use of keywords. The method includes receiving speech audio data from a microphone and in response to receiving the audio data, determining automatically whether a transcription of it is intended as a substitution for a fragment of existing text or as new additional text. The method further includes aligning a representation of the speech audio data with the existing text and based on that, determining the likelihood that a transcription of the speech audio data is intended as a replacement of a fragment of existing text and what that fragment is. The method further includes automatically replacing the fragment with the transcription or inserting/appending the transcription, adjusting the final text for proper punctuation and semantics.Type: GrantFiled: March 21, 2023Date of Patent: June 17, 2025Inventor: Orlin Todorov
-
Patent number: 12334048Abstract: A device may receive and convert audio data to text data in real-time, and may detect a network fluctuation that causes missing voice packets. The device may process partial text and context of the text data, with a model, to generate a new phrase, and may generate a response phoneme for the new phrase. The device may utilize a text embedding model to generate a text embedding for the response phoneme, and may process the audio data, with the model, to generate a target voice sequence. The device may utilize an audio embedding model to generate an audio embedding for the target voice sequence, and may combine the text embedding and the audio embedding to generate an embedding input vector. The device may process the embedding input vector, with an audio synthesis model, to generate a final voice response, and may provide the audio data and the final voice response.Type: GrantFiled: October 12, 2022Date of Patent: June 17, 2025Assignee: Verizon Patent and Licensing Inc.Inventors: Saurabh Tahiliani, Subham Biswas
-
Patent number: 12334055Abstract: The amount of future context used in a speech processing application allows for tradeoffs between performance and the delay in providing results to users. Existing speech processing applications may be trained with a specified future context size and perform poorly when used in production with a different future context size. A speech processing application trained using a stochastic future context allows a trained neural network to be used in production with different amounts of future context. During an update step in training, a future-context size may be sampled from a probability distribution, used to mask a neural network, and compute an output of the masked neural network. The output may then be used to compute a loss value and update parameters of the neural network. The trained neural network may then be used in production with different amounts of future context to provide greater flexibility for production speech processing applications.Type: GrantFiled: November 18, 2021Date of Patent: June 17, 2025Assignee: ASAPP, INC.Inventors: Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu Jeong Han
-
Patent number: 12334075Abstract: Modern automatic speech recognition (ASR) systems can utilize artificial intelligence (AI) models to service ASR requests. The number and scale of AI models used in a modern ASR system can be substantial. The process of configuring and reconfiguring hardware to execute various AI models corresponding to a substantial number of ASR requests can be time consuming and inefficient. Among other features, the described technology utilizes batching of ASR requests, splitting of the ASR requests, and/or parallel processing to efficiently use hardware tasked with executing AI models corresponding to ASR requests. In one embodiment, the compute graphs of ASR tasks are used to batch the ASR requests. The corresponding AI models of each batch can be loaded into hardware, and batches can be processed in parallel. In some embodiments, the ASR requests are split, batched, and processed in parallel.Type: GrantFiled: October 14, 2022Date of Patent: June 17, 2025Assignee: Deepgram, Inc.Inventors: Adam Joseph Sypniewski, Joshua Gevirtz, Nikola Lazar Whallon, Anthony John Deschamps, Scott Ivan Stephenson
-
Patent number: 12327571Abstract: A method may include recording operation of equipment to create an audio file, extracting features from the audio file, inputting the extracted features into a machine learning model, and determining with the machine learning model a score indicative of the operation of the equipment. A system may include an audio sensor to record audio of operation of equipment and generate an audio file, and one or more processors. The one or more processors extract features from the audio file, input the extracted features into a machine learning model, and determine with the machine learning model a score indicative of the operation of the equipment.Type: GrantFiled: November 10, 2021Date of Patent: June 10, 2025Assignee: Transportation IP Holdings, LLCInventor: Naveenkumar Ramaiah
-
Patent number: 12323554Abstract: A device, system and method for analyzing audio speech signals to detect fraudulent calls to a contact center comprising splitting an audio recording of a call in real-time into a foreground speech signal attributed to a main speaker and a background audio signal, extracting audio features from the foreground speech signal and background audio signal, inputting the extracted audio features into an ensemble model comprising multiple different machine learning models co-trained to cumulatively detect fraud, wherein the multiple different machine learning models include: a speaker audio model to detect audio speech anomalies, a speaker intent model to classify intent of the main speaker, and a prosody model to detect voice intonation of the main speaker. A prediction may be output, by the ensemble model, indicating whether the call is fraudulent.Type: GrantFiled: November 10, 2024Date of Patent: June 3, 2025Assignee: Morgan Stanley Services Group Inc.Inventors: Cheryl Fernandes, Mehak Mehta, Aratrika Sarkar, Melissa Kagaju
-
Patent number: 12315494Abstract: An electronic device may include a user interface, a processor operatively connected to the user interface, and a memory operatively connected to the processor. The memory may store instructions that, when executed, may cause the processor to identify a modified hotword included in the first user input in response to failing to detect a hotword included in a first user input received using the user interface, to monitor a second user input received during a specified time using the user interface, to identify an existing hotword corresponding to the modified hotword using the second user input, to provide response data indicating whether to update the existing hotword using the modified hotword, through the user interface, and to update a hotword model based on a user input to the response data. Moreover, various example embodiments found through the disclosure, as well as other embodiments, are possible.Type: GrantFiled: July 22, 2022Date of Patent: May 27, 2025Assignee: Samsung Electronics Co., Ltd.Inventors: Hyunson Seo, Kartik Khandelwal
-
Patent number: 12316516Abstract: A first UE may receive a V2X message from a second UE. The V2X message may be a suspicious message. The V2X message may include a plurality of values. Each value in the plurality of values may correspond to a respective field in a plurality of fields. The first UE may identify a health score associated with the V2X message. The first UE may adjust at least one value in the plurality of values if the health score is greater than a threshold. Accordingly, a health level of a suspicious V2X message may be analyzed quantitatively. A suspicious V2X message with a sufficiently low health level may be suppressed. Moreover, a correctable field value in a suspicious V2X message may be corrected based on a computed, measured, or prestored value.Type: GrantFiled: July 19, 2022Date of Patent: May 27, 2025Assignee: QUALCOMM IncorporatedInventors: Jean-Philippe Monteuuis, Cong Chen, Jonathan Petit, Mohammad Raashid Ansari
-
Patent number: 12315533Abstract: The disclosure provides technology for enhancing the ability of a computing device to detect when a user has discontinued reading a text source. An example method includes receiving audio data comprising a spoken word associated with a text source, comparing the audio data with data of the text source, determining, based on the comparing, whether a segment of the audio data corresponds to a location of the text source, and responsive to determining that the segment of the audio data does not correspond to a location of the text source, transmitting a signal indicating that a user has discontinued reading the text source, the signal causing to cease the comparing of the audio data with the data of the text source.Type: GrantFiled: December 29, 2023Date of Patent: May 27, 2025Assignee: Google LLCInventors: Chaitanya Gharpure, Evan Fisher, Eric Liu, Peng Yang, Emily Hou, Victoria Fang
-
Patent number: 12300249Abstract: Methods, systems, and computer-readable media for rapid event voice documentation are provided herein. The rapid event voice documentation system captures verbalized orders and actions and translates that unstructured voice data to structured, usable data for documentation. The voice data captured is tagged with metadata including the name and role of the speaker, a time stamp indicating a time the data was spoken, and a clinical concept identified in the data captured. The system automatically identifies orders (e.g., medications, labs and procedures, etc.), treatments, and assessments/findings that were verbalized during the rapid event to create structured data that is usable by a health information system and ready for documentation directly into an EHR. The system provides all of the captured data including orders, assessment documentation, vital signs and measurements, performed procedures, and treatments, and who performed each, available for viewing and interaction in real time.Type: GrantFiled: May 1, 2024Date of Patent: May 13, 2025Assignee: CERNER INNOVATION, INC.Inventors: Allison Michelle Thilges, Neil Curtis Pfeiffer, Eslie Rolland Phillips, III, Geoffrey Harold Simmons
-
Patent number: 12300243Abstract: In one aspect, a method includes receiving podcast content, generating a transcript of at least a portion of the podcast content, and parsing the podcast content to (i) identify audio segments within the podcast content, (ii) determine classifications for the audio segments, (iii) identify audio segment offsets, and (iv) identify sentence offsets. The method also includes based on the audio segments, the classifications, the audio segment offsets, and the sentence offsets, dividing the generated transcript into text sentences and, from among the text sentences of the divided transcript, selecting a group of text sentences for use in generating an audio summary of the podcast content. The method also includes based on timestamps at which the group of text sentences begin in the podcast content, combining portions of audio in the podcast content that correspond to the group of text sentences to generate an audio file representing the audio summary.Type: GrantFiled: February 22, 2022Date of Patent: May 13, 2025Assignee: Gracenote, Inc.Inventors: Amanmeet Garg, Aneesh Vartakavi, Joshua Ernest Morris
-
Patent number: 12300225Abstract: Systems, methods, and computer-readable media for correcting transcriptions created through automatic speech recognition. A transcription of speech created using an automatic speech recognition system can be received. One or more domain-specific contexts associated with the speech can be identified and a text span that includes a mistranscribed entry can be recognized from the speech based on the one or more domain-specific contexts. Additionally, features can be extracted from the mistranscribed entry and the extracted features can be matched against an index of domain-specific entries to identify a correct entry of the mistranscribed entry. Subsequently, the transcription can be corrected by replacing with the mistranscribed entry with the correct entry.Type: GrantFiled: September 22, 2022Date of Patent: May 13, 2025Assignee: Cisco Technology, Inc.Inventors: Karthik Raghunathan, Arushi Raghuvanshi, Vijay Ramakrishnan Thimmaiyah, Lucien Serapio Carroll, Varsha Ravikumar Embar
-
Patent number: 12299390Abstract: Generating text suggestions based on context can leverage sources associated with the context to generate more accurate and informed text suggestions. For example, the context can be a user situation, such as the user is attending a meeting. Obtaining text from sources associated with the user situation can generate a corpus of text that can be leveraged for generating the context-based text suggestions.Type: GrantFiled: June 4, 2021Date of Patent: May 13, 2025Assignee: GOOGLE LLCInventors: Daniel V. Klein, Igor dos Santos Ramos
-
Patent number: 12293825Abstract: In some aspects, a method of using a virtual medical assistant to assist a medical professional, the virtual medical assistant implemented, at least in part, by at least one processor of a host device capable of connecting to at least one network is provided. The method comprises receiving free-form instruction from the medical professional, providing the free-form instruction for processing to assist in identifying from the free-form instruction at least one medical task to be performed, obtaining identification of at least one impediment to performing the at least one medical task, and inferring at least some information needed to overcome the at least one impediment.Type: GrantFiled: July 7, 2022Date of Patent: May 6, 2025Assignee: Microsoft Technology Licensing, LLC.Inventors: Guido Remi Marcel Gallopyn, Justin Hubbard, Reid W. Coleman
-
Patent number: 12294676Abstract: A method and system provide for receiving the first request to generate a pipeline flow, identifying a user account based on the first request, associating the pipeline flow with the user account, receiving a second request to process an action associated with the user account, and processing the action including applying the pipeline flow to select the component configuration based on the parameter.Type: GrantFiled: August 16, 2022Date of Patent: May 6, 2025Assignee: Twilio Inc.Inventors: Christer Jan Erik Fahlgren, Umair Akeel
-
Patent number: 12293134Abstract: According to some embodiments, a method includes: receiving, by a client device, speech of a user during a screen sharing session; transcribing, by the client device, the speech into text; analyzing, by the client device, the text to identify one or more UI elements referenced within the speech, the one or more UI elements visible within the screen sharing session; and highlighting the one or more UI elements visible on the client device.Type: GrantFiled: November 24, 2021Date of Patent: May 6, 2025Inventors: Hao Wu, Taodong Lu, Yu Xin
-
Patent number: 12288031Abstract: Filtering user intents corresponding to user utterances is provided. A list of allowed user intents is generated, using a natural language understanding model of a chatbot, based on identifying one or more of a set of user intents corresponding to a user utterance within a filtered user intent mapping table. It is determined whether a user intent having a highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents. In response to determining that the user intent having the highest confidence score in the set of user intents corresponding to the user utterance is contained in the list of allowed user intents, content corresponding to the user intent having the highest confidence score is sent, using the chatbot, to a client device of a user who submitted the user utterance as a response to the user utterance.Type: GrantFiled: July 13, 2022Date of Patent: April 29, 2025Assignee: ADP, Inc.Inventors: Henry C. Will, IV, Stefan George Wilk
-
Patent number: 12288555Abstract: An electronic device receives a voice input, and determines whether the voice input is matched with a natural language understanding (NLU) model for determining the presence or absence of a verb. The electronic device further identifies a display context object associated with the voice input based on the voice input being matched with the NLU model. The electronic device calculates a similarity value between the voice input and the display context object, and update a user interface (UI) depending on the calculated the similarity values.Type: GrantFiled: September 14, 2022Date of Patent: April 29, 2025Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Yoonju Lee, Dongwan Kim, Juwhan Kim, Yoonjae Park
-
Patent number: 12284582Abstract: The system receives a SIP call from a UE associated with a user, where the SIP call includes multiple fields, and where a field among the multiple fields indicates that the SIP call is the open-line call. The system receives a first indication that the UE associated with a callee has generated a notification to the callee of the SIP call. Upon identifying the SIP call as the open-line call and receiving the first indication, the system sends a first message to the UE associated with the caller that the UE associated with the callee has generated the notification. The first message indicates to the UE associated with the caller to generate a first inaudible notification indicating that the UE associated with the callee generated the notification.Type: GrantFiled: March 15, 2022Date of Patent: April 22, 2025Assignee: T-Mobile USA, Inc.Inventors: Hsin-Fu Henry Chiang, William Michael Hooker
-
Patent number: 12283291Abstract: Systems, devices, and methods are provided for determining factually consistent generative narrations. A narrative may be generated by performing steps to determine one or more metadata messages for a first portion of a video stream, determine transcribed commentary for a second portion of the video stream, wherein the second portion includes the first portion, and determine a prompt based at least in part on the one or more metadata messages and the transcribed commentary. The prompt may be provided to a generative model that produces an output text. Techniques for performing a factual consistency evaluation may be used to determine a consistency score for the output text that indicates whether the output text is factually consistent with the one or more metadata messages and the transcribed commentary. A narrated highlight video may be generated using the consistent narrative.Type: GrantFiled: August 16, 2023Date of Patent: April 22, 2025Assignee: Amazon Technologies, Inc.Inventors: Noah Lirone Sarfati, Ido Yerushalmy, Michael Chertok, Ianir Ideses
-
Patent number: 12282745Abstract: An intelligent question answering method includes: determining, based on received question information, a target object and a target attribute corresponding to the question information; obtaining an answer knowledge path and an external knowledge path of the target object other than the answer knowledge path from a pre-established knowledge graph based on the target object and the target attribute, the answer knowledge path including target context information for describing the target attribute, and the external knowledge path including external context information for describing another attribute; inputting the answer knowledge path and the external knowledge path into a trained neural network model to obtain a reply text, a training corpus of the neural network model during training including at least comment information of the target object; and outputting the reply text.Type: GrantFiled: March 14, 2022Date of Patent: April 22, 2025Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Xiaoxue Liu, Yuyao Tang, Ninghua Wang, He Liu
-
Patent number: 12274940Abstract: A method and system for providing gaze-based generation of virtual effects indicators correlated with directional sounds is disclosed. Gaze data is tracked via a camera associated with a client device to identify a point of focus within a three-dimensional virtual environment towards which one or both eyes of the player are focused. When the point of focus indicated by the gaze data when the point of focus does not move towards the source location within the three-dimensional virtual environment when the directional sound is received indicates that a virtual effect indicator associated with the directional sound type of the indicated directional sound is should be generated.Type: GrantFiled: April 17, 2024Date of Patent: April 15, 2025Assignees: Sony Interactive Entertainment LLC, Sony Interactive Entertainment Inc.Inventors: Kristie Ramirez, Elizabeth Juenger, Katie Egeland, Sepideh Karimi, Lachmin Singh, Olga Rudi
-
Patent number: 12277870Abstract: An item generation interface may generate knowledge assessment items directed a subject area based on a set of model items collectively directed to the subject area. The item generation interface may group the set of model assessment items into a plurality of similar item groups using numeric features corresponding to the model assessment items. Similar item groups may include model assessment items covering conceptually similar concepts within the subject area. A conditioning input may be generated for each of the item groups based on the numeric features corresponding to the model assessment items in the item group. Responsive to providing the conditioning inputs to a transformer-based natural language generation model, the item generation interface may receive raw assessment items from the transformer-based natural language generation model. Knowledge assessment items may be identified from the raw assessment items.Type: GrantFiled: June 27, 2022Date of Patent: April 15, 2025Assignee: Prometric LLCInventors: Saad Masood Khan, Jesse Andrew Lewis Hamer, Tiago Lima Almeida, Charles Foster, Geoff Converse, Claudio Souza, Lucas Cezimbra, Sara Vispoel
-
Patent number: 12266345Abstract: This disclosure relates generally to ASR and is particularly directed to automatic, efficient, and intelligent detection of transcription bias in ASR models. Contrary to a tradition approach to the testing of ASR bias, the example implementations disclosed herein do not require actual test speeches and corresponding ground-truth texts. Instead, test speeches may be machine-generated from a pre-constructed reference textual passage according short speech samples of speakers using a neural voice cloning technology. The reference passage may be constructed according to a particular target domain of the ASR model being tested. Bias of the ASR model in various aspects may be identified by analyzing transcribed text from the machine-generated speeches and the reference textual passage. The underlying principles for bias detection may be applied to evaluation of general transcription effectiveness and accuracy of the ASR model.Type: GrantFiled: August 25, 2022Date of Patent: April 1, 2025Assignee: Accenture Global Solutions LimitedInventors: Anup Bera, Hemant Palivela
-
Patent number: 12266363Abstract: The present disclosure provides methods, devices, apparatus, and storage medium for performing speech-to-text conversion. The method includes: displaying, by a first device, a first user interface, the first user interface being a display screen of a virtual environment that provides a virtual activity place for a first virtual role controlled by a first user account; displaying, by a second device, a second user interface, the second user interface being a display screen of a virtual environment that provides a virtual activity place for a second virtual role controlled by a second user account; in response to a speech input operation by the first user account performed on the first device, displaying, by the first device, a chat message in a first language, and displaying, by the second device, the chat message in a second language.Type: GrantFiled: October 13, 2021Date of Patent: April 1, 2025Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Peicheng Liu, Xiaohao Liu, Yancan Wang, Dong Ding, Kai Tang, Shan Lin
-
Patent number: 12266197Abstract: Systems and computer-implemented methods disclosed herein relate to detecting errors in manually entered data. In one embodiment, the system can identify a named entity automatically from a conversation between a customer and service agent with a named entity recognition model that employs natural language processing and machine learning to detect a word or string of words in the conversation that corresponds to a named entity category. In another embodiment, the system can determine whether data entered into a field on a service platform by the service agent includes an error by comparing the data entered with the named entity. In another embodiment, the system can transmit an alert to the service agent through the service platform when there is a mismatch between the named entity and the data entered.Type: GrantFiled: April 28, 2022Date of Patent: April 1, 2025Assignee: Capital One Services, LLCInventors: Tyler Maiman, Joshua Edwards, Feng Qiu, Michael Mossoba, Alexander Lin, Meredith L Critzer, Guadalupe Bonilla, Vahid Khanagha, Mia Rodriguez, Aysu Ezen Can
-
Patent number: 12265581Abstract: Multi-modal search systems with improved search request routing are provided. A device can include a module that identifies, based on content of a search request, provider criterion that indicates factors to be considered in making a routing decision, a criterion processor that determines, based on the provider criterion, a routing decision indicating whether to route the search request to a search engine or a chat engine based, at least in part, respective compute costs of servicing the search request using the search engine and the chat engine, respectively, and respective accuracies of responses provided responsive to the search request using the search engine and the chat engine, respectively, and an output port coupled to receive the search request and to provide the search request to the search engine or the chat engine in accord with the routing decision.Type: GrantFiled: September 28, 2023Date of Patent: April 1, 2025Assignee: Microsoft Technology Licensing, LLCInventor: Ryen William White