Patents Examined by Abul K. Azad
  • Patent number: 12387042
    Abstract: A method including transcribing, automatically, an ongoing stream of voice data into text phrases. The method also includes receiving an indication of a selected text phrase in the text phrases. The method also includes converting the selected text phrase to a selected phrase vector. The method also includes generating a subsequent text phrase, after the selected text phrase, from the ongoing stream of voice data, and adding the subsequent text phrase to the text phrases. The method also includes converting the subsequent text phrase to a subsequent phrase vector. The method also includes generating a similarity confidence score from the selected phrase vector and the subsequent phrase vector, using a machine learning model. The method also includes highlighting, responsive to the similarity confidence score exceeding a threshold value, the subsequent text phrase in the text phrases.
    Type: Grant
    Filed: April 18, 2024
    Date of Patent: August 12, 2025
    Assignee: Intuit Inc.
    Inventors: Amir Eftekhari, Roger C. Meike
  • Patent number: 12380881
    Abstract: Systems and methods relate to executing a task using a machine learning model based on prompt generation and collaborative interactions with a user. The machine language model generating a set of questions based on a task request. The user interactively answers the questions. A task processor generates a set of question-answer pairs based on the questions generated by the machine learning model and the answers given by the user. The machine learning model generates a task specific output based on the set of question-answer pairs. The machine learning model represents a large language model with deep learning. The simple question-and-answer prompts enable non-expert users to instruct the machine learning model with information that is sufficient to execute the task without overwhelming the users with the operations. The machine learning model leverages the answers to execute the task with accuracy, thereby providing efficacy of the prompting technique.
    Type: Grant
    Filed: October 21, 2022
    Date of Patent: August 5, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Elnaz Nouri, Swaroop Ranjan Mishra
  • Patent number: 12374326
    Abstract: Techniques for determining when speech is directed at another individual of a dialog, and storing a representation of such user-directed speech for use as context when processing subsequently-received system-directed speech are described. A system receives audio data and/or video data and determines therefrom that speech in the audio data is user-directed. Based on this, the system determine whether the speech is able to be used to perform an action by the system. If the speech is able to be used to perform an action, the system stores a natural language representation of the speech. Thereafter, when the system receives system-directed speech, the system generates a rewrite of a natural language representation of the system-directed speech based on the previously-received user-directed speech. The system then determines output data responsive to the system-directed speech using the rewritten natural language representation.
    Type: Grant
    Filed: April 28, 2023
    Date of Patent: July 29, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Alexandros Potamianos, Arijit Biswas, Bonan Zheng, Anushree Venkatesh, Yohan Jo, Vincent Auvray, Nikolaos Malandrakis, Aaron Challenner, Xinyan Zhao, Angeliki Metallinou, David A Jara, Jiahui Li, Ying Shi, Nikko Strom, Veerdhawal Pande
  • Patent number: 12367890
    Abstract: An electronic device having a circuitry configured to perform audio source separation on an audio input signal to obtain a separated source and configured to perform audio dubbing on the separated source based on replacement conditions to obtain a personalized separated source.
    Type: Grant
    Filed: March 17, 2021
    Date of Patent: July 22, 2025
    Assignee: Sony Group Corporation
    Inventors: Stefan Uhlich, Giorgio Fabbro, Marc Ferras Font, Falk-Martin Hoffmann, Thomas Kemp
  • Patent number: 12361957
    Abstract: A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.
    Type: Grant
    Filed: January 30, 2024
    Date of Patent: July 15, 2025
    Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
    Inventors: Seungkwon Beack, Tae Jin Lee, Min Je Kim, Kyeongok Kang, Dae Young Jang, Jeongil Seo, Jin Woo Hong, Chieteuk Ahn, Ho Chong Park, Young-cheol Park
  • Patent number: 12361932
    Abstract: According to one embodiment, a method, computer system, and computer program product for human-machine interfacing is provided. The present invention may include receiving a personality corpus associated with a personality typology comprising multiple personality types; extracting a plurality of utterances from a user; selecting, by a personality model, a personality type associated with the user based on the utterances and the personality corpus; identifying a compatible personality type of the selected personality type; constructing one or more natural language scripts from a word graph associated with the compatible personality type; and transmitting the one or more natural language scripts to the user.
    Type: Grant
    Filed: June 13, 2023
    Date of Patent: July 15, 2025
    Assignee: International Business Machines Corporation
    Inventors: Irene Lizeth Manotas GutiƩrrez, Ra'eesa Kabir, Jonathan D. Dunne
  • Patent number: 12361225
    Abstract: Methods for generating and utilizing a multi-modal discourse tree (MMDT) are provided herein. An extended discourse tree (EDT) may be generated (e.g., from a discourse tree (DT) or a communicative DT (CDT)) from a corpus of text. Data records (e.g., records contained numerical data) may be linked to the extended discourse tree to generate a multi-modal discourse tree. The multi-modal discourse tree may link any suitable text/records from disparate sources. For example, entities identified from elementary discourse units of the EDT may be matched to an entity of a data record. Causal links may be identified between EDTs and/or data records. Rhetorical relationships can be identified for each entity/causal link match to incorporate the data records with the EDT to generate a MMDT. The MMDT may be used to classify subsequent input, to generate answers to subsequent questions, to navigate the corpus of text and/or data records, or the like.
    Type: Grant
    Filed: April 8, 2024
    Date of Patent: July 15, 2025
    Assignee: Oracle International Corporation
    Inventor: Boris Galitsky
  • Patent number: 12347416
    Abstract: The present disclosure relates to systems, methods, and products for using machine-learning networks to generate trustworthy audio and face mesh. A system, serving as a digital avatar, generates a trust audio and trust face mesh corresponding to an input text. A method includes generating a set of trust embedding vectors based on a reference audio; generate a text embedding vector based on the input text; generate a conditioned vector based on the set of trust embedding vectors and the text embedding vector; synthesize an audio representation based on the conditioned vector; generate the trust audio based on the synthesized audio representation; obtain a speech feature representation based on the trust audio; obtain an abstract feature vector based on the speech feature representation; and generate positions of vertices based on the abstract feature vector, the positions of vertices being used for generating the trust face mesh.
    Type: Grant
    Filed: December 5, 2022
    Date of Patent: July 1, 2025
    Assignee: Accenture Global Solutions Limited
    Inventors: Lan Guan, Neeraj D Vadhan, Sukryool Kang, Anwitha Paruchuri, Anupam Anurag Tripathi, Sujeong Cha, Thomas Wayne Hancock, Jill Gengelbach-Wylie, Yuan He, Andrew Francis Hickl, Ivan Wong, Surya Raghavendra Vadlamani
  • Patent number: 12340797
    Abstract: Devices and techniques are generally described for inference reduction in natural language processing using semantic similarity-based caching. In various examples, first automatic speech recognition (ASR) data representing a first natural language input may be determined. A cache may be searched using the first ASR data. A first skill associated with the first ASR data may be determined from the cache. In some examples, first intent data representing a semantic interpretation of the first natural language input data may be determined by using a first natural language process associated with the first skill.
    Type: Grant
    Filed: March 6, 2023
    Date of Patent: June 24, 2025
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Kiana Hajebi, Vivek Yadav, Pradeep Natarajan
  • Patent number: 12334068
    Abstract: Approaches are generally described for corrupted speech detection in voice-based computer interfaces. First input data including first audio data representing a user utterance may be received. First data representing the first audio data may be generated using a first encoder. First text data representing a transcription of the user utterance may be generated. Second data representing the first text data may be generated using a second encoder different from the first encoder. Third data may be generated by combining the first data and the second data. The third data may be sent to a classifier network trained to predict a relevant corruption state for speech processing inputs. The classifier network may determine that the first input data corresponds to a first corruption state.
    Type: Grant
    Filed: September 29, 2022
    Date of Patent: June 17, 2025
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Di Wang, Deshen Wang, Lan Ma, Shu Wang, Wenbo Yan, Prathap Ramachandra
  • Patent number: 12334072
    Abstract: Network source identification via audio signals is provided. A system receives data packets with an input audio signal from a client device. The system identifies a request. The system selects a digital component provided by a digital component provider device. The system identifies audio chimes stored in memory of the client device. The system matches, based on a policy, an identifier of the digital component provider device to a first audio chime stored in the memory of the client device. The system determines, based on a characteristic of the first audio chime, a configuration to combine the digital component with the first audio chime. The system generates an action data structure with the digital component, an indication of the first audio chime, and the configuration. The system transmits the action data structure to the client device to cause the client device to generate an output audio signal.
    Type: Grant
    Filed: October 31, 2023
    Date of Patent: June 17, 2025
    Assignee: GOOGLE LLC
    Inventor: Peter Kraker
  • Patent number: 12321699
    Abstract: A computing platform is configured to: (i) obtain a pool of available words for use in generating four-word passphrases, (ii) generate a candidate batch of four-word passphrases using the pool of available words, (iii) identify one or more duplicate four-word passphrases in the candidate batch and then filter the identified one or more duplicate four-word passphrases out of the candidate set; (iv) based on the filtered candidate batch of four-word passphrases, generate a new batch of four-word passphrases for use on direct mail; and (v) release the new batch of four-word passphrases for use on direct mail.
    Type: Grant
    Filed: August 12, 2022
    Date of Patent: June 3, 2025
    Assignee: Discover Financial Services
    Inventors: Tarun Dadoo, Sharif Refaie, Benjamin Wolff, Ian Huntley, Sanjeev Khatri
  • Patent number: 12315498
    Abstract: Techniques for action recommendation based on conversational log for real time assistance are described. Pre-generated intent clusters can be used to identify a relevant intent of a user in a given conversation between the user and a contact center agent. Based on the identified intent, certain recommended actions can be performed on the computing device of the contact center agent to facilitate the conversation between the user and the contact center agent. Feedback relating to the conversation and/or the recommended actions can be recorded and used to update the pre-generated intent clusters to improve the quality and relevance of the actions recommended for future conversations.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: May 27, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Nicholas Sun, Phillip H. Keung, Fan Luo, Wei Niu
  • Patent number: 12301912
    Abstract: A display device for providing a speech recognition service according to an embodiment of the present disclosure can include a storage unit configured to store a user's viewing history, a display unit, and a control unit configured to acquire a plurality of recommended utterance words based on the stored viewing history information, receive a command for requesting the speech recognition service, and display the plurality of acquired recommended utterance words on the display unit according to the received command.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 13, 2025
    Assignee: LG ELECTRONICS INC.
    Inventor: Daegun Park
  • Patent number: 12300225
    Abstract: Systems, methods, and computer-readable media for correcting transcriptions created through automatic speech recognition. A transcription of speech created using an automatic speech recognition system can be received. One or more domain-specific contexts associated with the speech can be identified and a text span that includes a mistranscribed entry can be recognized from the speech based on the one or more domain-specific contexts. Additionally, features can be extracted from the mistranscribed entry and the extracted features can be matched against an index of domain-specific entries to identify a correct entry of the mistranscribed entry. Subsequently, the transcription can be corrected by replacing with the mistranscribed entry with the correct entry.
    Type: Grant
    Filed: September 22, 2022
    Date of Patent: May 13, 2025
    Assignee: Cisco Technology, Inc.
    Inventors: Karthik Raghunathan, Arushi Raghuvanshi, Vijay Ramakrishnan Thimmaiyah, Lucien Serapio Carroll, Varsha Ravikumar Embar
  • Patent number: 12293775
    Abstract: A voice control method and apparatus, a chip, earphones, and a system. The method includes: recognizing (001) whether a voice signal includes a keyword; in response to the voice signal including the keyword, executing (001a) an instruction corresponding to the keyword or sending the instruction; before recognizing whether the voice signal includes the keyword, determining (002) whether the voice signal is from a target user and, in response to the voice signal being from the target user, starting to recognize (001) whether the voice signal includes the keyword; or during recognizing whether the voice signal includes the keyword, determining (002) whether the voice signal is from the target user and, in response to the voice signal being from a non-target user, stopping recognizing (003a) whether the voice signal includes the keyword. The voice control method reduces the power consumption of voice control and improves the endurance.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: May 6, 2025
    Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.
    Inventors: Zhiyao Liu, Shuqing Cheng
  • Patent number: 12282516
    Abstract: A method includes extracting a set of candidate keywords from clickstream data and natural language processing of product text for a plurality of search queries. The set of candidate keywords are filtered based on the clickstream data. The set of candidate keywords as filtered are ranked based on the clickstream data. The set of candidate keywords as ranked are clustered to remove near duplicates. The set of candidate keywords as ranked for a respective search query is output.
    Type: Grant
    Filed: May 6, 2022
    Date of Patent: April 22, 2025
    Assignee: Home Depot Product Authority, LLC
    Inventors: Venkata Goutham Simhadri, Janani Balaji, Jeyaprakash Singarayar, Olga Stolpovskaia, Suhail Shaikh
  • Patent number: 12277155
    Abstract: An online system extracts information from a user for use in workflows using a machine learning-based language mode. The online system creates a weighted epoch tree comprising epoch nodes, each epoch node associated with a time interval associated with the user. An epoch node has a relevance score determined based on a set of events associated with the user that occurred during a time interval. The online system builds the weighted epoch tree by selecting an epoch node for further exploration based on relevance scores and determining a question relevant to a context represented by the selected epoch node. The online system determines an answer to the question and either adds the answer to an existing node or to new epoch nodes added to the weighted epoch tree. The online system may use the weighted epoch tree for generating a synthetic statement for the user.
    Type: Grant
    Filed: November 22, 2024
    Date of Patent: April 15, 2025
    Inventor: Yashraj Panwar
  • Patent number: 12266373
    Abstract: A method and apparatus for audio processing, an electronic device and a storage medium are provided. The method includes: obtaining an audio encoding result, wherein each element in the audio encoding result has a coordinate in an audio frame number dimension and a coordinate in a text label sequence dimension; in response to an output result of an ith frame in a decoding path being a non-null character, respectively increasing the coordinate in the audio frame number dimension and the coordinate in the text label sequence dimension corresponding to an output position of the ith frame by 1 to obtain an output position of a (i+1)th frame in the decoding path; and determining an output result corresponding to the output position of the (i+1)th frame according to the output result of the ith frame in the decoding path and an element of the (i+1)th frame in the audio encoding result.
    Type: Grant
    Filed: December 9, 2022
    Date of Patent: April 1, 2025
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Mingshuang Luo, Fangjun Kuang, Liyong Guo, Long Lin, Wei Kang, Zengwei Yao, Povey Daniel
  • Patent number: 12260233
    Abstract: Methods and systems described herein for addressing issues associated with varying graph analytics tools that require different tool-specific coding languages. An artificial intelligence (AI) sub-system of various modules extracts metadata from a dataset and identifies nodes and relationships in the dataset using the metadata. The dataset is matched with a corresponding graph-analytics template in a data store, and a dynamic template modifier modifies the corresponding graph-analytics template. In some examples, the AI system generates smart guided videos with logical breakpoints that are embedded along with templates for quick learning and to build faster graphical analytics. The AI system includes a dynamic template modifier and a cognitive smart AI engine that includes a graph.
    Type: Grant
    Filed: November 2, 2022
    Date of Patent: March 25, 2025
    Assignee: Bank of America Corporation
    Inventors: Siva Paini, Sakshi Bakshi