Patents by Inventor Xuedong Huang

Xuedong Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Federated Artificial Intelligence System For Request Processing Using A Model Chain

Publication number: 20250165803

Abstract: A federated artificial intelligence system executes machine learning models of a model chain in order of increasing computational complexity to determine a lowest computational complexity model to use to serve a quality response to a user request. A first machine learning model of the model chain performs an inference operation to produce first output based on the user request. A scoring machine learning model determines that the first output fails to meet a threshold. Based on such determination, a second machine learning model of the model chain performs a second inference operation to produce second output based on the user request, in which the second machine learning model has a higher computational complexity than the first machine learning model. The scoring machine learning model determines that the second output meets the threshold, and, based on such determination, the second output is transmitted in response to the user request.

Type: Application

Filed: November 21, 2023

Publication date: May 22, 2025

Inventors: Xuedong Huang, Chenguang Zhu
MULTI-GRANULARITY MEETING SUMMARIZATION MODELS

Publication number: 20250111133

Abstract: Generally discussed herein are devices, systems, and methods for. A method can include receiving, from a user through a user interface, a segmentation granularity value indicating a number of events in the transcript to be included in a summary, extracting, by a ranker model and from the transcript, a number of hints equal to the number of events, generating, by a summarizer model that includes a re-trained language model, respective summaries, one for each event, of a portion of the transcript corresponding to the event, and providing the respective summaries as an overall summary of the transcript.

Type: Application

Filed: March 25, 2022

Publication date: April 3, 2025

Inventors: Chenguang Zhu, Yang LIU, Nanshan ZENG, Xuedong HUANG, Ming ZHONG, Yuantao Wang, Wei XIONG
Speaker attributed transcript generation

Patent number: 12243534

Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.

Type: Grant

Filed: April 4, 2022

Date of Patent: March 4, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Array geometry agnostic multi-channel personalized speech enhancement

Patent number: 12230259

Abstract: Examples of array geometry agnostic multi-channel personalized speech enhancement (PSE) extract speaker embeddings, which represent acoustic characteristics of one or more target speakers, from target speaker enrollment data. Spatial features (e.g., inter-channel phase difference) are extracted from input audio captured by a microphone array. The input audio includes a mixture of speech data of the target speaker(s) and one or more interfering speaker(s). The input audio, the extracted speaker embeddings, and the extracted spatial features are provided to a trained geometry-agnostic PSE model. Output data is produced, which comprises estimated clean speech data of the target speaker(s) that has a reduction (or elimination) of speech data of the interfering speaker(s), without the trained PSE model requiring geometry information for the microphone array.

Type: Grant

Filed: December 17, 2021

Date of Patent: February 18, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Hassan Taherian, Zhuo Chen, Xuedong Huang
Dynamic gradient aggregation for training neural networks

Patent number: 12136034

Abstract: The disclosure herein describes training a global model based on a plurality of data sets. The global model is applied to each data set of the plurality of data sets and a plurality of gradients is generated based on that application. At least one gradient quality metric is determined for each gradient of the plurality of gradients. Based on the determined gradient quality metrics of the plurality of gradients, a plurality of weight factors is calculated. The plurality of gradients is transformed into a plurality of weighted gradients based on the calculated plurality of weight factors and a global gradient is generated based on the plurality of weighted gradients. The global model is updated based on the global gradient, wherein the updated global model, when applied to a data set, performs a task based on the data set and provides model output based on performing the task.

Type: Grant

Filed: July 31, 2020

Date of Patent: November 5, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dimitrios B. Dimitriadis, Kenichi Kumatani, Robert Peter Gmyr, Masaki Itagaki, Yashesh Gaur, Nanshan Zeng, Xuedong Huang
SYSTEMS AND METHODS FOR REAL-TIME MEETING SUMMARIZATION

Publication number: 20240340193

Abstract: Systems and methods are provided for processing electronic content and generating corresponding output. Electronic content is received from a meeting, including recognizable speech content. This content is then summarized into real-time summary output by processing and encoding the meeting content while selectively alternating between unidirectional attention and bidirectional attention that is applied to the meeting contents.

Type: Application

Filed: April 10, 2023

Publication date: October 10, 2024

Inventors: Chenguang ZHU, Xuedong HUANG, Zong Zong YUAN, Wei XIONG, Nanshan ZENG, Yuantao WANG
QUALITY ASSURANCE FOR DIGITAL TECHNOLOGIES USING LARGE LANGUAGE MODELS

Publication number: 20240330165

Abstract: Systems and methods are provided for implementing quality assurance for digital technologies using language model (“LM”)-based artificial intelligence (“AI”) and/or machine learning (“ML”) systems. In various embodiments, a first prompt is provided to an LM actor or attacker to cause the LM actor or attacker to generate interaction content for interacting with test software. Responses from the test software are then evaluated by an LM evaluator to produce evaluation results. In some examples, a second prompt is generated that includes the responses from the test software along with the evaluation criteria for the test software. When the second prompt is provided to the LM evaluator, the LM evaluator generates the evaluation results.

Type: Application

Filed: April 3, 2023

Publication date: October 3, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Reid Allen PRYZANT, Yin Tat LEE, Chenguang ZHU, Sebastien BUBECK, Ronen ELDAN, Yuwei FANG, Dan ITER, Yichong XU, Yuanzhi LI, Yi ZHANG, Lijuan QIN, Nanshan ZENG, Xuedong HUANG
Processing overlapping speech from distributed devices

Patent number: 12051422

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

Type: Grant

Filed: September 13, 2021

Date of Patent: July 30, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Conducting search sessions utilizing navigation patterns

Patent number: 12045245

Abstract: Systems, methods, and computer-readable storage media are provided for conducting searches utilizing search navigation patterns. Search queries are received that include search terms that are of a particular type. It is recognized that at least one prior search session has been conducted that included a search query having search terms of an equivalent or similar type and followed a particular navigation pattern. Such prior search(es) may have been conducted by the user or by a different user and/or may have a navigation pattern that was affirmatively recorded by the requesting user or that was recorded by the system without explicit contemporaneous user instruction to do so. Upon identifying the navigation pattern associated with the prior search, the system effectively conducts a search session following the navigation pattern.

Type: Grant

Filed: June 24, 2019

Date of Patent: July 23, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Anoop Gupta, Xuedong Huang
Systems, methods, and computer-readable storage device for generating notes for a meeting based on participant actions and machine learning

Patent number: 12014737

Abstract: Systems, methods, and computer-readable storage devices are disclosed for generating smart notes for a meeting based on participant actions and machine learning. One method including: receiving meeting data from a plurality of participant devices participating in an online meeting; continuously generating text data based on the received audio data from each participant device of the plurality of participant devices; iteratively performing the following steps until receiving meeting data for the meeting has ended, the steps including: receiving an indication that a predefined action has occurred on the first participating device; generating a participant segment of the meeting data for at least the first participant device from a first predetermined time before when the predefined action occurred to when the predefined action occurred; determining whether the receiving meeting data of the meeting has ended; and generating a summary of the meeting.

Type: Grant

Filed: November 18, 2021

Date of Patent: June 18, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Heiko Rahmel, Li-Juan Qin, Xuedong Huang, Wei Xiong
Automated meeting minutes generator

Patent number: 11990132

Abstract: A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.

Type: Grant

Filed: February 28, 2023

Date of Patent: May 21, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chenguang Zhu, Yu Shi, William Isaac Hinthorn, Nanshan Zeng, Ruochen Xu, Liyang Lu, Xuedong Huang
Microsegment secure speech transcription

Patent number: 11947699

Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.

Type: Grant

Filed: April 30, 2021

Date of Patent: April 2, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Hemant Malhotra, Xuedong Huang, Li Jiang, Ivo Jose Garcia Dos Santos, Dong Li, Shuangyu Chang
PRE-TRAINING A UNIFIED NATURAL LANGUAGE MODEL WITH CORRUPTED SPAN AND REPLACED TOKEN DETECTION

Publication number: 20240062018

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
UNIFIED NATURAL LANGUAGE MODEL WITH SEGMENTED AND AGGREGATE ATTENTION

Publication number: 20240062020

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
Audio-visual diarization to identify meeting attendees

Patent number: 11875796

Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.

Type: Grant

Filed: April 30, 2019

Date of Patent: January 16, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang
COMPUTERIZED INTELLIGENT ASSISTANT FOR CONFERENCES

Publication number: 20230402038

Abstract: A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

Type: Application

Filed: May 15, 2023

Publication date: December 14, 2023

Inventors: Adi DIAMANT, Xuedong HUANG, Karen MASTER BEN-DOR, Eyal KRUPKA, Raz HALALY, Yoni SMOLIN, Ilya GURVICH, Aviv HURVITZ, Lijuan QIN, Wei XIONG, Shixiong ZHANG, Lingfeng WU, Xiong XIAO, Ido LEICHTER, Moshe DAVID, Amit Kumar AGARWAL
AUTOMATED MEETING MINUTES GENERATOR

Publication number: 20230205985

Abstract: A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.

Type: Application

Filed: February 28, 2023

Publication date: June 29, 2023

Inventors: Chenguang ZHU, Yu SHI, William Isaac HINTHORN, Nanshan ZENG, Rouchen XU, Liyang LU, Xuedong HUANG
Customized transcribed conversations

Patent number: 11687736

Abstract: Systems and methods may be used to provide transcription and translation services. A method may include initializing a plurality of user devices with respective language output selections in a translation group by receiving a shared identifier from the plurality of user devices and transcribing the audio stream to transcribed text. The method may include translating the transcribed text to one or more of the respective language output selections when an original language of the transcribed text differs from the one or more of the respective language output selections. The method may include sending, a user device in the translation group, the transcribed text including translated text in a language corresponding to the respective language output selection for the user device. In an example, the method may include customizing the transcription or the translation, such as to a particular topic, location, user, or the like.

Type: Grant

Filed: October 23, 2020

Date of Patent: June 27, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: William D. Lewis, Ivo José Garcia Dos Santos, Tanvi Surti, Arul A. Menezes, Olivier Nano, Christian Wendt, Xuedong Huang
Computerized intelligent assistant for conferences

Patent number: 11688399

Abstract: A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

Type: Grant

Filed: December 8, 2020

Date of Patent: June 27, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Adi Diamant, Karen Master Ben-Dor, Eyal Krupka, Raz Halaly, Yoni Smolin, Ilya Gurvich, Aviv Hurvitz, Lijuan Qin, Wei Xiong, Shixiong Zhang, Lingfeng Wu, Xiong Xiao, Ido Leichter, Moshe David, Xuedong Huang, Amit Kumar Agarwal
MICROSEGMENT SECURE SPEECH TRANSCRIPTION

Publication number: 20230153451

Abstract: Embodiments are provided for securing data access to machine learning training data at a plurality of distributed computing devices. Electronic content including original data that corresponds to a preferred data security level is divided into a plurality of microsegments. The plurality of microsegments is restrictively distributed to a plurality of computing devices which apply transcription labels to the plurality of microsegments. The labeled microsegments are reconstructed into training data which is then used to train a machine learning model while facilitating an improvement in data security of the original data included with the training data from the reconstructed microsegments.

Type: Application

Filed: April 30, 2021

Publication date: May 18, 2023

Inventors: Hemant MALHOTRA, Xuedong HUANG, Li JIANG, Ivo Jose GARCIA DOS SANTOS, Dong LI, Shuangyu CHANG

1 2 3 4 5 … next