Patents by Inventor Nanshan Zeng

Nanshan Zeng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speaker Attributed Transcript Generation

Publication number: 20220230642

Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.

Type: Application

Filed: April 4, 2022

Publication date: July 21, 2022

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan ZENG, Lijuan QIN, William Isaac Hinthorn, Xuedong HUANG
GENERATION OF OPTIMIZED SPOKEN LANGUAGE UNDERSTANDING MODEL THROUGH JOINT TRAINING WITH INTEGRATED KNOWLEDGE-LANGUAGE MODULE

Publication number: 20220230628

Abstract: A system is provided for generating an optimized speech model by training a knowledge module on a knowledge graph. A language module is trained on unlabeled text data and a speech module is trained on unlabeled acoustic data. The knowledge module is integrated with the language module to perform semantic analysis using knowledge-graph based information. The speech module is then aligned to the language module of the integrated knowledge-language module. The speech module is then configured as an optimized speech model configured to leverage acoustic and language information in natural language processing tasks.

Type: Application

Filed: May 18, 2021

Publication date: July 21, 2022

Inventors: Chenguang ZHU, Nanshan ZENG
Speaker attributed transcript generation

Patent number: 11322148

Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.

Type: Grant

Filed: April 30, 2019

Date of Patent: May 3, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
SYNTHETIC DATA GENERATION FOR TRAINING OF NATURAL LANGUAGE UNDERSTANDING MODELS

Publication number: 20220084510

Abstract: This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-adapted generative model that has been tuned using one or more task-specific seed examples. The method or technique can also include inputting dialog acts into the task-adapted generative model and obtaining synthetic utterances that are output by the task-adapted generative model. The method or technique can also include populating a synthetic training corpus with synthetic training examples that include the synthetic utterances. The synthetic training corpus may be suitable for training a natural language understanding model.

Type: Application

Filed: September 15, 2020

Publication date: March 17, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Nanshan Zeng, Jianfeng Gao
Data-driven and rule-based speech recognition output enhancement

Patent number: 11257484

Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.

Type: Grant

Filed: August 21, 2019

Date of Patent: February 22, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dimitrios Basile Dimitriadis, Xie Chen, Nanshan Zeng, Yu Shi, Liyang Lu
DYNAMIC GRADIENT AGGREGATION FOR TRAINING NEURAL NETWORKS

Publication number: 20220036178

Abstract: The disclosure herein describes training a global model based on a plurality of data sets. The global model is applied to each data set of the plurality of data sets and a plurality of gradients is generated based on that application. At least one gradient quality metric is determined for each gradient of the plurality of gradients. Based on the determined gradient quality metrics of the plurality of gradients, a plurality of weight factors is calculated. The plurality of gradients is transformed into a plurality of weighted gradients based on the calculated plurality of weight factors and a global gradient is generated based on the plurality of weighted gradients. The global model is updated based on the global gradient, wherein the updated global model, when applied to a data set, performs a task based on the data set and provides model output based on performing the task.

Type: Application

Filed: July 31, 2020

Publication date: February 3, 2022

Inventors: Dimitrios B. DIMITRIADIS, Kenichi KUMATANI, Robert Peter GMYR, Masaki ITAGAKI, Yashesh GAUR, Nanshan ZENG, Xuedong HUANG
Dynamic deeplinks for navigational queries

Patent number: 11226969

Abstract: Techniques for dynamically generating deeplink search results in response to navigational search queries. In an aspect, to address user search queries, a general-purpose search engine is provided in parallel with a dedicated engine for specifically ranking deeplinks. Upon identifying a received query as a navigational query, a parallel query is generated from a common domain and user query, and provided to the dedicated engine. The engine accesses relevant deeplink URL's from a search index, which may be frequently refreshed and updated with the most recent Web contents. Ranking of deeplink URL's may be performed according to an algorithm that processes query-level features and document-level features of URL's to be ranked. In an aspect, the algorithm may be trained from search engine logs and/or Web browser logs, by calculating a Log-based Normalized Discounted Cumulative Gain (LNDCG) designed to quantify relevance of search results to queries based on user click behavior.

Type: Grant

Filed: February 27, 2016

Date of Patent: January 18, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dong Han Wang, Xiaofei Huang, Jinghua Chen, Nanshan Zeng
Processing Overlapping Speech from Distributed Devices

Publication number: 20210407516

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

Type: Application

Filed: September 13, 2021

Publication date: December 30, 2021

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
AUTOMATED MEETING MINUTES GENERATOR

Publication number: 20210375289

Abstract: A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.

Type: Application

Filed: May 29, 2020

Publication date: December 2, 2021

Inventors: Chenguang Zhu, Yu Shi, William Isaac Hinthorn, Nanshan Zeng, Ruochen Xu, Liyang Lu, Xuedong Huang
AUTOMATED MEETING MINUTES GENERATION SERVICE

Publication number: 20210375291

Abstract: Attributes of electronic content from a meeting are identified and evaluated to determine whether sub-portions of the electronic content should or should not be attributed to a user profile. Upon determining that the sub-portion should be attributed to a user profile, attributes of the sub-portion of electronic content are compared to attributes of stored user profiles. A probability that the sub-portion corresponds to at least one stored user profile is calculated. Based on the calculated probability, the sub-portion is attributed to a stored user profile or a guest user profile.

Type: Application

Filed: May 27, 2020

Publication date: December 2, 2021

Inventors: Nanshan Zeng, Wei Xiong, Lingfeng Wu, Jun Zhang, Shayin Jing
Processing overlapping speech from distributed devices

Patent number: 11138980

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

Type: Grant

Filed: April 30, 2019

Date of Patent: October 5, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Customized output to optimize for user preference in a distributed system

Patent number: 11023690

Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.

Type: Grant

Filed: April 30, 2019

Date of Patent: June 1, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
DATA-DRIVEN AND RULE-BASED SPEECH RECOGNITION OUTPUT ENHANCEMENT

Publication number: 20210056956

Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.

Type: Application

Filed: August 21, 2019

Publication date: February 25, 2021

Inventors: Dimitrios Basile DIMITRIADIS, Xie CHEN, Nanshan ZENG, Yu SHI, Liyang LU
SESSION MESSAGE PROCESSING

Publication number: 20210043207

Abstract: The present disclosure provides method and apparatus for processing a message. A statement sentence message and a message processing parameter associated with a user's session message are obtained. One or more first statement sentence nodes that are semantic-matched with the statement sentence message are determined in the knowledge map. One or more second statement sentence nodes corresponding to the message processing parameters are obtained from the knowledge map, based on the node relationship properties of the first statement sentence nodes. A response is generated based at least in part on statement sentences of the one or more second statement sentence nodes. The generated response is provided to the user.

Type: Application

Filed: April 6, 2019

Publication date: February 11, 2021

Inventors: Ling CHEN, Yu SHI, Yining CHEN, Nanshan ZENG, Dong LI
Audio Stream Processing for Distributed Device Meeting

Publication number: 20200351603

Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
CUSTOMIZED OUTPUT TO OPTIMIZE FOR USER PREFERENCE IN A DISTRIBUTED SYSTEM

Publication number: 20200349230

Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Distributed Device Meeting Initiation

Publication number: 20200349949

Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Speaker Attributed Transcript Generation

Publication number: 20200349950

Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Processing Overlapping Speech from Distributed Devices

Publication number: 20200349954

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Audio-visual diarization to identify meeting attendees

Publication number: 20200349953

Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang

prev 1 2 3 next