Patents by Inventor Nanshan Zeng

Nanshan Zeng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220230628
    Abstract: A system is provided for generating an optimized speech model by training a knowledge module on a knowledge graph. A language module is trained on unlabeled text data and a speech module is trained on unlabeled acoustic data. The knowledge module is integrated with the language module to perform semantic analysis using knowledge-graph based information. The speech module is then aligned to the language module of the integrated knowledge-language module. The speech module is then configured as an optimized speech model configured to leverage acoustic and language information in natural language processing tasks.
    Type: Application
    Filed: May 18, 2021
    Publication date: July 21, 2022
    Inventors: Chenguang ZHU, Nanshan ZENG
  • Patent number: 11322148
    Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 3, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20220084510
    Abstract: This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-adapted generative model that has been tuned using one or more task-specific seed examples. The method or technique can also include inputting dialog acts into the task-adapted generative model and obtaining synthetic utterances that are output by the task-adapted generative model. The method or technique can also include populating a synthetic training corpus with synthetic training examples that include the synthetic utterances. The synthetic training corpus may be suitable for training a natural language understanding model.
    Type: Application
    Filed: September 15, 2020
    Publication date: March 17, 2022
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Nanshan Zeng, Jianfeng Gao
  • Patent number: 11257484
    Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: February 22, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dimitrios Basile Dimitriadis, Xie Chen, Nanshan Zeng, Yu Shi, Liyang Lu
  • Publication number: 20220036178
    Abstract: The disclosure herein describes training a global model based on a plurality of data sets. The global model is applied to each data set of the plurality of data sets and a plurality of gradients is generated based on that application. At least one gradient quality metric is determined for each gradient of the plurality of gradients. Based on the determined gradient quality metrics of the plurality of gradients, a plurality of weight factors is calculated. The plurality of gradients is transformed into a plurality of weighted gradients based on the calculated plurality of weight factors and a global gradient is generated based on the plurality of weighted gradients. The global model is updated based on the global gradient, wherein the updated global model, when applied to a data set, performs a task based on the data set and provides model output based on performing the task.
    Type: Application
    Filed: July 31, 2020
    Publication date: February 3, 2022
    Inventors: Dimitrios B. DIMITRIADIS, Kenichi KUMATANI, Robert Peter GMYR, Masaki ITAGAKI, Yashesh GAUR, Nanshan ZENG, Xuedong HUANG
  • Patent number: 11226969
    Abstract: Techniques for dynamically generating deeplink search results in response to navigational search queries. In an aspect, to address user search queries, a general-purpose search engine is provided in parallel with a dedicated engine for specifically ranking deeplinks. Upon identifying a received query as a navigational query, a parallel query is generated from a common domain and user query, and provided to the dedicated engine. The engine accesses relevant deeplink URL's from a search index, which may be frequently refreshed and updated with the most recent Web contents. Ranking of deeplink URL's may be performed according to an algorithm that processes query-level features and document-level features of URL's to be ranked. In an aspect, the algorithm may be trained from search engine logs and/or Web browser logs, by calculating a Log-based Normalized Discounted Cumulative Gain (LNDCG) designed to quantify relevance of search results to queries based on user click behavior.
    Type: Grant
    Filed: February 27, 2016
    Date of Patent: January 18, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dong Han Wang, Xiaofei Huang, Jinghua Chen, Nanshan Zeng
  • Publication number: 20210407516
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Application
    Filed: September 13, 2021
    Publication date: December 30, 2021
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20210375289
    Abstract: A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.
    Type: Application
    Filed: May 29, 2020
    Publication date: December 2, 2021
    Inventors: Chenguang Zhu, Yu Shi, William Isaac Hinthorn, Nanshan Zeng, Ruochen Xu, Liyang Lu, Xuedong Huang
  • Publication number: 20210375291
    Abstract: Attributes of electronic content from a meeting are identified and evaluated to determine whether sub-portions of the electronic content should or should not be attributed to a user profile. Upon determining that the sub-portion should be attributed to a user profile, attributes of the sub-portion of electronic content are compared to attributes of stored user profiles. A probability that the sub-portion corresponds to at least one stored user profile is calculated. Based on the calculated probability, the sub-portion is attributed to a stored user profile or a guest user profile.
    Type: Application
    Filed: May 27, 2020
    Publication date: December 2, 2021
    Inventors: Nanshan Zeng, Wei Xiong, Lingfeng Wu, Jun Zhang, Shayin Jing
  • Patent number: 11138980
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: October 5, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 11023690
    Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: June 1, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20210056956
    Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.
    Type: Application
    Filed: August 21, 2019
    Publication date: February 25, 2021
    Inventors: Dimitrios Basile DIMITRIADIS, Xie CHEN, Nanshan ZENG, Yu SHI, Liyang LU
  • Publication number: 20210043207
    Abstract: The present disclosure provides method and apparatus for processing a message. A statement sentence message and a message processing parameter associated with a user's session message are obtained. One or more first statement sentence nodes that are semantic-matched with the statement sentence message are determined in the knowledge map. One or more second statement sentence nodes corresponding to the message processing parameters are obtained from the knowledge map, based on the node relationship properties of the first statement sentence nodes. A response is generated based at least in part on statement sentences of the one or more second statement sentence nodes. The generated response is provided to the user.
    Type: Application
    Filed: April 6, 2019
    Publication date: February 11, 2021
    Inventors: Ling CHEN, Yu SHI, Yining CHEN, Nanshan ZENG, Dong LI
  • Publication number: 20200351603
    Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
  • Publication number: 20200349230
    Abstract: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349949
    Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349950
    Abstract: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349954
    Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
  • Publication number: 20200349953
    Abstract: A computer implemented method includes receiving information streams on a meeting server from a set of multiple distributed devices included in a meeting, receiving audio signals representative of speech by at least two users in at least two of the information streams, receiving at least one video signal of at least one user in the information streams, associating a specific user with speech in the received audio signals as a function of the received audio and video signals, and generating a transcript of the meeting with an indication of the specific user associated with the speech.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, William Isaac Hinthorn, Xuedong Huang
  • Patent number: 10812921
    Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: October 20, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang