Patents by Inventor Zhong Meng

Zhong Meng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Intelligent glass and intelligent window system

Patent number: 11951814

Abstract: The embodiments of the disclosure provide an intelligent glass and an intelligent window system, and relates to the technical field of window display. The intelligent glass of the disclosure includes a touch display assembly and a glass assembly. The touch display assembly is communicatively coupled to the glass assembly, and is configured to send a corresponding dimming instruction to the glass assembly based on a received touch instruction, such that the glass assembly adjusts its light transmittance based on the dimming instruction.

Type: Grant

Filed: September 26, 2019

Date of Patent: April 9, 2024

Assignee: BOE TECHNOLOGY GROUP CO., LTD.

Inventors: Yongbo Wang, Chen Meng, Zhong Hu, Yutao Tang, Wenjie Zhong, Dahai Hu, Wei Shi
Hypothesis stitcher for speech recognition of long-form audio

Patent number: 11935542

Abstract: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis.

Type: Grant

Filed: January 19, 2023

Date of Patent: March 19, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka
Speaker adaptation for attention-based encoder-decoder

Patent number: 11915686

Abstract: Embodiments are associated with a speaker-independent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-independent attention-based encoder-decoder model associated with a first output distribution, and a speaker-dependent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-dependent attention-based encoder-decoder model associated with a second output distribution. The second attention-based encoder-decoder model is trained to classify output tokens based on input speech frames of a target speaker and simultaneously trained to maintain a similarity between the first output distribution and the second output distribution.

Type: Grant

Filed: January 5, 2022

Date of Patent: February 27, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong
Condition-invariant feature extraction network

Patent number: 11823702

Abstract: To generate substantially condition-invariant and speaker-discriminative features, embodiments are associated with a feature extractor capable of extracting features from speech frames based on first parameters, a speaker classifier capable of identifying a speaker based on the features and on second parameters, and a condition classifier capable of identifying a noise condition based on the features and on third parameters. The first parameters of the feature extractor and the second parameters of the speaker classifier are trained to minimize a speaker classification loss, the first parameters of the feature extractor are further trained to maximize a condition classification loss, and the third parameters of the condition classifier are trained to minimize the condition classification loss.

Type: Grant

Filed: November 30, 2021

Date of Patent: November 21, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong
Attentive adversarial domain-invariant training

Patent number: 11735190

Abstract: To generate substantially domain-invariant and speaker-discriminative features, embodiments may operate to extract features from input data based on a first set of parameters, generate outputs based on the extracted features and on a second set of parameters, and identify words represented by the input data based on the outputs, wherein the first set of parameters and the second set of parameters have been trained to minimize a network loss associated with the second set of parameters, wherein the first set of parameters has been trained to maximize the domain classification loss of a network comprising 1) an attention network to determine, based on a third set of parameters, relative importances of features extracted based on the first parameters to domain classification and 2) a domain classifier to classify a domain based on the extracted features, the relative importances, and a fourth set of parameters, and wherein the third set of parameters and the fourth set of parameters have been trained to minimize

Type: Grant

Filed: October 5, 2021

Date of Patent: August 22, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhong Meng, Jinyu Li, Yifan Gong
TRAINING AND USING A TRANSCRIPT GENERATION MODEL ON A MULTI-SPEAKER AUDIO STREAM

Publication number: 20230215439

Abstract: The disclosure herein describes using a transcript generation model for generating a transcript from a multi-speaker audio stream. Audio data including overlapping speech of a plurality of speakers is obtained and a set of frame embeddings are generated from audio data frames of the obtained audio data using an audio data encoder. A set of words and channel change (CC) symbols are generated from the set of frame embeddings using a transcript generation model. The CC symbols are included between pairs of adjacent words that are spoken by different people at the same time. The set of words and CC symbols are transformed into a plurality of transcript lines, wherein words of the set of words are sorted into transcript lines based on the CC symbols, and a multi-speaker transcript is generated based on the plurality of transcript lines. The inclusion of CC symbols by the model enables efficient, accurate multi-speaker transcription.

Type: Application

Filed: December 31, 2021

Publication date: July 6, 2023

Inventors: Naoyuki KANDA, Takuya YOSHIOKA, Zhuo CHEN, Jinyu LI, Yashesh GAUR, Zhong MENG, Xiaofei WANG, Xiong XIAO
HYPOTHESIS STITCHER FOR SPEECH RECOGNITION OF LONG-FORM AUDIO

Publication number: 20230154468

Abstract: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis.

Type: Application

Filed: January 19, 2023

Publication date: May 18, 2023

Inventors: Naoyuki KANDA, Xuankai CHANG, Yashesh GAUR, Xiaofei WANG, Zhong MENG, Takuya YOSHIOKA
Conditional teacher-student learning for model training

Patent number: 11586930

Abstract: Embodiments are associated with conditional teacher-student model training. A trained teacher model configured to perform a task may be accessed and an untrained student model may be created. A model training platform may provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data. When it is determined that a teacher posterior matches the associated ground truth label, the platform may conditionally use the teacher posterior to train the student model. When it is determined that a teacher posterior does not match the associated ground truth label, the platform may conditionally use the ground truth label to train the student model. The models might be associated with, for example, automatic speech recognition (e.g., in connection with domain adaptation and/or speaker adaptation).

Type: Grant

Filed: May 13, 2019

Date of Patent: February 21, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
Hypothesis stitcher for speech recognition of long-form audio

Patent number: 11574639

Abstract: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis.

Type: Grant

Filed: December 18, 2020

Date of Patent: February 7, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka
POLYGONI MILLETII RHIZOME COMPOSITIONS AND METHODS OF PREPARING THE SAME

Publication number: 20230022609

Abstract: A method of preparing a Polygoni Milletii Rhizome tincture includes: preparing a first mixture; extracting the first mixture with 70-90% ethanol under reflux condition to obtain a first extract solution; preparing a second mixture; extracting the second mixture with 50-70% ethanol to obtain a second extract solution; and mixing the first extract solution with the second extract solution to obtain the polygoni milletii rhizome tincture. A method of preparing a Polygoni Milletii Rhizome poultice includes: reparing a Polygoni Milletii Rhizome mixture; mixing the Polygoni Milletii Rhizome mixture and a skin penetration enhancer in water; mixing a moisturizing agent and a binder in water; adding a thickener in water; mixing methylparaben and ethylparaben in 90% ethanol; mixing all solutions to form a mixture; and applying the mixture on a non-woven fabric cloth and drying to form the Polygoni Milletii Rhizome poultice.

Type: Application

Filed: July 1, 2022

Publication date: January 26, 2023

Inventors: Xiaolin XIE, Dezhu ZHANG, Chengyuan LIANG, Shujun DING, Yuzhi LIU, Zhao MA, Xuhua ZHOU, Zhong MENG, Jianguo MENG
Internal language model for E2E models

Patent number: 11527238

Abstract: A computer device is provided that includes one or more processors configured to receive an end-to-end (E2E) model that has been trained for automatic speech recognition with training data from a source-domain, and receive an external language model that has been trained with training data from a target-domain. The one or more processors are configured to perform an inference of the probability of an output token sequence given a sequence of input speech features. Performing the inference includes computing an E2E model score, computing an external language model score, and computing an estimated internal language model score for the E2E model. The estimated internal language model score is computed by removing a contribution of an intrinsic acoustic model. The processor is further configured to compute an integrated score based at least on E2E model score, the external language model score, and the estimated internal language model score.

Type: Grant

Filed: January 21, 2021

Date of Patent: December 13, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhong Meng, Sarangarajan Parthasarathy, Xie Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong
HYPOTHESIS STITCHER FOR SPEECH RECOGNITION OF LONG-FORM AUDIO

Publication number: 20220199091

Abstract: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis.

Type: Application

Filed: December 18, 2020

Publication date: June 23, 2022

Inventors: Naoyuki KANDA, Xuankai CHANG, Yashesh GAUR, Xiaofei WANG, Zhong MENG, Takuya YOSHIOKA
CONDITION-INVARIANT FEATURE EXTRACTION NETWORK

Publication number: 20220165290

Abstract: To generate substantially condition-invariant and speaker-discriminative features, embodiments are associated with a feature extractor capable of extracting features from speech frames based on first parameters, a speaker classifier capable of identifying a speaker based on the features and on second parameters, and a condition classifier capable of identifying a noise condition based on the features and on third parameters. The first parameters of the feature extractor and the second parameters of the speaker classifier are trained to minimize a speaker classification loss, the first parameters of the feature extractor are further trained to maximize a condition classification loss, and the third parameters of the condition classifier are trained to minimize the condition classification loss.

Type: Application

Filed: November 30, 2021

Publication date: May 26, 2022

Inventors: Zhong MENG, Yong ZHAO, Jinyu LI, Yifan GONG
INTERNAL LANGUAGE MODEL FOR E2E MODELS

Publication number: 20220139380

Abstract: A computer device is provided that includes one or more processors configured to receive an end-to-end (E2E) model that has been trained for automatic speech recognition with training data from a source-domain, and receive an external language model that has been trained with training data from a target-domain. The one or more processors are configured to perform an inference of the probability of an output token sequence given a sequence of input speech features. Performing the inference includes computing an E2E model score, computing an external language model score, and computing an estimated internal language model score for the E2E model. The estimated internal language model score is computed by removing a contribution of an intrinsic acoustic model. The processor is further configured to compute an integrated score based at least on E2E model score, the external language model score, and the estimated internal language model score.

Type: Application

Filed: January 21, 2021

Publication date: May 5, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Zhong MENG, Sarangarajan PARTHASARATHY, Xie SUN, Yashesh GAUR, Naoyuki KANDA, Liang LU, Xie CHEN, Rui ZHAO, Jinyu LI, Yifan GONG
SPEAKER ADAPTATION FOR ATTENTION-BASED ENCODER-DECODER

Publication number: 20220130376

Abstract: Embodiments are associated with a speaker-independent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-independent attention-based encoder-decoder model associated with a first output distribution, and a speaker-dependent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-dependent attention-based encoder-decoder model associated with a second output distribution. The second attention-based encoder-decoder model is trained to classify output tokens based on input speech frames of a target speaker and simultaneously trained to maintain a similarity between the first output distribution and the second output distribution.

Type: Application

Filed: January 5, 2022

Publication date: April 28, 2022

Inventors: Zhong MENG, Yashesh GAUR, Jinyu LI, Yifan GONG
ATTENTIVE ADVERSARIAL DOMAIN-INVARIANT TRAINING

Publication number: 20220028399

Abstract: To generate substantially domain-invariant and speaker-discriminative features, embodiments may operate to extract features from input data based on a first set of parameters, generate outputs based on the extracted features and on a second set of parameters, and identify words represented by the input data based on the outputs, wherein the first set of parameters and the second set of parameters have been trained to minimize a network loss associated with the second set of parameters, wherein the first set of parameters has been trained to maximize the domain classification loss of a network comprising 1) an attention network to determine, based on a third set of parameters, relative importances of features extracted based on the first parameters to domain classification and 2) a domain classifier to classify a domain based on the extracted features, the relative importances, and a fourth set of parameters, and wherein the third set of parameters and the fourth set of parameters have been trained to minimize

Type: Application

Filed: October 5, 2021

Publication date: January 27, 2022

Inventors: Zhong MENG, Jinyu LI, Yifan GONG
Speaker adaptation for attention-based encoder-decoder

Patent number: 11232782

Abstract: Embodiments are associated with a speaker-independent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-independent attention-based encoder-decoder model associated with a first output distribution, a speaker-dependent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-dependent attention-based encoder-decoder model associated with a second output distribution, training of the second attention-based encoder-decoder model to classify output tokens based on input speech frames of a target speaker and simultaneously training the speaker-dependent attention-based encoder-decoder model to maintain a similarity between the first output distribution and the second output distribution, and performing automatic speech recognition on speech frames of the target speaker using the trained speaker-dependent attention-based encoder-decoder model.

Type: Grant

Filed: November 6, 2019

Date of Patent: January 25, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong
Method for extracting herbal medicine

Patent number: 11219652

Abstract: A method for extracting herbal medicine includes: step one, spray extraction; step two, pressure filtration and concentration; step three, spray and countercurrent precipitation; and step four, concentrating reduced pressure and drying.

Type: Grant

Filed: August 14, 2020

Date of Patent: January 11, 2022

Assignee: SHAANXI PANLONG PHARMACEUTICAL GROUP LIMITED BY SHARE LTD.

Inventors: Xiaolin Xie, Dezhu Zhang, Jianguo Meng, Yu Wang, Xuhua Zhou, Zhong Meng, Nan Hui, Juan Li
Condition-invariant feature extraction network

Patent number: 11217265

Abstract: To generate substantially condition-invariant and speaker-discriminative features, embodiments are associated with a feature extractor capable of extracting features from speech frames based on first parameters, a speaker classifier capable of identifying a speaker based on the features and on second parameters, and a condition classifier capable of identifying a noise condition based on the features and on third parameters. The first parameters of the feature extractor and the second parameters of the speaker classifier are trained to minimize a speaker classification loss, the first parameters of the feature extractor are further trained to maximize a condition classification loss, and the third parameters of the condition classifier are trained to minimize the condition classification loss.

Type: Grant

Filed: June 7, 2019

Date of Patent: January 4, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong
Attentive adversarial domain-invariant training

Patent number: 11170789

Abstract: To generate substantially domain-invariant and speaker-discriminative features, embodiments are associated with a feature extractor to receive speech frames and extract features from the speech frames based on a first set of parameters of the feature extractor, a senone classifier to identify a senone based on the received features and on a second set of parameters of the senone classifier, an attention network capable of determining a relative importance of features extracted by the feature extractor to domain classification, based on a third set of parameters of the attention network, a domain classifier capable of classifying a domain based on the features and the relative importances, and on a fourth set of parameters of the domain classifier; and a training platform to train the first set of parameters of the feature extractor and the second set of parameters of the senone classifier to minimize the senone classification loss, train the first set of parameters of the feature extractor to maximize the dom

Type: Grant

Filed: July 26, 2019

Date of Patent: November 9, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Jinyu Li, Yifan Gong

1 2 next