Patents by Inventor Jinyu Li

Jinyu Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Universal acoustic modeling using neural mixture models

Patent number: 11676006

Abstract: According to some embodiments, a universal modeling system may include a plurality of domain expert models to each receive raw input data (e.g., a stream of audio frames containing speech utterances) and provide a domain expert output based on the raw input data. A neural mixture component may then generate a weight corresponding to each domain expert model based on information created by the plurality of domain expert models (e.g., hidden features and/or row convolution). The weights might be associated with, for example, constrained scalar numbers, unconstrained scaler numbers, vectors, matrices, etc. An output layer may provide a universal modeling system output (e.g., an automatic speech recognition result) based on each domain expert output after being multiplied by the corresponding weight for that domain expert model.

Type: Grant

Filed: May 16, 2019

Date of Patent: June 13, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Amit Das, Jinyu Li, Changliang Liu, Yifan Gong
PIXEL ARRAY, DISPLAY APPARATUS AND FINE METAL MASK

Publication number: 20230180565

Abstract: A pixel array, a display apparatus and a fine metal mask are provided. The pixel array includes first basic pixel units in odd-numbered rows and second basic pixel units in even-numbered rows; each first basic pixel unit includes first, second and third sub-pixel groups sequentially in a row direction; each second basic pixel unit includes the third, first and second sub-pixel groups sequentially in the row direction; the first, second and third sub-pixel groups include two first, two second and two third sub-pixels in a column direction, respectively; the first, second and third sub-pixels have different colors; the first and second basic pixel units in the even-numbered and odd-numbered rows are aligned in the column direction, respectively; in the first and second basic pixel units in two adjacent rows, the third sub-pixel groups in the two rows are staggered with each other in the row direction.

Type: Application

Filed: October 29, 2021

Publication date: June 8, 2023

Inventors: Peng CAO, Wei ZHANG, Yamin YANG, Jianchao ZHANG, Guang JIN, Jinyu LI
DETECTION SUBSTRATE AND RAY DETECTOR

Publication number: 20230176238

Abstract: Provided are a detection substrate and a ray detector. The detection substrate includes: a base substrate; a plurality of detection pixels located on the base substrate, at least one detection pixel serves as a detection marking pixel. The detection marking pixel includes: a storage capacitor, a first electrode plate of the storage capacitor coupled to a bias voltage end; a discharge circuit configured to write a signal of the bias voltage end into a second electrode plate of the storage capacitor under the control of a first scanning signal end; and a reading circuit coupled to an external reading circuit, the reading circuit configured to write the voltage of the second electrode plate of the storage capacitor into the external reading circuit and write a reference signal of the external reading circuit into the second electrode plate of the storage capacitor under the control of a second scanning signal end.

Type: Application

Filed: May 12, 2021

Publication date: June 8, 2023

Inventors: Jianxing SHANG, Jinyu LI, Kunjing CHUNG, Jingjie SU, Xuecheng HOU, Zhenwu JIANG, Zhenyu WANG, Shenkang WU, Haiyang LU, Jian MA
Pre-training with alignments for recurrent neural network transducer based end-to-end speech recognition

Patent number: 11657799

Abstract: Techniques performed by a data processing system for training a Recurrent Neural Network Transducer (RNN-T) herein include encoder pretraining by training a neural network-based token classification model using first token-aligned training data representing a plurality of utterances, where each utterance is associated with a plurality of frames of audio data and tokens representing each utterance are aligned with frame boundaries of the plurality of audio frames; obtaining first cross-entropy (CE) criterion from the token classification model, wherein the CE criterion represent a divergence between expected outputs and reference outputs of the model; pretraining an encoder of an RNN-T based on the first CE criterion; and training the RNN-T with second training data after pretraining the encoder of the RNN-T. These techniques also include whole-network pre-training of the RNN-T.

Type: Grant

Filed: April 3, 2020

Date of Patent: May 23, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Rui Zhao, Jinyu Li, Liang Lu, Yifan Gong, Hu Hu
SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION WITH LATENCY THRESHOLD

Publication number: 20230154467

Abstract: A computing system including one or more processors configured to receive an audio input. The one or more processors may generate a text transcription of the audio input at a sequence-to-sequence speech recognition model, which may assign a respective plurality of external-model text tokens to a plurality of frames included in the audio input. Each external-model text token may have an external-model alignment within the audio input. Based on the audio input, the one or more processors may generate a plurality of hidden states. Based on the plurality of hidden states, the one or more processors may generate a plurality of output text tokens. Each output text token may have a corresponding output alignment within the audio input. For each output text token, a latency between the output alignment and the external-model alignment may be below a predetermined latency threshold. The one or more processors may output the text transcription.

Type: Application

Filed: January 20, 2023

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yashesh GAUR, Jinyu LI, Liang LU, Hirofumi INAGUMA, Yifan GONG
Layer trajectory long short-term memory with future context

Patent number: 11631399

Abstract: According to some embodiments, a machine learning model may include an input layer to receive an input signal as a series of frames representing handwriting data, speech data, audio data, and/or textual data. A plurality of time layers may be provided, and each time layer may comprise a uni-directional recurrent neural network processing block. A depth processing block may scan hidden states of the recurrent neural network processing block of each time layer, and the depth processing block may be associated with a first frame and receive context frame information of a sequence of one or more future frames relative to the first frame. An output layer may output a final classification as a classified posterior vector of the input signal. For example, the depth processing block may receive the context from information from an output of a time layer processing block or another depth processing block of the future frame.

Type: Grant

Filed: May 13, 2019

Date of Patent: April 18, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jinyu Li, Vadim Mazalov, Changliang Liu, Liang Lu, Yifan Gong
Generating and using text-to-speech data for speech recognition models

Patent number: 11587569

Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.

Type: Grant

Filed: May 14, 2020

Date of Patent: February 21, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Guoli Ye, Yan Huang, Wenning Wei, Lei He, Eva Sharma, Jian Wu, Yao Tian, Edward C. Lin, Yifan Gong, Rui Zhao, Jinyu Li, William Maxwell Gale
Conditional teacher-student learning for model training

Patent number: 11586930

Abstract: Embodiments are associated with conditional teacher-student model training. A trained teacher model configured to perform a task may be accessed and an untrained student model may be created. A model training platform may provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data. When it is determined that a teacher posterior matches the associated ground truth label, the platform may conditionally use the teacher posterior to train the student model. When it is determined that a teacher posterior does not match the associated ground truth label, the platform may conditionally use the ground truth label to train the student model. The models might be associated with, for example, automatic speech recognition (e.g., in connection with domain adaptation and/or speaker adaptation).

Type: Grant

Filed: May 13, 2019

Date of Patent: February 21, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
FLAT PANEL DETECTOR AND IMAGING SYSTEM

Publication number: 20230041531

Abstract: A flat panel detector and an imaging system are provided. The flat panel detector includes a plurality of pixel units which include photosensitive pixel units and alignment pixel units. Each photosensitive pixel unit includes a photoelectric sensor configured to convert an incident light into an electrical signal so that a photosensitive pixel unit in which the photoelectric sensor is located has a grayscale that changes according to a real-time change of the incident light. Each alignment pixel unit is configured to have a fixed grayscale, and the fixed grayscale does not change according to the real-time change of the incident light. The alignment pixel units includes first alignment pixel units and second alignment pixel units. Each first alignment pixel unit has a first fixed grayscale, each second alignment pixel unit has a second fixed grayscale different from the first fixed grayscale.

Type: Application

Filed: June 11, 2021

Publication date: February 9, 2023

Inventors: Jinyu LI, Xuecheng HOU
OLED LIGHT-EMITTING UNIT, OLED SUBSTRATE AND METHOD FOR MANUFACTURING OLED LIGHT-EMITTING UNIT

Publication number: 20230037094

Abstract: The present application relates to an OLED light-emitting unit for use in a top-emission OLED substrate, which includes an anode, a cathode, and an organic functional layer arranged between the anode and the cathode. The anode includes a first metal layer and a second metal layer arranged in sequence, a separation layer is arranged between the first metal layer and the second metal layer, and a thickness of the second metal layer is within a preset threshold range such that metal atoms of the second metal layer are capable of being thermally agglomerated and rearranged under a preset condition to form a concave-convex structure on a surface of the second metal layer.

Type: Application

Filed: August 11, 2021

Publication date: February 2, 2023

Inventors: Wei ZHANG, Cheng ZENG, Wanmei QING, Jinyu LI, Bing LIAO
Sequence-to-sequence speech recognition with latency threshold

Patent number: 11562745

Abstract: A computing system including one or more processors configured to receive an audio input. The one or more processors may generate a text transcription of the audio input at a sequence-to-sequence speech recognition model, which may assign a respective plurality of external-model text tokens to a plurality of frames included in the audio input. Each external-model text token may have an external-model alignment within the audio input. Based on the audio input, the one or more processors may generate a plurality of hidden states. Based on the plurality of hidden states, the one or more processors may generate a plurality of output text tokens. Each output text token may have a corresponding output alignment within the audio input. For each output text token, a latency between the output alignment and the external-model alignment may be below a predetermined latency threshold. The one or more processors may output the text transcription.

Type: Grant

Filed: April 6, 2020

Date of Patent: January 24, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yashesh Gaur, Jinyu Li, Liang Lu, Hirofumi Inaguma, Yifan Gong
CONTEXTUAL SPELLING CORRECTION (CSC) FOR AUTOMATIC SPEECH RECOGNITION (ASR)

Publication number: 20220415314

Abstract: Novel solutions for speech recognition provide contextual spelling correction (CSC) for automatic speech recognition (ASR). Disclosed examples include receiving an audio stream; performing an ASR process on the audio stream to produce an ASR hypothesis; receiving a context list; and, based on at least the ASR hypothesis and the context list, performing spelling correction to produce an output text sequence. A contextual spelling correction (CSC) model is used on top of an ASR model, precluding the need for changing the original ASR model. This permits run-time user customization based on contextual data, even for large-size context lists. Some examples include filtering ASR hypotheses for the audio stream and, based on at least the ASR hypotheses filtering, determining whether to trigger spelling correction for the ASR hypothesis. Some examples include generating text to speech (TTS) audio using preprocessed transcriptions with context phrases to train the CSC model.

Type: Application

Filed: August 31, 2022

Publication date: December 29, 2022

Inventors: Xiaoqiang WANG, Yanqing LIU, Sheng ZHAO, Jinyu LI
Internal language model for E2E models

Patent number: 11527238

Abstract: A computer device is provided that includes one or more processors configured to receive an end-to-end (E2E) model that has been trained for automatic speech recognition with training data from a source-domain, and receive an external language model that has been trained with training data from a target-domain. The one or more processors are configured to perform an inference of the probability of an output token sequence given a sequence of input speech features. Performing the inference includes computing an E2E model score, computing an external language model score, and computing an estimated internal language model score for the E2E model. The estimated internal language model score is computed by removing a contribution of an intrinsic acoustic model. The processor is further configured to compute an integrated score based at least on E2E model score, the external language model score, and the estimated internal language model score.

Type: Grant

Filed: January 21, 2021

Date of Patent: December 13, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhong Meng, Sarangarajan Parthasarathy, Xie Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong
ABSORBENT ARTICLE COMPRISING AN INTERMEDIATE LAYER

Publication number: 20220378625

Abstract: The present disclosure relates to an absorbent article comprising a topsheet, a backsheet, and a layer of absorbent material disposed between the topsheet and the backsheet, wherein the layer of absorbent material comprises superabsorbent polymer, and an intermediate layer comprising a nonwoven web. The intermediate layer is disposed between the layer of absorbent material and the backsheet, wherein the intermediate layer has a MD tensile/basis weight no greater than about 0.75 N/5 cm/g/m2 as measured according to Tensile Strength Test, and a thickness/basis weight no less than about 0.078 mm/g/m2, as measured FTT Test.

Type: Application

Filed: May 27, 2022

Publication date: December 1, 2022

Applicant: The Procter & Gamble Company

Inventors: Gueltekin ERDEM, Yi Yuan, Jinyu Li, Hui Liu, Ernesto Gabriel Bianchi, Abhishek Prakash Surushe, Sascha Kreisel
EFFICIENCY ADJUSTABLE SPEECH RECOGNITION SYSTEM

Publication number: 20220351718

Abstract: A computing system is configured to generate a transformer-transducer-based deep neural network. The transformer-transducer-based deep neural network comprises a transformer encoder network and a transducer predictor network. The transformer encoder network has a plurality of layers, each of which includes a multi-head attention network sublayer and a feed-forward network sublayer. The computing system trains an end-to-end (E2E) automatic speech recognition (ASR) model, using the transformer-transducer-based deep neural network. The E2E ASR model has one or more adjustable hyperparameters that are configured to dynamically adjust an efficiency or a performance of E2E ASR model when the E2E ASR model is deployed onto a device or executed by the device.

Type: Application

Filed: April 29, 2021

Publication date: November 3, 2022

Inventors: Yu WU, Jinyu LI, Shujie LIU, Xie CHEN, Chengyi WANG
OBJECT DATA STORAGE

Publication number: 20220335249

Abstract: The disclosure herein describes systems and methods for object data storage. In some examples, the method includes generating a profile for an object in a directory, the profile including a first feature vector corresponding to the object and a global unique identifier (GUID) corresponding to the first feature vector in the profile; generating a search scope, the search scope including at least the GUID corresponding to the profile; generating a second feature vector from a live image scan; matching the generated second feature vector from the live image scan to the first feature vector using the generated search scope; identifying the GUID corresponding to the first feature vector that matches the second feature vector; and outputting information corresponding to the object of the profile identified by the GUID corresponding to the first feature vector.

Type: Application

Filed: May 26, 2021

Publication date: October 20, 2022

Inventors: William Louis THOMAS, Jinyu LI, Yang CHEN, Youyou HAN OPPENLANDER, Steven John BOWLES, Qingfen LIN
FLAT PANEL DETECTOR, DRIVING METHOD THEREOF AND DETECTION DEVICE

Publication number: 20220296186

Abstract: The present disclosure discloses a flat panel detector, a driving method thereof and a detection device. The flat panel detector includes: at least one stage of first demultiplexer, wherein signal output terminals of a first stage of first demultiplexer are connected with the scanning signal lines in one-to-one correspondence, signal output terminals of other stage of first demultiplexer serves as signal input terminals of the previous stage of first demultiplexer, the first demultiplexers are configured to provide signals of the signal input terminals to all the signal output terminals in time division, the signal input terminals and the signal output terminals included in the different stages of first demultiplexers differ from one another, and the quantity of the signal input terminals and the quantity of the signal output terminals are reduced stage by stage from the first stage of first demultiplexer to the last stage of first demultiplexer.

Type: Application

Filed: November 1, 2021

Publication date: September 22, 2022

Inventors: Fengchun PANG, Xuecheng HOU, Jinyu LI, Zhenwu JIANG
Learning student DNN via output distribution

Patent number: 11429860

Abstract: Systems and methods are provided for generating a DNN classifier by “learning” a “student” DNN model from a larger more accurate “teacher” DNN model. The student DNN may be trained from un-labeled training data because its supervised signal is obtained by passing the un-labeled training data through the teacher DNN. In one embodiment, an iterative process is applied to train the student DNN by minimize the divergence of the output distributions from the teacher and student DNN models. For each iteration until convergence, the difference in the output distributions is used to update the student DNN model, and output distributions are determined again, using the unlabeled training data. The resulting trained student model may be suitable for providing accurate signal processing applications on devices having limited computational or storage resources such as mobile or wearable devices. In an embodiment, the teacher DNN model comprises an ensemble of DNN models.

Type: Grant

Filed: September 14, 2015

Date of Patent: August 30, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jinyu Li, Rui Zhao, Jui-Ting Huang, Yifan Gong
Light valve panel and manufacturing method thereof, three-dimensional printing system and method

Patent number: 11407173

Abstract: A light valve panel and a manufacturing method thereof, a three-dimensional printing system, and a three-dimensional printing method are disclosed. The light valve panel includes a first light valve array substrate and at least one second light valve array substrate, the first light valve array substrate and the at least one second light valve array substrate are arranged in a stack; the first light valve array substrate includes a plurality of first pixel units arranged in an array, and the second light valve array substrate includes a plurality of second pixel units arranged in an array; and an orthographic projection of at least one of the second pixel units on the first light valve array substrate partially overlaps with at least one of the first pixel units.

Type: Grant

Filed: March 19, 2020

Date of Patent: August 9, 2022

Assignees: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.

Inventors: Jinyu Li, Yanchen Li, Haobo Fang, Yu Zhao, Dawei Feng, Dong Wang, Wang Guo, Hailong Wang
Microfluidic chip, testing apparatus and control method therefor

Patent number: 11376596

Abstract: A microfluidic chip configured to move a microdroplet along a predetermined path, includes a plurality of probe electrode groups spaced apart along the predetermined path. Each of the plurality of probe electrode groups includes a first probe electrode and a second probe electrode spaced apart from each other. The first probe electrode and the second probe electrode among a plurality of first probe electrodes and a plurality of second probe electrodes are configured to form an electrical loop with the microdroplet to thereby facilitate determining a position of the microdroplet.

Type: Grant

Filed: December 25, 2018

Date of Patent: July 5, 2022

Assignees: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.

Inventors: Mingyang Lv, Yue Li, Jinyu Li, Yanchen Li, Dawei Feng, Dong Wang, Yu Zhao, Shaojun Hou, Wang Guo

prev 1 2 3 4 5 6 … next