Patents by Inventor Jinyu Li

Jinyu Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

END-TO-END STREAMING SPEECH TRANSLATION WITH NEURAL TRANSDUCER

Publication number: 20250210035

Abstract: Systems and methods are provided for obtaining, training, and using an end-to-end AST model based on a neural transducer, the end-to-end AST model comprising at least (i) an acoustic encoder which is configured to receive and encode audio data, (ii) a prediction network which is integrated in a parallel model architecture with the acoustic encoder in the end-to-end AST model, and (iii) a joint layer which is integrated in series with the acoustic encoder and prediction network. The end-to-end AST model is configured to generate a transcription in the second language of input audio data in the first language such that the acoustic encoder learns a plurality of temporal processing paths.

Type: Application

Filed: March 14, 2025

Publication date: June 26, 2025

Inventors: Jinyu LI, Jian XUE, Matthew John POST, Peidong WANG
ADVANCED CLUSTERING FOR SELF-SUPERVISED LEARNING IN SPEECH RECOGNITION

Publication number: 20250157459

Abstract: Systems and methods are provided for generating a pseudo-labeled training dataset by at least one of: (1) extracting a set of intermediate outputs from an automatic speech recognition model based on applying the automatic speech recognition model to the set of unlabeled speech data, clustering the set of intermediate outputs into different clusters, and generating a first set of pseudo-labels comprising cluster assignments associated with the different clusters and which correspond to the unlabeled speech data, or (2) generating a set of decoded word sequences for the unlabeled speech data by applying the automatic speech recognition model to the set of unlabeled speech data, and generating a second set of pseudo-labels associated with the unlabeled speech data by applying the automatic speech recognition model to both (i) the set of decoded word sequences and (ii) the set of unlabeled speech data.

Type: Application

Filed: March 24, 2022

Publication date: May 15, 2025

Inventors: Yiming WANG, Chengyi WANG, Jinyu LI, Yu WU, Shujie LIU
METHODS AND SYSTEMS FOR ENHANCING MULTIMODAL CAPABILITIES IN LARGE LANGUAGE MODELS

Publication number: 20250140238

Abstract: Systems and methods are provided for enhancing the speech modality in a large language model (LLM) and for retaining in-context learning capabilities without overfitting to trained tasks. Systems obtain a first set of training data comprising tuples of a sample of speech combined with synthetically generated pairings of speech comprehension test questions and answers that correspond to the sample of speech and obtain a second set of training data comprising pairings of automatic speech recognition data. Systems generate and align a first set of encodings of the first set of training data and a second set of encodings of the second set of training data. Systems train the LLM on a greater amount of the first set of training data than the second set of training data and use the trained LLM to perform a natural language processing task.

Type: Application

Filed: February 28, 2024

Publication date: May 1, 2025

Inventors: Yashesh GAUR, Jing PAN, Zhuo CHEN, Jian WU, Jinyu LI, Sunit SIVASANKARAN
End-to-end streaming speech translation with neural transducer

Patent number: 12277927

Abstract: Systems and methods are provided for obtaining, training, and using an end-to-end AST model based on a neural transducer, the end-to-end AST model comprising at least (i) an acoustic encoder which is configured to receive and encode audio data, (ii) a prediction network which is integrated in a parallel model architecture with the acoustic encoder in the end-to-end AST model, and (iii) a joint layer which is integrated in series with the acoustic encoder and prediction network. The end-to-end AST model is configured to generate a transcription in the second language of input audio data in the first language such that the acoustic encoder learns a plurality of temporal processing paths.

Type: Grant

Filed: March 15, 2022

Date of Patent: April 15, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jinyu Li, Jian Xue, Matthew John Post, Peidong Wang
OXYGENATOR AND EXTRACORPOREAL MEMBRANE OXYGENATION DEVICE

Publication number: 20250099661

Abstract: Disclosed are an oxygenator and an extracorporeal membrane oxygenation device. The oxygenator includes a housing; an oxygenation chamber, arranged in the housing, and having a blood flow pipeline extend through a blood inlet and a blood outlet; a partition plate, arranged between the housing and the oxygenation chamber, the partition plate is arranged in a same direction as the upper end cover and divide the interior of the housing into a heat medium chamber and a gas chamber. The oxygenator combines the design of a heat medium chamber and a gas chamber to perform brand-new optimization design on a blood flow path, a gas pipeline and a heat medium pipeline of a membrane lung, so as to obtain the best hemodynamic performance, uniform distribution of internal flow fields and pressure fields, small flow retention zone, low blood flow resistance and high gas blood exchange efficiency and heat exchange efficiency.

Type: Application

Filed: September 21, 2023

Publication date: March 27, 2025

Inventors: Zihua SU, Mingzhou XU, Minghao YUE, Jinian LI, Yawei WANG, Shihang LIN, Yubo FAN, Zengsheng CHEN, Huichao LIU, Shiyao ZHANG, Yake CHENG, Jinyu LI, Wenjie YU
Canonical training for highly configurable multilingual speech

Patent number: 12249336

Abstract: Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.

Type: Grant

Filed: June 29, 2021

Date of Patent: March 11, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jinyu Li, Long Zhou, Xie Sun, Shujie Liu
Generating and using text-to-speech data for speech recognition models

Patent number: 12205596

Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.

Type: Grant

Filed: February 10, 2023

Date of Patent: January 21, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Guoli Ye, Yan Huang, Wenning Wei, Lei He, Eva Sharma, Jian Wu, Yao Tian, Edward C. Lin, Yifan Gong, Rui Zhao, Jinyu Li, William Maxwell Gale
MACHINE LEARNING MODEL WITH DEPTH PROCESSING UNITS

Publication number: 20250005339

Abstract: Representative embodiments disclose machine learning classifiers used in scenarios such as speech recognition, image captioning, machine translation, or other sequence-to-sequence embodiments. The machine learning classifiers have a plurality of time layers, each layer having a time processing block and a depth processing block. The time processing block is a recurrent neural network such as a Long Short Term Memory (LSTM) network. The depth processing blocks can be an LSTM network, a gated Deep Neural Network (DNN) or a maxout DNN. The depth processing blocks account for the hidden states of each time layer and uses summarized layer information for final input signal feature classification. An attention layer can also be used between the top depth processing block and the output layer.

Type: Application

Filed: September 10, 2024

Publication date: January 2, 2025

Inventors: Jinyu LI, Liang LU, Changliang LIU, Yifan GONG
FACTORIZED NEURAL TRANSDUCER FOR MULTI-SPEAKER SPEECH RECOGNITION

Publication number: 20240412736

Abstract: Systems and methods are provided for instantiating, modifying, adapting, and using a factorized neural transducer for multi-speaker automatic speech recognition. The factorized neural transducer includes a vocabulary predictor with multiple hidden states to process speech from different speakers, a non-vocabulary predictor that facilitates the prediction of channel change tokens indicating a speaker change in input speech data, an encoder used to encode acoustic features of the input speech data, and a joint network.

Type: Application

Filed: June 8, 2023

Publication date: December 12, 2024

Inventors: Jian WU, Jinyu LI, Zhuo CHEN, Naoyuki KANDA, Takuya YOSHIOKA
Detection substrate and ray detector

Patent number: 12164069

Abstract: Provided are a detection substrate and a ray detector. The detection substrate includes: a base substrate; a plurality of detection pixels located on the base substrate, at least one detection pixel serves as a detection marking pixel. The detection marking pixel includes: a storage capacitor, a first electrode plate of the storage capacitor coupled to a bias voltage end; a discharge circuit configured to write a signal of the bias voltage end into a second electrode plate of the storage capacitor under the control of a first scanning signal end; and a reading circuit coupled to an external reading circuit, the reading circuit configured to write the voltage of the second electrode plate of the storage capacitor into the external reading circuit and write a reference signal of the external reading circuit into the second electrode plate of the storage capacitor under the control of a second scanning signal end.

Type: Grant

Filed: May 12, 2021

Date of Patent: December 10, 2024

Assignees: Beijing BOE Sensor Technology Co., Ltd., BOE Technology Group Co., Ltd.

Inventors: Jianxing Shang, Jinyu Li, Kunjing Chung, Jingjie Su, Xuecheng Hou, Zhenwu Jiang, Zhenyu Wang, Shenkang Wu, Haiyang Lu, Jian Ma
Stable Output Streaming Speech Translation System

Publication number: 20240395240

Abstract: A computer implemented method includes receiving speech data representative of speech in a first language The speech data is divided into chunks of speech data, each chunk comprising multiple temporally consecutive frames of acoustic information. Each temporally consecutive chunk of data is processed using beam search on each frame to identify candidate language tokens representing a second language different from the first language. A best candidate language token(s) is selected for each chunk as processed. The selected best candidate language token or tokens for each chunk of data is committed as a prefix for a next temporally consecutive chunk of data.

Type: Application

Filed: May 23, 2023

Publication date: November 28, 2024

Inventors: Junkun Chen, Jinyu Li, Peidong Wang, Jian Xue
Detection substrate and manufacturing method thereof, flat panel detector and manufacturing method thereof

Patent number: 12148720

Abstract: A detection substrate and manufacturing method thereof, a flat panel detector and manufacturing method thereof. The detection substrate includes: a substrate including a detection region, a binding region, a controllable on-off region, and a cutting region; a plurality of detection units including transistors and photosensitive devices located in the detection region, a transistor includes a gate, a first electrode and a second electrode; a photosensitive device is connected to the first or second electrode; a plurality of conductive wires, one end is connected to the gate, and the other end is extended to the binding region; a conductive ring disposed in the cutting region; a plurality of detection wires, one end is connected to the conductive ring, the other end is connected to the conductive wires, the detection wires are passed through the controllable on-off region; the detection wires located in the controllable on-off region can have a disconnected state.

Type: Grant

Filed: March 24, 2021

Date of Patent: November 19, 2024

Assignees: Beijing BOE Sensor Technology Co., Ltd., BOE TECHNOLOGY GROUP CO., LTD.

Inventors: Bin Zhao, Shuai Xu, Binbin Xu, Xuecheng Hou, Jinyu Li, Ye Zhang, Chuncheng Che
Terahertz radiator based on coherent SPR amplified by stimulation

Patent number: 12149041

Abstract: A terahertz radiator is based on coherent Smith-Purcell radiation amplified by stimulation. The terahertz radiator includes an electron emission source configured to emit electron beams and a pumping source configured to emit pumping signals. The pumping signal interacts with a primary grating structure to obtain preliminarily bunched electrons. The preliminarily bunched electrons interact with the primary grating structure to generate coherent Smith-Purcell radiation. The coherent Smith-Purcell radiation and the pumping signals vertically resonate in a primary resonant cavity structure, so that the electron bunching density is increased, and in turn, the coherent Smith-Purcell radiation is enhanced. A positive feedback process is formed by free electrons and the coherent Smith-Purcell radiation, and the coherent Smith-Purcell radiation amplified by stimulation and periodic bunched electron bunches are obtained.

Type: Grant

Filed: June 23, 2021

Date of Patent: November 19, 2024

Assignee: Tsinghua University

Inventors: Fang Liu, Yuechai Lin, Jinyu Li, Yidong Huang, Kaiyu Cui, Xue Feng, Wei Zhang
DISPLAY PANEL AND DISPLAY APPARATUS

Publication number: 20240357868

Abstract: Disclosed are a display panel and a display apparatus. The display panel includes: a base substrate, including a plurality of pixels; and a pixel defining layer, located on the base substrate and having a plurality of pixel openings; wherein the plurality of pixels are in one-to-one correspondence with the plurality of pixel openings; the plurality of pixels include first color pixels (spx1) and second color pixels (spx2); a decay rate of viewing angle brightness of pixel openings (KK1) of the first color pixels (spx1) in unit area is a first decay rate, a decay rate of viewing angle brightness of pixel openings (KK2) of the second color pixels (spx2) in unit area is a second decay rate.

Type: Application

Filed: June 29, 2022

Publication date: October 24, 2024

Inventors: Ruqin ZHANG, Jinyu LI, Zhimeng SHAO, Yige QI, Pingchuan ZENG, Chao KONG, Gaokun HUANG
FLAT PANEL DETECTOR AND DETECTING APPARATUS

Publication number: 20240337764

Abstract: A flat panel detector and a detecting apparatus. The flat panel detector includes a base substrate, and scanning lines, data lines, signal lines and detecting units in an array on the base substrate; each detecting units includes a switch sub-circuit and a photosensitive device; control terminals of switch sub-circuits in detecting units in a same row are connected with a same scanning line; first terminals of switch sub-circuits in detecting units in a same column are connected with a same data line; in each detecting unit, a second terminal of the switch sub-circuit is connected with a first terminal of the photosensitive device; second terminals of photosensitive devices of the detecting units in the same column are connected with a same bias signal line, the bias signal lines are divided into groups, the bias signal lines in different groups are mutually insulated and connected with different driving chips.

Type: Application

Filed: August 3, 2022

Publication date: October 10, 2024

Inventors: Jinyu LI, Xuecheng HOU, Zhenyu WANG
PHOTOELECTRIC DETECTOR AND ELECTRONIC DEVICE

Publication number: 20240302543

Abstract: The present disclosure provides a photoelectric detector and an electronic device. The photoelectric detector has a pixel region and a peripheral region surrounding the pixel region, includes a base substrate and a plurality of pixel units arranged on the base substrate and positioned in the pixel region; each pixel unit includes a thin film transistor, a photodiode and a storage capacitor; for each pixel unit, a first electrode of the thin film transistor is connected with a first electrode of the photodiode and a first electrode plate of the storage capacitor, a second electrode of the photodiode is connected with a first bias signal line, a second electrode plate of the storage capacitor is connected with a second bias signal line, the first bias signal line is electrically connected with the second bias signal line at a connection node located in the peripheral region.

Type: Application

Filed: April 28, 2022

Publication date: September 12, 2024

Inventors: Guan ZHANG, Jinyu LI, Zhenyu WANG, Zhenwu JIANG
Machine learning model with depth processing units

Patent number: 12086704

Abstract: Representative embodiments disclose machine learning classifiers used in scenarios such as speech recognition, image captioning, machine translation, or other sequence-to-sequence embodiments. The machine learning classifiers have a plurality of time layers, each layer having a time processing block and a depth processing block. The time processing block is a recurrent neural network such as a Long Short Term Memory (LSTM) network. The depth processing blocks can be an LSTM network, a gated Deep Neural Network (DNN) or a maxout DNN. The depth processing blocks account for the hidden states of each time layer and uses summarized layer information for final input signal feature classification. An attention layer can also be used between the top depth processing block and the output layer.

Type: Grant

Filed: November 3, 2021

Date of Patent: September 10, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong
CANONICAL TRAINING FOR HIGHLY CONFIGURABLE MULTILINGUAL SPEECH

Publication number: 20240265924

Abstract: Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.

Type: Application

Filed: June 29, 2021

Publication date: August 8, 2024

Inventors: Jinyu LI, Long ZHOU, Xie SUN, Shujie LIU
FLAT PANEL DETECTOR AND IMAGING SYSTEM

Publication number: 20240252128

Abstract: A flat panel detector and an imaging system are provided. The flat panel detector includes a plurality of pixel units which include photosensitive pixel units and alignment pixel units. Each photosensitive pixel unit includes a photoelectric sensor configured to convert an incident light into an electrical signal so that a photosensitive pixel unit in which the photoelectric sensor is located has a grayscale that changes according to a real-time change of the incident light. Each alignment pixel unit is configured to have a fixed grayscale, and the fixed grayscale does not change according to the real-time change of the incident light. The alignment pixel units includes first alignment pixel units and second alignment pixel units. Each first alignment pixel unit has a first fixed grayscale, each second alignment pixel unit has a second fixed grayscale different from the first fixed grayscale.

Type: Application

Filed: April 11, 2024

Publication date: August 1, 2024

Inventors: Jinyu LI, Xuecheng HOU
TRAINING AND USING A TRANSCRIPT GENERATION MODEL ON A MULTI-SPEAKER AUDIO STREAM

Publication number: 20240257815

Abstract: The disclosure herein describes using a transcript generation model for generating a transcript from a multi-speaker audio stream. Audio data including overlapping speech of a plurality of speakers is obtained and a set of frame embeddings are generated from audio data frames of obtained audio data using an audio data encoder. A set of words and channel change (CC) symbols are generated from the set of frame embeddings using a transcript generation model. The CC symbols are included between pairs of adjacent words that are spoken by different people at the same time. The set of words and CC symbols are transformed into a plurality of transcript lines, wherein words of the set of words are sorted into transcript lines based on CC symbols, and a multi-speaker transcript is generated based on the plurality of transcript lines. The inclusion of CC symbols by the model enables efficient, accurate multi-speaker transcription.

Type: Application

Filed: April 10, 2024

Publication date: August 1, 2024

Inventors: Naoyuki KANDA, Takuya YOSHIOKA, Zhuo CHEN, Jinyu LI, Yashesh GAUR, Zhong MENG, Xiaofei WANG, Xiong XIAO

1 2 3 4 5 … next