Patents by Inventor Jinyu Li
Jinyu Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250140238Abstract: Systems and methods are provided for enhancing the speech modality in a large language model (LLM) and for retaining in-context learning capabilities without overfitting to trained tasks. Systems obtain a first set of training data comprising tuples of a sample of speech combined with synthetically generated pairings of speech comprehension test questions and answers that correspond to the sample of speech and obtain a second set of training data comprising pairings of automatic speech recognition data. Systems generate and align a first set of encodings of the first set of training data and a second set of encodings of the second set of training data. Systems train the LLM on a greater amount of the first set of training data than the second set of training data and use the trained LLM to perform a natural language processing task.Type: ApplicationFiled: February 28, 2024Publication date: May 1, 2025Inventors: Yashesh GAUR, Jing PAN, Zhuo CHEN, Jian WU, Jinyu LI, Sunit SIVASANKARAN
-
Patent number: 12277927Abstract: Systems and methods are provided for obtaining, training, and using an end-to-end AST model based on a neural transducer, the end-to-end AST model comprising at least (i) an acoustic encoder which is configured to receive and encode audio data, (ii) a prediction network which is integrated in a parallel model architecture with the acoustic encoder in the end-to-end AST model, and (iii) a joint layer which is integrated in series with the acoustic encoder and prediction network. The end-to-end AST model is configured to generate a transcription in the second language of input audio data in the first language such that the acoustic encoder learns a plurality of temporal processing paths.Type: GrantFiled: March 15, 2022Date of Patent: April 15, 2025Assignee: Microsoft Technology Licensing, LLCInventors: Jinyu Li, Jian Xue, Matthew John Post, Peidong Wang
-
Publication number: 20250099661Abstract: Disclosed are an oxygenator and an extracorporeal membrane oxygenation device. The oxygenator includes a housing; an oxygenation chamber, arranged in the housing, and having a blood flow pipeline extend through a blood inlet and a blood outlet; a partition plate, arranged between the housing and the oxygenation chamber, the partition plate is arranged in a same direction as the upper end cover and divide the interior of the housing into a heat medium chamber and a gas chamber. The oxygenator combines the design of a heat medium chamber and a gas chamber to perform brand-new optimization design on a blood flow path, a gas pipeline and a heat medium pipeline of a membrane lung, so as to obtain the best hemodynamic performance, uniform distribution of internal flow fields and pressure fields, small flow retention zone, low blood flow resistance and high gas blood exchange efficiency and heat exchange efficiency.Type: ApplicationFiled: September 21, 2023Publication date: March 27, 2025Inventors: Zihua SU, Mingzhou XU, Minghao YUE, Jinian LI, Yawei WANG, Shihang LIN, Yubo FAN, Zengsheng CHEN, Huichao LIU, Shiyao ZHANG, Yake CHENG, Jinyu LI, Wenjie YU
-
Patent number: 12249336Abstract: Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.Type: GrantFiled: June 29, 2021Date of Patent: March 11, 2025Assignee: Microsoft Technology Licensing, LLCInventors: Jinyu Li, Long Zhou, Xie Sun, Shujie Liu
-
Patent number: 12205596Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.Type: GrantFiled: February 10, 2023Date of Patent: January 21, 2025Assignee: Microsoft Technology Licensing, LLCInventors: Guoli Ye, Yan Huang, Wenning Wei, Lei He, Eva Sharma, Jian Wu, Yao Tian, Edward C. Lin, Yifan Gong, Rui Zhao, Jinyu Li, William Maxwell Gale
-
Publication number: 20250005339Abstract: Representative embodiments disclose machine learning classifiers used in scenarios such as speech recognition, image captioning, machine translation, or other sequence-to-sequence embodiments. The machine learning classifiers have a plurality of time layers, each layer having a time processing block and a depth processing block. The time processing block is a recurrent neural network such as a Long Short Term Memory (LSTM) network. The depth processing blocks can be an LSTM network, a gated Deep Neural Network (DNN) or a maxout DNN. The depth processing blocks account for the hidden states of each time layer and uses summarized layer information for final input signal feature classification. An attention layer can also be used between the top depth processing block and the output layer.Type: ApplicationFiled: September 10, 2024Publication date: January 2, 2025Inventors: Jinyu LI, Liang LU, Changliang LIU, Yifan GONG
-
Publication number: 20240412736Abstract: Systems and methods are provided for instantiating, modifying, adapting, and using a factorized neural transducer for multi-speaker automatic speech recognition. The factorized neural transducer includes a vocabulary predictor with multiple hidden states to process speech from different speakers, a non-vocabulary predictor that facilitates the prediction of channel change tokens indicating a speaker change in input speech data, an encoder used to encode acoustic features of the input speech data, and a joint network.Type: ApplicationFiled: June 8, 2023Publication date: December 12, 2024Inventors: Jian WU, Jinyu LI, Zhuo CHEN, Naoyuki KANDA, Takuya YOSHIOKA
-
Patent number: 12164069Abstract: Provided are a detection substrate and a ray detector. The detection substrate includes: a base substrate; a plurality of detection pixels located on the base substrate, at least one detection pixel serves as a detection marking pixel. The detection marking pixel includes: a storage capacitor, a first electrode plate of the storage capacitor coupled to a bias voltage end; a discharge circuit configured to write a signal of the bias voltage end into a second electrode plate of the storage capacitor under the control of a first scanning signal end; and a reading circuit coupled to an external reading circuit, the reading circuit configured to write the voltage of the second electrode plate of the storage capacitor into the external reading circuit and write a reference signal of the external reading circuit into the second electrode plate of the storage capacitor under the control of a second scanning signal end.Type: GrantFiled: May 12, 2021Date of Patent: December 10, 2024Assignees: Beijing BOE Sensor Technology Co., Ltd., BOE Technology Group Co., Ltd.Inventors: Jianxing Shang, Jinyu Li, Kunjing Chung, Jingjie Su, Xuecheng Hou, Zhenwu Jiang, Zhenyu Wang, Shenkang Wu, Haiyang Lu, Jian Ma
-
Publication number: 20240395240Abstract: A computer implemented method includes receiving speech data representative of speech in a first language The speech data is divided into chunks of speech data, each chunk comprising multiple temporally consecutive frames of acoustic information. Each temporally consecutive chunk of data is processed using beam search on each frame to identify candidate language tokens representing a second language different from the first language. A best candidate language token(s) is selected for each chunk as processed. The selected best candidate language token or tokens for each chunk of data is committed as a prefix for a next temporally consecutive chunk of data.Type: ApplicationFiled: May 23, 2023Publication date: November 28, 2024Inventors: Junkun Chen, Jinyu Li, Peidong Wang, Jian Xue
-
Patent number: 12149041Abstract: A terahertz radiator is based on coherent Smith-Purcell radiation amplified by stimulation. The terahertz radiator includes an electron emission source configured to emit electron beams and a pumping source configured to emit pumping signals. The pumping signal interacts with a primary grating structure to obtain preliminarily bunched electrons. The preliminarily bunched electrons interact with the primary grating structure to generate coherent Smith-Purcell radiation. The coherent Smith-Purcell radiation and the pumping signals vertically resonate in a primary resonant cavity structure, so that the electron bunching density is increased, and in turn, the coherent Smith-Purcell radiation is enhanced. A positive feedback process is formed by free electrons and the coherent Smith-Purcell radiation, and the coherent Smith-Purcell radiation amplified by stimulation and periodic bunched electron bunches are obtained.Type: GrantFiled: June 23, 2021Date of Patent: November 19, 2024Assignee: Tsinghua UniversityInventors: Fang Liu, Yuechai Lin, Jinyu Li, Yidong Huang, Kaiyu Cui, Xue Feng, Wei Zhang
-
Patent number: 12148720Abstract: A detection substrate and manufacturing method thereof, a flat panel detector and manufacturing method thereof. The detection substrate includes: a substrate including a detection region, a binding region, a controllable on-off region, and a cutting region; a plurality of detection units including transistors and photosensitive devices located in the detection region, a transistor includes a gate, a first electrode and a second electrode; a photosensitive device is connected to the first or second electrode; a plurality of conductive wires, one end is connected to the gate, and the other end is extended to the binding region; a conductive ring disposed in the cutting region; a plurality of detection wires, one end is connected to the conductive ring, the other end is connected to the conductive wires, the detection wires are passed through the controllable on-off region; the detection wires located in the controllable on-off region can have a disconnected state.Type: GrantFiled: March 24, 2021Date of Patent: November 19, 2024Assignees: Beijing BOE Sensor Technology Co., Ltd., BOE TECHNOLOGY GROUP CO., LTD.Inventors: Bin Zhao, Shuai Xu, Binbin Xu, Xuecheng Hou, Jinyu Li, Ye Zhang, Chuncheng Che
-
Publication number: 20240357868Abstract: Disclosed are a display panel and a display apparatus. The display panel includes: a base substrate, including a plurality of pixels; and a pixel defining layer, located on the base substrate and having a plurality of pixel openings; wherein the plurality of pixels are in one-to-one correspondence with the plurality of pixel openings; the plurality of pixels include first color pixels (spx1) and second color pixels (spx2); a decay rate of viewing angle brightness of pixel openings (KK1) of the first color pixels (spx1) in unit area is a first decay rate, a decay rate of viewing angle brightness of pixel openings (KK2) of the second color pixels (spx2) in unit area is a second decay rate.Type: ApplicationFiled: June 29, 2022Publication date: October 24, 2024Inventors: Ruqin ZHANG, Jinyu LI, Zhimeng SHAO, Yige QI, Pingchuan ZENG, Chao KONG, Gaokun HUANG
-
Publication number: 20240337764Abstract: A flat panel detector and a detecting apparatus. The flat panel detector includes a base substrate, and scanning lines, data lines, signal lines and detecting units in an array on the base substrate; each detecting units includes a switch sub-circuit and a photosensitive device; control terminals of switch sub-circuits in detecting units in a same row are connected with a same scanning line; first terminals of switch sub-circuits in detecting units in a same column are connected with a same data line; in each detecting unit, a second terminal of the switch sub-circuit is connected with a first terminal of the photosensitive device; second terminals of photosensitive devices of the detecting units in the same column are connected with a same bias signal line, the bias signal lines are divided into groups, the bias signal lines in different groups are mutually insulated and connected with different driving chips.Type: ApplicationFiled: August 3, 2022Publication date: October 10, 2024Inventors: Jinyu LI, Xuecheng HOU, Zhenyu WANG
-
Publication number: 20240302543Abstract: The present disclosure provides a photoelectric detector and an electronic device. The photoelectric detector has a pixel region and a peripheral region surrounding the pixel region, includes a base substrate and a plurality of pixel units arranged on the base substrate and positioned in the pixel region; each pixel unit includes a thin film transistor, a photodiode and a storage capacitor; for each pixel unit, a first electrode of the thin film transistor is connected with a first electrode of the photodiode and a first electrode plate of the storage capacitor, a second electrode of the photodiode is connected with a first bias signal line, a second electrode plate of the storage capacitor is connected with a second bias signal line, the first bias signal line is electrically connected with the second bias signal line at a connection node located in the peripheral region.Type: ApplicationFiled: April 28, 2022Publication date: September 12, 2024Inventors: Guan ZHANG, Jinyu LI, Zhenyu WANG, Zhenwu JIANG
-
Patent number: 12086704Abstract: Representative embodiments disclose machine learning classifiers used in scenarios such as speech recognition, image captioning, machine translation, or other sequence-to-sequence embodiments. The machine learning classifiers have a plurality of time layers, each layer having a time processing block and a depth processing block. The time processing block is a recurrent neural network such as a Long Short Term Memory (LSTM) network. The depth processing blocks can be an LSTM network, a gated Deep Neural Network (DNN) or a maxout DNN. The depth processing blocks account for the hidden states of each time layer and uses summarized layer information for final input signal feature classification. An attention layer can also be used between the top depth processing block and the output layer.Type: GrantFiled: November 3, 2021Date of Patent: September 10, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong
-
Publication number: 20240265924Abstract: Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.Type: ApplicationFiled: June 29, 2021Publication date: August 8, 2024Inventors: Jinyu LI, Long ZHOU, Xie SUN, Shujie LIU
-
Publication number: 20240252128Abstract: A flat panel detector and an imaging system are provided. The flat panel detector includes a plurality of pixel units which include photosensitive pixel units and alignment pixel units. Each photosensitive pixel unit includes a photoelectric sensor configured to convert an incident light into an electrical signal so that a photosensitive pixel unit in which the photoelectric sensor is located has a grayscale that changes according to a real-time change of the incident light. Each alignment pixel unit is configured to have a fixed grayscale, and the fixed grayscale does not change according to the real-time change of the incident light. The alignment pixel units includes first alignment pixel units and second alignment pixel units. Each first alignment pixel unit has a first fixed grayscale, each second alignment pixel unit has a second fixed grayscale different from the first fixed grayscale.Type: ApplicationFiled: April 11, 2024Publication date: August 1, 2024Inventors: Jinyu LI, Xuecheng HOU
-
Publication number: 20240257815Abstract: The disclosure herein describes using a transcript generation model for generating a transcript from a multi-speaker audio stream. Audio data including overlapping speech of a plurality of speakers is obtained and a set of frame embeddings are generated from audio data frames of obtained audio data using an audio data encoder. A set of words and channel change (CC) symbols are generated from the set of frame embeddings using a transcript generation model. The CC symbols are included between pairs of adjacent words that are spoken by different people at the same time. The set of words and CC symbols are transformed into a plurality of transcript lines, wherein words of the set of words are sorted into transcript lines based on CC symbols, and a multi-speaker transcript is generated based on the plurality of transcript lines. The inclusion of CC symbols by the model enables efficient, accurate multi-speaker transcription.Type: ApplicationFiled: April 10, 2024Publication date: August 1, 2024Inventors: Naoyuki KANDA, Takuya YOSHIOKA, Zhuo CHEN, Jinyu LI, Yashesh GAUR, Zhong MENG, Xiaofei WANG, Xiong XIAO
-
Publication number: 20240236522Abstract: A detection substrate, a noise reduction method therefor and a detection device are disclosed. The detection substrate includes a base substrate, the base substrate including a noise reduction region; a plurality of first photosensitive devices in the noise reduction region; a plurality of reading lines and a plurality of scanning lines, where the plurality of reading lines and the plurality of scanning lines are arranged in different layers from the plurality of first photosensitive devices, and the plurality of reading lines and the plurality of scanning lines are in different layers and crossing over each other; and a plurality of first transistors in the noise reduction region, where the first transistor is disconnected from at least one of the first photosensitive device, the reading line or the scanning line.Type: ApplicationFiled: March 22, 2024Publication date: July 11, 2024Inventors: Zhi DING, Xuecheng HOU, Zhenyu WANG, Jinyu LI
-
Publication number: 20240212394Abstract: The disclosure herein describes systems and methods for object data storage. In some examples, the method includes generating a profile for an object in a directory, the profile including a first feature vector corresponding to the object and a global unique identifier (GUID) corresponding to the first feature vector in the profile; generating a search scope, the search scope including at least the GUID corresponding to the profile; generating a second feature vector from a live image scan; matching the generated second feature vector from the live image scan to the first feature vector using the generated search scope; identifying the GUID corresponding to the first feature vector that matches the second feature vector; and outputting information corresponding to the object of the profile identified by the GUID corresponding to the first feature vector.Type: ApplicationFiled: March 7, 2024Publication date: June 27, 2024Inventors: William Louis THOMAS, Jinyu LI, Yang CHEN, Youyou HAN OPPENLANDER, Steven John BOWLES, Qingfen LIN