Patents by Inventor Jinyu Li
Jinyu Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11961334Abstract: The disclosure herein describes systems and methods for object data storage. In some examples, the method includes generating a profile for an object in a directory, the profile including a first feature vector corresponding to the object and a global unique identifier (GUID) corresponding to the first feature vector in the profile; generating a search scope, the search scope including at least the GUID corresponding to the profile; generating a second feature vector from a live image scan; matching the generated second feature vector from the live image scan to the first feature vector using the generated search scope; identifying the GUID corresponding to the first feature vector that matches the second feature vector; and outputting information corresponding to the object of the profile identified by the GUID corresponding to the first feature vector.Type: GrantFiled: May 26, 2021Date of Patent: April 16, 2024Assignee: Microsoft Technology Licensing, LLC.Inventors: William Louis Thomas, Jinyu Li, Yang Chen, Youyou Han Oppenlander, Steven John Bowles, Qingfen Lin
-
Publication number: 20240095252Abstract: Methods and systems are provided for ranking search results and generating a presentation. In some implementations, a search system generates a presentation based on a search query. In some implementations, a search system ranks search results based on data stored in a knowledge graph. In some implementations, a search system identifies a modifying concept such as a superlative in a received search query, and determines ranking properties based on the modifying concept.Type: ApplicationFiled: November 28, 2023Publication date: March 21, 2024Inventors: Chen Zhou, Chen Ding, David Francois Huynh, JinYu Lou, Yanlai Huang, Hongda Shen, Guanghua Li, Yiming Li, Yangyang Chai
-
Patent number: 11933724Abstract: Disclosed are a device of complex gas mixture detection based on optical-path-adjustable spectrum detection and a method therefor, and the device includes: a light source configured for generating an incident beam and emitting the incident beam into an optical gas cell; the optical gas cell, including a cavity configured for accommodating a gas sample, and a reflection module group configured for reflecting the incident beam and a track arranged in the cavity, where the track is consistent with a light path of the light beam in the cavity; a detector module that is connected with the track in a relatively movable manner and is configured for receiving light beams and obtaining spectral data, where an optical path is changed by moving the detector module relative to the track; and a data acquisition unit that is configured for acquiring the spectral data obtained by the detector module.Type: GrantFiled: December 1, 2023Date of Patent: March 19, 2024Assignee: Hubei University of TechnologyInventors: Yin Zhang, Xiaoxing Zhang, Ran Zhuo, Zhiming Huang, Guozhi Zhang, Dibo Wang, Shuangshuang Tian, Mingli Fu, Yunjian Wu, Yan Luo, Shuo Jin, Jinyu Pu, Yalong Li
-
Patent number: 11915686Abstract: Embodiments are associated with a speaker-independent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-independent attention-based encoder-decoder model associated with a first output distribution, and a speaker-dependent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-dependent attention-based encoder-decoder model associated with a second output distribution. The second attention-based encoder-decoder model is trained to classify output tokens based on input speech frames of a target speaker and simultaneously trained to maintain a similarity between the first output distribution and the second output distribution.Type: GrantFiled: January 5, 2022Date of Patent: February 27, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong
-
Publication number: 20240052689Abstract: The present disclosure provides a device for separately arranging vacuum glass supports, which comprises a base, a vibrator, a separation chamber assembly, a separation actuator, a feed tube, and a drive device. The separation chamber assembly is provided with a feed inlet, an accommodation cavity, and a dispensing outlet. The feed inlet is communicated with the accommodation cavity. The dispensing outlet is communicated with the feed tube. The separation actuator is arranged directly below the accommodation cavity and capable to move reciprocatingly with respect to the accommodation cavity. A recess capable of accommodating one support is provided at the upper edge of the separation actuator. Thus, at each reciprocating movement, the separation actuator transports one support from the accommodation cavity to the dispensing outlet via the recess, which avoid arranging multiple supports in one operation.Type: ApplicationFiled: March 4, 2021Publication date: February 15, 2024Applicant: LUOYANG LANDGLASS TECHNOLOGY CO., LTD.Inventors: Yan ZHAO, Zhangsheng WANG, Jinyu LI, Haiyan WU
-
Patent number: 11862144Abstract: A computer system is provided that includes a processor configured to store a set of audio training data that includes a plurality of audio segments and metadata indicating a word or phrase associated with each audio segment. For a target training statement of a set of structured text data, the processor is configured to generate a concatenated audio signal that matches a word content of a target training statement by comparing the words or phrases of a plurality of text segments of the target training statement to respective words or phrases of audio segments of the stored set of audio training data, selecting a plurality of audio segments from the set of audio training data based on a match in the words or phrases between the plurality of text segments of the target training statement and the selected plurality of audio segments, and concatenating the selected plurality of audio segments.Type: GrantFiled: December 16, 2020Date of Patent: January 2, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Rui Zhao, Jinyu Li, Yifan Gong
-
Publication number: 20230378100Abstract: A detection substrate and manufacturing method thereof, a flat panel detector and manufacturing method thereof. The detection substrate includes: a substrate including a detection region, a binding region, a controllable on-off region, and a cutting region; a plurality of detection units including transistors and photosensitive devices located in the detection region, a transistor includes a gate, a first electrode and a second electrode; a photosensitive device is connected to the first or second electrode; a plurality of conductive wires, one end is connected to the gate, and the other end is extended to the binding region; a conductive ring disposed in the cutting region; a plurality of detection wires, one end is connected to the conductive ring, the other end is connected to the conductive wires, the detection wires are passed through the controllable on-off region; the detection wires located in the controllable on-off region can have a disconnected state.Type: ApplicationFiled: March 24, 2021Publication date: November 23, 2023Applicants: Beijing BOE Sensor Technology Co., Ltd., BOE TECHNOLOGY GROUP CO., LTD.Inventors: Bin ZHAO, Shuai XU, Binbin XU, Xuecheng HOU, Jinyu LI, Ye ZHANG, Chuncheng CHE
-
Patent number: 11823702Abstract: To generate substantially condition-invariant and speaker-discriminative features, embodiments are associated with a feature extractor capable of extracting features from speech frames based on first parameters, a speaker classifier capable of identifying a speaker based on the features and on second parameters, and a condition classifier capable of identifying a noise condition based on the features and on third parameters. The first parameters of the feature extractor and the second parameters of the speaker classifier are trained to minimize a speaker classification loss, the first parameters of the feature extractor are further trained to maximize a condition classification loss, and the third parameters of the condition classifier are trained to minimize the condition classification loss.Type: GrantFiled: November 30, 2021Date of Patent: November 21, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong
-
Publication number: 20230317063Abstract: The computing system trains an end-to-end (E2E) automatic speech recognition (ASR) model, using a transformer-transducer-based deep neural network that comprises a transformer encoder network and a transducer predictor network. The E2E ASR model is trained to have one or more adjustable hyperparameters that are configured to dynamically adjust an efficiency or a performance of the E2E ASR model when the E2E ASR model is deployed onto a device or executed by the device, by identifying one or more conditions of the device associated with computational power of the device and setting at least one of the one or more adjustable hyperparameters based on one or more conditions of the device.Type: ApplicationFiled: June 8, 2023Publication date: October 5, 2023Inventors: Yu WU, Jinyu LI, Shujie LIU, Xie CHEN, Chengyi WANG
-
Patent number: 11776548Abstract: Embodiments may include determination, for each of a plurality of speech frames associated with an acoustic feature, of a phonetic feature based on the associated acoustic feature, generation of one or more two-dimensional feature maps based on the plurality of phonetic features, input of the one or more two-dimensional feature maps to a trained neural network to generate a plurality of speaker embeddings, and aggregation of the plurality of speaker embeddings into a speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, wherein the speaker embedding is associated with an identity of the speaker.Type: GrantFiled: February 7, 2022Date of Patent: October 3, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Yong Zhao, Tianyan Zhou, Jinyu Li, Yifan Gong, Jian Wu, Zhuo Chen
-
Patent number: 11770959Abstract: A pixel array, a display apparatus and a fine metal mask are provided. The pixel array includes first basic pixel units in odd-numbered rows and second basic pixel units in even-numbered rows; each first basic pixel unit includes first, second and third sub-pixel groups sequentially in a row direction; each second basic pixel unit includes the third, first and second sub-pixel groups sequentially in the row direction; the first, second and third sub-pixel groups include two first, two second and two third sub-pixels in a column direction, respectively; the first, second and third sub-pixels have different colors; the first and second basic pixel units in the even-numbered and odd-numbered rows are aligned in the column direction, respectively; in the first and second basic pixel units in two adjacent rows, the third sub-pixel groups in the two rows are staggered with each other in the row direction.Type: GrantFiled: October 29, 2021Date of Patent: September 26, 2023Assignees: Chengdu BOE Optoelectronics Technology Co., Ltd., BOE TECHNOLOGY GROUP CO., LTD.Inventors: Peng Cao, Wei Zhang, Yamin Yang, Jianchao Zhang, Guang Jin, Jinyu Li
-
Publication number: 20230298566Abstract: Systems and methods are provided for obtaining, training, and using an end-to-end AST model based on a neural transducer, the end-to-end AST model comprising at least (i) an acoustic encoder which is configured to receive and encode audio data, (ii) a prediction network which is integrated in a parallel model architecture with the acoustic encoder in the end-to-end AST model, and (iii) a joint layer which is integrated in series with the acoustic encoder and prediction network. The end-to-end AST model is configured to generate a transcription in the second language of input audio data in the first language such that the acoustic encoder learns a plurality of temporal processing paths.Type: ApplicationFiled: March 15, 2022Publication date: September 21, 2023Inventors: Jinyu LI, Jian XUE, Matthew John POST, Peidong WANG
-
Publication number: 20230289536Abstract: Solutions for on-device streaming inverse text normalization (ITN) include: receiving a stream of tokens, each token representing an element of human speech; tagging, by a tagger that can work in a streaming manner (e.g., a neural network), the stream of tokens with one or more tags of a plurality of tags to produce a tagged stream of tokens, each tag of the plurality of tags representing a different normalization category of a plurality of normalization categories; based on at least a first tag representing a first normalization category, converting, by a first language converter of a plurality of category-specific natural language converters (e.g., weighted finite state transducers, WFSTs), at least one token of the tagged stream of tokens, from a first lexical language form, to a first natural language form; and based on at least the first natural language form, outputting a natural language representation of the stream of tokens.Type: ApplicationFiled: March 11, 2022Publication date: September 14, 2023Inventors: Yashesh GAUR, Nicholas KIBRE, Issac J. ALPHONSO, Jian XUE, Jinyu LI, Piyush BEHRE, Shawn CHANG
-
Patent number: 11735190Abstract: To generate substantially domain-invariant and speaker-discriminative features, embodiments may operate to extract features from input data based on a first set of parameters, generate outputs based on the extracted features and on a second set of parameters, and identify words represented by the input data based on the outputs, wherein the first set of parameters and the second set of parameters have been trained to minimize a network loss associated with the second set of parameters, wherein the first set of parameters has been trained to maximize the domain classification loss of a network comprising 1) an attention network to determine, based on a third set of parameters, relative importances of features extracted based on the first parameters to domain classification and 2) a domain classifier to classify a domain based on the extracted features, the relative importances, and a fourth set of parameters, and wherein the third set of parameters and the fourth set of parameters have been trained to minimizeType: GrantFiled: October 5, 2021Date of Patent: August 22, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Zhong Meng, Jinyu Li, Yifan Gong
-
Patent number: 11715462Abstract: A computing system is configured to generate a transformer-transducer-based deep neural network. The transformer-transducer-based deep neural network comprises a transformer encoder network and a transducer predictor network. The transformer encoder network has a plurality of layers, each of which includes a multi-head attention network sublayer and a feed-forward network sublayer. The computing system trains an end-to-end (E2E) automatic speech recognition (ASR) model, using the transformer-transducer-based deep neural network. The E2E ASR model has one or more adjustable hyperparameters that are configured to dynamically adjust an efficiency or a performance of E2E ASR model when the E2E ASR model is deployed onto a device or executed by the device.Type: GrantFiled: April 29, 2021Date of Patent: August 1, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Yu Wu, Jinyu Li, Shujie Liu, Xie Chen, Chengyi Wang
-
Publication number: 20230215439Abstract: The disclosure herein describes using a transcript generation model for generating a transcript from a multi-speaker audio stream. Audio data including overlapping speech of a plurality of speakers is obtained and a set of frame embeddings are generated from audio data frames of the obtained audio data using an audio data encoder. A set of words and channel change (CC) symbols are generated from the set of frame embeddings using a transcript generation model. The CC symbols are included between pairs of adjacent words that are spoken by different people at the same time. The set of words and CC symbols are transformed into a plurality of transcript lines, wherein words of the set of words are sorted into transcript lines based on the CC symbols, and a multi-speaker transcript is generated based on the plurality of transcript lines. The inclusion of CC symbols by the model enables efficient, accurate multi-speaker transcription.Type: ApplicationFiled: December 31, 2021Publication date: July 6, 2023Inventors: Naoyuki KANDA, Takuya YOSHIOKA, Zhuo CHEN, Jinyu LI, Yashesh GAUR, Zhong MENG, Xiaofei WANG, Xiong XIAO
-
Patent number: 11691147Abstract: A digital microfluidic chip and a digital microfluidic system. The digital microfluidic chip comprises: an upper substrate and a lower substrate arranged opposite to each other; multiple driving circuits and multiple addressing circuits disposed between the lower substrate and the upper substrate; and a control circuit, electrically connected to the driving circuits and the addressing circuits. The control circuit is configured to apply, in a driving stage, a driving voltage to each driving circuit, such that a droplet is controlled to move inside a droplet accommodation space according to a set path, measure, in a detection stage, after a bias voltage is applied to each addressing circuit, a charge loss amount of each addressing circuit, and to determine the position of the droplet according to the charge loss amount. The charge loss amount of each addressing circuit is related to the intensity of received external light.Type: GrantFiled: July 26, 2019Date of Patent: July 4, 2023Assignees: Beijing BOE Optoelectronics Technology Co., Ltd., BOE Technology Group Co., Ltd.Inventors: Mingyang Lv, Yue Li, Yanchen Li, Jinyu Li, Dawei Feng, Yu Zhao, Dong Wang, Wang Guo, Hailong Wang, Yue Geng, Peizhi Cai, Fengchun Pang, Le Gu, Chuncheng Che, Haochen Cui, Yingying Zhao, Nan Zhao, Yuelei Xiao, Hui Liao
-
Publication number: 20230186919Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.Type: ApplicationFiled: February 10, 2023Publication date: June 15, 2023Inventors: Guoli YE, Yan HUANG, Wenning WEI, Lei HE, Eva SHARMA, Jian WU, Yao TIAN, Edward C. LIN, Yifan GONG, Rui ZHAO, Jinyu LI, William Maxwell GALE
-
Patent number: 11676006Abstract: According to some embodiments, a universal modeling system may include a plurality of domain expert models to each receive raw input data (e.g., a stream of audio frames containing speech utterances) and provide a domain expert output based on the raw input data. A neural mixture component may then generate a weight corresponding to each domain expert model based on information created by the plurality of domain expert models (e.g., hidden features and/or row convolution). The weights might be associated with, for example, constrained scalar numbers, unconstrained scaler numbers, vectors, matrices, etc. An output layer may provide a universal modeling system output (e.g., an automatic speech recognition result) based on each domain expert output after being multiplied by the corresponding weight for that domain expert model.Type: GrantFiled: May 16, 2019Date of Patent: June 13, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Amit Das, Jinyu Li, Changliang Liu, Yifan Gong
-
Publication number: 20230180565Abstract: A pixel array, a display apparatus and a fine metal mask are provided. The pixel array includes first basic pixel units in odd-numbered rows and second basic pixel units in even-numbered rows; each first basic pixel unit includes first, second and third sub-pixel groups sequentially in a row direction; each second basic pixel unit includes the third, first and second sub-pixel groups sequentially in the row direction; the first, second and third sub-pixel groups include two first, two second and two third sub-pixels in a column direction, respectively; the first, second and third sub-pixels have different colors; the first and second basic pixel units in the even-numbered and odd-numbered rows are aligned in the column direction, respectively; in the first and second basic pixel units in two adjacent rows, the third sub-pixel groups in the two rows are staggered with each other in the row direction.Type: ApplicationFiled: October 29, 2021Publication date: June 8, 2023Inventors: Peng CAO, Wei ZHANG, Yamin YANG, Jianchao ZHANG, Guang JIN, Jinyu LI