Patents by Inventor Jinyu Li

Jinyu Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11200824
    Abstract: The present disclosure provides a shift register unit, a driving method thereof, a gate driver circuit and a display device. The shift register unit includes: a first input subcircuit configured to transmit a first voltage signal to a pull-up node under a control of a first input signal; an output subcircuit configured to transmit a clock signal to a first output end and to transmit a second voltage signal to a second output end; a storage unit having a first end coupled to the pull-up node and a second end coupled to the first output end; a first pull-down subcircuit configured to pull down a voltage of the first output end and the second output end; and a second pull-down subcircuit configured to pull down a voltage of the first output end and the second output end.
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: December 14, 2021
    Assignees: BOE TECHNOLOGY GROUP CO., LTD., BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD.
    Inventors: Jinyu Li, Yue Li, Xi Chen, Yanchen Li, Xingyou Luo, Dawei Feng, Shaojun Hou, Dong Wang, Yu Zhao, Mingyang Lv, Wang Guo
  • Patent number: 11170789
    Abstract: To generate substantially domain-invariant and speaker-discriminative features, embodiments are associated with a feature extractor to receive speech frames and extract features from the speech frames based on a first set of parameters of the feature extractor, a senone classifier to identify a senone based on the received features and on a second set of parameters of the senone classifier, an attention network capable of determining a relative importance of features extracted by the feature extractor to domain classification, based on a third set of parameters of the attention network, a domain classifier capable of classifying a domain based on the features and the relative importances, and on a fourth set of parameters of the domain classifier; and a training platform to train the first set of parameters of the feature extractor and the second set of parameters of the senone classifier to minimize the senone classification loss, train the first set of parameters of the feature extractor to maximize the dom
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: November 9, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Zhong Meng, Jinyu Li, Yifan Gong
  • Publication number: 20210312923
    Abstract: A computing system including one or more processors configured to receive an audio input. The one or more processors may generate a text transcription of the audio input at a sequence-to-sequence speech recognition model, which may assign a respective plurality of external-model text tokens to a plurality of frames included in the audio input. Each external-model text token may have an external-model alignment within the audio input. Based on the audio input, the one or more processors may generate a plurality of hidden states. Based on the plurality of hidden states, the one or more processors may generate a plurality of output text tokens. Each output text token may have a corresponding output alignment within the audio input. For each output text token, a latency between the output alignment and the external-model alignment may be below a predetermined latency threshold. The one or more processors may output the text transcription.
    Type: Application
    Filed: April 6, 2020
    Publication date: October 7, 2021
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yashesh GAUR, Jinyu LI, Liang LU, Hirofumi INAGUMA, Yifan GONG
  • Publication number: 20210312905
    Abstract: Techniques performed by a data processing system for training a Recurrent Neural Network Transducer (RNN-T) herein include encoder pretraining by training a neural network-based token classification model using first token-aligned training data representing a plurality of utterances, where each utterance is associated with a plurality of frames of audio data and tokens representing each utterance are aligned with frame boundaries of the plurality of audio frames; obtaining first cross-entropy (CE) criterion from the token classification model, wherein the CE criterion represent a divergence between expected outputs and reference outputs of the model; pretraining an encoder of an RNN-T based on the first CE criterion; and training the RNN-T with second training data after pretraining the encoder of the RNN-T. These techniques also include whole-network pre-training of the RNN-T.
    Type: Application
    Filed: April 3, 2020
    Publication date: October 7, 2021
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Rui ZHAO, Jinyu LI, Liang LU, Yifan GONG, Hu Hu
  • Publication number: 20210304769
    Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.
    Type: Application
    Filed: May 14, 2020
    Publication date: September 30, 2021
    Inventors: Guoli Ye, Yan Huang, Wenning Wei, Lei He, Eva Sharma, Jian Wu, Yao Tian, Edward C. Lin, Yifan Gong, Rui Zhao, Jinyu Li, William Maxwell Gale
  • Patent number: 11107460
    Abstract: Embodiments are associated with a speaker-independent acoustic model capable of classifying senones based on input speech frames and on first parameters of the speaker-independent acoustic model, a speaker-dependent acoustic model capable of classifying senones based on input speech frames and on second parameters of the speaker-dependent acoustic model, and a discriminator capable of receiving data from the speaker-dependent acoustic model and data from the speaker-independent acoustic model and outputting a prediction of whether received data was generated by the speaker-dependent acoustic model based on third parameters.
    Type: Grant
    Filed: July 2, 2019
    Date of Patent: August 31, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Zhong Meng, Jinyu Li, Yifan Gong
  • Publication number: 20210225250
    Abstract: There are provided in the present disclosure a shift register and a driving method thereof, a gate driving circuit and a display apparatus. The shift register of the present disclosure includes: a forward scanning input sub-circuit for pre-charging a potential of a pull-up node by an operation level signal under control of a forward input signal and a forward scanning signal upon scanning forwards; a backward scanning input sub-circuit for pre-charging the potential of the pull-up node by an operation level signal under control of a backward input signal and a backward scanning signal upon scanning backwards; an output sub-circuit for outputting a clock signal through a signal output terminal under control of the potential of the pull-up node; wherein the pull-up node is a connection node of the forward scanning input sub-circuit, the backward scanning input sub-circuit and the output sub-circuit.
    Type: Application
    Filed: February 8, 2018
    Publication date: July 22, 2021
    Applicants: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Wang GUO, Yanchen LI, Yue LI, Jinyu LI, Dawei FENG, Yu ZHAO, Shaojun HOU, Dong WANG, Mingyang LV
  • Publication number: 20210191544
    Abstract: The present disclosure provides a touch panel, an array substrate and a display device. The touch panel includes a touch control electrode unit including a first electrode and a second electrode which are insulated from each other, and the second electrode surrounds the first electrode. In the touch panel in which the second electrode surrounds the first electrode in the touch control electrode unit, since the first electrode and the second electrode are provided independently, touch control driving and sensing are performed on the first electrode and the second electrode respectively when the touch panel is being touched.
    Type: Application
    Filed: February 12, 2018
    Publication date: June 24, 2021
    Inventors: Xingyou LUO, Yue LI, Xi CHEN, Yanchen LI, Jinyu LI, Dawei FENG, Yu ZHAO, Shaojun HOU, Dong WANG, Mingyang LV, Wang GUO
  • Publication number: 20210166595
    Abstract: The present disclosure provides a shift register unit, a driving method thereof, a gate driver circuit and a display device. The shift register unit includes: a first input subcircuit configured to transmit a first voltage signal to a pull-up node under a control of a first input signal; an output subcircuit configured to transmit a clock signal to a first output end and to transmit a second voltage signal to a second output end; a storage unit having a first end coupled to the pull-up node and a second end coupled to the first output end; a first pull-down subcircuit configured to pull down a voltage of the first output end and the second output end; and a second pull-down subcircuit configured to pull down a voltage of the first output end and the second output end.
    Type: Application
    Filed: September 22, 2017
    Publication date: June 3, 2021
    Inventors: Jinyu LI, Yue LI, Xi CHEN, Yanchen LI, Xingyou LUO, Dawei FENG, Shaojun HOU, Dong WANG, Yu ZHAO, Mingyang LV, Wang GUO
  • Publication number: 20210129437
    Abstract: A light valve panel and a manufacturing method thereof, a three-dimensional printing system, and a three-dimensional printing method are disclosed. The light valve panel includes a first light valve array substrate and at least one second light valve array substrate, the first light valve array substrate and the at least one second light valve array substrate are arranged in a stack; the first light valve array substrate includes a plurality of first pixel units arranged in an array, and the second light valve array substrate includes a plurality of second pixel units arranged in an array; and an orthographic projection of at least one of the second pixel units on the first light valve array substrate partially overlaps with at least one of the first pixel units.
    Type: Application
    Filed: March 19, 2020
    Publication date: May 6, 2021
    Applicants: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Jinyu LI, Yanchen LI, Haobo FANG, Yu ZHAO, Dawei FENG, Dong WANG, Wang GUO, Hailong WANG
  • Patent number: 10970372
    Abstract: The use of user-specific data to process a biometric print, such that use of the biometric print is revoked by invalidating the user-specific data. The processed print is generated by performing one-way processing of the biometric print using the user-specific data. The processed print, not the biometric print, is then provided to the authentication system for later authentication of the user. During matching, the user later provides a current biometric, resulting in generation of a current biometric print. For each of multiple users, the user-specific is obtained for that user, and at least one processed print is generated for each user based on the current biometric print. The current processed prints are used by the authentication system to match against each of the enrolled processed prints. If a match is found, the user is identified as being the user associated with the matching enrolled print.
    Type: Grant
    Filed: November 1, 2018
    Date of Patent: April 6, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Peter Dawoud Shenouda Dawoud, Rachel Peters, Jinyu Li
  • Patent number: 10964309
    Abstract: A CS CTC model may be initialed from a major language CTC model by keeping network hidden weights and replacing output tokens with a union of major and secondary language output tokens. The initialized model may be trained by updating parameters with training data from both languages, and a LID model may also be trained with the data. During a decoding process for each of a series of audio frames, if silence dominates a current frame then a silence output token may be emitted. If silence does not dominate the frame, then a major language output token posterior vector from the CS CTC model may be multiplied with the LID major language probability to create a probability vector from the major language. A similar step is performed for the secondary language, and the system may emit an output token associated with the highest probability across all tokens from both languages.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: March 30, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong, Ke Li
  • Publication number: 20210082438
    Abstract: Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.
    Type: Application
    Filed: November 13, 2019
    Publication date: March 18, 2021
    Inventors: Yong ZHAO, Tianyan ZHOU, Jinyu LI, Yifan GONG, Jian WU, Zhuo CHEN
  • Publication number: 20210065683
    Abstract: Embodiments are associated with a speaker-independent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-independent attention-based encoder-decoder model associated with a first output distribution, a speaker-dependent attention-based encoder-decoder model to classify output tokens based on input speech frames, the speaker-dependent attention-based encoder-decoder model associated with a second output distribution, training of the second attention-based encoder-decoder model to classify output tokens based on input speech frames of a target speaker and simultaneously training the speaker-dependent attention-based encoder-decoder model to maintain a similarity between the first output distribution and the second output distribution, and performing automatic speech recognition on speech frames of the target speaker using the trained speaker-dependent attention-based encoder-decoder model.
    Type: Application
    Filed: November 6, 2019
    Publication date: March 4, 2021
    Inventors: Zhong MENG, Yashesh GAUR, Jinyu LI, Yifan GONG
  • Publication number: 20210020166
    Abstract: Streaming machine learning unidirectional models is facilitated by the use of embedding vectors. Processing blocks in the models apply embedding vectors as input. The embedding vectors utilize context of future data (e.g., data that is temporally offset into the future within a data stream) to improve the accuracy of the outputs generated by the processing blocks. The embedding vectors cause a temporal shift between the outputs of the processing blocks and the inputs to which the outputs correspond. This temporal shift enables the processing blocks to apply the embedding vector inputs from processing blocks that are associated with future data.
    Type: Application
    Filed: July 19, 2019
    Publication date: January 21, 2021
    Inventors: Jinyu Li, Amit Kumar Agarwal, Yifan Gong, Harini Kesavamoorthy
  • Patent number: 10885900
    Abstract: Improvements in speech recognition in a new domain are provided via the student/teacher training of models for different speech domains. A student model for a new domain is created based on the teacher model trained in an existing domain. The student model is trained in parallel to the operation of the teacher model, with inputs in the new and existing domains respectfully, to develop a neural network that is adapted to recognize speech in the new domain. The data in the new domain may exclude transcription labels but rather are parallelized with the data analyzed in the existing domain analyzed by the teacher model. The outputs from the teacher model are compared with the outputs of the student model and the differences are used to adjust the parameters of the student model to better recognize speech in the second domain.
    Type: Grant
    Filed: August 11, 2017
    Date of Patent: January 5, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jinyu Li, Michael Lewis Seltzer, Xi Wang, Rui Zhao, Yifan Gong
  • Publication number: 20200363890
    Abstract: A display substrate, a manufacturing method thereof and a touch display device are disclosed. The display substrate includes a plurality of pixel units arranged in an array, every column of pixel units is provided with one corresponding first data line, every adjacent three columns of pixel units constitute one pixel unit group, and one second data line and one touch signal line are provided between every adjacent two pixel unit groups; between every adjacent two pixel unit groups, the second data line and the first data line are located at two sides of the touch signal line, respectively; and for the pixel unit adjacent to the second data line, a coupling capacitance between the pixel unit and the first data line adjacent to the pixel unit is as same as a coupling capacitance between the pixel unit and the second data line adjacent to the pixel unit.
    Type: Application
    Filed: April 24, 2019
    Publication date: November 19, 2020
    Applicants: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Dong WANG, Yue LI, Wang GUO, Mingyang LV, Yu ZHAO, Yanchen LI, Hailong WANG, Hongbo FENG, Jinyu LI
  • Patent number: 10839822
    Abstract: Representative embodiments disclose mechanisms to separate and recognize multiple audio sources (e.g., picking out individual speakers) in an environment where they overlap and interfere with each other. The architecture uses a microphone array to spatially separate out the audio signals. The spatially filtered signals are then input into a plurality of separators, so each signal is input into a corresponding signal. The separators use neural networks to separate out audio sources. The separators typically produce multiple output signals for the single input signals. A post selection processor then assesses the separator outputs to pick the signals with the highest quality output. These signals can be used in a variety of systems such as speech recognition, meeting transcription and enhancement, hearing aids, music information retrieval, speech enhancement and so forth.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: November 17, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhuo Chen, Jinyu Li, Xiong Xiao, Takuya Yoshioka, Huaming Wang, Zhenghao Wang, Yifan Gong
  • Publication number: 20200334526
    Abstract: According to some embodiments, a machine learning model may include an input layer to receive an input signal as a series of frames representing handwriting data, speech data, audio data, and/or textual data. A plurality of time layers may be provided, and each time layer may comprise a uni-directional recurrent neural network processing block. A depth processing block may scan hidden states of the recurrent neural network processing block of each time layer, and the depth processing block may be associated with a first frame and receive context frame information of a sequence of one or more future frames relative to the first frame. An output layer may output a final classification as a classified posterior vector of the input signal. For example, the depth processing block may receive the context from information from an output of a time layer processing block or another depth processing block of the future frame.
    Type: Application
    Filed: May 13, 2019
    Publication date: October 22, 2020
    Inventors: Jinyu LI, Vadim MAZALOV, Changliang LIU, Liang LU, Yifan GONG
  • Publication number: 20200334527
    Abstract: According to some embodiments, a universal modeling system may include a plurality of domain expert models to each receive raw input data (e.g., a stream of audio frames containing speech utterances) and provide a domain expert output based on the raw input data. A neural mixture component may then generate a weight corresponding to each domain expert model based on information created by the plurality of domain expert models (e.g., hidden features and/or row convolution). The weights might be associated with, for example, constrained scalar numbers, unconstrained scaler numbers, vectors, matrices, etc. An output layer may provide a universal modeling system output (e.g., an automatic speech recognition result) based on each domain expert output after being multiplied by the corresponding weight for that domain expert model.
    Type: Application
    Filed: May 16, 2019
    Publication date: October 22, 2020
    Inventors: Amit DAS, Jinyu LI, Changliang LIU, Yifan GONG