Patents by Inventor Jiusheng Chen

Jiusheng Chen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

GENERATION OF DATA MODELS FOR PREDICTING DATA

Publication number: 20240046037

Abstract: Systems and methods are provided for training a data model based on training data. The training includes pre-training and fine-tuning the data model based on a combination of an autoregressive (AR) model and a non-autoregressive (NAR) model. Training data may be received and encoded into streams of tokens. A pre-trainer during decoding generates a continuum of data structures of the AR and NAR combined model including a main stream and a series of predicting streams. Masked tokens in predicting streams reference or attend to one or more preceding tokens in the main stream or the preceding predicting streams. A fine-tuner selects streams to generate a trained model according to a target data model. The target data model is determined based on balancing an accuracy constraint and an efficiency constraint for predicting tokens. The decoder acts as abridge between the AR and NAR models in generating a trained data model.

Type: Application

Filed: December 25, 2020

Publication date: February 8, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Weizhu CHEN, Kewen TANG, Qiang LOU, Ruofei ZHANG, Yu YAN, Jiusheng CHEN
Resource-Efficient Attention in a Neural Network

Publication number: 20220318601

Abstract: Computing technology is described herein that provides an attention mechanism, implemented by a neural network, that generates attention information based on head-specific query information and shared key and value (KV) information, without computing head-specific key information and head-specific value information, and without caching the head-specific key information and the head-specific value information in memory. This manner of operation allows the computing technology to make efficient use of processing and memory resources. In some implementations, the attention mechanism is part of decoder of an encoder-decoder system, or a standalone decoder system. In some implementations, the computing technology leverages the attention information to generate synthesized text based on input text.

Type: Application

Filed: April 3, 2021

Publication date: October 6, 2022

Inventors: Yu YAN, Jiusheng CHEN, Nikhil BHENDAWADE, Yeyun GONG, Nan DUAN, Ruofei ZHANG
DYNAMIC CACHE MANAGEMENT IN BEAM SEARCH

Publication number: 20220100676

Abstract: Systems and methods for dynamically modifying a cache associated with a neural network model of a natural language generator are described. In examples, a neural network model employs a beam search algorithm at a decoder when decoding output and generating predicted output candidates. The decoder utilizes caching techniques to improve a speed at which the neural network operations. When an amount of memory utilized by one or more caches of the neural network model is determined to exceed a threshold memory size, a layer-specific portion of a cache associated with a layer of the neural network model is identified. The identified layer-specific portion of the cache can be deleted when the amount of memory utilized by the cache of the neural network model exceeds the threshold memory size. In examples, data in the cache is deduplicated and/or deleted.

Type: Application

Filed: February 18, 2021

Publication date: March 31, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yu YAN, Jiusheng CHEN, Ruofei ZHANG
DIRECT ECHO PARTICLE IMAGE VELOCIMETRY FLOW VECTOR MAPPING ON ULTRASOUND DICOM IMAGES

Publication number: 20140147013

Abstract: An Echo PIV analysis process, apparatus and algorithm are developed to reduce noise and analyze DICOM images representing a fluid flow of a plurality of particles. A plurality of DICOM images representing sequential image pairs of a plurality of particles is received. The plurality of DICOM sequential image pairs are grouped. The sequential image pairs are correlated to create N cross correlation maps. An average cross-correlation transformation is applied to each cross correlation map to create an image pair vector map for each image pair. A maximizing operation is applied to one or more of the N adjacent image pair vector maps to create a modified image pair vector map for the one or more of the N image pairs. The maps are combined to create a corresponding temporary vector map that are averaged to obtain a mean velocity vector field of the sequential image pairs.

Type: Application

Filed: October 11, 2011

Publication date: May 29, 2014

Applicant: The Regents of the University of Colorado, A Body Corporate

Inventors: Robin Shandas, Fuxing Zhang, Jiusheng Chen, Luciano A. Mazzaro

GENERATION OF DATA MODELS FOR PREDICTING DATA

Resource-Efficient Attention in a Neural Network

DYNAMIC CACHE MANAGEMENT IN BEAM SEARCH

DIRECT ECHO PARTICLE IMAGE VELOCIMETRY FLOW VECTOR MAPPING ON ULTRASOUND DICOM IMAGES