Patents by Inventor Chenzheng SU

Chenzheng SU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR STATIC CACHED DECODING

Publication number: 20250094775

Abstract: Cached decoding systems and techniques are described. A system (e.g., decoder) receives an input token (e.g., input vector). The system applies a projection tensor (e.g., a projection matrix) to the input token to generate a feature tensor (e.g., a key tensor or a value tensor). The system processes at least the feature tensor and at least one previous feature tensor using at least one attention calculation to generate an output token. The at least one previous feature tensor is retrieved from a buffer. The at least one previous feature tensor can be stored in the buffer after having been previously calculated based on application of the projection tensor to a previous input token (e.g., from a previous iteration before the iteration in which the input token is received).

Type: Application

Filed: September 15, 2023

Publication date: March 20, 2025

Inventors: Shaojie ZHUO, Ramchalam KINATTINKARA RAMAKRISHNAN, Xiaopeng ZHANG, Yicheng LIN, Chenzheng SU, Liang SHEN
SINGLE SEARCH FOR ARCHITECTURES ON EMBEDDED DEVICES

Publication number: 20240152726

Abstract: A processor-implemented method for a neural architecture search (NAS) starts by generating an over-parameterized super network having multiple layers. The super network has multiple operator types. Each of the layers includes a largest super kernel corresponding to a search space. The method also includes performing gradient descent to evolve a largest super kernel to a small kernel corresponding to the search space in order to generate a range of kernel encodings. The method further includes identifying a subset of kernel encodings from the range of kernel encodings, for each layer of the super network, based on the gradient descent. The method determines a set of candidate architectures based on the subset of kernel encodings, each of the candidate architectures having a different model size. The method selects a target model, from the set of architectures, based on meeting hardware specifications, and then applies the target model.

Type: Application

Filed: August 1, 2023

Publication date: May 9, 2024

Inventors: Chen FENG, Xiaopeng ZHANG, Shaojie ZHUO, Ramchalam KINATTINKARA RAMAKRISHNAN, Chenzheng SU, Liang SHEN, Zi Wen HAN, Yicheng LIN

SYSTEMS AND METHODS FOR STATIC CACHED DECODING

SINGLE SEARCH FOR ARCHITECTURES ON EMBEDDED DEVICES