Patents by Inventor Chenzheng SU

Chenzheng SU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250094775
    Abstract: Cached decoding systems and techniques are described. A system (e.g., decoder) receives an input token (e.g., input vector). The system applies a projection tensor (e.g., a projection matrix) to the input token to generate a feature tensor (e.g., a key tensor or a value tensor). The system processes at least the feature tensor and at least one previous feature tensor using at least one attention calculation to generate an output token. The at least one previous feature tensor is retrieved from a buffer. The at least one previous feature tensor can be stored in the buffer after having been previously calculated based on application of the projection tensor to a previous input token (e.g., from a previous iteration before the iteration in which the input token is received).
    Type: Application
    Filed: September 15, 2023
    Publication date: March 20, 2025
    Inventors: Shaojie ZHUO, Ramchalam KINATTINKARA RAMAKRISHNAN, Xiaopeng ZHANG, Yicheng LIN, Chenzheng SU, Liang SHEN
  • Publication number: 20240152726
    Abstract: A processor-implemented method for a neural architecture search (NAS) starts by generating an over-parameterized super network having multiple layers. The super network has multiple operator types. Each of the layers includes a largest super kernel corresponding to a search space. The method also includes performing gradient descent to evolve a largest super kernel to a small kernel corresponding to the search space in order to generate a range of kernel encodings. The method further includes identifying a subset of kernel encodings from the range of kernel encodings, for each layer of the super network, based on the gradient descent. The method determines a set of candidate architectures based on the subset of kernel encodings, each of the candidate architectures having a different model size. The method selects a target model, from the set of architectures, based on meeting hardware specifications, and then applies the target model.
    Type: Application
    Filed: August 1, 2023
    Publication date: May 9, 2024
    Inventors: Chen FENG, Xiaopeng ZHANG, Shaojie ZHUO, Ramchalam KINATTINKARA RAMAKRISHNAN, Chenzheng SU, Liang SHEN, Zi Wen HAN, Yicheng LIN