Patents by Inventor Pei-Hsuan HSIEH

Pei-Hsuan HSIEH has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250117626
    Abstract: A computing device is provided, including processor and a storage device holding instructions that are executable by the processor to implement a base artificial intelligence (AI) model and two or more delta AI models, each delta AI model having lower dimensionality than the base AI model. An inference request including an input prompt is received, the inference request specifying a selected delta AI model of the two or more delta AI models. The input prompt is input to the base AI model to thereby generate a base model result vector. The input prompt is input to the selected delta AI model to thereby generate a delta model result vector. An output vector is generated by combining the base model result vector and the delta model result vector via a combination operation. The output vector is output.
    Type: Application
    Filed: October 9, 2023
    Publication date: April 10, 2025
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Sanjay RAMANUJAN, Ciprian CHISALITA, Pei-Hsuan HSIEH, Derek Edward HYATT, Rakesh KELKAR, Karthik RAMAN
  • Publication number: 20240419493
    Abstract: A method, computer program product, and computing system for processing workload data associated with processing a plurality of requests for an artificial intelligence (AI) model on a processing unit. A maximum number of key-value (KV) cache blocks available for the workload data is determined by simulating the workload data using a simulation engine. A token utilization for the workload data is determined based upon, at least in part, the maximum number of KV cache blocks available for the workload data. Processing unit resources are allocated for the processing unit based upon, at least in part, the token utilization.
    Type: Application
    Filed: June 14, 2023
    Publication date: December 19, 2024
    Inventors: Sanjay Ramanujan, Karthik Raman, Rakesh Kelkar, Kalyan Kumar Bhukya, Archit Shukla, Pei-Hsuan Hsieh
  • Publication number: 20240411658
    Abstract: This document relates to predicting performance of large artificial intelligence (LAI) models that are too large to be handled by a single computing device. One example can receive a sample workload for a trained LAI model and identify multiple nodes functioning as a cluster to instantiate an instance of the trained LAI model. The example can predict performance characteristics for accomplishing the sample workload on the cluster and can cause at least some of the predicted performance characteristics to be presented on a user interface.
    Type: Application
    Filed: June 9, 2023
    Publication date: December 12, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Sanjay RAMANUJAN, Karthik RAMAN, Rakesh KELKAR, Pei-Hsuan HSIEH