Patents by Inventor Simo Lin

Simo Lin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250094223
    Abstract: A system and computer-implemented method include receiving a request for allocating graphical processing unit (GPU) resources for performing an operation. The request includes metadata identifying a client identifier (ID) associated with a client, throughput, and latency of the operation. A resource limit is determined for performing the operation based on the metadata. Attributes associated with each GPU resource of a plurality of GPU resources available for assignment are obtained. The attribute is analyzed that is associated with each GPU resource with respect to the resource limit. A set of GPU resources is identified from the plurality of GPU resources based on the analysis. A dedicated AI cluster is generated by patching the set of GPU resources within a single cluster. The dedicated AI cluster reserves a portion of a computation capacity of a computing system for a period of time and the dedicated AI cluster is allocated to the client associated with the client ID.
    Type: Application
    Filed: May 28, 2024
    Publication date: March 20, 2025
    Applicant: Oracle International Corporation
    Inventors: Ming Fang, Simo Lin, Jinguo Zhang, Wei Gao
  • Publication number: 20250094234
    Abstract: A system and computer-implemented method include accessing a request for allocating graphical processing unit (GPU) resources for performing an operation. The request includes metadata identifying a client identifier associated with a client, throughput, and a latency of the operation. A predicted resource limit for performing the operation is determined based on the metadata. A parameter of GPU resources is obtained. The parameter includes a status indicating whether a GPU resource is occupied for performing another operation. A GPU resource utilization value is determined for each node based on the status. The GPU resource utilization value indicates the amount of utilization of GPU resources of the corresponding node. The GPU resource utilization value of each node is compared with a pre-defined resource utilization threshold value. The GPU resources are re-scheduled based on the predicted resource limit. Further, a set of GPU resources from the re-scheduled GPU resources for performing the operation.
    Type: Application
    Filed: May 28, 2024
    Publication date: March 20, 2025
    Applicant: Oracle International Corporation
    Inventors: Ming Fang, Yifeng Liu, Simo Lin, Wei Gao
  • Publication number: 20250097013
    Abstract: The present disclosure relates to secure deployment of model weights from a generative artificial intelligence (GenAI) platform to a cloud service. The method includes accessing the model metadata and a set of weights of a GenAI model associated with a GenAI platform. These model weights may be encrypted using a first encryption key that may be provided in the model metadata. These encrypted model weights may be decrypted based on the model metadata by utilizing the first encryption key from the model metadata. Each key may be associated with the specific type of GenAI model. Before storing the model weights from the GenAI platform cloud tenancy to a cloud storage in GenAI home region, the model weights may be encrypted again by utilizing a second encryption key. This encryption by the cloud may enable independent control over the sensitive information during transit and storing.
    Type: Application
    Filed: May 28, 2024
    Publication date: March 20, 2025
    Applicant: Oracle International Corporation
    Inventors: Ming Fang, Simo Lin, Beiwen Guo, Wei Gao