Abstract: The disclosed embodiments relate to a system that generates and executes a deep neural network (DNN) based on target runtime parameters. During operation, the system receives a trained original model and a set of target runtime parameters for the DNN, wherein the target runtime parameters are associated with one or more of the following for the DNN: desired operating conditions, desired resource utilization, and desired accuracy of results. Next, the system generates a context-specific model based on the original model and the set of target runtime parameters. The system also generates an operational plan for executing both the original model and the context-specific model to meet requirements of the target runtime parameters. Finally, the system controls execution of the original model and the context-specific model based on the operational plan.
Abstract: A system, apparatus and method are provided for securing a neural network (or other artificial intelligence model) against malicious activity, such as piracy, theft of intellectual property, sabotage, etc. One or more security elements or features (e.g., digital watermarks, encryption, obfuscation) are applied to the neural network model during training and/or optimization. Therefore, the model is enhanced with robust security before it is linked or merged with application software for performing inference processing using the model.
Type:
Application
Filed:
August 22, 2023
Publication date:
February 29, 2024
Applicant:
Latent AI, Inc.
Inventors:
Sek Meng Chai, Jonathan D. Brookshire, Abelardo Lopez-Lagunas
Abstract: The disclosed embodiments relate to a system that optimizes execution of a DNN based on operational performance parameters. During operation, the system collects the operational performance parameters from the DNN during operation of the DNN, wherein the operational performance parameters include parameters associated with operating conditions for the DNN, parameters associated with resource utilization during operation of the DNN, and parameters associated with accuracy of results produced by the DNN. Next, the system uses the operational performance parameters to update the DNN model to improve performance and efficiency during execution of the DNN.
Abstract: Systems, tools and methods are provided for optimizing neural networks (NNs) to run efficiently on target hardware such as central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), etc. The provided software tools are implemented as part of a machine-learning operations (MLOps) workflow for building a neural network, and include optimization algorithms (e.g., for quantization and/or pruning) and compiler processes that reduce memory requirements and processing latency.
Type:
Application
Filed:
March 17, 2023
Publication date:
September 21, 2023
Applicant:
Latent AI, Inc.
Inventors:
Sek Meng Chai, Jan Ernst, Abelardo Lopez-Lagunas, Ryan M. Dailey
Abstract: The disclosed embodiments relate to a system that generates and executes a deep neural network (DNN) based on target runtime parameters. During operation, the system receives a trained original model and a set of target runtime parameters for the DNN, wherein the target runtime parameters are associated with one or more of the following for the DNN: desired operating conditions, desired resource utilization, and desired accuracy of results. Next, the system generates a context-specific model based on the original model and the set of target runtime parameters. The system also generates an operational plan for executing both the original model and the context-specific model to meet requirements of the target runtime parameters. Finally, the system controls execution of the original model and the context-specific model based on the operational plan.
Abstract: The disclosed embodiments relate to a system that facilitates dynamic runtime execution of a deep neural network (DNN). During operation, the system receives a model, a set of weights and runtime metadata for the DNN. The system also obtains code to perform inference-processing operations for the DNN. Next, the system compiles code to implement a runtime engine that facilitates throttling operations during execution of the inference-processing operations, wherein the runtime engine conserves computing resources by selecting portions of the inference-processing operations to execute based on the runtime metadata.
Abstract: The disclosed embodiments relate to a system that optimizes execution of a DNN based on operational performance parameters. During operation, the system collects the operational performance parameters from the DNN during operation of the DNN, wherein the operational performance parameters include parameters associated with operating conditions for the DNN, parameters associated with resource utilization during operation of the DNN, and parameters associated with accuracy of results produced by the DNN. Next, the system uses the operational performance parameters to update the DNN model to improve performance and efficiency during execution of the DNN.