Patents Assigned to ALPHAICS CORPORATION
  • Patent number: 10970623
    Abstract: A reinforcement learning processor specifically configured to train reinforcement learning agents in the AI systems by the way of implementing an application-specific instruction set is disclosed. The application-specific instruction set incorporates ‘Single Instruction Multiple Agents (SIMA)’ instructions. SIMA type instructions are specifically designed to be implemented simultaneously on a plurality of reinforcement learning agents which interact with corresponding reinforcement learning environments. The SIMA type instructions are specifically configured to receive either a reinforcement learning agent ID or a reinforcement learning environment ID as the operand. The reinforcement learning processor is designed for parallelism in reinforcement learning operations. The reinforcement learning processor executing of a plurality of threads associated with an operation or task in parallel.
    Type: Grant
    Filed: October 17, 2017
    Date of Patent: April 6, 2021
    Assignee: Alphaics Corporation
    Inventor: Nagendra Nagaraja
  • Patent number: 10949743
    Abstract: The embodiments herein disclose a system and method for implementing reinforcement learning agents using a reinforcement learning processor. An application-domain specific instruction set (ASI) for implementing reinforcement learning agents and reward functions is created. Further, instructions are created by including at least one of the reinforcement learning agent ID vectors, the reinforcement learning environment ID vectors, and length of vector as an operand. The reinforcement learning agent ID vectors and the reinforcement learning environment ID vectors are pointers to a base address of an operations memory. Further, at least one of said reinforcement learning agent ID vector and reinforcement learning environment ID vector is embedded into operations associated with the decoded instruction. The instructions retrieved by agent ID vector indexed operation are executed using a second processor, and applied onto a group of reinforcement learning agents.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: March 16, 2021
    Assignee: ALPHAICS CORPORATION
    Inventor: Nagendra Nagaraja
  • Patent number: 10372859
    Abstract: The embodiments herein discloses a system and method for designing SoC by using a reinforcement learning processor. An SoC specification input is received and a plurality of domains and a plurality of subdomains is created using application specific instruction set to generate chip specific graph library. An interaction is initiated between the reinforcement learning agent and the reinforcement learning environment using the application specific instructions. Each of the SoC sub domains from the plurality of SoC sub domains is mapped to a combination of environment, rewards and actions by a second processor. Further, interaction of a plurality of agents is initiated with the reinforcement learning environment for a predefined number of times and further Q value, V value, R value, and A value is updated in the second memory module. Thereby, an optimal chip architecture for designing SoC is acquired using application-domain specific instruction set (ASI).
    Type: Grant
    Filed: January 1, 2018
    Date of Patent: August 6, 2019
    Assignee: ALPHAICS CORPORATION
    Inventor: Nagendra Nagaraja
  • Patent number: 9892223
    Abstract: The embodiments herein discloses a system and method for designing SoC by synchronizing a hierarchy of SMDPs. Reinforcement Learning is done either hierarchically in several steps or in a single-step comprising environment, tasks, agents and experiments, to have access to SoC (System on a Chip) related information. The AI agent is configured to learn from the interaction and plan the implementation of a SoC circuit design. Q values generated for each domain and sub domain are stored in a hierarchical SMDP structure in a form of SMDP Q table in a big data database. An optimal chip architecture corresponding to a maximum Q value of a top level in the SMDP Q table is acquired and stored in a database for learning and inference. Desired SoC configuration is optimized and generated based on the optimal chip architecture and the generated chip specific graph library.
    Type: Grant
    Filed: September 7, 2017
    Date of Patent: February 13, 2018
    Assignee: ALPHAICS CORPORATION
    Inventor: Nagendra Nagaraja
  • Patent number: 9792397
    Abstract: The embodiments herein discloses a system and method for designing SoC using AI and Reinforcement Learning (RL) techniques. Reinforcement Learning is done either hierarchically in several steps or in a single-step comprising environment, tasks, agents and experiments, to have access to SoC (System on a Chip) related information. The AI agent is configured to learn from the interaction and plan the implementation of a SoC circuit design. Q values generated for each domain and sub domain are stored in a hierarchical SMDP structure in a form of SMDP Q table in a big data database. An optimal chip architecture corresponding to a maximum Q value of a top level in the SMDP Q table is acquired and stored in a database for learning and inference. Desired SoC configuration is optimized and generated based on the optimal chip architecture and the generated chip specific graph library.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: October 17, 2017
    Assignee: ALPHAICS CORPORATION
    Inventor: Nagendra Nagaraja
  • Patent number: 9754221
    Abstract: A reinforcement learning processor specifically configured to execute reinforcement learning operations by the way of implementing an application-specific instruction set is envisaged. The application-specific instruction set incorporates ‘Single Instruction Multiple Agents (SIMA)’ instructions. SIMA type instructions are specifically designed to be implemented simultaneously on a plurality of reinforcement learning agents which interact with corresponding reinforcement learning environments. The SIMA type instructions are specifically configured to receive either a reinforcement learning agent ID or a reinforcement learning environment ID as the operand. The reinforcement learning processor uses neural network data paths to communicate with a neural network which in turn uses the actions, state-value functions, Q-values and reward values generated by the reinforcement learning processor to approximate an optimal state-value function as well as an optimal reward function.
    Type: Grant
    Filed: March 9, 2017
    Date of Patent: September 5, 2017
    Assignee: ALPHAICS CORPORATION
    Inventor: Nagendra Nagaraja