Patents by Inventor James MACGLASHAN
James MACGLASHAN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12354027Abstract: A method and system for teaching an artificial intelligent agent where the agent can be placed in a state that it would like it to learn how to achieve. By giving the agent several examples, it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples to demonstrate a goal configuration. Once the agent has learned certain goal configurations, the agent can learn policies and skills that achieve the learned goal configuration. The agent may create a collection of these policies and skills from which to select based on a particular command or state.Type: GrantFiled: April 3, 2018Date of Patent: July 8, 2025Assignee: SONY GROUP CORPORATIONInventors: Mark Bishop Ring, Satinder Baveja, Peter Stone, James MacGlashan, Samuel Barrett, Roberto Capobianco, Varun Kompella, Kaushik Subramanian, Peter Wurman
-
Patent number: 12277194Abstract: A task prioritized experience replay (TaPER) algorithm enables simultaneous learning of multiple RL tasks off policy. The algorithm can prioritize samples that were part of fixed length episodes that led to the achievement of tasks. This enables the agent to quickly learn task policies by bootstrapping over its early successes. Finally, TaPER can improve performance on all tasks simultaneously, which is a desirable characteristic for multi-task RL. Unlike conventional ER algorithms that are applied to single RL task learning settings or that require rewards to be binary or abundant, or are provided as a parameterized specification of goals, TaPER poses no such restrictions and supports arbitrary reward and task specifications.Type: GrantFiled: September 29, 2020Date of Patent: April 15, 2025Assignee: SONY GROUP CORPORATIONInventors: Varun Kompella, James MacGlashan, Peter Wurman, Peter Stone
-
Patent number: 12217156Abstract: A real-time temporal convolution network (RT-TCN) algorithm reuses the output of prior convolution operations in all layers of the network to minimize the computational requirements and memory footprint of a TCN during real-time evaluation. Further, a TCN trained via the fixed-window view, where the TCN is trained using fixed time splices of the input time series, can be executed in real-time continually using RT-TCN.Type: GrantFiled: August 20, 2020Date of Patent: February 4, 2025Assignee: SONY GROUP CORPORATIONInventors: Piyush Khandelwal, James MacGlashan, Peter Wurman, Fabrizio Santini
-
Patent number: 12153385Abstract: Systems and methods are used to adapt the coefficients of a proportional-integral-derivative (PID) controller through reinforcement learning. The approach for adapting PID coefficients can include an outer loop of reinforcement learning where the PID coefficients are tuned to changes in the environment and an inner loop of PID control for quickly reacting to changing errors. The outer loop can learn and adapt as the environment changes and be configured to only run at a predetermined frequency, after a given number of steps. The outer loop can use summary statistics about the error terms and any other information sensed about the environment to calculate an observation. This observation can be used to evaluate the next action, for example, by feeding it into a neural network representing the policy. The resulting action is the coefficients of the PID controller and the tunable parameters of things such as the filters.Type: GrantFiled: May 7, 2021Date of Patent: November 26, 2024Assignees: SONY GROUP CORPORATION, SONY CORPORATION OF AMERICAInventors: Samuel Barrett, James MacGlashan, Varun Kompella, Peter Wurman, Goker Erdogan, Fabrizio Santini
-
Patent number: 12017148Abstract: A user interface (UI), for analyzing model training runs, tracking and visualizing various aspects of machine learning experiments, can be used when training an artificial intelligent agent in, for example, a racing game environment. The UI can be web-based and can allow researchers to easily see the status of their experiments. The UI can include an experiment synchronized event viewer that can synchronizes visualizations, videos, and timeline/metrics graphs in the experiment. This viewer allows researchers to see how experiments unfold in great detail. The UI can further include experiment event annotations that can generate event annotations. These annotations can be displayed via the synchronized event viewer. The UI can be used to consider consolidated results across experiments and can further consider videos. For example, the UI can provide a reusable dashboard that can capture and compare metrics across multiple experiments.Type: GrantFiled: May 31, 2022Date of Patent: June 25, 2024Assignee: SONY GROUP CORPORATIONInventors: Rory Douglas, Dion Whitehead, Leon Barrett, Piyush Khandelwal, Thomas Walsh, Samuel Barrett, Kaushik Subramanian, James MacGlashan, Leilani Gilpin, Peter Wurman
-
Publication number: 20230381660Abstract: A user interface (UI), for analyzing model training runs, tracking and visualizing various aspects of machine learning experiments, can be used when training an artificial intelligent agent in, for example, a racing game environment. The UI can be web-based and can allow researchers to easily see the status of their experiments. The UI can include an experiment synchronized event viewer that can synchronizes visualizations, videos, and timeline/metrics graphs in the experiment. This viewer allows researchers to see how experiments unfold in great detail. The UI can further include experiment event annotations that can generate event annotations. These annotations can be displayed via the synchronized event viewer. The UI can be used to consider consolidated results across experiments and can further consider videos. For example, the UI can provide a reusable dashboard that can capture and compare metrics across multiple experiments.Type: ApplicationFiled: May 31, 2022Publication date: November 30, 2023Inventors: Rory Douglas, Dion Whitehead, Leon Barrett, Piyush Khandelwal, Thomas Walsh, Samuel Barrett, Kaushik Subramanian, James MacGlashan, Leilani Gilpin, Peter Wurman
-
Patent number: 11816591Abstract: The Double Actor Critic (DAC) reinforcement-learning algorithm affords stable policy improvement and aggressive neural-net optimization without catastrophic overfitting of the policy. DAC trains models using an arbitrary history of data in both offline and online learning and can be used to smoothly improve on an existing policy learned or defined by some other means. Finally, DAC can optimize reinforcement learning problems with discrete and continuous action spaces.Type: GrantFiled: February 25, 2020Date of Patent: November 14, 2023Assignees: SONY GROUP CORPORATION, SONY CORPORATION OF AMERICAInventor: James Macglashan
-
Publication number: 20220365493Abstract: Systems and methods are used to adapt the coefficients of a proportional-integral-derivative (PID) controller through reinforcement learning. The approach for adapting PID coefficients can include an outer loop of reinforcement learning where the PID coefficients are tuned to changes in the environment and an inner loop of PID control for quickly reacting to changing errors. The outer loop can learn and adapt as the environment changes and be configured to only run at a predetermined frequency, after a given number of steps. The outer loop can use summary statistics about the error terms and any other information sensed about the environment to calculate an observation. This observation can be used to evaluate the next action, for example, by feeding it into a neural network representing the policy. The resulting action is the coefficients of the PID controller and the tunable parameters of things such as the filters.Type: ApplicationFiled: May 7, 2021Publication date: November 17, 2022Inventors: Samuel Barrett, James MacGlashan, Varun Kompella, Peter Wurman, Goker Erdogan, Fabrizio Santini
-
Patent number: 11443229Abstract: A method and system for teaching an artificial intelligent agent includes giving the agent several examples where it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples to demonstrate a goal configuration. Once the agent has learned certain goal configurations, the agent can learn an option to achieve the goal configuration and a distance function that predicts at least one of a distance and a duration to the goal configuration under the learned option. This distance function prediction may be incorporated as a state feature of the agent.Type: GrantFiled: August 31, 2018Date of Patent: September 13, 2022Assignees: Sony Group Corporation, Sony Corporation of AmericaInventors: Mark Bishop Ring, Satinder Baveja, Roberto Capobianco, Varun Kompella, Kaushik Subramanian, James MacGlashan
-
Publication number: 20220101064Abstract: A task prioritized experience replay (TaPER) algorithm enables simultaneous learning of multiple RL tasks off policy. The algorithm can prioritize samples that were part of fixed length episodes that led to the achievement of tasks. This enables the agent to quickly learn task policies by bootstrapping over its early successes. Finally, TaPER can improve performance on all tasks simultaneously, which is a desirable characteristic for multi-task RL. Unlike conventional ER algorithms that are applied to single RL task learning settings or that require rewards to be binary or abundant, or are provided as a parameterized specification of goals, TaPER poses no such restrictions and supports arbitrary reward and task specifications.Type: ApplicationFiled: September 29, 2020Publication date: March 31, 2022Inventors: Varun Kompella, James MacGlashan, Peter Wurman, Peter STONE
-
Publication number: 20220067504Abstract: Reinforcement learning methods can use actor-critic networks where (1) additional laboratory-only state information is used to train a policy that much act without this additional laboratory-only information in a production setting; and (2) complex resource-demanding policies are distilled into a less-demanding policy that can be more easily run at production with limited computational resources. The production actor network can be optimized using a frozen version of a large critic network, previously trained with a large actor network. Aspects of these methods can leverage actor-critic methods in which the critic network models the action value function, as opposed to the state value function.Type: ApplicationFiled: August 26, 2020Publication date: March 3, 2022Inventors: Piyush Khandelwal, James MacGlashan, Peter Wurman
-
Publication number: 20210312258Abstract: A real-time temporal convolution network (RT-TCN) algorithm reuses the output of prior convolution operations in all layers of the network to minimize the computational requirements and memory footprint of a TCN during real-time evaluation. Further, a TCN trained via the fixed-window view, where the TCN is trained using fixed time splices of the input time series, can be executed in real-time continually using RT-TCN.Type: ApplicationFiled: August 20, 2020Publication date: October 7, 2021Inventors: Piyush Khandelwal, James MacGlashan, Peter Wurman, Fabrizio Santini
-
Publication number: 20200302323Abstract: The Double Actor Critic (DAC) reinforcement-learning algorithm affords stable policy improvement and aggressive neural-net optimization without catastrophic overfitting of the policy. DAC trains models using an arbitrary history of data in both offline and online learning and can be used to smoothly improve on an existing policy learned or defined by some other means. Finally, DAC can optimize reinforcement learning problems with discrete and continuous action spaces.Type: ApplicationFiled: February 25, 2020Publication date: September 24, 2020Inventor: James MACGLASHAN
-
Publication number: 20200218992Abstract: A method and system for training and/or operating an artificial intelligent agent can use multi-input and/or multi-forecast networks. Multi-forecasts are computational constructs, typically, but not necessarily, neural networks, whose shared network weights can be used to compute multiple related forecasts. This allows for more efficient training, in terms of the amount of data and/or experience needed, and in some instances, for more efficient computation of those forecasts. There are several related and sometimes composable approaches to multi-forecast networks.Type: ApplicationFiled: January 2, 2020Publication date: July 9, 2020Inventors: Roberto Capobianco, Varun Kompella, Kaushik Subramanian, James Macglashan, Peter Wurman, Satinder Baveja
-
Publication number: 20200074349Abstract: A method and system for teaching an artificial intelligent agent includes giving the agent several examples where it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples to demonstrate a goal configuration. Once the agent has learned certain goal configurations, the agent can learn an option to achieve the goal configuration and a distance function that predicts at least one of a distance and a duration to the goal configuration under the learned option. This distance function prediction may be incorporated as a state feature of the agent.Type: ApplicationFiled: August 31, 2018Publication date: March 5, 2020Inventors: Mark Bishop RING, Satinder BAVEJA, Roberto CAPOBIANCO, Varun KOMPELLA, Kaushik SUBRAMANIAN, James MACGLASHAN
-
Publication number: 20190303776Abstract: A method and system for teaching an artificial intelligent agent where the agent can be placed in a state that it would like it to learn how to achieve. By giving the agent several examples, it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples to demonstrate a goal configuration. Once the agent has learned certain goal configurations, the agent can learn policies and skills that achieve the learned goal configuration. The agent may create a collection of these policies and skills from which to select based on a particular command or state.Type: ApplicationFiled: April 3, 2018Publication date: October 3, 2019Applicant: COGITAI, INC.Inventors: Mark Bishop RING, Satinder BAVEJA, Peter STONE, James MACGLASHAN, Samuel BARRETT, Roberto CAPOBIANCO, Varun KOMPELLA, Kaushik SUBRAMANIAN, Peter WURMAN