Patents by Inventor Don Joven Ravoy Agravante

Don Joven Ravoy Agravante has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sequential learning of constraints for hierarchical reinforcement learning

Patent number: 11734575

Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.

Type: Grant

Filed: July 30, 2018

Date of Patent: August 22, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
Sim-to-real learning of 2D multiple sound source localization

Patent number: 11676032

Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.

Type: Grant

Filed: February 28, 2020

Date of Patent: June 13, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
INTEGRATED AI PLANNERS AND RL AGENTS THROUGH AI PLANNING ANNOTATION IN RL

Publication number: 20230177368

Abstract: A computer-implemented method of integrating an Artificial Intelligence (AI) planner and a reinforcement learning (RL) agent through AI planning annotation in RL (PaRL) includes identifying an RL problem. A description received of a Markov decision process (MDP) having a plurality of states in an RL environment is used to generate an RL task to solve the RL problem. An AI planning model described in a planning language is received, and mapping state spaces from the MDP states in the RL environment to AI planning states of the AI planning model is performed. The RL task is generated with an AI planning task from the mapping to generate a PaRL task.

Type: Application

Filed: December 8, 2021

Publication date: June 8, 2023

Inventors: Junkyu Lee, Michael Katz, Shirin Sohrabi Araghi, Don Joven Ravoy Agravante, Miao Liu, Tamir Klinger, Murray Scott Campbell
Imitation learning by action shaping with antagonist reinforcement learning

Patent number: 11537872

Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.

Type: Grant

Filed: July 30, 2018

Date of Patent: December 27, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
Action shaping from demonstration for fast reinforcement learning

Patent number: 11501157

Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.

Type: Grant

Filed: July 30, 2018

Date of Patent: November 15, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana
LEARNING OF OPERATOR FOR PLANNING PROBLEM

Publication number: 20220309383

Abstract: A method for inferring an operator including a precondition and an effect of the operator for a planning problem is disclosed. In the method, a set of examples, each of which includes a base state, an action and a next state after performing the action in the base state is prepared. In the method, variable lifting is performed in relation to the set of examples. In the method, a validity label is computed for each example in the set of examples. In the method, a model is trained by using the set of examples with the validity label so that the model is configured to receive an input state and a representation of an input action and output at least validity of the input action for the input state. In the method, the precondition of the operator based on the model and the effect of the operator are outputted.

Type: Application

Filed: March 24, 2021

Publication date: September 29, 2022

Inventors: CORENTIN JACQUES ANDRE SAUTIER, DON JOVEN RAVOY AGRAVANTE, Michiaki Tatsubori
Two-dimensional sound localization with transformation layer

Patent number: 11425496

Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.

Type: Grant

Filed: May 1, 2020

Date of Patent: August 23, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
TRAINING A SEMANTIC PARSER USING ACTION TEMPLATES

Publication number: 20220198255

Abstract: Methods and systems for training a semantic parser includes performing an automated intervention action in a text-based environment. An inverse action is performed in the text-based environment to reverse the intervention action. States of the text-based environment are recorded before and after the intervention action and the inverse action. The recorded states are evaluated to generate training data. A semantic parser neural network model is trained using the training data.

Type: Application

Filed: December 17, 2020

Publication date: June 23, 2022

Inventors: Corentin Jacques Andre Sautier, Don Joven Ravoy Agravante, Michiaki Tatsubori
TWO-DIMENSIONAL SOUND LOCALIZATION WITH TRANSFORMATION LAYER

Publication number: 20210345039

Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.

Type: Application

Filed: May 1, 2020

Publication date: November 4, 2021

Inventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
SIM-TO-REAL LEARNING OF 2D MULTIPLE SOUND SOURCE LOCALIZATION

Publication number: 20210271978

Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.

Type: Application

Filed: February 28, 2020

Publication date: September 2, 2021

Inventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
SEQUENTIAL LEARNING OF CONSTRAINTS FOR HIERARCHICAL REINFORCEMENT LEARNING

Publication number: 20200034704

Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.

Type: Application

Filed: July 30, 2018

Publication date: January 30, 2020

Inventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
IMITATION LEARNING BY ACTION SHAPING WITH ANTAGONIST REINFORCEMENT LEARNING

Publication number: 20200034706

Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.

Type: Application

Filed: July 30, 2018

Publication date: January 30, 2020

Inventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
ACTION SHAPING FROM DEMONSTRATION FOR FAST REINFORCEMENT LEARNING

Publication number: 20200034705

Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.

Type: Application

Filed: July 30, 2018

Publication date: January 30, 2020

Inventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana