Patents by Inventor Don Joven Ravoy Agravante
Don Joven Ravoy Agravante has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11734575Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.Type: GrantFiled: July 30, 2018Date of Patent: August 22, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
-
Patent number: 11676032Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.Type: GrantFiled: February 28, 2020Date of Patent: June 13, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
-
Publication number: 20230177368Abstract: A computer-implemented method of integrating an Artificial Intelligence (AI) planner and a reinforcement learning (RL) agent through AI planning annotation in RL (PaRL) includes identifying an RL problem. A description received of a Markov decision process (MDP) having a plurality of states in an RL environment is used to generate an RL task to solve the RL problem. An AI planning model described in a planning language is received, and mapping state spaces from the MDP states in the RL environment to AI planning states of the AI planning model is performed. The RL task is generated with an AI planning task from the mapping to generate a PaRL task.Type: ApplicationFiled: December 8, 2021Publication date: June 8, 2023Inventors: Junkyu Lee, Michael Katz, Shirin Sohrabi Araghi, Don Joven Ravoy Agravante, Miao Liu, Tamir Klinger, Murray Scott Campbell
-
Patent number: 11537872Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.Type: GrantFiled: July 30, 2018Date of Patent: December 27, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
-
Patent number: 11501157Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.Type: GrantFiled: July 30, 2018Date of Patent: November 15, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana
-
Publication number: 20220309383Abstract: A method for inferring an operator including a precondition and an effect of the operator for a planning problem is disclosed. In the method, a set of examples, each of which includes a base state, an action and a next state after performing the action in the base state is prepared. In the method, variable lifting is performed in relation to the set of examples. In the method, a validity label is computed for each example in the set of examples. In the method, a model is trained by using the set of examples with the validity label so that the model is configured to receive an input state and a representation of an input action and output at least validity of the input action for the input state. In the method, the precondition of the operator based on the model and the effect of the operator are outputted.Type: ApplicationFiled: March 24, 2021Publication date: September 29, 2022Inventors: CORENTIN JACQUES ANDRE SAUTIER, DON JOVEN RAVOY AGRAVANTE, Michiaki Tatsubori
-
Patent number: 11425496Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.Type: GrantFiled: May 1, 2020Date of Patent: August 23, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
-
Publication number: 20220198255Abstract: Methods and systems for training a semantic parser includes performing an automated intervention action in a text-based environment. An inverse action is performed in the text-based environment to reverse the intervention action. States of the text-based environment are recorded before and after the intervention action and the inverse action. The recorded states are evaluated to generate training data. A semantic parser neural network model is trained using the training data.Type: ApplicationFiled: December 17, 2020Publication date: June 23, 2022Inventors: Corentin Jacques Andre Sautier, Don Joven Ravoy Agravante, Michiaki Tatsubori
-
Publication number: 20210345039Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.Type: ApplicationFiled: May 1, 2020Publication date: November 4, 2021Inventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
-
Publication number: 20210271978Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.Type: ApplicationFiled: February 28, 2020Publication date: September 2, 2021Inventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
-
Publication number: 20200034704Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.Type: ApplicationFiled: July 30, 2018Publication date: January 30, 2020Inventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
-
Publication number: 20200034706Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.Type: ApplicationFiled: July 30, 2018Publication date: January 30, 2020Inventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
-
Publication number: 20200034705Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.Type: ApplicationFiled: July 30, 2018Publication date: January 30, 2020Inventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana