Patents by Inventor Don Joven Ravoy Agravante

Don Joven Ravoy Agravante has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11734575
    Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: August 22, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
  • Patent number: 11676032
    Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: June 13, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
  • Publication number: 20230177368
    Abstract: A computer-implemented method of integrating an Artificial Intelligence (AI) planner and a reinforcement learning (RL) agent through AI planning annotation in RL (PaRL) includes identifying an RL problem. A description received of a Markov decision process (MDP) having a plurality of states in an RL environment is used to generate an RL task to solve the RL problem. An AI planning model described in a planning language is received, and mapping state spaces from the MDP states in the RL environment to AI planning states of the AI planning model is performed. The RL task is generated with an AI planning task from the mapping to generate a PaRL task.
    Type: Application
    Filed: December 8, 2021
    Publication date: June 8, 2023
    Inventors: Junkyu Lee, Michael Katz, Shirin Sohrabi Araghi, Don Joven Ravoy Agravante, Miao Liu, Tamir Klinger, Murray Scott Campbell
  • Patent number: 11537872
    Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: December 27, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
  • Patent number: 11501157
    Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: November 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana
  • Publication number: 20220309383
    Abstract: A method for inferring an operator including a precondition and an effect of the operator for a planning problem is disclosed. In the method, a set of examples, each of which includes a base state, an action and a next state after performing the action in the base state is prepared. In the method, variable lifting is performed in relation to the set of examples. In the method, a validity label is computed for each example in the set of examples. In the method, a model is trained by using the set of examples with the validity label so that the model is configured to receive an input state and a representation of an input action and output at least validity of the input action for the input state. In the method, the precondition of the operator based on the model and the effect of the operator are outputted.
    Type: Application
    Filed: March 24, 2021
    Publication date: September 29, 2022
    Inventors: CORENTIN JACQUES ANDRE SAUTIER, DON JOVEN RAVOY AGRAVANTE, Michiaki Tatsubori
  • Patent number: 11425496
    Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: August 23, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
  • Publication number: 20220198255
    Abstract: Methods and systems for training a semantic parser includes performing an automated intervention action in a text-based environment. An inverse action is performed in the text-based environment to reverse the intervention action. States of the text-based environment are recorded before and after the intervention action and the inverse action. The recorded states are evaluated to generate training data. A semantic parser neural network model is trained using the training data.
    Type: Application
    Filed: December 17, 2020
    Publication date: June 23, 2022
    Inventors: Corentin Jacques Andre Sautier, Don Joven Ravoy Agravante, Michiaki Tatsubori
  • Publication number: 20210345039
    Abstract: Methods and systems for localizing a sound source include determining a spatial transformation between a position of a reference microphone array and a position of a displaced microphone array. A sound is measured at the reference microphone array and at the displaced microphone array. A source of the sound is localized using a neural network that includes respective paths for the reference microphone array and the displaced microphone array. The neural network further includes a transformation layer that represents the spatial transformation.
    Type: Application
    Filed: May 1, 2020
    Publication date: November 4, 2021
    Inventors: Guillaume Jean Victor Marie Le Moing, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Don Joven Ravoy Agravante, Tadanobu Inoue, Asim Munawar
  • Publication number: 20210271978
    Abstract: A computer-implemented method is provided for training a multi-source sound localization model using labeled simulation data and unlabeled real data. The method includes inputting the labeled simulation data and the unlabeled real data respectively into a multi-source sound localization model of a neural network to obtain a localization heatmap from an output layer of the multi-source sound localization model for each of the labeled simulation data and the unlabeled real data. The method further includes inputting the localization heatmap for each of the labeled simulation data and the unlabeled real data into an output discriminator. The method also includes training the output discriminator so that the output discriminator assigns a domain class label to distinguish simulation data from real data. The method additionally includes training, by a hardware process, the multi-source sound localization model by a first adversarial loss for the output discriminator with an original localization model loss.
    Type: Application
    Filed: February 28, 2020
    Publication date: September 2, 2021
    Inventors: Guillaume Jean Victor Marie Le Moing, Don Joven Ravoy Agravante, Phongtharin Vinayavekhin, Jayakorn Vongkulbhisal, Tadanobu Inoue, Asim Munawar
  • Publication number: 20200034704
    Abstract: A computer-implemented method, computer program product, and computer processing system are provided for Hierarchical Reinforcement Learning (HRL) with a target task. The method includes obtaining, by a processor device, a sequence of tasks based on hierarchical relations between the tasks, the tasks constituting the target task. The method further includes learning, by a processor device, a sequence of constraints corresponding to the sequence of tasks by repeating, for each of the tasks in the sequence, reinforcement learning and supervised learning with a set of good samples and a set of bad samples and by applying an obtained constraint for a current task to a next task.
    Type: Application
    Filed: July 30, 2018
    Publication date: January 30, 2020
    Inventors: Don Joven Ravoy Agravante, Giovanni De De Magistris, Tu-Hoa Pham, Ryuki Tachibana
  • Publication number: 20200034706
    Abstract: A computer-implemented method, computer program product, and computer processing system are provided for obtaining a plurality of bad demonstrations. The method includes reading, by a processor device, a protagonist environment. The method further includes training, by the processor device, a plurality of antagonist agents to fail a task by reinforcement learning using the protagonist environment. The method also includes collecting, by the processor device, the plurality of bad demonstrations by playing the trained antagonist agents on the protagonist environment.
    Type: Application
    Filed: July 30, 2018
    Publication date: January 30, 2020
    Inventors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Ravoy Agravante, Ryuki Tachibana
  • Publication number: 20200034705
    Abstract: A method is provided for reinforcement learning. The method includes obtaining, by a processor device, a first set and a second set of state-action tuples. Each of the state-action tuples in the first set represents a respective good demonstration. Each of the state-action tuples in the second set represents a respective bad demonstration. The method further includes training, by the processor device using supervised learning with the first set and the second set, a neural network which takes as input a state to provide an output. The output is parameterized to obtain each of a plurality of real-valued constraint functions used for evaluation of each of a plurality of action constraints. The method also includes training, by the processor device, a policy using reinforcement learning by restricting actions predicted by the policy according to each of the plurality of action constraints with each of the plurality of real-valued constraint functions.
    Type: Application
    Filed: July 30, 2018
    Publication date: January 30, 2020
    Inventors: Tu-Hoa Pham, Don Joven Ravoy Agravante, Giovanni De Magistris, Ryuki Tachibana