Patents by Inventor Soshi Iba

Soshi Iba has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8886357
    Abstract: It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: November 11, 2014
    Assignees: Advanced Telecommunications Research Institute International, Honda Motor Co., Ltd.
    Inventors: Norikazu Sugimoto, Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka
  • Patent number: 8768507
    Abstract: A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: July 1, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventors: Soshi Iba, Tadaaki Hasegawa
  • Patent number: 8660699
    Abstract: A behavior control system capable of controlling the behavior of an agent (robot) such that the agent securely applies a force to a moving object. The behavior control system calculates the degree of overlapping of a time-series probability density distribution between a predicted position trajectory of an object (ball) and a position trajectory candidate of a counter object (racket). Further, a behavior plan of the agent (robot) is generated such that the counter object is moved according to a desired position trajectory, which is a mean position trajectory or a central position trajectory of a position trajectory candidate of the counter object which has the highest degree of overlapping with the predicted position trajectory of the object among a plurality of position trajectory candidates of the counter object.
    Type: Grant
    Filed: December 17, 2010
    Date of Patent: February 25, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventor: Soshi Iba
  • Publication number: 20130345865
    Abstract: A system capable of causing an agent to continuously execute a plurality of different subtasks while securing the continuity of behavior of the agent is provided. A plurality of state variable trajectories representing the time series of a state variable of an object are generated according to a stochastic transition model in which the state variable of the object is represented as a random variable. The stochastic transition model is defined so that the transition mode of the state variable is determined according to an execution probability of each subtask in which a probability distribution is represented by a Dirichlet distribution. An operation of the agent is controlled so that the state of the object transits according to one state variable trajectory (desired state variable trajectory) maximizing or optimizing the joint probability of a whole of the stochastic transition model among the plurality of state variable trajectories.
    Type: Application
    Filed: February 22, 2013
    Publication date: December 26, 2013
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Soshi Iba, Akinobu Hayashi
  • Patent number: 8392346
    Abstract: A reinforcement learning system (1) of the present invention utilizes a value of a first value gradient function (dV1/dt) in the learning performed by a second learning device (122), namely in evaluating a second reward (r2(t)). The first value gradient function (dV1/dt) is a temporal differential of a first value function (V1) which is defined according to a first reward (r1(t)) obtained from an environment and is served as a learning result given by a first learning device (121). An action policy which should be taken by a robot (R) to execute a task is determined based on the second reward (r2(t)).
    Type: Grant
    Filed: November 2, 2009
    Date of Patent: March 5, 2013
    Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International
    Inventors: Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka, Norikazu Sugimoto
  • Patent number: 8315740
    Abstract: The present invention provides a motion control system to control a motion of a second motion body, by considering an environment which a human contacts and a motion mode appropriate to the environment, and an environment which a robot actually contacts. The motion mode is learned based on an idea that it is sufficient to learn only a feature part of the motion mode of the human without a necessity to learn the others. Moreover, based on an idea that it is sufficient to reproduce only the feature part of the motion mode of the human without a necessity to reproduce the others, the motion mode of the robot is controlled by using the model obtained from the learning result. Thereby, the motion mode of the robot is controlled by using the motion mode of the human as a prototype without restricting the motion mode thereof more than necessary.
    Type: Grant
    Filed: June 12, 2008
    Date of Patent: November 20, 2012
    Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International
    Inventors: Tadaaki Hasegawa, Yugo Ueda, Soshi Iba, Darrin Bentivegna
  • Publication number: 20120253514
    Abstract: It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.
    Type: Application
    Filed: March 28, 2012
    Publication date: October 4, 2012
    Inventors: Norikazu Sugimoto, Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka
  • Publication number: 20120078416
    Abstract: A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.
    Type: Application
    Filed: September 20, 2011
    Publication date: March 29, 2012
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Soshi Iba, Tadaaki Hasegawa
  • Patent number: 8099374
    Abstract: A behavior estimating system is provided. According to the system, an estimated trajectory which provides the basis on which the behavior of an agent is controlled is generated according to a second model which represents a motion of an instructor in which the position and the displacing velocity of the position of a state variable and the time differential values thereof continuously change, in addition to the position of a characteristic point of a reference trajectory which represents a motion of the instructor and a plurality of first models which represent a plurality of shape characteristics of reference trajectories. A behavior manner corresponding to a first model whose fluctuation, which is allowed under a condition that an estimated trajectory passes a characteristic state variable or a range in the vicinity thereof, is the smallest and whose stability is the highest is estimated as the behavior manner of the instructor.
    Type: Grant
    Filed: June 18, 2009
    Date of Patent: January 17, 2012
    Assignee: Honda Motor Co., Ltd.
    Inventor: Soshi Iba
  • Patent number: 8078321
    Abstract: A behavior control system is capable of causing an agent carry out a task by smooth motions. The behavior control system makes it possible to reproduce a typical shape characteristic of a reference trajectory, i.e., the characteristic of a motion of an instructor carrying out a task, by using a first model defined on the basis of a plurality of reference trajectories representing the position of a first state variable in a time-series manner. Further, a learning trajectory representing the position of a second state variable in a time-series manner is generated on the basis of a second model, which represents an agent's motion in which the position of the second state variable corresponding to the first state variable and one or a plurality of time differential values (a displacing velocity and acceleration) thereof continuously change, in addition to the first model.
    Type: Grant
    Filed: June 18, 2009
    Date of Patent: December 13, 2011
    Assignee: Honda Motor Co., Ltd.
    Inventor: Soshi Iba
  • Publication number: 20110160908
    Abstract: A behavior control system capable of controlling the behavior of an agent (robot) such that the agent securely applies a force to a moving object. The behavior control system calculates the degree of overlapping of a time-series probability density distribution between a predicted position trajectory of an object (ball) and a position trajectory candidate of a counter object (racket). Further, a behavior plan of the agent (robot) is generated such that the counter object is moved according to a desired position trajectory, which is a mean position trajectory or a central position trajectory of a position trajectory candidate of the counter object which has the highest degree of overlapping with the predicted position trajectory of the object among a plurality of position trajectory candidates of the counter object.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 30, 2011
    Applicant: HONDA MOTOR CO., LTD.
    Inventor: Soshi Iba
  • Publication number: 20100114807
    Abstract: A reinforcement learning system (1) of the present invention utilizes a value of a first value gradient function (dV1/dt) in the learning performed by a second learning device (122), namely in evaluating a second reward (r2(t)). The first value gradient function (dV1/dt) is a temporal differential of a first value function (V1) which is defined according to a first reward (r1(t)) obtained from an environment and is served as a learning result given by a first learning device (121). An action policy which should be taken by a robot (R) to execute a task is determined based on the second reward (r2(t)).
    Type: Application
    Filed: November 2, 2009
    Publication date: May 6, 2010
    Applicants: HONDA MOTOR CO., LTD., ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL
    Inventors: Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka, Norikazu Sugimoto
  • Publication number: 20090326710
    Abstract: A behavior control system is capable of causing an agent carry out a task by smooth motions. The behavior control system makes it possible to reproduce a typical shape characteristic of a reference trajectory, i.e., the characteristic of a motion of an instructor carrying out a task, by using a first model defined on the basis of a plurality of reference trajectories representing the position of a first state variable in a time-series manner. Further, a learning trajectory representing the position of a second state variable in a time-series manner is generated on the basis of a second model, which represents an agent's motion in which the position of the second state variable corresponding to the first state variable and one or a plurality of time differential values (a displacing velocity and acceleration) thereof continuously change, in addition to the first model.
    Type: Application
    Filed: June 18, 2009
    Publication date: December 31, 2009
    Applicant: HONDA MOTOR CO., LTD.
    Inventor: Soshi Iba
  • Publication number: 20090326679
    Abstract: A behavior estimating system is provided. According to the system, an estimated trajectory which provides the basis on which the behavior of an agent is controlled is generated according to a second model which represents a motion of an instructor in which the position and the displacing velocity of the position of a state variable and the time differential values thereof continuously change, in addition to the position of a characteristic point of a reference trajectory which represents a motion of the instructor and a plurality of first models which represent a plurality of shape characteristics of reference trajectories. A behavior manner corresponding to a first model whose fluctuation, which is allowed under a condition that an estimated trajectory passes a characteristic state variable or a range in the vicinity thereof, is the smallest and whose stability is the highest is estimated as the behavior manner of the instructor.
    Type: Application
    Filed: June 18, 2009
    Publication date: December 31, 2009
    Applicant: HONDA MOTOR CO., LTD.
    Inventor: Soshi Iba
  • Publication number: 20080312772
    Abstract: The present invention provides a motion control system control a motion of a second motion body by considering an environment which a human contacts and a motion mode appropriate to the environment, and an environment which a robot actually contacts. The motion mode is learned based on an idea that it is sufficient to learn only a feature part of the motion mode of the human without a necessity to learn the others. Moreover, based on an idea that it is sufficient to reproduce only the feature part of the motion mode of the human without a necessity to reproduce the others, the motion mode of the robot is controlled by using the model obtained from the learning result. Thereby, the motion mode of the robot is controlled by using the motion mode of the human as a prototype without restricting the motion mode thereof more than necessary.
    Type: Application
    Filed: June 12, 2008
    Publication date: December 18, 2008
    Applicants: HONDA MOTOR CO., LTD., ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL
    Inventors: Tadaaki Hasegawa, Yugo Ueda, Soshi Iba, Darrin Bentivegna