Patents by Inventor Soshi Iba

Soshi Iba has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reinforcement learning apparatus, control apparatus, and reinforcement learning method

Patent number: 8886357

Abstract: It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.

Type: Grant

Filed: March 28, 2012

Date of Patent: November 11, 2014

Assignees: Advanced Telecommunications Research Institute International, Honda Motor Co., Ltd.

Inventors: Norikazu Sugimoto, Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka
Robot and behavior control system for the same

Patent number: 8768507

Abstract: A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.

Type: Grant

Filed: September 20, 2011

Date of Patent: July 1, 2014

Assignee: Honda Motor Co., Ltd.

Inventors: Soshi Iba, Tadaaki Hasegawa
Behavior control system and robot

Patent number: 8660699

Abstract: A behavior control system capable of controlling the behavior of an agent (robot) such that the agent securely applies a force to a moving object. The behavior control system calculates the degree of overlapping of a time-series probability density distribution between a predicted position trajectory of an object (ball) and a position trajectory candidate of a counter object (racket). Further, a behavior plan of the agent (robot) is generated such that the counter object is moved according to a desired position trajectory, which is a mean position trajectory or a central position trajectory of a position trajectory candidate of the counter object which has the highest degree of overlapping with the predicted position trajectory of the object among a plurality of position trajectory candidates of the counter object.

Type: Grant

Filed: December 17, 2010

Date of Patent: February 25, 2014

Assignee: Honda Motor Co., Ltd.

Inventor: Soshi Iba
BEHAVIOR CONTROL SYSTEM

Publication number: 20130345865

Abstract: A system capable of causing an agent to continuously execute a plurality of different subtasks while securing the continuity of behavior of the agent is provided. A plurality of state variable trajectories representing the time series of a state variable of an object are generated according to a stochastic transition model in which the state variable of the object is represented as a random variable. The stochastic transition model is defined so that the transition mode of the state variable is determined according to an execution probability of each subtask in which a probability distribution is represented by a Dirichlet distribution. An operation of the agent is controlled so that the state of the object transits according to one state variable trajectory (desired state variable trajectory) maximizing or optimizing the joint probability of a whole of the stochastic transition model among the plurality of state variable trajectories.

Type: Application

Filed: February 22, 2013

Publication date: December 26, 2013

Applicant: HONDA MOTOR CO., LTD.

Inventors: Soshi Iba, Akinobu Hayashi
Reinforcement learning system

Patent number: 8392346

Abstract: A reinforcement learning system (1) of the present invention utilizes a value of a first value gradient function (dV1/dt) in the learning performed by a second learning device (122), namely in evaluating a second reward (r2(t)). The first value gradient function (dV1/dt) is a temporal differential of a first value function (V1) which is defined according to a first reward (r1(t)) obtained from an environment and is served as a learning result given by a first learning device (121). An action policy which should be taken by a robot (R) to execute a task is determined based on the second reward (r2(t)).

Type: Grant

Filed: November 2, 2009

Date of Patent: March 5, 2013

Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International

Inventors: Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka, Norikazu Sugimoto
Motion control system, motion control method, and motion control program

Patent number: 8315740

Abstract: The present invention provides a motion control system to control a motion of a second motion body, by considering an environment which a human contacts and a motion mode appropriate to the environment, and an environment which a robot actually contacts. The motion mode is learned based on an idea that it is sufficient to learn only a feature part of the motion mode of the human without a necessity to learn the others. Moreover, based on an idea that it is sufficient to reproduce only the feature part of the motion mode of the human without a necessity to reproduce the others, the motion mode of the robot is controlled by using the model obtained from the learning result. Thereby, the motion mode of the robot is controlled by using the motion mode of the human as a prototype without restricting the motion mode thereof more than necessary.

Type: Grant

Filed: June 12, 2008

Date of Patent: November 20, 2012

Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International

Inventors: Tadaaki Hasegawa, Yugo Ueda, Soshi Iba, Darrin Bentivegna
REINFORCEMENT LEARNING APPARATUS, CONTROL APPARATUS, AND REINFORCEMENT LEARNING METHOD

Publication number: 20120253514

Abstract: It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.

Type: Application

Filed: March 28, 2012

Publication date: October 4, 2012

Inventors: Norikazu Sugimoto, Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka
ROBOT AND BEHAVIOR CONTROL SYSTEM FOR THE SAME

Publication number: 20120078416

Abstract: A robot and a behavior control system for the same are capable of ensuring continued stability while carrying out a specified task by a motion of a body of the robot. Time-series changing patterns of first state variables indicating a motional state of an arm are generated according to a stochastic transition model such that at least one of the first state variables follows a first specified motion trajectory for causing the robot to carry out a specified task. Similarly, time-series changing patterns of second state variables indicating a motional state of the body are generated according to the stochastic transition model such that the second state variables satisfy a continuously stable dynamic condition.

Type: Application

Filed: September 20, 2011

Publication date: March 29, 2012

Applicant: HONDA MOTOR CO., LTD.

Inventors: Soshi Iba, Tadaaki Hasegawa
Behavior estimating system

Patent number: 8099374

Abstract: A behavior estimating system is provided. According to the system, an estimated trajectory which provides the basis on which the behavior of an agent is controlled is generated according to a second model which represents a motion of an instructor in which the position and the displacing velocity of the position of a state variable and the time differential values thereof continuously change, in addition to the position of a characteristic point of a reference trajectory which represents a motion of the instructor and a plurality of first models which represent a plurality of shape characteristics of reference trajectories. A behavior manner corresponding to a first model whose fluctuation, which is allowed under a condition that an estimated trajectory passes a characteristic state variable or a range in the vicinity thereof, is the smallest and whose stability is the highest is estimated as the behavior manner of the instructor.

Type: Grant

Filed: June 18, 2009

Date of Patent: January 17, 2012

Assignee: Honda Motor Co., Ltd.

Inventor: Soshi Iba
Behavior control system

Patent number: 8078321

Abstract: A behavior control system is capable of causing an agent carry out a task by smooth motions. The behavior control system makes it possible to reproduce a typical shape characteristic of a reference trajectory, i.e., the characteristic of a motion of an instructor carrying out a task, by using a first model defined on the basis of a plurality of reference trajectories representing the position of a first state variable in a time-series manner. Further, a learning trajectory representing the position of a second state variable in a time-series manner is generated on the basis of a second model, which represents an agent's motion in which the position of the second state variable corresponding to the first state variable and one or a plurality of time differential values (a displacing velocity and acceleration) thereof continuously change, in addition to the first model.

Type: Grant

Filed: June 18, 2009

Date of Patent: December 13, 2011

Assignee: Honda Motor Co., Ltd.

Inventor: Soshi Iba
BEHAVIOR CONTROL SYSTEM AND ROBOT

Publication number: 20110160908

Abstract: A behavior control system capable of controlling the behavior of an agent (robot) such that the agent securely applies a force to a moving object. The behavior control system calculates the degree of overlapping of a time-series probability density distribution between a predicted position trajectory of an object (ball) and a position trajectory candidate of a counter object (racket). Further, a behavior plan of the agent (robot) is generated such that the counter object is moved according to a desired position trajectory, which is a mean position trajectory or a central position trajectory of a position trajectory candidate of the counter object which has the highest degree of overlapping with the predicted position trajectory of the object among a plurality of position trajectory candidates of the counter object.

Type: Application

Filed: December 17, 2010

Publication date: June 30, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventor: Soshi Iba
REINFORCEMENT LEARNING SYSTEM

Publication number: 20100114807

Abstract: A reinforcement learning system (1) of the present invention utilizes a value of a first value gradient function (dV1/dt) in the learning performed by a second learning device (122), namely in evaluating a second reward (r2(t)). The first value gradient function (dV1/dt) is a temporal differential of a first value function (V1) which is defined according to a first reward (r1(t)) obtained from an environment and is served as a learning result given by a first learning device (121). An action policy which should be taken by a robot (R) to execute a task is determined based on the second reward (r2(t)).

Type: Application

Filed: November 2, 2009

Publication date: May 6, 2010

Applicants: HONDA MOTOR CO., LTD., ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL

Inventors: Yugo Ueda, Tadaaki Hasegawa, Soshi Iba, Koji Akatsuka, Norikazu Sugimoto
BEHAVIOR CONTROL SYSTEM

Publication number: 20090326710

Abstract: A behavior control system is capable of causing an agent carry out a task by smooth motions. The behavior control system makes it possible to reproduce a typical shape characteristic of a reference trajectory, i.e., the characteristic of a motion of an instructor carrying out a task, by using a first model defined on the basis of a plurality of reference trajectories representing the position of a first state variable in a time-series manner. Further, a learning trajectory representing the position of a second state variable in a time-series manner is generated on the basis of a second model, which represents an agent's motion in which the position of the second state variable corresponding to the first state variable and one or a plurality of time differential values (a displacing velocity and acceleration) thereof continuously change, in addition to the first model.

Type: Application

Filed: June 18, 2009

Publication date: December 31, 2009

Applicant: HONDA MOTOR CO., LTD.

Inventor: Soshi Iba
BEHAVIOR ESTIMATING SYSTEM

Publication number: 20090326679

Abstract: A behavior estimating system is provided. According to the system, an estimated trajectory which provides the basis on which the behavior of an agent is controlled is generated according to a second model which represents a motion of an instructor in which the position and the displacing velocity of the position of a state variable and the time differential values thereof continuously change, in addition to the position of a characteristic point of a reference trajectory which represents a motion of the instructor and a plurality of first models which represent a plurality of shape characteristics of reference trajectories. A behavior manner corresponding to a first model whose fluctuation, which is allowed under a condition that an estimated trajectory passes a characteristic state variable or a range in the vicinity thereof, is the smallest and whose stability is the highest is estimated as the behavior manner of the instructor.

Type: Application

Filed: June 18, 2009

Publication date: December 31, 2009

Applicant: HONDA MOTOR CO., LTD.

Inventor: Soshi Iba
MOTION CONTROL SYSTEM, MOTION CONTROL METHOD, AND MOTION CONTROL PROGRAM

Publication number: 20080312772

Abstract: The present invention provides a motion control system control a motion of a second motion body by considering an environment which a human contacts and a motion mode appropriate to the environment, and an environment which a robot actually contacts. The motion mode is learned based on an idea that it is sufficient to learn only a feature part of the motion mode of the human without a necessity to learn the others. Moreover, based on an idea that it is sufficient to reproduce only the feature part of the motion mode of the human without a necessity to reproduce the others, the motion mode of the robot is controlled by using the model obtained from the learning result. Thereby, the motion mode of the robot is controlled by using the motion mode of the human as a prototype without restricting the motion mode thereof more than necessary.

Type: Application

Filed: June 12, 2008

Publication date: December 18, 2008

Applicants: HONDA MOTOR CO., LTD., ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL

Inventors: Tadaaki Hasegawa, Yugo Ueda, Soshi Iba, Darrin Bentivegna

prev 1 2