Patents by Inventor Hitoshi Yanami

Hitoshi Yanami has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Recording medium, reinforcement learning method, and reinforcement learning apparatus

Patent number: 11645574

Abstract: A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.

Type: Grant

Filed: September 13, 2018

Date of Patent: May 9, 2023

Assignees: FUJITSU LIMITED KAWASAKI, JAPAN, OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION

Inventors: Tomotake Sasaki, Eiji Uchibe, Kenji Doya, Hirokazu Anai, Hitoshi Yanami, Hidenao Iwane
QUESTIONNAIRE DATA ANALYSIS METHOD AND INFORMATION PROCESSING APPARATUS

Publication number: 20230129842

Abstract: A non-transitory computer-readable recording medium stores a program for causing a computer to execute a process, the process includes generating a plurality of causal relationship candidates each including a pair of a first answer candidate to each of at least some of first questions and a second answer candidate to one of second questions based on questionnaire result data, and searching for a solution of a combinatorial optimization problem that minimizes or maximizes a value of an objective function of which the value changes according to causal relationship candidates to be combined, under a constraint condition such that a predetermined ratio of respondents or more of the plurality of respondents have answers that are same as the pair of the first answer candidate and the second answer candidate of any one of the causal relationship candidates to be combined.

Type: Application

Filed: August 19, 2022

Publication date: April 27, 2023

Applicant: FUJITSU LIMITED

Inventors: Kazuhiro MATSUMOTO, Masatoshi OGAWA, Hitoshi YANAMI, Noriyasu ASO, Hiromitsu SONEDA, Katsumi HOMMA, Natsuki ISHIKAWA, Hayato Dan
Reinforcement learning method and reinforcement learning system

Patent number: 11619915

Abstract: A computer-implemented reinforcement learning method includes determining, based on a target probability of satisfaction of a constraint condition related to a state of a control object and a specific time within which a controller causes the state of the control object not satisfying the constraint condition to be the state of the control object satisfying the constraint condition, a parameter of a reinforcement learner that causes, in a specific probability, the state of the control object to satisfy the constraint condition at a first timing following a second timing at which the state of control object satisfies the constraint condition; and determining a control input to the control object by either the reinforcement learner or the controller, based on whether the state of the control object satisfies the constraint condition at a specific timing.

Type: Grant

Filed: February 21, 2020

Date of Patent: April 4, 2023

Assignee: FUJITSU LIMITED

Inventors: Hidenao Iwane, Junichi Shigezumi, Yoshihiro Okawa, Tomotake Sasaki, Hitoshi Yanami
Apparatus, method and recording medium for controlling system using temporal difference error

Patent number: 11573537

Abstract: A non-transitory, computer-readable recording medium stores a program of reinforcement learning by a state-value function. The program causes a computer to execute a process including calculating a temporal difference (TD) error based on an estimated state-value function, the TD error being calculated by giving a perturbation to each component of a feedback coefficient matrix that provides a policy; calculating based on the TD error and the perturbation, an estimated gradient function matrix acquired by estimating a gradient function matrix of the state-value function with respect to the feedback coefficient matrix for a state of a controlled object, when state variation of the controlled object in the reinforcement learning is described by a linear difference equation and an immediate cost or an immediate reward of the controlled object is described in a quadratic form of the state and an input; and updating the feedback coefficient matrix using the estimated gradient function matrix.

Type: Grant

Filed: September 13, 2018

Date of Patent: February 7, 2023

Assignees: FUJITSU LIMITED, OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION

Inventors: Tomotake Sasaki, Eiji Uchibe, Kenji Doya, Hirokazu Anai, Hitoshi Yanami, Hidenao Iwane
Reinforcement learning method, recording medium, and reinforcement learning system

Patent number: 11543789

Abstract: A reinforcement learning method executed by a computer includes calculating a degree of risk for a state of a controlled object at a current time point with respect to a constraint condition related to the state of the controlled object, the degree of risk being calculated based on a predicted value of the state of the controlled object at a future time point, the predicted value being obtained from model information defining a relationship between the state of the controlled object and a control input to the controlled object; and determining the control input to the controlled object at the current time point, from a range defined according to the calculated degree of risk so that the range becomes narrower as the calculated degree of risk increases.

Type: Grant

Filed: February 21, 2020

Date of Patent: January 3, 2023

Assignee: FUJITSU LIMITED

Inventors: Yoshihiro Okawa, Tomotake Sasaki, Hidenao Iwane, Hitoshi Yanami
POLICY IMPROVEMENT METHOD, NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM FOR STORING POLICY IMPROVEMENT PROGRAM, AND POLICY IMPROVEMENT DEVICE

Publication number: 20210109491

Abstract: A policy improvement method for reinforcement learning using a state value function, the method including: calculating, when an immediate cost or immediate reward of a control target in the reinforcement learning is defined by a state and an input, an estimated parameter that estimates a parameter of the state value function for the state of the control target; contracting a state space of the control target using the calculated estimated parameter; generating a TD error for the estimated state value function that estimates the state value function in the contracted state space of the control target by perturbing each parameter that defines the policy; generating an estimated gradient that estimates the gradient of the state value function with respect to the parameter that defines the policy, based on the generated TD error and the perturbation; and updating the parameter that defines the policy using the generated estimated gradient.

Type: Application

Filed: September 29, 2020

Publication date: April 15, 2021

Applicant: FUJITSU LIMITED

Inventors: Junichi Shigezumi, Tomotake Sasaki, Hidenao Iwane, Hitoshi Yanami
METHOD FOR REINFORCEMENT LEARNING, RECORDING MEDIUM STORING REINFORCEMENT LEARNING PROGRAM, AND REINFORCEMENT LEARNING APPARATUS

Publication number: 20210063974

Abstract: A method for reinforcement learning performed by a computer is disclosed. The method includes: predicting a state of a target to be controlled in reinforcement learning at each time point to measure a state of the target, the time point being included in a period from a time point to determine a present action to a time point to determine a subsequent action; calculating a degree of risk concerning the state of the target at the each time point with respect to a constraint condition based on a result of prediction; specifying a search range concerning the present action to the target in accordance with the calculated degree of risk and a degree of impact of the present action to the target on the state of the target at the each time point; and determining the present action to the target based on the specified search range.

Type: Application

Filed: August 25, 2020

Publication date: March 4, 2021

Applicant: FUJITSU LIMITED

Inventors: Yoshihiro OKAWA, Tomotake Sasaki, Hidenao Iwane, Hitoshi Yanami
REINFORCEMENT LEARNING METHOD AND REINFORCEMENT LEARNING SYSTEM

Publication number: 20200285204

Abstract: A computer-implemented reinforcement learning method includes determining, based on a target probability of satisfaction of a constraint condition related to a state of a control object and a specific time within which a controller causes the state of the control object not satisfying the constraint condition to be the state of the control object satisfying the constraint condition, a parameter of a reinforcement learner that causes, in a specific probability, the state of the control object to satisfy the constraint condition at a first timing following a second timing at which the state of control object satisfies the constraint condition; and determining a control input to the control object by either the reinforcement learner or the controller, based on whether the state of the control object satisfies the constraint condition at a specific timing.

Type: Application

Filed: February 21, 2020

Publication date: September 10, 2020

Applicant: FUJITSU LIMITED

Inventors: Hidenao IWANE, Junichi Shigezumi, Yoshihiro Okawa, Tomotake Sasaki, Hitoshi Yanami
REINFORCEMENT LEARNING METHOD, RECORDING MEDIUM, AND REINFORCEMENT LEARNING SYSTEM

Publication number: 20200285208

Abstract: A reinforcement learning method executed by a computer includes calculating a degree of risk for a state of a controlled object at a current time point with respect to a constraint condition related to the state of the controlled object, the degree of risk being calculated based on a predicted value of the state of the controlled object at a future time point, the predicted value being obtained from model information defining a relationship between the state of the controlled object and a control input to the controlled object; and determining the control input to the controlled object at the current time point, from a range defined according to the calculated degree of risk so that the range becomes narrower as the calculated degree of risk increases.

Type: Application

Filed: February 21, 2020

Publication date: September 10, 2020

Applicant: FUJITSU LIMITED

Inventors: Yoshihiro OKAWA, Tomotake Sasaki, Hidenao Iwane, Hitoshi Yanami
REINFORCEMENT LEARNING METHOD, RECORDING MEDIUM, AND REINFORCEMENT LEARNING APPARATUS

Publication number: 20200234123

Abstract: A reinforcement learning method executed by a computer includes calculating, in reinforcement learning of repeatedly executing a learning step for a value function that has monotonicity as a characteristic of a value according to a state or an action of a control target, a contribution level of the state or the action of the control target used in the learning step, the contribution level of the state or the action to the reinforcement learning being calculated for each learning step and calculated using a basis function used for representing the value function; determining whether to update the value function, based on the value function after each learning step and the calculated contribution level calculated in each learning step; and updating the value function when the determining determines to update the value function.

Type: Application

Filed: January 16, 2020

Publication date: July 23, 2020

Applicant: FUJITSU LIMITED

Inventors: Junichi Shigezumi, Hidenao Iwane, Hitoshi Yanami
RECORDING MEDIUM THAT STORES REINFORCEMENT LEARNING PROGRAM, REINFORCEMENT LEARNING METHOD, AND REINFORCEMENT LEARNING APPARATUS

Publication number: 20200184277

Abstract: A reinforcement learning method is performed by a computer. The method includes: acquiring an input value related to a state and an action of a control target and a gain of the control target that corresponds to the input value; estimating coefficients of state-action value function that becomes a polynomial for a variable that represents the action of the control target, or becomes a polynomial for a variable that represents the action of the control target when a value is substituted for a variable that represents the state of the control target, based on the acquired input value and the gain; and obtaining an optimum action or an optimum value of the state-action value function with the estimated coefficients by using a quantifier elimination.

Type: Application

Filed: December 4, 2019

Publication date: June 11, 2020

Applicant: FUJITSU LIMITED

Inventors: Hidenao Iwane, Tomotake Sasaki, Hitoshi Yanami
Power-supply control apparatus, power-supply control method, server, power-supply control system, and storage medium

Patent number: 10310587

Abstract: A power-supply control apparatus includes a processor that executes a process. The process includes calculating, for a first time period, a first predictive value of total power consumption by the power-supply control apparatus and one or more other power-supply control apparatuses to which power is supplied from a power supply; and determining whether to allow a storage battery to be charged in the first time period based on the first predictive value for the first time period and previous information that is related to the first predictive value and obtained in a second time period before the first time period.

Type: Grant

Filed: October 26, 2015

Date of Patent: June 4, 2019

Assignees: FUJITSU LIMITED, THE UNIVERSITY OF TOKYO

Inventors: Tomotake Sasaki, Hitoshi Yanami, Junji Kaneko, Shinji Hara
RECORDING MEDIUM, POLICY IMPROVING METHOD, AND POLICY IMPROVING APPARATUS

Publication number: 20190086876

Abstract: A non-transitory, computer-readable recording medium stores a program of reinforcement learning by a state-value function. The program causes a computer to execute a process including calculating a TD error based on an estimated state-value function, the TD error being calculated by giving a perturbation to each component of a feedback coefficient matrix that provides a policy; calculating based on the TD error and the perturbation, an estimated gradient function matrix acquired by estimating a gradient function matrix of the state-value function with respect to the feedback coefficient matrix for a state of a controlled object, when state variation of the controlled object in the reinforcement learning is described by a linear difference equation and an immediate cost or an immediate reward of the controlled object is described in a quadratic form of the state and an input; and updating the feedback coefficient matrix using the estimated gradient function matrix.

Type: Application

Filed: September 13, 2018

Publication date: March 21, 2019

Applicants: FUJITSU LIMITED, Okinawa Institute of Science and Technology School Corporation

Inventors: Tomotake Sasaki, Eiji Uchibe, Kenji Doya, Hirokazu Anai, Hitoshi Yanami, Hidenao Iwane
RECORDING MEDIUM, REINFORCEMENT LEARNING METHOD, AND REINFORCEMENT LEARNING APPARATUS

Publication number: 20190087751

Abstract: A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.

Type: Application

Filed: September 13, 2018

Publication date: March 21, 2019

Applicants: FUJITSU LIMITED, Okinawa Institute of Science and Technology School Corporation

Inventors: Tomotake Sasaki, Eiji Uchibe, Kenji Doya, Hirokazu Anai, Hitoshi Yanami, Hidenao Iwane
Control method, control server, and computer-readable recording medium

Patent number: 9614401

Abstract: A control server according to an embodiment sorts a plurality of notebook PCs into a plurality of groups so that the total value of the remaining energy is a value similar to the total value of the remaining energy of the rechargeable batteries of a plurality of notebook PCs included in a different group. The control server according to the embodiment performs local search individually on the sorted groups, and generates a control plan for the individual notebook PCs.

Type: Grant

Filed: February 28, 2014

Date of Patent: April 4, 2017

Assignees: FUJITSU LIMITED, THE UNIVERSITY OF TOKYO

Inventors: Hitoshi Yanami, Hidenao Iwane, Tomotake Sasaki, Hirokazu Anai, Junji Kaneko, Shinji Hara, Suguru Fujita
CONTROL SCHEME CREATION METHOD AND COMPUTER-READABLE RECORDING MEDIUM FOR CREATING CONTROL SCHEME

Publication number: 20160291090

Abstract: A control scheme creation method according to an embodiment includes executing, on a computer, processing of calculation of the amount of stored or released energy of each of a plurality of energy storage devices for each of a plurality of periods based on estimation value information on the amount of energy consumption within a target area and based on remaining amount information representing the amount of remaining energy of each of the plurality of energy storage devices. Furthermore, the control scheme creation method includes executing, on the computer, processing of determination of storage timing or release timing for the energy storage device for each of the periods based on the calculated amount of stored or released energy.

Type: Application

Filed: March 4, 2016

Publication date: October 6, 2016

Applicants: FUJITSU LIMITED, THE UNIVERSITY OF TOKYO

Inventors: Tomotake Sasaki, Hitoshi Yanami, Junji Kaneko, Shinji Hara
POWER-SUPPLY CONTROL APPARATUS, POWER-SUPPLY CONTROL METHOD, SERVER, POWER-SUPPLY CONTROL SYSTEM, AND STORAGE MEDIUM

Publication number: 20160154453

Abstract: A power-supply control apparatus includes a processor that executes a process. The process includes calculating, for a first time period, a first predictive value of total power consumption by the power-supply control apparatus and one or more other power-supply control apparatuses to which power is supplied from a power supply; and determining whether to allow a storage battery to be charged in the first time period based on the first predictive value for the first time period and previous information that is related to the first predictive value and obtained in a second time period before the first time period.

Type: Application

Filed: October 26, 2015

Publication date: June 2, 2016

Applicants: FUJITSU LIMITED, The University of Tokyo

Inventors: Tomotake SASAKI, Hitoshi YANAMI, Junji KANEKO, Shinji HARA
Model expression generation method and apparatus

Patent number: 8935131

Abstract: When model expressions of objective functions are generated at vertexes of a quadrilateral on a plane concerning P and N channels of transistors in SRAM, the initial number of times of simulation is allocated to each objective function at each designated vertex according to weight values set based on relationships presumed among the objective functions at each designated vertex. For each objective function at each designated vertex, first simulation is executed the allocated number of times. Furthermore, a model expression is generated from the first simulation result, and an evaluation indicator of an approximation accuracy of the model expression is calculated. Then, for each model expression, it is determined whether the corresponding model expression has influence on the yield, and based on the evaluation indicator of the corresponding model expression and presence or absence of the influence, it is determined whether additional simulation is required for the corresponding objective function.

Type: Grant

Filed: March 25, 2011

Date of Patent: January 13, 2015

Assignee: Fujitsu Limited

Inventors: Hidenao Iwane, Hirokazu Anai, Hitoshi Yanami
Display processing technique of design parameter space

Patent number: 8843351

Abstract: This method includes: generating a constraint equation from data of an approximate expression of a cost function representing a relationship between a plurality of design parameters and a cost, data of a route in a cost space and data of a search range in a design parameter space; obtaining a logical expression of a solution for the constraint equation from a quantifier elimination processing unit that carries out a processing according to a quantifier elimination method; substituting coordinates of each of a plurality of points within the search range in the design parameter space into the logical expression of the solution to determine, for each of the plurality of points, true or false of the logical expression of the solution; and displaying the design parameter space in which a display object including a first point for which true is determined is disposed at the first point.

Type: Grant

Filed: May 25, 2011

Date of Patent: September 23, 2014

Assignee: Fujitsu Limited

Inventors: Hitoshi Yanami, Hirokazu Anai, Hidenao Iwane
CONTROL METHOD, CONTROL SERVER, AND COMPUTER-READABLE RECORDING MEDIUM

Publication number: 20140249793

Abstract: A control server according to an embodiment sorts a plurality of notebook PCs into a plurality of groups so that the total value of the remaining amounts is a value similar to the total value of the remaining amounts of the rechargeable batteries of a plurality of notebook PCs included in a different group. The control server according to the embodiment performs local search individually on the sorted groups, and generates a control plan for the individual notebook PCs.

Type: Application

Filed: February 28, 2014

Publication date: September 4, 2014

Applicants: The University of Tokyo, FUJITSU LIMITED

Inventors: Hitoshi Yanami, Hidenao Iwane, Tomotake Sasaki, Hirokazu Anai, Junji Kaneko, Shinji Hara, Suguru Fujita

1 2 3 next