Patents by Inventor Takayuki Osogami

Takayuki Osogami has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11947323
    Abstract: A computer-implemented method comprising: receiving data associated with an operational control problem; formulating the operation control problem as an optimization problem; recursively generating a sequence of policies of operational control associated with the operational control problem, wherein each subsequent policy in the sequence is constructed by modifying one or more actions at a single state in a preceding policy in the sequence, and wherein the modifying monotonically changes a risk associated with the subsequent policy; constructing, from the sequence of policies, an optimal solution path, wherein each vertex on the optimal solution path represents an optimal solution to the operational control problem; calculating a ratio of reward to risk for each of the vertices on the path; and selecting one of the policies in the sequence to apply to the operational control problem, based, at least in part, on the calculated ratios.
    Type: Grant
    Filed: October 16, 2021
    Date of Patent: April 2, 2024
    Inventors: Alexander Zadorojniy, Takayuki Osogami
  • Patent number: 11875270
    Abstract: A method, a computer program product, and a system of adversarial semi-supervised one-shot training using a data stream. The method includes receiving a data stream based on an observation, wherein the data stream includes unlabeled data and labeled data. The method also includes training a prediction model with the labeled data using stochastic gradient descent based on a classification loss and an adversarial term and training a representation model with the labeled data and the unlabeled data based on a reconstruction loss and the adversarial term. The adversarial term is a cross-entropy between the middle layer output data from the models. The classification loss is a cross-entropy between the labeled data and an output from the prediction model. The method further includes updating a discriminator with middle layer output data from the prediction model and the representation model and based on a discrimination loss, and discarding the data stream.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Takayuki Katsuki, Takayuki Osogami
  • Publication number: 20230385619
    Abstract: A neuromorphic chip includes synaptic cells including respective resistive devices, axon lines, dendrite lines and switches. The synaptic cells are connected to the axon lines and dendrite lines to form a crossbar array. The axon lines are configured to receive input data and to supply the input data to the synaptic cells. The dendrite lines are configured to receive output data and to supply the output data via one or more respective output lines. A given one of the switches is configured to connect an input terminal to one or more input lines and to changeably connect its one or more output terminals to a given one or more axon lines.
    Type: Application
    Filed: August 7, 2023
    Publication date: November 30, 2023
    Inventors: Atsuya Okazaki, Masatoshi Ishii, Junka Okazawa, Kohji Hosokawa, Takayuki Osogami
  • Patent number: 11823083
    Abstract: An n-steps-ahead value of time-series data is estimated by a prediction model configured to output a sum of discounted m-th order differences of adjacent time steps at each time step, wherein an m-th order difference at a corresponding time step is discounted by using a discount factor such that an m-th order difference discount increases as a time step of the m-th order difference increases.
    Type: Grant
    Filed: November 8, 2019
    Date of Patent: November 21, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Takayuki Osogami
  • Patent number: 11790032
    Abstract: In an approach, a processor obtains a target base strategy for selecting actions of a target agent. A processor obtains an adversarial base strategy for selecting adversarial actions of an adversarial agent. A processor calculates, for each candidate action among a plurality of candidate actions of the target agent, a risk measure of the candidate action based on the adversarial base strategy and a payoff to the target agent in a case where the target agent takes the candidate action and the adversarial agent takes an adversarial action based on the adversarial base strategy. A processor generates a target strategy by adjusting the target base strategy based on the risk measure for each candidate action.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: October 17, 2023
    Assignee: International Business Machines Corporation
    Inventor: Takayuki Osogami
  • Publication number: 20230297915
    Abstract: A computer implemented method determines a policy for risk sensitive decisions. A computer system receives state and action pairs. The computer system, with initial probabilistic discounted entropic risk measure values for the state and action pairs, determines in a recursive manner current probabilistic discounted entropic risk measure values for the state and action pairs based on a risk factor until the current probabilistic discounted entropic risk measure values reach a desired level. The current probabilistic discounted entropic risk measure values are the initial probabilistic discounted entropic risk measure values for a next determination.
    Type: Application
    Filed: March 16, 2022
    Publication date: September 21, 2023
    Inventor: Takayuki Osogami
  • Patent number: 11763139
    Abstract: A neuromorphic chip includes synaptic cells including respective resistive devices, axon lines, dendrite lines and switches. The synaptic cells are connected to the axon lines and dendrite lines to form a crossbar array. The axon lines are configured to receive input data and to supply the input data to the synaptic cells. The dendrite lines are configured to receive output data and to supply the output data via one or more respective output lines. A given one of the switches is configured to connect an input terminal to one or more input lines and to changeably connect its one or more output terminals to a given one or more axon lines.
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: September 19, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Atsuya Okazaki, Masatoshi Ishii, Junka Okazawa, Kohji Hosokawa, Takayuki Osogami
  • Patent number: 11755946
    Abstract: A cumulative reward of a target system type is predicted by training a prediction model by performing an iteration for each time step. The iteration includes recursively updating a matrix by using the weighted difference between an eligibility trace of a current time step and an eligibility trace of a previous time step, recursively updating a vector by using a reward of a subsequent time step and the eligibility trace of the current time step, and recursively updating an eligibility trace of a subsequent time step by using a feature vector of the subsequent time step. Each feature vector is an encoded representation of a state of a training system of the target system type at a corresponding time step. The matrix and the vector are output as the prediction model for estimating the cumulative reward of a target time step of a target system of the target system type.
    Type: Grant
    Filed: November 8, 2019
    Date of Patent: September 12, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Takayuki Osogami
  • Patent number: 11704542
    Abstract: A computer-implemented method is provided for machine prediction. The method includes forming, by a hardware processor, a Convolutional Dynamic Boltzmann Machine (C-DyBM) by extending a non-convolutional DyBM with a convolutional operation. The method further includes generating, by the hardware processor using the convolution operation of the C-DyBM, a prediction of a future event at time t from a past patch of time-series of observations. The method also includes performing, by the hardware processor, a physical action responsive to the prediction of the future event at time t.
    Type: Grant
    Filed: January 29, 2019
    Date of Patent: July 18, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takayuki Katsuki, Takayuki Osogami, Akira Koseki, Masaki Ono
  • Publication number: 20230206099
    Abstract: An apparatus for implementing a computing system to predict preferences includes at least one processor device operatively coupled to a memory. The at least one processor device is configured to calculate a parameter relating to a density of a prior distribution at each sample of a set of samples associated with the prior distribution. The at least one parameter including a distance from each sample to at least one neighboring sample. The at least one processor device is further configured to estimate, for the plurality of samples, at least one differential entropy of at least one posterior distribution associated with at least one observation based on the parameter relating to the density of the prior distribution at each sample and the likelihood of observation for each sample. The estimation is performed without sampling the at least one posterior distribution to reduce consumption of resources of the computing system.
    Type: Application
    Filed: March 1, 2023
    Publication date: June 29, 2023
    Inventors: Takayuki Osogami, Rudy Raymond Harry Putra
  • Publication number: 20230185881
    Abstract: A computer-implemented method is provided for offline reinforcement learning with a dataset. The method includes training a neural network which inputs a state-action pair and outputs a respective Q function for each of a reward and one or more safety constraints, respectively. The neural network has a linear output layer and remaining non-linear layers being represented by a feature mapping function. The training includes obtaining the feature mapping function by constructing Q-functions based on the dataset according to an offline reinforcement algorithm. The training further includes tuning, using the feature mapping function, a weight between the reward and the one or more safety constraints, wherein during the obtaining and the tuning steps, an estimate of a Q-function is provided by subtracting an uncertainty from an expected value of the Q-function. The uncertainty is a function to map the state-action pair to an error size.
    Type: Application
    Filed: December 15, 2021
    Publication date: June 15, 2023
    Inventors: Akifumi Wachi, Takayuki Osogami
  • Publication number: 20230124567
    Abstract: A computer-implemented method comprising: receiving data associated with an operational control problem; formulating the operation control problem as an optimization problem; recursively generating a sequence of policies of operational control associated with the operational control problem, wherein each subsequent policy in the sequence is constructed by modifying one or more actions at a single state in a preceding policy in the sequence, and wherein the modifying monotonically changes a risk associated with the subsequent policy; constructing, from the sequence of policies, an optimal solution path, wherein each vertex on the optimal solution path represents an optimal solution to the operational control problem; calculating a ratio of reward to risk for each of the vertices on the path; and selecting one of the policies in the sequence to apply to the operational control problem, based, at least in part, on the calculated ratios.
    Type: Application
    Filed: October 16, 2021
    Publication date: April 20, 2023
    Inventors: Alexander Zadorojniy, Takayuki Osogami
  • Patent number: 11625631
    Abstract: An apparatus for implementing a computing system to predict preferences includes at least one processor device operatively coupled to a memory. The at least one processor device is configured to calculate a parameter relating to a density of a prior distribution at each sample of a set of samples associated with the prior distribution. The at least one parameter including a distance from each sample to at least one neighboring sample. The at least one processor device is further configured to estimate, for the plurality of samples, at least one differential entropy of at least one posterior distribution associated with at least one observation based on the parameter relating to the density of the prior distribution at each sample and the likelihood of observation for each sample. The estimation is performed without sampling the at least one posterior distribution to reduce consumption of resources of the computing system.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: April 11, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takayuki Osogami, Rudy Raymond Harry Putra
  • Publication number: 20220410878
    Abstract: A method is provided for choosing an action of an agent in a first team that competes against a second team. The method includes determining an action, based on first, second and third types of local payoff matrices. The method further includes performing the action. The determining step includes representing, by the first type of local payoff matrices, a payoff to the first team due to a pairwise interaction between agent teammates of the first team. The determining step further includes representing, by the second type of local payoff matrices, a payoff to the first team due to a pairwise interaction between agent opponents from the first team and the second team. The determining step also includes representing, by the third type of local payoff matrices, a payoff to the first team due to a pairwise interaction between agent teammates of the second team.
    Type: Application
    Filed: June 23, 2021
    Publication date: December 29, 2022
    Inventor: Takayuki Osogami
  • Patent number: 11531878
    Abstract: Systems and methods for modelling time-series data includes testing a testing model with a plurality of hyper-forgetting rates to select a best performing hyper-forgetting rate. A model optimization is tested using the best performing hyper-forgetting rate with the testing model to test combinations of hyper-parameters to select a best performing combination of hyper-parameters. An error of the model is determined using the model optimization. Model parameters are recursively updated according to the least squares regression by determining a pseudo-inverse of a Hessian of the least squares regression at a current time stamp according to a projection of the time-series data at the current time stamp and the pseudo-inverse of the Hessian at a previous time-stamp to determine an optimum model parameter. A next step behavior of the time-series data is predicted using the optimum model parameter. The next step behavior is stored in a database for access by a user.
    Type: Grant
    Filed: February 19, 2019
    Date of Patent: December 20, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Takayuki Osogami
  • Patent number: 11501204
    Abstract: An information processing apparatus includes a history acquisition section configured to acquire history data including a history indicating that a plurality of selection subjects have selected selection objects; a learning processing section configured to allow a choice model to learn a preference of each selection subject for a feature and an environmental dependence of selection of each selection object in each selection environment using the history data, where the choice model uses a feature value possessed by each selection object, the preference of each selection subject for the feature, and the environmental dependence indicative of ease of selection of each selection object in each of a plurality of selection environments to calculate a selectability with which each of the plurality of selection subjects selects each selection object; and an output section configured to output results of learning by the learning processing section.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: November 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Takayuki Katsuki, Takayuki Osogami
  • Patent number: 11461703
    Abstract: Methods and systems for selecting and performing group actions include selecting parameters for an approximated action-value function, which determines a reward value associated with a particular group action taken from a particular state, using a determinant of a parameter matrix for the action-value function. A group action is selected using the approximated action-value function and the selected parameters. Agents are triggered to perform respective tasks in the group action.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: October 4, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takayuki Osogami, Rudy R. Harry Putra
  • Patent number: 11449731
    Abstract: Provided are a computer program product, a learning apparatus and a learning method. The method includes calculating a first propagation value that is propagated from a propagation source node to a propagation destination node in a neural network including nodes, based on node values of the propagation source node at time points and a weight corresponding to passage of time points based on a first attenuation coefficient. The method also includes updating the first attenuation coefficient by using a first update parameter, that is based on a first propagation value, and an error of the node value of the propagation destination node.
    Type: Grant
    Filed: January 16, 2020
    Date of Patent: September 20, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Takayuki Osogami
  • Publication number: 20220284306
    Abstract: A computer-implemented method is provided for data reduction in a memory device for machine learning. The method includes storing, in the memory device, data that has been used for training in a tree-based fitted Q iteration session which learns an action value function with an ensemble of decision trees from the data. The method further includes determining, by a processor device, samples to be removed from the data based on a number of samples which belong to leaf nodes of the decision trees. The method also includes removing, from the memory device, the determined samples from the data to reduce an amount of the data. The method additionally includes learning, by the processor device, a new ensemble of decision trees using the data from which the determined samples have been removed together with new data.
    Type: Application
    Filed: March 4, 2021
    Publication date: September 8, 2022
    Inventors: Takayuki Osogami, Ryo Iwaki, Kohei Miyaguchi
  • Publication number: 20220284281
    Abstract: Methods and systems for learning a policy model include determining an imitation learning expert policy. A policy model neural network is iteratively trained using the determined imitation learning expert policy, including modifying the policy model neural network at iteration to decrease a difference between an output of the policy model neural network and a target signal that is based on the determined imitation learning expert policy.
    Type: Application
    Filed: March 5, 2021
    Publication date: September 8, 2022
    Inventors: Ryo Iwaki, Takayuki Osogami, Kohei Miyaguchi