Patents by Inventor Tom Ben Zion Zahavy

Tom Ben Zion Zahavy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240127071
    Abstract: There is provided a computer-implemented method for updating a search distribution of an evolutionary strategies optimizer using an optimizer neural network comprising one or more attention blocks. The method comprises receiving a plurality of candidate solutions, one or more parameters defining the search distribution that the plurality of candidate solutions are sampled from, and fitness score data indicating a fitness of each respective candidate solution of the plurality of candidate solutions. The method further comprises processing, by the one or more attention neural network blocks, the fitness score data using an attention mechanism to generate respective recombination weights corresponding to each respective candidate solution. The method further comprises updating the one or more parameters defining the search distribution based upon the recombination weights applied to the plurality of candidate solutions.
    Type: Application
    Filed: September 27, 2023
    Publication date: April 18, 2024
    Inventors: Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Ben Zion Zahavy, Valentin Clement Dalibard, Christopher Yenchuan Lu, Satinder Singh Baveja, Johan Sebastian Flennerhag
  • Publication number: 20240104389
    Abstract: In one aspect there is provided a method for training a neural network system by reinforcement learning. The neural network system may be configured to receive an input observation characterizing a state of an environment interacted with by an agent and to select and output an action in accordance with a policy aiming to satisfy an objective. The method may comprise obtaining a policy set comprising one or more policies for satisfying the objective and determining a new policy based on the one or more policies. The determining may include one or more optimization steps that aim to maximize a diversity of the new policy relative to the policy set under the condition that the new policy satisfies a minimum performance criterion based on an expected return that would be obtained by following the new policy.
    Type: Application
    Filed: February 4, 2022
    Publication date: March 28, 2024
    Inventors: Tom Ben Zion Zahavy, Brendan Timothy O'Donoghue, Andre da Motta Salles Barreto, Johan Sebastian Flennerhag, Volodymyr Mnih, Satinder Singh Baveja
  • Publication number: 20230144995
    Abstract: A reinforcement learning system, method, and computer program code for controlling an agent to perform a plurality of tasks while interacting with an environment. The system learns options, where an option comprises a sequence of primitive actions performed by the agent under control of an option policy neural network. In implementations the system discovers options which are useful for multiple different tasks by meta-learning rewards for training the option policy neural network whilst the agent is interacting with the environment.
    Type: Application
    Filed: June 7, 2021
    Publication date: May 11, 2023
    Inventors: Vivek Veeriah Jeya Veeraiah, Tom Ben Zion Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado Philip van Hasselt, David Silver, Satinder Singh Baveja
  • Patent number: 10282462
    Abstract: A multi-modal computer classification network system for use in classifying data records is described herein. The system includes a memory device, a first classification computer server, a second classification computer server, and a policy computer server. The memory device includes an item records database and a labeling database. The first classification computer server includes a first classifier program that is configured to select an item record from the item database and generate a first classification record including a first ranked list of class labels. The second classification computer server includes a second classifier program that is configured to generate a second classification record including a second ranked list of class labels. The policy computer server includes a policy network that is programmed to determine a predicted class label based on the first and second ranked lists of class labels.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: May 7, 2019
    Assignee: WALMART APOLLO, LLC
    Inventors: Alessandro Magnani, Tom Ben Zion Zahavy, Abhinandan Krishnan, Shie Mannor
  • Publication number: 20180121533
    Abstract: A multi-modal computer classification network system for use in classifying data records is described herein. The system includes a memory device, a first classification computer server, a second classification computer server, and a policy computer server. The memory device includes an item records database and a labeling database. The first classification computer server includes a first classifier program that is configured to select an item record from the item database and generate a first classification record including a first ranked list of class labels. The second classification computer server includes a second classifier program that is configured to generate a second classification record including a second ranked list of class labels. The policy computer server includes a policy network that is programmed to determine a predicted class label based on the first and second ranked lists of class labels.
    Type: Application
    Filed: October 31, 2016
    Publication date: May 3, 2018
    Inventors: Alessandro Magnani, Tom Ben Zion Zahavy, Abhinandan Krishnan, Shie Mannor