Patents by Inventor Seyed Mehdi Fatemi Booshehri
Seyed Mehdi Fatemi Booshehri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230199031Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.Type: ApplicationFiled: February 17, 2023Publication date: June 22, 2023Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI
-
Publication number: 20230185902Abstract: Embodiments seek to prevent detection of a sandbox environment by a potential malware application. To this end, execution of the application is monitored, and provide information about the execution to a reinforcement learning machine learning model. The model generates a suggested modification to make to the executing application. The model is provided with information indicating whether the application executed successfully or not, and this information is used to train the model for additional modifications. By modifying the potential malware execution during its execution, detection of a sandbox environment is prevented, and analysis of the potential malware applications features are better understood.Type: ApplicationFiled: January 30, 2023Publication date: June 15, 2023Inventors: Jugal PARIKH, Geoffrey Lyall McDonald, Mariusz Hieronim JAKUBOWSKI, Seyed Mehdi Fatemi Booshehri, Allan Gordon Lontoc Sepillo, Bradley Noah Faskowitz
-
Patent number: 11616813Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.Type: GrantFiled: August 28, 2019Date of Patent: March 28, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Harm Hendrik Van Seijen, Seyed Mehdi Fatemi Booshehri
-
Patent number: 11568052Abstract: Embodiments seek to prevent detection of a sandbox environment by a potential malware application. To this end, execution of the application is monitored, and provide information about the execution to a reinforcement learning machine learning model. The model generates a suggested modification to make to the executing application. The model is provided with information indicating whether the application executed successfully or not, and this information is used to train the model for additional modifications. By modifying the potential malware execution during its execution, detection of a sandbox environment is prevented, and analysis of the potential malware applications features are better understood.Type: GrantFiled: May 31, 2020Date of Patent: January 31, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Jugal Parikh, Geoffrey Lyall McDonald, Mariusz H. Jakubowski, Seyed Mehdi Fatemi Booshehri, Allan Gordon Lontoc Sepillo, Bradley Noah Faskowitz
-
Publication number: 20210374241Abstract: Embodiments seek to prevent detection of a sandbox environment by a potential malware application. To this end, execution of the application is monitored, and provide information about the execution to a reinforcement learning machine learning model. The model generates a suggested modification to make to the executing application. The model is provided with information indicating whether the application executed successfully or not, and this information is used to train the model for additional modifications. By modifying the potential malware execution during its execution, detection of a sandbox environment is prevented, and analysis of the potential malware applications features are better understood.Type: ApplicationFiled: May 31, 2020Publication date: December 2, 2021Inventors: Jugal Parikh, Geoffrey Lyall McDonald, Mariusz H. Jakubowski, Seyed Mehdi Fatemi Booshehri, Allan Gordon Lontoc Sepillo, Bradley Noah Faskowitz
-
Patent number: 10977551Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.Type: GrantFiled: June 27, 2017Date of Patent: April 13, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Harm Hendrik Van Seijen, Seyed Mehdi Fatemi Booshehri, Romain Michel Henri Laroche, Joshua Samuel Romoff
-
Publication number: 20200076857Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.Type: ApplicationFiled: August 28, 2019Publication date: March 5, 2020Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI
-
Patent number: 10395646Abstract: Described herein are systems and methods for two-stage training of a spoken dialog system. The first stage trains a policy network using external data to produce a semi-trained policy network. The external data includes one or more known fixed dialogs. The second stage trains the semi-trained policy network through interaction to produce a trained policy network. The interaction may be interaction with a user simulator.Type: GrantFiled: May 12, 2017Date of Patent: August 27, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Seyed Mehdi Fatemi Booshehri, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
-
Publication number: 20180165602Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.Type: ApplicationFiled: June 27, 2017Publication date: June 14, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI, Romain Michel Henri LAROCHE, Joshua Samuel ROMOFF
-
Publication number: 20180165603Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.Type: ApplicationFiled: June 27, 2017Publication date: June 14, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI, Romain Michel Henri LAROCHE, Joshua Samuel ROMOFF
-
Publication number: 20170330556Abstract: Described herein are systems and methods for two-stage training of a spoken dialogue system. The first stage trains a policy network using external data to produce a semi-trained policy network. The external data includes one or more known fixed dialogues. The second stage trains the semi-trained policy network through interaction to produce a trained policy network. The interaction may be interaction with a user simulator.Type: ApplicationFiled: May 12, 2017Publication date: November 16, 2017Applicant: Maluuba Inc.Inventors: Seyed Mehdi Fatemi Booshehri, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman