Patents by Inventor Gaurav Jagtiani

Gaurav Jagtiani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

In-place recovery of fatal system errors at virtualization hosts

Patent number: 12277040

Abstract: In-place recovery of fatal system errors at virtualization hosts. A device identifies an occurrence of a fatal system error in the first instance of a host operating system (OS) executing in a computer system. The device determines to perform an in-place recovery for the fatal system error. The device performs the in-place recovery, including pausing the execution of a virtual machine (VM) by the first instance of the host OS, preserving a state of the VM within system memory of the computer system, and resuming the execution of the VM by a second instance of the host OS executing in the computer system based on the state of the VM that is preserved within the system memory of the computer system.

Type: Grant

Filed: June 7, 2023

Date of Patent: April 15, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Binit Ranjan Mishra, Mukhtar Ahmed, Christina Marianne Curlette, Steven Adrian West, Gaurav Jagtiani, Naga Kiran Govindaraju, James George Cavalaris, Drew Douglas Cross, Jason Stewart Wohlgemuth, James Anthony Schwartz, Jr., Jennifer Marie Bourlier, Sri Harsha Kanukuntla, Emma Sutherland Boyd, Scott Chao-Chueh Lee, Vijaybalaji Madhanagopal, Terence Kwok Tak Chan, Yuri Dotsenko, Peter Hanpeng Jiang, Aacer Hatem Daken, Emily Nicole Wilson, Emily Cara Clemens, Cody Dean Hartwig, Raz Meir Aloni, Sharon Scarlet Tang, Minsang Kim, Shen Wang
ACCELERATED FATAL SYSTEM ERROR RECOVERY OF CONTAINER HOST

Publication number: 20250004882

Abstract: A computer system identifies an event from a management system log associated with a first container host. The presence of the event in the management system log is indicative that the first container host identified a fatal system error at the first container host. Based on the event, the computer system determines that a first instance of a container that is provisioned at the first container host has been isolated to the first container host. Based on the first instance of the container having been isolated to the first container host, the computer system instructs a second container host to provision a second instance of the container at the second container host.

Type: Application

Filed: June 28, 2023

Publication date: January 2, 2025

Inventors: Shekhar AGRAWAL, Abhay Sudhir KETKAR, Gaurav JAGTIANI, Binit Ranjan MISHRA, Emma Sutherland BOYD, Scott Chao-Chueh LEE, James Anthony SCHWARTZ, JR., Hari R. PULAPAKA, Karan MEHRA, Shailesh Padmakar JOSHI, Jason Stewart WOHLGEMUTH, David WIMMEL
IN-PLACE RECOVERY OF FATAL SYSTEM ERRORS AT VIRTUALIZATION HOSTS

Publication number: 20240338282

Abstract: In-place recovery of fatal system errors at virtualization hosts. A device identifies an occurrence of a fatal system error in the first instance of a host operating system (OS) executing in a computer system. The device determines to perform an in-place recovery for the fatal system error. The device performs the in-place recovery, including pausing the execution of a virtual machine (VM) by the first instance of the host OS, preserving a state of the VM within system memory of the computer system, and resuming the execution of the VM by a second instance of the host OS executing in the computer system based on the state of the VM that is preserved within the system memory of the computer system.

Type: Application

Filed: June 7, 2023

Publication date: October 10, 2024

Inventors: Binit Ranjan MISHRA, Mukhtar AHMED, Christina Marianne CURLETTE, Steven Adrian WEST, Gaurav JAGTIANI, Naga Kiran GOVINDARAJU, James George CAVALARIS, Drew Douglas CROSS, Jason Stewart WOHLGEMUTH, James Anthony SCHWARTZ, JR., Jennifer Marie BOURLIER, Sri Harsha KANUKUNTLA, Emma Sutherland BOYD, Scott Chao-Chueh LEE, Vijaybalaji MADHANAGOPAL, Terence Kwok Tak CHAN, Yuri DOTSENKO, Peter Hanpeng JIANG, Aacer Hatem DAKEN, Emily Nicole WILSON, Emily Cara CLEMENS, Cody Dean HARTWIG, Raz Meir ALONI, Sharon Scarlet TANG, Minsang KIM, Shen WANG
Capacity aware cloud environment node recovery system

Patent number: 12028223

Abstract: A computer implemented method includes receiving telemetry data corresponding to capacity health of nodes in a cloud based computing system. The received telemetry data is processed via a prediction engine to provide predictions of capacity health at multiple dimensions of the cloud based computing system. Node recoverability information is received and node recovery execution is initiated as a function of the representations of capacity health and node recoverability information.

Type: Grant

Filed: June 6, 2022

Date of Patent: July 2, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shandan Zhou, Sam Prakash Bheri, Karthikeyan Subramanian, Yancheng Chen, Gaurav Jagtiani, Abhay Sudhir Ketkar, Hemant Malik, Thomas Moscibroda, Shweta Balkrishna Patil, Luke Rafael Rodriguez, Dalianna Victoria Vaysman
UTILIZING DEVICE SIGNALS TO IMPROVE THE RECOVERY OF VIRTUAL MACHINE HOST DEVICES

Publication number: 20240201767

Abstract: The present disclosure relates to utilizing a host failure recovery system to efficiently and accurately determine the health of host devices. For example, the host failure recovery system detects when a host server is failing by utilizing a power failure detection model that determines whether a host server is operating in a healthy power state or an unhealthy power state. In particular, the host failure recovery system utilizes a multi-layer power failure detection model that determines power-draw failure events on a host device. The failure detection model determines, with high confidence, the health of a host device based on power-draw signals and/or usage characteristics of the host device. Additionally, the host failure recovery system can initiate a quick recovery of a failing host device.

Type: Application

Filed: December 20, 2022

Publication date: June 20, 2024

Inventors: Emma Sutherland BOYD, Shekhar AGRAWAL, Amruta Bhalchandra PATHAK, Yu YAO, Aravind Narayanan KRISHNAMOORTHY, Derek James BOYER, Binit Ranjan MISHRA, Gaurav JAGTIANI, Abhay Sudhir KETKAR, Tri Minh TRAN
Capacity Aware Cloud Environment Node Recovery System

Publication number: 20230396511

Abstract: A computer implemented method includes receiving telemetry data corresponding to capacity health of nodes in a cloud based computing system. The received telemetry data is processed via a prediction engine to provide predictions of capacity health at multiple dimensions of the cloud based computing system. Node recoverability information is received and node recovery execution is initiated as a function of the representations of capacity health and node recoverability information.

Type: Application

Filed: June 6, 2022

Publication date: December 7, 2023

Inventors: Shandan ZHOU, Sam Prakash BHERI, Karthikeyan SUBRAMANIAN, Yancheng CHEN, Gaurav JAGTIANI, Abhay Sudhir KETKAR, Hemant MALIK, Thomas MOSCIBRODA, Shweta Balkrishna PATIL, Luke Rafael RODRIGUEZ, Dalianna Victoria VAYSMAN
Deferred server recovery in computing systems

Patent number: 10810096

Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.

Type: Grant

Filed: May 21, 2018

Date of Patent: October 20, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nic Allen, Gaurav Jagtiani
PERFORMING ACTIONS OPPORTUNISTICALLY IN CONNECTION WITH REBOOT EVENTS IN A CLOUD COMPUTING SYSTEM

Publication number: 20200150972

Abstract: A method for opportunistically performing an action in a cloud computing system may include detecting a reboot event corresponding to a computing entity in the cloud computing system. The computing entity may be, for example, a host machine in the cloud computing system or a virtual machine in the cloud computing system. The method may also include causing the computing entity to be held in a stopped state and performing the action while the computing entity is being held in the stopped state, thereby eliminating a need to perform the action at a future time subsequent to the reboot event. The nature of the action is such that it would affect the computing entity if the action were performed subsequent to the reboot event. The method may also include causing the computing entity to be started after the action has been performed.

Type: Application

Filed: November 9, 2018

Publication date: May 14, 2020

Inventors: Abhay Sudhir KETKAR, Gaurav JAGTIANI, Ajay MANI, Richard Thomas RUSSO, Shweta Balkrishna PATIL, James Cameron WHITE
Determining for an optimal timeout value to minimize downtime for nodes in a network-accessible server set

Patent number: 10547516

Abstract: Methods, systems, and computer program products are described herein for minimizing the downtime for nodes in a network-accessible server set. The downtime may be minimized by determining an optimal timeout value for which a fabric controller waits to perform a recovery action. The optimal timeout value may be determined for each cluster in the network-accessible server set. The optimal timeout value advantageously reduces the overall downtime for customer workloads running on a node for which contact has been lost. The optimal timeout value for each cluster may be based on a predictive model based on the observed historical patterns of the nodes within that cluster. In the event that an optimal timeout value is not determined for a particular cluster (e.g., due to a lack of observed historical patterns), the fabric controller may fall back to a less than optimal timeout value.

Type: Grant

Filed: June 30, 2017

Date of Patent: January 28, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sathyanarayana Singh, Gaurav Jagtiani, Rohit Pandey, Durmus Ugur Karatay, Gil Lapid Shafriri
Healing cloud services during upgrades

Patent number: 10496503

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Grant

Filed: November 13, 2017

Date of Patent: December 3, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
DETERMINING AN OPTIMAL TIMEOUT VALUE TO MINIMIZE DOWNTIME FOR NODES IN A NETWORK-ACCESSIBLE SERVER SET

Publication number: 20190007278

Abstract: Methods, systems, and computer program products are described herein for minimizing the downtime for nodes in a network-accessible server set. The downtime may be minimized by determining an optimal timeout value for which a fabric controller waits to perform a recovery action. The optimal timeout value may be determined for each cluster in the network-accessible server set. The optimal timeout value advantageously reduces the overall downtime for customer workloads running on a node for which contact has been lost. The optimal timeout value for each cluster may be based on a predictive model based on the observed historical patterns of the nodes within that cluster. In the event that an optimal timeout value is not determined for a particular cluster (e.g., due to a lack of observed historical patterns), the fabric controller may fall back to a less than optimal timeout value.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventors: Sathyanarayana SINGH, Gaurav JAGTIANI, Rohit PANDEY, Durmus Ugur KARATAY, Gil Lapid SHAFRIRI
DEFERRED SERVER RECOVERY IN COMPUTING SYSTEMS

Publication number: 20180267872

Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.

Type: Application

Filed: May 21, 2018

Publication date: September 20, 2018

Inventors: Nic Allen, Gaurav Jagtiani
Deferred server recovery in computing systems

Patent number: 10007586

Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.

Type: Grant

Filed: March 10, 2016

Date of Patent: June 26, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nic Allen, Gaurav Jagtiani
Healing cloud services during upgrades

Patent number: 9940210

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Grant

Filed: June 26, 2015

Date of Patent: April 10, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
HEALING CLOUD SERVICES DURING UPGRADES

Publication number: 20180067830

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Application

Filed: November 13, 2017

Publication date: March 8, 2018

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
DEFERRED SERVER RECOVERY IN COMPUTING SYSTEMS

Publication number: 20170199795

Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.

Type: Application

Filed: March 10, 2016

Publication date: July 13, 2017

Inventors: Nic Allen, Gaurav Jagtiani
HEALING CLOUD SERVICES DURING UPGRADES

Publication number: 20150293821

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Application

Filed: June 26, 2015

Publication date: October 15, 2015

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
Healing cloud services during upgrades

Patent number: 9141487

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Grant

Filed: January 15, 2013

Date of Patent: September 22, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
HEALING CLOUD SERVICES DURING UPGRADES

Publication number: 20140201564

Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.

Type: Application

Filed: January 15, 2013

Publication date: July 17, 2014

Applicant: Microsoft Corporation

Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao