Patents by Inventor Gaurav Jagtiani

Gaurav Jagtiani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12277040
    Abstract: In-place recovery of fatal system errors at virtualization hosts. A device identifies an occurrence of a fatal system error in the first instance of a host operating system (OS) executing in a computer system. The device determines to perform an in-place recovery for the fatal system error. The device performs the in-place recovery, including pausing the execution of a virtual machine (VM) by the first instance of the host OS, preserving a state of the VM within system memory of the computer system, and resuming the execution of the VM by a second instance of the host OS executing in the computer system based on the state of the VM that is preserved within the system memory of the computer system.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: April 15, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Binit Ranjan Mishra, Mukhtar Ahmed, Christina Marianne Curlette, Steven Adrian West, Gaurav Jagtiani, Naga Kiran Govindaraju, James George Cavalaris, Drew Douglas Cross, Jason Stewart Wohlgemuth, James Anthony Schwartz, Jr., Jennifer Marie Bourlier, Sri Harsha Kanukuntla, Emma Sutherland Boyd, Scott Chao-Chueh Lee, Vijaybalaji Madhanagopal, Terence Kwok Tak Chan, Yuri Dotsenko, Peter Hanpeng Jiang, Aacer Hatem Daken, Emily Nicole Wilson, Emily Cara Clemens, Cody Dean Hartwig, Raz Meir Aloni, Sharon Scarlet Tang, Minsang Kim, Shen Wang
  • Publication number: 20250004882
    Abstract: A computer system identifies an event from a management system log associated with a first container host. The presence of the event in the management system log is indicative that the first container host identified a fatal system error at the first container host. Based on the event, the computer system determines that a first instance of a container that is provisioned at the first container host has been isolated to the first container host. Based on the first instance of the container having been isolated to the first container host, the computer system instructs a second container host to provision a second instance of the container at the second container host.
    Type: Application
    Filed: June 28, 2023
    Publication date: January 2, 2025
    Inventors: Shekhar AGRAWAL, Abhay Sudhir KETKAR, Gaurav JAGTIANI, Binit Ranjan MISHRA, Emma Sutherland BOYD, Scott Chao-Chueh LEE, James Anthony SCHWARTZ, JR., Hari R. PULAPAKA, Karan MEHRA, Shailesh Padmakar JOSHI, Jason Stewart WOHLGEMUTH, David WIMMEL
  • Publication number: 20240338282
    Abstract: In-place recovery of fatal system errors at virtualization hosts. A device identifies an occurrence of a fatal system error in the first instance of a host operating system (OS) executing in a computer system. The device determines to perform an in-place recovery for the fatal system error. The device performs the in-place recovery, including pausing the execution of a virtual machine (VM) by the first instance of the host OS, preserving a state of the VM within system memory of the computer system, and resuming the execution of the VM by a second instance of the host OS executing in the computer system based on the state of the VM that is preserved within the system memory of the computer system.
    Type: Application
    Filed: June 7, 2023
    Publication date: October 10, 2024
    Inventors: Binit Ranjan MISHRA, Mukhtar AHMED, Christina Marianne CURLETTE, Steven Adrian WEST, Gaurav JAGTIANI, Naga Kiran GOVINDARAJU, James George CAVALARIS, Drew Douglas CROSS, Jason Stewart WOHLGEMUTH, James Anthony SCHWARTZ, JR., Jennifer Marie BOURLIER, Sri Harsha KANUKUNTLA, Emma Sutherland BOYD, Scott Chao-Chueh LEE, Vijaybalaji MADHANAGOPAL, Terence Kwok Tak CHAN, Yuri DOTSENKO, Peter Hanpeng JIANG, Aacer Hatem DAKEN, Emily Nicole WILSON, Emily Cara CLEMENS, Cody Dean HARTWIG, Raz Meir ALONI, Sharon Scarlet TANG, Minsang KIM, Shen WANG
  • Patent number: 12028223
    Abstract: A computer implemented method includes receiving telemetry data corresponding to capacity health of nodes in a cloud based computing system. The received telemetry data is processed via a prediction engine to provide predictions of capacity health at multiple dimensions of the cloud based computing system. Node recoverability information is received and node recovery execution is initiated as a function of the representations of capacity health and node recoverability information.
    Type: Grant
    Filed: June 6, 2022
    Date of Patent: July 2, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shandan Zhou, Sam Prakash Bheri, Karthikeyan Subramanian, Yancheng Chen, Gaurav Jagtiani, Abhay Sudhir Ketkar, Hemant Malik, Thomas Moscibroda, Shweta Balkrishna Patil, Luke Rafael Rodriguez, Dalianna Victoria Vaysman
  • Publication number: 20240201767
    Abstract: The present disclosure relates to utilizing a host failure recovery system to efficiently and accurately determine the health of host devices. For example, the host failure recovery system detects when a host server is failing by utilizing a power failure detection model that determines whether a host server is operating in a healthy power state or an unhealthy power state. In particular, the host failure recovery system utilizes a multi-layer power failure detection model that determines power-draw failure events on a host device. The failure detection model determines, with high confidence, the health of a host device based on power-draw signals and/or usage characteristics of the host device. Additionally, the host failure recovery system can initiate a quick recovery of a failing host device.
    Type: Application
    Filed: December 20, 2022
    Publication date: June 20, 2024
    Inventors: Emma Sutherland BOYD, Shekhar AGRAWAL, Amruta Bhalchandra PATHAK, Yu YAO, Aravind Narayanan KRISHNAMOORTHY, Derek James BOYER, Binit Ranjan MISHRA, Gaurav JAGTIANI, Abhay Sudhir KETKAR, Tri Minh TRAN
  • Publication number: 20230396511
    Abstract: A computer implemented method includes receiving telemetry data corresponding to capacity health of nodes in a cloud based computing system. The received telemetry data is processed via a prediction engine to provide predictions of capacity health at multiple dimensions of the cloud based computing system. Node recoverability information is received and node recovery execution is initiated as a function of the representations of capacity health and node recoverability information.
    Type: Application
    Filed: June 6, 2022
    Publication date: December 7, 2023
    Inventors: Shandan ZHOU, Sam Prakash BHERI, Karthikeyan SUBRAMANIAN, Yancheng CHEN, Gaurav JAGTIANI, Abhay Sudhir KETKAR, Hemant MALIK, Thomas MOSCIBRODA, Shweta Balkrishna PATIL, Luke Rafael RODRIGUEZ, Dalianna Victoria VAYSMAN
  • Patent number: 10810096
    Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.
    Type: Grant
    Filed: May 21, 2018
    Date of Patent: October 20, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nic Allen, Gaurav Jagtiani
  • Publication number: 20200150972
    Abstract: A method for opportunistically performing an action in a cloud computing system may include detecting a reboot event corresponding to a computing entity in the cloud computing system. The computing entity may be, for example, a host machine in the cloud computing system or a virtual machine in the cloud computing system. The method may also include causing the computing entity to be held in a stopped state and performing the action while the computing entity is being held in the stopped state, thereby eliminating a need to perform the action at a future time subsequent to the reboot event. The nature of the action is such that it would affect the computing entity if the action were performed subsequent to the reboot event. The method may also include causing the computing entity to be started after the action has been performed.
    Type: Application
    Filed: November 9, 2018
    Publication date: May 14, 2020
    Inventors: Abhay Sudhir KETKAR, Gaurav JAGTIANI, Ajay MANI, Richard Thomas RUSSO, Shweta Balkrishna PATIL, James Cameron WHITE
  • Patent number: 10547516
    Abstract: Methods, systems, and computer program products are described herein for minimizing the downtime for nodes in a network-accessible server set. The downtime may be minimized by determining an optimal timeout value for which a fabric controller waits to perform a recovery action. The optimal timeout value may be determined for each cluster in the network-accessible server set. The optimal timeout value advantageously reduces the overall downtime for customer workloads running on a node for which contact has been lost. The optimal timeout value for each cluster may be based on a predictive model based on the observed historical patterns of the nodes within that cluster. In the event that an optimal timeout value is not determined for a particular cluster (e.g., due to a lack of observed historical patterns), the fabric controller may fall back to a less than optimal timeout value.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: January 28, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sathyanarayana Singh, Gaurav Jagtiani, Rohit Pandey, Durmus Ugur Karatay, Gil Lapid Shafriri
  • Patent number: 10496503
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Grant
    Filed: November 13, 2017
    Date of Patent: December 3, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
  • Publication number: 20190007278
    Abstract: Methods, systems, and computer program products are described herein for minimizing the downtime for nodes in a network-accessible server set. The downtime may be minimized by determining an optimal timeout value for which a fabric controller waits to perform a recovery action. The optimal timeout value may be determined for each cluster in the network-accessible server set. The optimal timeout value advantageously reduces the overall downtime for customer workloads running on a node for which contact has been lost. The optimal timeout value for each cluster may be based on a predictive model based on the observed historical patterns of the nodes within that cluster. In the event that an optimal timeout value is not determined for a particular cluster (e.g., due to a lack of observed historical patterns), the fabric controller may fall back to a less than optimal timeout value.
    Type: Application
    Filed: June 30, 2017
    Publication date: January 3, 2019
    Inventors: Sathyanarayana SINGH, Gaurav JAGTIANI, Rohit PANDEY, Durmus Ugur KARATAY, Gil Lapid SHAFRIRI
  • Publication number: 20180267872
    Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.
    Type: Application
    Filed: May 21, 2018
    Publication date: September 20, 2018
    Inventors: Nic Allen, Gaurav Jagtiani
  • Patent number: 10007586
    Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: June 26, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nic Allen, Gaurav Jagtiani
  • Patent number: 9940210
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: April 10, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
  • Publication number: 20180067830
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Application
    Filed: November 13, 2017
    Publication date: March 8, 2018
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
  • Publication number: 20170199795
    Abstract: Various techniques for deferred server recovery are disclosed herein. In one embodiment, a method includes receiving a notification of a fault from a host in the computing system. The host is performing one or more computing tasks for one or more users. The method can then include determining whether recovery of the fault in the received notification is deferrable on the host. In response to determining that the fault in the received notification is deferrable, the method includes setting a time delay to perform a pending recovery operation on the host at a later time and disallowing additional assignment of computing tasks to the host.
    Type: Application
    Filed: March 10, 2016
    Publication date: July 13, 2017
    Inventors: Nic Allen, Gaurav Jagtiani
  • Publication number: 20150293821
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Application
    Filed: June 26, 2015
    Publication date: October 15, 2015
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
  • Patent number: 9141487
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Grant
    Filed: January 15, 2013
    Date of Patent: September 22, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao
  • Publication number: 20140201564
    Abstract: Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources.
    Type: Application
    Filed: January 15, 2013
    Publication date: July 17, 2014
    Applicant: Microsoft Corporation
    Inventors: Gaurav Jagtiani, Abhishek Singh, Ajay Mani, Akram Hassan, Thiruvengadam Venketesan, Saad Syed, Sushant Pramod Rewaskar, Wei Zhao