Patents by Inventor Sudhir G. Rao
Sudhir G. Rao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8286026Abstract: A cluster recovery and maintenance technique for a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of one or more client applications running in a client application tier on the clients. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide client application services. First and second zones in the cluster are determined in response to an active node membership change involving one or more active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.Type: GrantFiled: February 13, 2012Date of Patent: October 9, 2012Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson
-
Publication number: 20120166866Abstract: A cluster recovery and maintenance technique for a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of one or more client applications running in a client application tier on the clients. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide client application services. First and second zones in the cluster are determined in response to an active node membership change involving one or more active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.Type: ApplicationFiled: February 13, 2012Publication date: June 28, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sudhir G. Rao, Bruce M. Jackson
-
Patent number: 8195976Abstract: A cluster recovery and maintenance technique for use in a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of client applications running in a client application tier. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide services on behalf of client applications. First and second zones in the cluster are determined in response to an active node membership change involving active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.Type: GrantFiled: June 29, 2005Date of Patent: June 5, 2012Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson
-
Publication number: 20110258306Abstract: The acquisition of a lock among nodes of a divided cluster is disclosed. A method is performable by each of at least one node of the cluster. A node waits for a delay corresponding to its identifier. The node asserts intent to acquire the lock by writing its identifier to X and Y variables where another node has failed to acquire the lock. The node waits for another node to acquire the lock where the other node has written to X, and proceeds where Y remains equal to its own identifier. The node waits for another node to acquire the lock where the other node has written to a Z variable, and writes its own identifier to Z and proceeds where the other node has failed. The node writes a value to Y indicating that it is acquiring the lock, and maintains acquisition by periodically writing to Z.Type: ApplicationFiled: June 25, 2011Publication date: October 20, 2011Inventors: Sudhir G. Rao, Myung M. Bae, Thomas K. Clark, Douglas Griffith, Roger L. Haskin, Shah Mohammad Rezaul Islam, Felipe Knop, Soumitra Sarkar, Frank B. Schmuck, Theodore B. Vojnovich, Yi Zhou, Robert Curran
-
Patent number: 7991753Abstract: The acquisition of a lock among nodes of a divided cluster is disclosed. A method is performable by each of at least one node of the cluster. A node waits for a delay corresponding to its identifier. The node asserts intent to acquire the lock by writing its identifier to X and Y variables where another node has failed to acquire the lock. The node waits for another node to acquire the lock where the other node has written to X, and proceeds where Y remains equal to its own identifier. The node waits for another node to acquire the lock where the other node has written to a Z variable, and writes its own identifier to Z and proceeds where the other node has failed. The node writes a value to Y indicating that it is acquiring the lock, and maintains acquisition by periodically writing to Z.Type: GrantFiled: May 21, 2004Date of Patent: August 2, 2011Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Myung M. Bae, Thomas K. Clark, Douglas Griffith, Roger L. Haskin, Shah Mohammad Rezaul Islam, Felipe Knop, Soumitra Sarkar, Frank B. Schmuck, Theodore B. Vojnovich, Yi Zhou, Robert Curran
-
Patent number: 7941690Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss not a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.Type: GrantFiled: July 5, 2007Date of Patent: May 10, 2011Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
-
Patent number: 7870230Abstract: A system, method and computer program product for use in a server cluster having plural server nodes implementing a server tier in a client-server computing architecture in order to determine which of two or more partitioned server subgroups has a quorum. A determination is made of relative priorities of the subgroups and a quorum is awarded to the subgroup having a highest relative priority. The relative priorities are determined by policy rules that evaluate comparative server node application state information. The server node application state information may include one or more of client connectivity, application priority, resource connectivity, processing capability, memory availability, and input/output resource availability, etc. The policy rules evaluate the application state information for each subgroup and can assign different weights to different types of application state information. An interface may be provided for receiving policy rules specified by a cluster application.Type: GrantFiled: July 15, 2005Date of Patent: January 11, 2011Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson, Soumitra Sarkar
-
Publication number: 20100115338Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss not a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.Type: ApplicationFiled: July 5, 2007Publication date: May 6, 2010Inventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
-
Patent number: 7475296Abstract: A method and system for capturing a state of a distributed computer system is provided. The state is captured in response to an error or event message received by one of the clients and/or server nodes of the system. In response to receipt of the error or event message, the recipient initiates transmission of a special protocol message of affected members of the system. Upon receipt of the message, all recipients will conduct a freeze of their respective operating system image. Depending upon the characteristics of the error or event, the message may be transmitted to a selection of members of the system, or the entire system.Type: GrantFiled: May 20, 2004Date of Patent: January 6, 2009Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Pradeep Satyanarayana
-
Patent number: 7284147Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss nor a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.Type: GrantFiled: August 27, 2003Date of Patent: October 16, 2007Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
-
Patent number: 7197632Abstract: A method and system for maintaining a discovery record and a cluster bootstrap record is provided. The discovery record enables shared storage system discovery and the cluster bootstrap record enables cluster discovery and cooperative cluster startup. The cluster bootstrap record is updated in response to a change in the cluster membership. The update is performed by a cluster leader in the form of a transactionally consistent I/O update to the cluster bootstrap record on disk and a distributed cache update across the cluster (30, 50). The update is aborted (80) in the event of a failure in the cluster leaving the cluster bootstrap record in a consistent state. In the event of a disastrous cluster and/or storage system failure, the discovery record may be recovered (228) from a restored storage system (214) and the cluster bootstrap record may be reset to install a new cluster in the old cluster's place (232).Type: GrantFiled: April 29, 2003Date of Patent: March 27, 2007Assignee: International Business Machines CorporationInventors: Sudhir G. Rao, Bruce M. Jackson
-
Publication number: 20040221149Abstract: A method and system for maintaining a discovery record and a cluster bootstrap record is provided. The discovery record enables shared storage system discovery and the cluster bootstrap record enables cluster discovery and cooperative cluster startup. The cluster bootstrap record is updated in response to a change in the cluster membership. The update is performed by a cluster leader in the form of a transactionally consistent I/O update to the cluster bootstrap record on disk and a distributed cache update across the cluster (30, 50). The update is aborted (80) in the event of a failure in the cluster leaving the cluster bootstrap record in a consistent state. In the event of a disastrous cluster and/or storage system failure, the discovery record may be recovered (228) from a restored storage system (214) and the cluster bootstrap record may be reset to install a new cluster in the old cluster's place (232).Type: ApplicationFiled: April 29, 2003Publication date: November 4, 2004Inventors: Sudhir G. Rao, Bruce M. Jackson