Patents by Inventor Sudhir G. Rao

Sudhir G. Rao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Fault-tolerance and fault-containment models for zoning clustered application silos into continuous availability and high availability zones in clustered systems during recovery and maintenance

Patent number: 8286026

Abstract: A cluster recovery and maintenance technique for a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of one or more client applications running in a client application tier on the clients. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide client application services. First and second zones in the cluster are determined in response to an active node membership change involving one or more active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.

Type: Grant

Filed: February 13, 2012

Date of Patent: October 9, 2012

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson
Fault-Tolerance And Fault-Containment Models For Zoning Clustered Application Silos Into Continuous Availability And High Availability Zones In Clustered Systems During Recovery And Maintenance

Publication number: 20120166866

Abstract: A cluster recovery and maintenance technique for a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of one or more client applications running in a client application tier on the clients. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide client application services. First and second zones in the cluster are determined in response to an active node membership change involving one or more active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.

Type: Application

Filed: February 13, 2012

Publication date: June 28, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sudhir G. Rao, Bruce M. Jackson
Fault-tolerance and fault-containment models for zoning clustered application silos into continuous availability and high availability zones in clustered systems during recovery and maintenance

Patent number: 8195976

Abstract: A cluster recovery and maintenance technique for use in a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of client applications running in a client application tier. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide services on behalf of client applications. First and second zones in the cluster are determined in response to an active node membership change involving active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.

Type: Grant

Filed: June 29, 2005

Date of Patent: June 5, 2012

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson
Lock acquisition among node of divided cluster

Publication number: 20110258306

Abstract: The acquisition of a lock among nodes of a divided cluster is disclosed. A method is performable by each of at least one node of the cluster. A node waits for a delay corresponding to its identifier. The node asserts intent to acquire the lock by writing its identifier to X and Y variables where another node has failed to acquire the lock. The node waits for another node to acquire the lock where the other node has written to X, and proceeds where Y remains equal to its own identifier. The node waits for another node to acquire the lock where the other node has written to a Z variable, and writes its own identifier to Z and proceeds where the other node has failed. The node writes a value to Y indicating that it is acquiring the lock, and maintains acquisition by periodically writing to Z.

Type: Application

Filed: June 25, 2011

Publication date: October 20, 2011

Inventors: Sudhir G. Rao, Myung M. Bae, Thomas K. Clark, Douglas Griffith, Roger L. Haskin, Shah Mohammad Rezaul Islam, Felipe Knop, Soumitra Sarkar, Frank B. Schmuck, Theodore B. Vojnovich, Yi Zhou, Robert Curran
Lock acquisition among nodes of divided cluster

Patent number: 7991753

Abstract: The acquisition of a lock among nodes of a divided cluster is disclosed. A method is performable by each of at least one node of the cluster. A node waits for a delay corresponding to its identifier. The node asserts intent to acquire the lock by writing its identifier to X and Y variables where another node has failed to acquire the lock. The node waits for another node to acquire the lock where the other node has written to X, and proceeds where Y remains equal to its own identifier. The node waits for another node to acquire the lock where the other node has written to a Z variable, and writes its own identifier to Z and proceeds where the other node has failed. The node writes a value to Y indicating that it is acquiring the lock, and maintains acquisition by periodically writing to Z.

Type: Grant

Filed: May 21, 2004

Date of Patent: August 2, 2011

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Myung M. Bae, Thomas K. Clark, Douglas Griffith, Roger L. Haskin, Shah Mohammad Rezaul Islam, Felipe Knop, Soumitra Sarkar, Frank B. Schmuck, Theodore B. Vojnovich, Yi Zhou, Robert Curran
Reliable fault resolution in a cluster

Patent number: 7941690

Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss not a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.

Type: Grant

Filed: July 5, 2007

Date of Patent: May 10, 2011

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
Policy-based cluster quorum determination

Patent number: 7870230

Abstract: A system, method and computer program product for use in a server cluster having plural server nodes implementing a server tier in a client-server computing architecture in order to determine which of two or more partitioned server subgroups has a quorum. A determination is made of relative priorities of the subgroups and a quorum is awarded to the subgroup having a highest relative priority. The relative priorities are determined by policy rules that evaluate comparative server node application state information. The server node application state information may include one or more of client connectivity, application priority, resource connectivity, processing capability, memory availability, and input/output resource availability, etc. The policy rules evaluate the application state information for each subgroup and can assign different weights to different types of application state information. An interface may be provided for receiving policy rules specified by a cluster application.

Type: Grant

Filed: July 15, 2005

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson, Soumitra Sarkar
Reliable Fault Resolution In A Cluster

Publication number: 20100115338

Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss not a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.

Type: Application

Filed: July 5, 2007

Publication date: May 6, 2010

Inventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
Serviceability and test infrastructure for distributed systems

Patent number: 7475296

Abstract: A method and system for capturing a state of a distributed computer system is provided. The state is captured in response to an error or event message received by one of the clients and/or server nodes of the system. In response to receipt of the error or event message, the recipient initiates transmission of a special protocol message of affected members of the system. Upon receipt of the message, all recipients will conduct a freeze of their respective operating system image. Depending upon the characteristics of the error or event, the message may be transmitted to a selection of members of the system, or the entire system.

Type: Grant

Filed: May 20, 2004

Date of Patent: January 6, 2009

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Pradeep Satyanarayana
Reliable fault resolution in a cluster

Patent number: 7284147

Abstract: A method and system for localizing and resolving a fault in a cluster environment. The cluster is configured with at least one multi-homed node, and at least one gateway for each network interface. Heartbeat messages are sent between peer nodes and the gateway in predefined periodic intervals. In the event of loss of a heartbeat message by any node or gateway, an ICMP echo is issued to each node and gateway in the cluster for each network interface. If neither a node loss nor a network loss is validated in response to the ICMP echo, an application level ping is issued to determine if the fault associated with the absence of the heartbeat message is a transient error condition or an application software fault.

Type: Grant

Filed: August 27, 2003

Date of Patent: October 16, 2007

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson, Mark C. Davis, Srikanth N. Sridhara
Storage system and cluster maintenance

Patent number: 7197632

Abstract: A method and system for maintaining a discovery record and a cluster bootstrap record is provided. The discovery record enables shared storage system discovery and the cluster bootstrap record enables cluster discovery and cooperative cluster startup. The cluster bootstrap record is updated in response to a change in the cluster membership. The update is performed by a cluster leader in the form of a transactionally consistent I/O update to the cluster bootstrap record on disk and a distributed cache update across the cluster (30, 50). The update is aborted (80) in the event of a failure in the cluster leaving the cluster bootstrap record in a consistent state. In the event of a disastrous cluster and/or storage system failure, the discovery record may be recovered (228) from a restored storage system (214) and the cluster bootstrap record may be reset to install a new cluster in the old cluster's place (232).

Type: Grant

Filed: April 29, 2003

Date of Patent: March 27, 2007

Assignee: International Business Machines Corporation

Inventors: Sudhir G. Rao, Bruce M. Jackson
Storage system and cluster maintenance

Publication number: 20040221149

Abstract: A method and system for maintaining a discovery record and a cluster bootstrap record is provided. The discovery record enables shared storage system discovery and the cluster bootstrap record enables cluster discovery and cooperative cluster startup. The cluster bootstrap record is updated in response to a change in the cluster membership. The update is performed by a cluster leader in the form of a transactionally consistent I/O update to the cluster bootstrap record on disk and a distributed cache update across the cluster (30, 50). The update is aborted (80) in the event of a failure in the cluster leaving the cluster bootstrap record in a consistent state. In the event of a disastrous cluster and/or storage system failure, the discovery record may be recovered (228) from a restored storage system (214) and the cluster bootstrap record may be reset to install a new cluster in the old cluster's place (232).

Type: Application

Filed: April 29, 2003

Publication date: November 4, 2004

Inventors: Sudhir G. Rao, Bruce M. Jackson