Abstract: A method for monitoring the availability of a data processing system is proposed. For example, the system runs a management application, which involves the periodic transmission of blocks of data from multiple local computers to a central computer. In the method of the invention, whenever a block of data must be transmitted by a generic local computer, an expected transmission delay of a next block of data is estimated; this information is then attached to the block of data. As a result, the central computer receiving the updated block of data can calculate an expected receiving time of the next block of data accordingly. If the next block of data is not received in due time, the central computer determines a failure of the local computer. Preferably, the central computer also scans a subset of ports of the local computer, so as to ascertain whether the problem is due to a temporary unavailability of the application or to an actual crash of the local computer.
Type:
Grant
Filed:
September 20, 2005
Date of Patent:
May 19, 2009
Assignee:
International Business Machines Corporation
Inventors:
Salvatore D'Alo, Arcangelo Di Balsamo, Alessandro Donatelli
Abstract: An arbitrary metric stream is processed initially at an interim sampling rate to derive a plurality of samples. The samples are analyzed preferably to determine an estimate of the effective bandwidth of the metric stream. As a result of the analysis, an improved sampling rate is determined and adopted for further sampling. In a preferred embodiment, the improved sampling rate is a function of the effective bandwidth.
Type:
Grant
Filed:
September 15, 2006
Date of Patent:
April 21, 2009
Assignee:
International Business Machines Corporation
Abstract: A solution for distributing the workload across the servers (105) in a fail-over cluster (for example, based on the MSCS) is proposed. A fail-over cluster is aimed at providing high availability; for this purpose, a resource service (205) automatically moves each resource (220) that exhibits some sort of failure to another server in the cluster. The proposed solution adds a monitor (240) that periodically measures a responsiveness of each resource. If the responsiveness of a resource is lower than a threshold value, the monitor inquiries a metrics provider (245) for determining the workload of all the servers in the cluster. The monitor then causes the resource service to move that resource to the server having the lowest workload in the cluster.
Type:
Grant
Filed:
September 13, 2005
Date of Patent:
October 28, 2008
Assignee:
International Business Machines Corporation