Abstract: A programmable device employs an address and data corruption logic for data written to a first memory. A first signature is computed from the data stored in the first memory and stored in a second memory. When data is read from the first memory, the first signature stored in the second memory is read and compared with a second signature computed from the data read from the first memory. If the first and second signatures do not match, an error condition is indicated.
Abstract: A system and method for using continuous failure predictions for proactive failure management in distributed cluster systems includes a sampling subsystem configured to continuously monitor and collect operation states of different system components. An analysis subsystem is configured to build classification models to perform on-line failure predictions. A failure prevention subsystem is configured to take preventive actions on failing components based on failure warnings generated by the analysis subsystem.
Type:
Grant
Filed:
April 5, 2007
Date of Patent:
June 1, 2010
Assignee:
International Business Machines Corporation
Inventors:
Shu-Ping Chang, Xiaohui Gu, Spyridon Papadimitriou, Philip Shi-lung Yu