Abstract: A system operation management server computer (server), agent computer (agent), and related methods are disclosed to configure and manage the operation of systems in a system category. In some embodiments, the server and a plurality of agents are programmed to collaborate on exploring and improving the operation of systems in a system category through reinforcement learning.
Abstract: A system operation management server computer (server) and related methods are disclosed. The server is programmed to learn the features of and relations among computer devices from various types of data related to the computer devices and build a knowledge graph (KG) to represent IT infrastructure. The server is also programmed to manage a collection of issue resolution rules each mapping the states of certain computer devices that characterize a known issue and a known resolution of the known issue. In response to receiving a support bundle that contains data related to a target computer system that has encountered an unknown issue the server is programmed to determine which issue resolution rules are applicable to the support bundle based on the KG and transmit a recommendation for resolving the unknown issue.