COMPUTER INFRASTRUCTURE SECURITY MANAGEMENT
A mapping system is provided that makes use of security data collected from various data sources. Following appropriate pre-processing, the mapping system analyses the security data to provide estimated values for parameters in a security model, the security model in turn being based on one or more mathematical representations.
Ensuring the security of a computing or information technology (IT) infrastructure, is important for an organisation. There are many threats and vulnerabilities. These may originate from internal and external sources on technical and administrative levels. Typically an organisation will have suitable policies and controls to identify and mitigate threats and vulnerabilities. For example, they may employ computer security professionals and/or install security systems to monitor the computing infrastructure and provide security alerts. These latter systems are often referred to as security information and event management (SEM) systems.
Any security solution needs to be suitable for the organisation. However, all organisations vary. Amongst others, organisations may vary in size; in hardware infrastructure; in geographical distribution; and in operational culture. To take account of these variations, expensive and time-consuming solutions are often required. Due consideration also needs to be given to the ever-changing nature of security threats: new technologies are developed, clyptographic systems are deciphered and most infrastructures are continually developing. For example, the explosive growth of mobile devices presents new challenges that may not have been anticipated when implementing legacy security systems.
Various features and advantages of the present disclosure will be apparent from the detailed description which follows, taken in conjunction with the accompanying drawings, which together illustrate, by way of example only, features of the present disclosure, and wherein;
Certain examples described herein provide a system that analyses low-level raw data such as that produced by monitoring and/or event information systems operating on a computing infrastructure. This raw data is processed into a form that can be analysed, e.g. statistically and/or numerically. Following analysis, more meaningful data is derived from the original raw data that can be directly used in security risk analysis systems, for example systems that use modelling and simulations. The transformed data, and the result of any further modelling and simulation, may be used to inform policy decisions relating to the modification, upgrade or replacement of security systems for the computing infrastructure. Modelling and simulation may include any of predictions, speculative analysis and scenario planning.
As used in the following description, a computing infrastructure may comprise one or more computing devices coupled to one or more heterogeneous networks. The computing devices may comprise, amongst others, workstations, laptops, mobile and tablet devices, routers, servers, networked storage, sensors, networked appliances, access points, and gateways. The networks may use a variety of wired or wireless access mechanisms including telecommunications technologies. The networks may be arranged in local, metropolitan and/or global configurations with multiple sites and specifications. They may comprise a mix of private and public networks.
In block 105, a system security model is defined. This involves modelling the behaviour of processes relating to the computing infrastructure. A mathematical representation of a process based on the probabilities of discrete events occurring in relation to the process may be used. In this stage any of architectural, policy, business process, and behavioural constraints, amongst others, which are inherent in the security problem are captured and formalized. For example, in a security model representing possible electronic attacks on the computing infrastructure, threat environment characteristics such as potential attacker behaviour, threat vectors and probabilities and other externalities that may influence an internal business process or human behaviour in the organisation are identified and captured as events. In other approaches, reports, rather than models may be issued based on a review of the computing infrastructure, typically by auditors or consultants.
In block 110, the security model of block 105 is used in a simulation to generate results 115 that can be analysed at block 120 for a deeper understanding of a security risk or proposed implementation. In a generic case, a security model comprises a mathematical representation that is defined by parameters and variables. The mathematical representation may be probabilistic, e.g. a process may be represented using probabilities or probability distributions. For example, in a basic case, if the mathematical representation uses linear regression, i.e. comprises a linear function of the form y=mx+c, the parameters of the representation are the multiplier ‘m’ and the constant ‘c’ and the variables are ‘y’ and ‘x’, the ‘y’ variable being the result of the function. A simulation may be performed by varying a variable range (e.g. the range of ‘x’ values) or one or more parameters. In another example, user requests to a server may be modelled using queuing theory (e.g. using Poisson processes or exponential distributions); simulations may be run to determine the effect on user waiting time of adding more servers. Certain parameters may be fixed by the computing infrastructure and its environment and others may be variable depending on a proposed implementation. A simulation may explore the value space for a variable parameter. That is to say, using a security model, behaviour of the computing infrastructure is simulated by exploring one or more search spaces in order to provide results 115 which can be representative of multiple output configurations for the computing infrastructure. A simulation may comprise Monte Caro methods and/or may provide statistically verifiable results. If the result of the simulations at block 110 does not accurately reflect the actual behaviour of the processes then the security model may be refined, i.e. the method may return to block 105 as shown by the dotted line.
At block 120, the outcome of the simulation is analysed. For example, the requirements of existing or proposed security policies for the computing infrastructure may be compared with the results 115. Examples of policy requirements may comprise a requirement that all new patches must be applied within 30 days, a requirement that all user requests must be acknowledged within one second or less or a requirement that user data relating to ex-employees must be deleted or archived within 10 days. These requirements may be measured using security metrics. in this case, “patching” refers to the installation of an update to a piece of software such as an operating system or application. If there are outcomes within the results that match existing decisions, policies or configurations for the computing infrastructure, for example through simulation it may be determined that three server devices are required to meet an acknowledgement time of one second or less, then actions based on the analysis 120 may be taken, for example three server devices may be purchased. If there are no simulation outcomes that match existing policies, for example it may be determined that an acknowledgement time of one second or less is impossible to achieve, then the identification of the security risks or the security model may need to be refined, as represented by the dotted lines to blocks 100 and 105. Additionally block 105 may need to be revisited as new problems arise.
Following the first join/change role event 205, a provisioning request 210 is generated for the provision of access to one or more applications and/or computing devices forming part of the computing infrastructure. There is then a configuration/deployment phase 215 in which the access rights are determined, verified and deployed for the user. For example, a computing department within the organisation can generate the desired security or access credentials for the user in response to the request 210, and communicate those credentials to the user, or someone else in the user's hierarchy (such as a manager for example). Once the security model is defined a set of metrics 225 can be used to monitor various phases of the process. For example, the time taken to process a request 210 can be monitored, as well as whether or not the configuration and/or deployment phase 215 was successful. In particular, the following sub-processes can be modelled and monitored: a loss of a provisioning request; a waiting time to approval of a request; a loss of a deployment request; a waiting time for deployment; and misconfiguration of a new user, e.g. unsuccessful deployment.
Similarly, following a second leave/role change event 255, if access privileges should be downgraded or revoked, a de-provisioning request 260 can be used to fulfil the changes. Accordingly, a configuration/deployment phase 265 determines the access rights which should be changed as a result of the request 260, and executes the changes by, for example, revoking a security credential for the user or downgrading/changing a security credential so that access privileges for the user are less privileged than they were, or only permit access to limited or different systems than before the change was deployed. A set of metrics 270 can be used to monitor various sub-processes or phases associated with the request 260. For example, the time taken to process the request can be monitored, as well as whether or not the configuration and/or deployment phase 113 was successful.
A security model may be used in two contexts: in a first context the security model serves as a guide for audit measurements involving the computing infrastructure; in a second context the security model serves as a framework for simulation using the measurements. For example, in the first context existing software operating on the computing infrastructure may be adapted so that a measurement is made relating to each of a number of sub-processes or phases, i.e. relating to metrics 225 or 270. Using measurements for multiple events, appropriate probability distributions that model the measurements are determined for simulation. For example, the loss of a provisioning request may be modelled as a Bernoulli process using a binomial probability distribution. The parameters of the binomial probability distribution may be derived from the metrics 225. In the second context referred to above, a simulation of a provisioning request 210 may comprise a random sampling of the modelled Bernoulli process. Hence, security models can be used for at least: one, conveying, in a scientific manner, current security, risk and threat circumstances by using suitable metrics calculated using the model; and two, speculative (i.e. so-called “what-if”) analysis and scenario planning, for example, by exploring variants of processes and/or exploring different model assumptions. in the second case, simulations convey results and predictions via security metrics, the results and predictions being dependent on input parameters for the security model.
As demonstrated by the above examples, a security model may include one or more mathematical representations of a set of internal and external processes or components to represent aspects of the computing infrastructure, its environment and a security risk under consideration. External components may correspond to a threat environment and can include the rate of discovery of vulnerabilities, a speed to develop exploits, a speed to develop patches and signatures, attacker behaviour etc. Internal components can include specific tasks undertaken in security operations, a speed with which these tasks are undertaken, internal threats, human behaviour, human-system interactions, information and process flows, decision points, errors and process failures and specific security solutions and mechanisms and their properties.
The examples described below allow parameter values for security models to be determined based on collected data.
The example shown in
In the example of
A first component of the mapping system is the parameter processing engine 510. The parameter processing engine 510 coordinates the collection of raw data from data sources. In the example of
Following pre-processing, the parameter processing engine 510 performs data analysis of the pre-processed data based on available configuration information. The data analysis may comprise fitting the pre-processed data to a number of mathematical representations. The data analysis may involve, amongst others, one or more of data fitting, statistical analysis, numerical analysis. Data fitting may comprise curve fitting, e.g. constructing a curve or mathematical function that has the best fit to a number of data points provided by the pre-processed data. The data fitting may be subject to one or more constraints. Curve fitting may involve either interpolation, where an exact fit to the data is required, or smoothing, in which a “smoothing” function is constructed that approximately fits the pre-processed data. Numerical analysis applies algorithms that use numerical approximation. One or more statistical libraries may be used to perform the data analysis. For example, the pre-processed data may be analysed to determine, amongst others: if there are any frequency components; if the data is represented by one or more probability distributions, such as normal or Gaussian distributions, Bernoulli and binomial distributions. Pareto distributions, Poisson and exponential distributions; if the data is represented by one or more predefined lines, curves or multi-dimensional equations such as take-up curves; and if the pre-processed data displays any patterns such as clustering or discrete probabilities. If the pre-processed data does not match any library functions, e.g. if the confidence levels for each fit is below a predetermined threshold, a bespoke or composite function may be fitted to the pre-processed data. For example, a user or the mapping system may combine different curve equations until a confidence level exceeds a predetermined threshold. The result of this analysis is a range of estimated parameter values. A selected mathematical representation may also be output. Confidence levels for each determination or fit may also be provided. Estimated parameters may be stored in an estimated parameter database 516, such that the development and evolution of the estimated parameters over time may be analysed. In certain implementations, if pre-processed data cannot be fitted using smoothing algorithms, then an interpolation algorithm is applied. A range of interpolation methods may be used, for example depending on the data. These interpolation methods may include interpolation using Gaussian processes. Interpolation may be particularly important for certain security data, as there may not always be enough data to apply smoothing algorithms and accurate fit data with high confidence levels. Interpolation may also be applied where data fitting provides an erroneous result.
A number of components may be provided for the configuration of the mapping system. The mapping system of
In certain examples, the configuration and management module 545 enables a configuration user 502 to specify a security model to be used. A security model need not be associated with a particular modelling and simulation tool or system; it may be a chosen mathematical representation. hi certain configurations, a security model need not be selected; a mathematical representation may be selected based on the processing performed by the parameter processing engine 510. After selecting the security model, a list of parameters used by the security model may be displayed to the configuration user. The current values of each listed parameter may also be displayed. This may be implemented based on the parsing of a security model data the or via interaction (e.g. function calls) with a modelling and simulation system or tool. If a security model comprises multiple parameters, the configuration user is able to specify which parameters are to be estimated or re-estimated based on data collected from data sources. Finally, the configuration and management module 545 enables the configuration user to configure the way parameters are to be processed and how estimates are to be provided by the system. In certain implementations, the configuration user 502 may be distinguished from an end-user 504; for example, the end-user may be able to view outputs such as experimental results, security metrics and graphs, but may not be able to configure the mapping system; similarly a configuration user 502 may be able to configure the mapping system but not view outputs. In other implementations both users may have similar permissions.
A first system module coordinated by the workflow manager 520 is parameter processing configuration module 511. This uses configuration information from the configuration and management module 545, for example that supplied by a configuration user 502 and/or configuration database 544, The parameter processing configuration module 511 configures the parameter processing engine 510 and/or one or more of the plug-in processors 562, 564, 556. It retrieves configuration information that specifies how to collect data from one or more data sources, such as event and alert data sources 572, 574 and 576 and other data sources 578, how to process this data, for example which plug-in modules are to be used, and how to estimate parameter values, for example which of one or more probability distributions to fit. Parameter processing configuration module 511 may control the pre-processing configurations defined using the exemplary first and second interfaces of
A second system module coordinated by the workflow manager 520 is confidence analysis module 512. In certain examples, this module determines a confidence level that is representative of the suitability of a particular mathematical model. For example, the confidence level may be calculated based on an error between prepared data and a fitted line or equation or based on a statistical deviation or other statistical measure. Certain smoothing algorithms may also generate a confidence level when applied to pre-processed data. In particular implementations, algorithms within this module graphically display estimated parameter values for selected mathematical representations, for example via dashboard interface 530 and display 532, and provide a classification based on calculated confidence levels. A classification scheme with two or more classifications may be used. The classifications may be based on threshold levels. For Example, on classification may be that the mathematical representation is “useable” or “unuseable”, a mathematical representation being useable if a calculated confidence level is above a particular threshold. Another classification may use a colour-coded system, for example a red, amber and green “traffic-light” colour scheme. The confidence analysis module 512 may present a user with a number of different mathematical representations and the confidence levels for data derived from a number of data sources; the user may then select a particular representation to define the security model and for use in a modelling and simulation system 440. The confidence analysis module 512 may be arranged to record the present selection, and any previous selections, in one or more of configuration database 544 and estimated parameter database 516.
A third system module coordinated by the workflow manager 520 is modelling and simulation mapping module 513. This module controls how parameter values estimated by the parameter processing engine 510 are mapped into existing modelling and simulation systems, for example modelling and simulation system 440. This module may make use of modelling and simulation interface 550 that provides a set of capabilities, such as function calls, application program interfaces (APIs) or data wrappers, to interact with existing modelling and simulation systems and tools 554, 556. For example, if a modelling and simulation system uses security models defined in a particular structured programming language, the modelling and simulation mapping module 513 may write estimated parameter values to configuration files for the security models, including placing the estimated parameters values in a format that can be read by the structured programming language and used be the modelling and simulation systems and tools. In certain implementations, modelling and simulation interface 550 outputs mapped parameters to particular modelling and simulation systems. These systems may provide a programming language and general framework to represent and run executable security models. They may also provide a framework to perform Monte Carlo simulations using the aforementioned security models.
A fourth system module coordinated by the workflow manager 520 is modelling and simulation simulation module 514. This module controls interactions between the mapping system and modelling and simulation systems. This may be achieved using modelling and simulation interface 550. For example, the modelling and simulation simulation module 514 may instruct a particular modelling and simulation system or tool 554, 556 to carry out an experimental simulation, using a selected security model and estimate parameter values from the parameter processing engine 510. In certain implementations, the modelling and simulation simulation module 514 uses the modelling and simulation mapping module 513 to place estimated parameter values in the correct form for simulation using a selected security model. A security model may be selected by an configuration user 502, for example using configuration and management module 545, or may be selected based on the suitability of security model following analysis, for example a security model that best fits measured and/or pre-processed data may be selected. In certain cases, if analysis shows that existing security models do not accurately model the data, a new security model may be generated based on a new or revised underlying mathematical representation. This new security model may then be used in the instructed simulation. The results of experimental simulations are stored in a results database 552.
A fifth system module coordinated by the workflow manager 520 is experimental outcome module 515. This module processes experimental results from simulations performed by one of more modelling and simulation systems or tools. The experimental outcome module 515 is arranged to retrieve experimental results stored in results database 552, in certain cases via modelling and simulation interface 550. The experimental outcome module 515 processes the experimental results so that, in one case, they can be displayed to an operational user 504 using display 532 and dashboard interface 530. This may involve processing the experimental results such that they can be graphical rendered, e.g. charted. It may also involve statistical summaries of the experimental results. Graphical rendering is performed by graphical rendering module 522. The graphical rendering module 522 is configurable and expandable based on the types of graphical results that might be required over time. For example, a “plug-in” or modular approach similar to that used for the plug-in processors 562, 564, 556 may be used. The graphical rendering module 522 and the display of experimental results in general, may be configured by a configuration user 502 via the configuration and management module 545.
Dashboard interface 530 provides an interface for the display of data to operational users. Operational users may be, amongst others, computer security professionals, members of a security operations centre, managers within the organisation associated with the computing infrastructure, business managers, governance managers, decision makers, risk assessors and other persons that are involved with the operation of the organisation. The dashboard interface 530 may provide a graphical framework, for example using a web-centric programming language or user interface libraries, to enable the display of information on display 532. This graphical framework may enable the modular display of the output of the graphical rendering module 522. it may also enable the output of a historical results module 526 to be displayed. Historical results module 526 enables display of historical simulation outcomes, for example previous experiment results from results database. This enables current estimated parameter values based on data sources 572 to 578 to be compared with parameter values estimated from simulations. These comparisons and historical results may be graphically displayed as output 536. Comparative analysis, possibly involving historical analysis, may be performed between a particular organisation and/or infrastructure and aggregated and/or anonymised data for a number of organisations and/or infrastructures. For example, this comparative analysis may be performed by a security operation centre that monitored a plurality of computing infrastructures for a plurality of organisations or customers. A result navigation module 524 enables operational users 504 to interact with displayed results, for example returning more detailed results when a user clicks on displayed data or switching between different graphs or chart types. In certain implementations, a user is able to define thresholds for the display of security metrics.
One advantage of examples described herein is that operational users 504 can review one or more of: security metrics based on historical ‘real-world’ data measured from the computing infrastructure; security metrics based on predictions developed using simulations; security metrics based on historical simulation data, e.g. previous predictions or results from simulations performed in the past; and security metrics based on different combinations of this data. A trends module 534 may be provided that displays how values of one or more security metrics vary with time (i.e. time-series data). The trends module 534 may be configured to use security metric values based on historical ‘real-world’ data for past values and security metric values based on simulations that use estimated parameter values for future values. For example, the security metric may be the percentage of computing devices in the computing infrastructure that have been patched after thirty days. Measurements from data sources may be used to calculate, this metric for past data, for example over the last six months. The same measurements may also be processed by the parameter processing engine 510 to determine estimated parameter values for a fitted take-up curve, e.g. to determine values for parameters in an probability distribution equation that defines a take-up curve. The estimate parameter values then may be using in simulations that repeatedly take random samples based on the probability distribution equation, for example Monte Carlo simulations. These simulations may then be used to estimate future values of the security metric, e.g. each day for the next six months may be taken as an independent trial. If values for certain parameters within a mathematical representation are varied, e.g. those relating to possible changes to the computing infrastructure, while other certain parameters have their estimated values, accurate “what-if” scenarios can be explored, with the predicted changes in security metric values displayed together with security metric values based on actual collected data from the past.
Further details of some of the functions of the mapping system will now be described, with reference to a number of examples,
The data shown in
A number of exemplary user interfaces will now be described so as to illustrate some of the functions of the mapping system. These user interfaces may comprise, for example, one possible implementation of the configuration interface 540 to enable a user to configure the parameter processing engine 510, plug-in processors 562, 564 and 556, and data sources 572 to 578 by way of the configuration and management module 545 and parameter processing configuration module 511. These exemplary user interfaces are provided to facilitate explanation of certain features of certain examples; they are not to be seen as limiting. Implementations may use different user interfaces and these may offer configuration and display options that differ from those shown in these examples. Furthermore, any applied graphical user interface may be changed or developed with successive versions of an implementation. They may or may not be used with the previously described examples of
In this example, data sources may be one of two types: primary data sources and derived data sources. Primary data sources are sources of raw data directly provided from the computing infrastructure or by external systems, such as monitoring systems, audit applications, threat management applications and external databases. Examples of primary data source comprise: a list of patched systems in a given time period along with the patch identifiers; and a list of people joining or leaving an organisation in a given time period along with the user identifiers. Derived data sources are sources of intermediate data obtained by processing data contained in raw data sources and/or other derived data sources. Derived data sources are distinguished from data generated by SIEM solutions in that derived data sources provide data that is processed to provide relevant data sets that support data analysis such as statistical assessment. For example, a derived data source may correlate a list of patched systems and their patch times with an approval date to deploy a patch and additional information about the patch, such as data defined by one or more Common Vulnerability Scoring System (CVSS) databases. The data shown in
Data sources may be added using data source configuration panel 720. For example, a primary data source may be added to the configuration using control 722 and a derived data source may be added to the configuration using control 724. Once a data source has been added it is shown as being available in available data sources panel 715. Panels 725, 735 and 740 and panels 745, 750 and 755 respectively enable further configuration of a selected primary and derived data source. Panels 725 and 745 allow for configuration of a selected data source, for example the selection of a physical or logical database, whether data should be appended to an existing data source, whether a the header shown be read etc. Panels 735 and 750 allow for configuration of the sampling frequency of a data source, for example whether data is taken from the data source every n seconds, minutes, hours etc. In the present example, the original data source is not modified, as the data source may be required for the successful operation of the computing infrastructure. Hence, data from a data source is sampled and stored in a buffer file or table. This buffer file or table further allows for the aggregation of data over a pre-determined time period. in certain implementations the data collected from both raw and derived data sources is stored in a SQL database, with tables automatically created and managed by the mapping system. Specifically, in this case, the manipulation of data sources and the definition of “derived data sources” are managed by means of explicit, annotated SQL queries. Programming scripts and/or graphical programming approaches may also be used in a similar manner. For example, visual programming that uses “query by example” may be used, wherein a user graphically selects suitable data. Panels 740 and 755 enable a subset of available fields from a data source to be selected. For a derived data source, panel 755 also allows for correlations between different data sources to be configured. The mapping system manages the synchronisation between raw data sources and derived data sources using dependency relationships. A data feed control panel 760 is also provided to start and stop the feeding of data from the configured data sources.
As illustrated by the example of
First a parameter is selected. Project and model details are displayed in panel 810. In certain implementations, a control is provided that lists all parameters for the selected security model. This list may be provided by parsing a structured language file that defines the security model as described above. In other implementations no project may be selected; for example, a list of stand-alone security metrics to estimate may be provided, these metrics relating to a particular organisation or being common to multiple organisations. Once a parameter is selected, a user specifies its assumed parameter type. In the example of
Panel 910 in
The exemplary implementation of
The interface of
As seen in
Following parameter estimation, certain examples of the mapping system enable a user to determine how to use estimated parameter values. In one implementation a user is provided with the option to use the new estimated parameter values by injecting them into a predefined security model used by a modelling and simulation system, replacing any previous assumptions of the parameter values. For example, the mapping system may be used to replace previously assumed values in the security models described with regard to
Further defined interfaces may also enable a user to configure the translation of estimated parameter values to a format or notation suitable for use in one or more security models. In general, the mapping system allows a user to have full control of the overall process. The level of interaction required from a user may vary according to the implementation: in some implementations a user may decide to configure and supervise the entire parameter estimation process, for example in an iterative manner; in other implementations the parameter estimation process may be automated, with a user setting initial configuration information. Any combination of the two approaches may be applied. In an automated case, parameter estimation may be regularly scheduled, for example to calculate new parameters based on new collected data every month, quarter or year. The mapping system may be configured such that parameter values within a security model are only automatically replaced based on a threshold comparison of the calculated confidence level. Automated reports for operational users that present security metrics based on estimated parameter values may also be regularly scheduled as an output phase of regular parameter processing.
Once parameter estimates are inserted or injected within a security model for a modelling and simulation system or tool, either the user or the mapping system can start a simulation step. The modelling and simulation system may form part of the mapping system or may be separate. The simulation step may be based on predefined sampling frequencies as defined in configuration information. In one implementation, the modelling and simulation system uses the underlying mathematical representation to carry out Monte Carlo trials. In this case, the experimental outcome of these trials may be processed and displayed to a user based on predefined graphical templates. Specifically, the mapping system interfaces with these systems or tools to start the simulation activity, for example via a function call or API. In the implementation of
In summary, the described examples relate to the management of a computing infrastructure. In particular, but not exclusively, the examples relate to the generation of security metrics for use in management of the computing infrastructure, the generation of the security metrics being based on security data derived from the computing infrastructure.
Certain examples solve a problem of how to improve risk assessment and support for decisions associated with the security of a computing infrastructure within organisations. To be effective, this risk assessment needs to factor in one or more of organisation processes, people behaviours, critical systems and business solutions. In the past, security risk assessment and decision support has been considered separately from the day-to-day management and control of the computing architecture. This has lead to a knowledge “gap”, i.e. high-level decisions concerning the structure and configuration of a computing infrastructure, for example whether to use this or that system or whether to restrict the coupling of person mobile devices to an organisations networks, are made without considering an actual or predicted behaviour of the computing infrastructure based on data recorded from the computing infrastructure itself. It is also difficult to convey information regarding security threats to strategic personnel, for example, information based on day-to-day computing infrastructure operations. There is also a lack of consistency regarding language and measurements. For example, risk assessment activities to identify threats and mitigate them with suitable policies and controls have been, in the past, business-driven and made on an ad-hoc basis by outside consultants or based on an audit when something goes wrong. This is very expensive, complex to achieve and at best only provides an untraceable snapshot of the operation a computing infrastructure. On the other hand, organisations typically invest in monitoring and security information and event management (SEM) solutions to collect large amount of information from their computing infrastructure, for compliance and governance purposes. However, these SEM solutions are driven by day-to-day, low-level, i.e. via a bottom-up approach driven by computing objectives; they are not used for high-level decision making in relation to the computing infrastructure.
Certain examples address this problem by introducing a mapping system that makes use of information collected from various data sources, including SEM solutions and threat management systems. Following appropriate pre-processing, the mapping system analyses this information to provide estimated values for parameters in a security model, the security model in turn being based on one or more mathematical representations. In certain cases, the security model may comprise a mathematical representation or probability model, in other cases the security model may be complex, i.e. model multiple sub-process and be generated by a modelling and simulation system. In certain implementations, the mapping system also transforms the estimated parameter values for use in a security model that forms part of a modelling and simulation system or tool, for example overwrites previously assumed parameter values in data files used by external modelling and simulation software. Any security model with update parameters based on the estimated parameter values may be used to generate security metrics that can inform decisions concerning the security of the computing architecture, for example in security risk assessment and decision support at the business and technical security level.
Hence, certain examples have an advantage of providing a technical system and method to bridge the gap existing between higher-level risk assessment, e.g. relating to system-wide properties of a computing infrastructure and specified user behaviour, and lower-level governance and compliance, e.g. based on logging and monitoring systems.
Certain examples also address the problem of accommodating the variations in organisations and security threats when implementing a security solution. Certain examples address this problem by providing objective measurements of the nature of a computing infrastructure that can inform decisions and prevent ill-advised choices and configurations. Simulations can use estimated parameter values based on actual collected data to ensure accuracy and relevance. This also avoids potential errors that may be introduced using other approaches, for example where data is collected manually by experts or consultants based on interviews and the like.
There is also an advantage of supporting an ongoing assessment of security risks and computing infrastructure operation. For example, information can be continuously collected from raw or derived data sources, allowing parameter values to be estimated for any time period from a current time. This can be contrasted with approaches that require manual ad-hoc recording for a specified time period or SEM solutions that provide real-time alerts but do not record information that may be used to determine longer-term historic trends and patterns. The approach of the described examples also factors in changes to the computing infrastructure. This may not be the case with one-off audits or reports. For example, if a number of computer workstations are added to a computing infrastructure this will be incorporated into the collected data and hence into estimated parameter values. However, assumed parameters derived from a one-off analysis of real-time data six months previously may be used for the expanded system without thought, giving a skewed insight into operation that can lead to business and technical problems.
Another advantage is that security metrics and/or estimated parameter values for any number of specified time periods can be compared. For example, by providing an indication of how security metrics and/or estimated parameter values change over time, users can see how the behaviour of a computing infrastructure evolves over time. Security metrics for different users, systems, computing infrastructures and organisations may also be compared using benchmarking functionality. It also enables users within an organisation to identify trends of relevance for security risk assessment and decision support. Through the inclusion of simulation results, time-series data may illustrate historic and predicted security metric values, allowing a complete view of trends and enabling users to identify potential problems.
Another advantage is that security metrics can be linked to the actual collected data, for example for auditing or reference purposes. For example, if an operational user clicks on a displayed security metric or chart, they may have the option to view the estimated parameter values, pre-processed data and/or raw data the metric is based on. The configuration information for the mapping system may comprise a relational database that associates the various data records and metrics.
The mapping system also has an advantage of being suitable for provision as a software service. Through the mapping systems modular approach it may receive data from a remote computing infrastructure and/or inject estimated parameters values into remote security analytic tools. For example, the mapping system may be implemented on a server remote from an organisations computing infrastructure, whereas SIEM solutions and/or security analytic tools may be operated by a security operations centre within the organisation. In implementations where the mapping system has access to estimated parameter data for multiple organisations, it may be arranged to output a score for a particular organisation in relation to other ones of the multiple organisations. For example, if all organisations use a common security model then the estimated parameter values, or security metrics generated based on said estimated parameter values, may be compared. Anonymised data may be used to respect the privacy of each organisation.
The above examples are to be understood as illustrative examples of the invention. Further examples of the invention are envisaged and certain variations have been discussed in the description above. it is to be understood that any feature described in relation to any one example may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the examples, or any combination of any other of the examples. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims.
Claims
1. A system for analysing a computing infrastructure, comprising:
- at least one data pre-processing module to receive security data from one or more data sources over a time period and to pre-process said security data, the security data comprising at least data relating to the operation of the computing infrastructure; and
- a parameter processing engine to perform data analysis on pre-processed data from the at least one data pre-processing module and determine, based on the data analysis, values for one or more parameters of a security model, the security model comprising at least a mathematical representation of a process relating to security of the computing infrastructure.
2. The system of claim 1, wherein the at least one data pre-processing module comprises at least one of:
- a data buffer to buffer the security data over the time period;
- a command processor to perform one or more operations using a data processing language; and
- a derived data source generator to generate pre-processed data based upon security data received from two or more data sources.
3. The system of claim 1, wherein the one or more data sources comprise one or more of:
- a security monitoring system for the computing infrastructure;
- a database identifying at least one of possible vulnerabilities, possible threats, or a combination thereof for the computing infrastructure;
- a security audit database for the computing infrastructure; and
- one or more log files for the computing infrastructure.
4. The system of claim 1 wherein the parameter processing engine comprises one or more of:
- a data processor to fit the pre-processed data to one or more mathematical representations, including deriving fitted parameter values for said one or more mathematical representations;
- a confidence processor to determine confidence values for one or more mathematical representations fitted by the data processor;
- a comparator to compare one or more determined values of the parameters with equivalent stored values of the parameters; and
- a data analyser to determine one or more statistical measures representative of the pre-processed data.
5. The system of claim 1, wherein the parameter processing engine comprises a data processor to determine a mathematical representation that best fits the pre-processed data, said mathematical representation being selected as the mathematical representation for the process relating to the security of the computing infrastructure.
6. The system of claim 1, comprising:
- a simulator to perform one or more simulations using the security model and the values of the one or more parameters so as to generate predicted security metric values.
7. The system of claim 6, comprising:
- a display interface to output time-series data for one or more security metrics based on the results of the simulator and the parameter processing engine, the time-series data comprising one or more of past and future time values.
8. The system of claim 1, comprising:
- a graphical user interface to display a graphical representation of one or more security' metrics, the one or more security metrics being generated based on the determined values for the one or more parameters of the security model.
9. The system of claim 1, wherein the mathematical representation comprises at least one discrete-event process model, discrete events being modelled using one or more probability distributions, the one or more parameters comprising parameters within said probability distributions.
10. The system of claim 1, comprising:
- a parameter injector to write the values of the one or more parameters determined by the parameter processing engine to a data file.
11. The system of claim 10, wherein the parameter injector replaces one or more previous values of said one or more parameters in the data file.
12. A method of analysing a computing infrastructure, the method comprising:
- receiving security data from one or more data sources over a time period, the security data comprising at least data relating to the operation of the computing infrastructure;
- pre-processing the security data to produce pre-processed data; and
- performing data analysis of said pre-processed data to determine values for one or more parameters of a security model, the security model comprising at least a mathematical representation of a process relating to security of a computing infrastructure.
13. The method of claim 12, wherein the step of pre-processing the security data comprises at least one of:
- buffering the security data over the time period;
- performing one or more operations using a data processing language; and
- generating pre-processed data based upon security data received from two or more data sources.
14. The method of claim 12, wherein the data sources comprise one or more of:
- a security monitoring system for the computing infrastructure;
- a database identifying at least one of possible vulnerabilities, possible threats, or a combination thereof for the computing infrastructure
- a security audit database for the computing infrastructure; and
- one or more log files for the computing infrastructure,
15. The method of claim 12, wherein performing data analysis of said pre-processed data comprises one or more of:
- fitting the pre-processed data to one or more mathematical representations, including deriving fitted parameter values for said one or more mathematical representations;
- determining confidence values for one or more mathematical representations fitted to the pre-processed data;
- comparing one or more determined values of the parameters with equivalent stored values of the parameters; and
- determining one or more statistical measures representative of the pre-processed data.
16. The method of claim 12, wherein the step of performing data analysis comprises:
- determining a mathematical representation that best fits the pre-processed data, said mathematical representation being selected as the mathematical representation for the process relating to the security of the computing infrastructure,
17. The method of claim 12, comprising:
- performing one or more simulations using the security and the values of the one or more parameters so as to generate predicted security metric values,
18. The method of claim 17, comprising:
- using the results of the one or more simulations and the results of the data analysis to output time-series data for one or more security metrics, the time-series data comprising one or more of past and future time values.
19. The method of claim 12, comprising:
- generating a security metric using the determined values for the one or more parameters; and
- comparing the security metric with one or more other security metrics generated based on data for other computing infrastructures.
20. The method of claim 12, wherein the mathematical representation comprises at least one discrete-event process model, discrete events being modelled using one or more probability distributions, the one or more parameters comprising parameters within the probability distributions.
Type: Application
Filed: Feb 22, 2012
Publication Date: Nov 13, 2014
Inventors: Marco Casassa Mont (Bristol), Yolanta Beresnevichiene (Bristol), Shane Sullivan (Bristol), Richard Brown (Bristol)
Application Number: 14/371,610
International Classification: G06F 21/50 (20060101);