Automating Metrics Calculations

Automating Metrics Calculations
Dr.MohitBansal Profile Pic
Published Date:26-10-2017
Your Website URL(Optional)
Automating Metrics Calculations The previous chapters tackled some of the more theoretical concepts related to security metrics: why we ought to be measuring security, and what sorts of things we ought to measure. This chapter’s intent is more practical: to describe how to gather the data we are looking for. Because much of the data we seek are, in most organizations, stored inside a vast array of databases, system logs, spreadsheets, and brains, any discussion of “how” must discuss the mechanical processes that enable us to gather data on a large scale. Thus, this chapter concerns itself with one thing: automation. Automation, in the context of scorecards and metrics, has many benefits but can deliver them only when the associated processes and systems are well defined—specifi- cally, well enough understood to tell a machine how to collect data, compute and com- municate information, and—we hope—get us closer to the goal of insight. Merriam-Webster defines automation as “the controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human 217 CHAPTER 7 AUTOMATING METRICS CALCULATIONS organs of observation, effort, and decision.” In our narrative, we will focus mostly on drilling into the issue of “organs of observation and effort.” Toward this end, we will discuss the following topics: The benefits of automating metrics computation • Functional requirements for an effective and efficient automation system • Logical and physical models of a metrics automation system • The technologies and interfaces between the software that automates metrics com- • putation and the rest of the security environment Phases in the life cycle of an automation program • The role automation can play, and when it should be tempered with human inter- • vention To illustrate the process of implementing security metrics automation, at the end of the chapter we discuss a disguised example of one company’s experience with setting up an automated metrics management system. AUTOMATION BENEFITS I have mentioned the virtues of automation in earlier chapters. Good automation deliv- ers the following major benefits: Accuracy: The collection, computation, and communication processes are executed • precisely to specification. Repeatability: Any result can be reproduced, thereby enhancing trust that the meas- • urements and scores are not biased or erroneous. Increased measurement frequency: Having computers do the work instead of • humans shortens operations that would normally take a long time. Reliability: Operations that would normally be error-prone or tedious are more effi- • cient and predictable when computers, instead of humans, perform them. Transparency: The automation steps used to derive the metrics are readily apparent • and accurately documented. Auditability: The processing associated with each metric as well as any revisions to • their definition is recorded and can be reviewed by authorized auditors. An explanation of each benefit follows. 218 AUTOMATION BENEFITS ACCURACY Computers excel at following instructions. Automation alone does not automatically result in accuracy, however. The trick, as alluded to by the quote from the anonymous developerdude at the start of this chapter, is to understand enough about the desired solution to faithfully translate that knowledge into correct instructions—in the form of object models, work flows, and software. Once the correct instructions exist, computers can effectively and efficiently carry them out over and over again. Accuracy is a prerequisite for trust. And trust is a prerequisite for effectively leveraging what metrics are telling you. Betsy Nichols, the CTO of Clear Point Metrics, relates a story about a meeting she attended whose purpose was to go over a metrics spreadsheet that had been created by one of the participants. Everyone received a hard copy and began to scrutinize the information it contained. After a few minutes, one of the partici- pants pointed to cell M43 and said, “This doesn’t look right,” at which point the author appeared to be stuck for an answer. This was enough to derail the rest of the discus- sion—the credibility of virtually every cell in the spreadsheet came under question. The meeting deteriorated into a debate about how the spreadsheet was computed as opposed to what insight it was designed to facilitate. Trust disappeared and did not come back. Avoiding loss of trust takes two key ingredients. The first is consensus regarding the data, computations, models, assumptions, and possibly even the publishing format of results. The second is high-fidelity automation that faithfully captures the essential per- formance attributes of a process. Automation cannot deliver consensus on the how of metrics or scorecard definition, but it can lend structure to the discussion and rigor to the specifications. Trust flows from accuracy once everyone agrees how the metrics should be created and are confident that automation can carry out the measurement process in the expected manner. After establishing that the measurement process is accu- rate, subsequent discussions become about what, not how. REPEATABILITY Repeatability is another key to trust. If two measurements of the same target consistently yield the same result, faith in the veracity of the measurement—both the technique and the result—is sustained. But lack of confidence in the repeatability of a measurement can destroy trust in an instant. When most people think about automation, repeatability is the thing that gets them excited. Not just because it saves some poor soul from an otherwise thankless task—that is plain enough—but also because it eliminates the “middleman.” By automating the measurement of security processes, one can go directly to the primary source of data, such as a server, firewall, or employee directory. One can also pull data from a secondary 219 CHAPTER 7 AUTOMATING METRICS CALCULATIONS source that derives its data from multiple primary sources, such as a vulnerability man- agement system. Repeatability eliminates subjectivity—the enemy of any good metrics program. One enterprise we know implemented a metric regarding adherence to a policy that required shredding of sensitive documents. Their measurement technique consisted of gathering all managers and asking them to grade themselves on how well they felt they had shred- ded: A, B, C, D, or F. The criteria for receiving a grade tended to change across “grading periods,” and there was no way to check if the managers were consistently grading them- selves. Subjectivity—that is, the individual bias, mood, mental status, or state of caffeine deprivation of each manager—was baked into each grade, and nobody had any way of distilling it out. Such arbitrariness and lack of consistent scoring criteria are eliminated when data-gathering processes are automated. But repeatability is not just about making a manual process go faster or without human interaction. Repeatability must also be sensitive to the fact that many security processes are dynamic and that the underlying data change in real time. (One can argue that the attributes of a security process that are the most interesting are the dynamic ones.) Change is the enemy of repeatability. So the key to repeatability in automation is to combine the real-time operation of taking a measurement with the ability to remem- ber when the result was measured. Given memory of time-stamped and measured val- ues, automation can ensure that repeatability of historical metrics computation is achieved. In short, we can define metrics “repeatability” as “calculating something using the same data sources and data—over a similar sample period—and arriving at the same result.” But there are often practical limits to repeatability. For example, a Clear Point Metrics customer implemented a Security Event Management system. This system calculates 1 metrics such as “Top Ten Attackers” and “Top Ten Attacked Servers.” These metrics are computed on an hourly basis. All of the raw events used to compute these metrics are thrown away after 30 days due to storage limits on the central event repository server. For a given month, the “Top Ten Attackers” or “Top Ten Attacked Servers” for a given hour can be recomputed from raw event data, but computations of these metrics at a time over one month ago are not repeatable. Does the customer view this as a critical defect? No. If it were, the company would add more memory or would redesign the data- gathering process to achieve repeatability for periods longer than a month. For this par- ticular customer’s needs, management judged that it was not worth the effort. 1 I discounted the value of these metrics in Chapter 3, “Diagnosing Problems and Measuring Technical Security.” That said, these metrics meant something to the customer. 220 AUTOMATION BENEFITS In summary, automation can deliver repeatability when an adequate storage compo- nent is designed into the system. Storage adequacy is defined in terms of time and capac- ity to be consistent with the goals of the metrics. INCREASED MEASUREMENT FREQUENCY A happy by-product of automation is periodicity. The benefits of repeatedly and regu- larly taking measurements and publishing the results are numerous. First, a process that regularly generates a picture of “where we are” facilitates regular review. Data that are always fresh are more likely to be examined regularly. Second, obtaining regular snap- shots over time helps people understand trends, uncover potential cause-and-effect rela- tionships, and spot problems more quickly. (This is Twain’s notion of poetic history; data rhyme from time to time.) Finally, regular and repeated observations across time can be used to establish an accurate benchmark of current status and to set realistic goals for the future. Nichols tells of a company that was very proud of its metrics program. Every six months, the security team embarked on a massive manual data-gathering exercise to col- lect a long list of metrics. The fruit of their labor was a report full of scorecards and met- rics, presented at the end of the period. Although this sounds commendable, it is worth noting that all of their information was six months stale as soon as it became available. Certainly, a cycle time of six months is better than a cycle time of one year. Any infor- mation is better than none—usually. However, most companies believe that monthly cycle times are a minimum requirement for most strategic planning—and certainly for tactical and operational planning. In some cases, weekly or daily cycle times are required. Thus, although this company may have had the most ingenious, most insightful security measurement program ever conceived, the staleness of the data diluted its value. When automation can generate a metric automatically, the process under measure- ment and the objectives of the metrics dictate the cycle time (or measurement sampling frequency), rather than the capacity of humans to perform the work manually. This is almost always a primary driver to automate—to achieve suitable sampling rates with affordable consumption of human labor. When processes are particularly volatile, the need for increased sampling frequency become even more critical. The Nyquist-Shannon Sampling Theorem from Information Theory tells us that when a process changes at a steady frequency of F, one must take measurements (or samples) of the process at a rate of at least twice this rate (2F) to per- 2 fectly model the original process from the samples. If one is not interested in modeling 2 See C. E. Shannon, “Communication in the presence of noise”, Proc. Institute of Radio Engineers, vol. 37, no.1, pp. 10-21, Jan. 1949 221 CHAPTER 7 AUTOMATING METRICS CALCULATIONS absolutely positively every change, one can measure at much less frequency. But, of course, not all processes change at a constant frequency—far from it. In this case it is often desirable to understand not only the average rate of change in a target process but also the variation in that rate. A high variation in the rate of change indicates that the process may have distinct bursts of change followed by relative stability. This requires sampling rates that are at least twice the highest possible (or burst) change frequency. In short, automation enables humans to dial up the frequency of measurement processes. For volatile processes, organizations gain additional insights by having the capacity and flexibility to change sampling frequency or adjust measurement cycle time. RELIABILITY Reliability refers to the consistent operation of a metrics collection process over time. Consistent, repeated processes in line with specifications increase trust because they allow analysts to derive conclusions from collections of trustworthy results. Reliability guarantees that results are gathered regularly following an agreed-upon schedule, despite the inevitable roadblocks that, without automation, might cause the measurement effort to be postponed or canceled. Well-managed manual processes can also be highly reliable, but automation often brings a higher level of assurance. Reliability is enhanced via resiliency and failover fea- tures built into the automation software. Automation can also automatically retry failed steps or interpolate to fill in missing data. TRANSPARENCY I have written about the importance of transparency in previous chapters—particularly with respect to scorecard design. Most humans who read scorecards with a critical eye want to know how the mystery numbers were calculated. Thus, scorecards whose methodologies are relatively transparent help increase the level of understanding and acceptance. This key principle of good scorecard design applies equally to automation. In the anecdote we shared earlier about the dysfunctional spreadsheet meeting, it was not just the seeming sketchiness of the results that bothered the skeptic. The lack of trans- parency in the spreadsheet itself contributed. After all, spreadsheet printouts display for- mula results, not the formulae themselves. Of course, if the preferred metrics automation tool is a spreadsheet, simply printing worksheet formulae will not increase transparency much—they tend to be cryptic to untrained eyes. It is not practical to always publish all of a metric’s implementation 222 AUTOMATION BENEFITS specifications. However, with other automation techniques, such as an enterprise-class data-management system, one has more flexibility. For more mature automation pro- grams, the system’s design should explicitly include mechanisms to publish and distrib- ute metadata about a metric. By offering an open channel for communication of metrics metadata, there is no mys- tery as to the details underlying each metric result or scorecard edition. Metadata should describe what data were used to calculate the metric, how they were obtained, when they were obtained, what models and assumptions were used in the computation, what errors were encountered, what version of the business logic was used, and the name of the author of the business logic. Furthermore, the metadata should be easy to read and should be delivered via browsable catalogs of metric definitions. AUDITABILITY Auditability applies to all phases of a metric’s life. Auditability is what guarantees that all “interesting” events in the life of a metric are memorialized in a chronological log where authorized individuals can see what happened. Examples of “interesting” events include: The creation of the metric definition • Changes or updates to the definition of a scorecard or metric—its business logic or • runtime parameters The time and date when the metric was put into production to begin generating • sample measurements The time and date when the scorecard was put into production to begin generating • editions for distribution Errors encountered when computing the metric • Changes to the schedules for computing metrics or generating scorecards • Changes to the external systems used to provide the raw data for metric • computation Because of the pervasiveness of auditability across all phases in the life cycle of a metric, audit requirements can drive lots of decisions about the design and architecture of a metrics automation system. For example, the requirement to audit changes to the busi- ness logic associated with a metric drives features into an automation system that are very similar to those for software versioning. A repository for business logic (centralized or distributed) is required that can, at a minimum, assign visible revision identifiers to a metric and be able to associate generated results with the version that created them. 223 CHAPTER 7 AUTOMATING METRICS CALCULATIONS CAN WE USE (INSERT YOUR FAVORITE TOOL HERE) TO AUTOMATE METRICS? Automation of metrics-gathering processes can bring substantial benefits. To do it right, enterprises need to select their tools with care. As with many “new” data gathering initia- tives, it is tempting to view metrics automation as a generic collection activity that can be fulfilled with generic tools: spreadsheets, business intelligence products, and security event and incident management (SIEM) software. These are not good choices, because metrics activities have specialized requirements. Let’s talk about each of these in turn. SPREADSHEETS In the previous discussion of automation benefits, I noted that spreadsheet printouts do not always offer much in the way of transparency because the printout shows numbers, not formulae. The astute reader might reasonably conclude that I do not think much of spreadsheets as a tool for automating metrics calculations. I do not, actually. The spread- sheet is a fabulous prototyping tool, but it’s not well suited to real automation tasks. Anybody who has forced spreadsheet software to do highly unnatural things, as I have, knows that scalability is not its strong suit. Spreadsheets are likewise limited in the areas of external connectivity and data integration, query capabilities, auditing, version con- trol, unattended operation, and automatic exhibit generation. However, spreadsheets are a good choice for: Exploring data samples (using, for example, Excel’s PivotTable feature) to identify • good candidate metrics Prototyping a new metric using data sampled from an external table or flat file • Consolidating data gathered from questionnaires or by other manual collection • methods Piloting a departmental metrics program • If you would like to realize the full benefits of metrics automation, spreadsheets are a poor option. You will inevitably compromise accuracy, repeatability, increased measure- ment frequency, reliability, transparency, or auditability. Of course, for smaller-scale metrics automation efforts, some or all of these benefits may not be needed or desirable. 224 CAN WE USE (INSERT YOUR FAVORITE TOOL HERE) TO AUTOMATE METRICS? BUSINESS INTELLIGENCE TOOLS Business intelligence (BI) and data-mining tools such as SAS, Cognos, and Crystal Reports are a better alternative to spreadsheets. But these, too, have limitations. Many companies have tried to use business intelligence and data-mining tools for metrics and scorecard automation, but they concluded they were trying to fit a square peg in a round hole. A key challenge with BI tools is that they are largely oriented toward helping busi- ness analysts perform ad hoc explorations of large data sets. They are less well suited for managing metrics collections over time, because they do not necessarily provide version- ing and tracking mechanisms for business logic and metadata. For example, individuals with access to the data warehouse can effectively create and change business logic and metrics unfettered, without formal tracking of these changes. And as you might expect, unmanaged changes to formulae might lead to unreliable results—and decrease trust in the overall system. A second limitation concerns the availability—or lack thereof—of adapters and “glue code” to extract data from external security devices and control systems. Most business intelligence tools have varying abilities to pull in data from general-purpose enterprise systems like SAP, PeopleSoft, and Documentum and also from XML files, relational data- base files, and flat files. Cognos, for example, can extract data from all these sources; it also provides facilities for merging and scrubbing source data. This is great news, but it does not necessarily help you if the critical data you need reside in, say, the bowels of a Cisco router or in a particular part of your company’s Active Directory tree. In those cases, you would need to supplement the standard BI facilities with custom code. I do not mean to bash business intelligence and data-mining software packages, because they can be tremendously powerful tools for discovering patterns in data. But keep their appropriate uses in perspective. They were not intended to serve as security metrics automation systems—at least in the ways we need them to. Most BI tools were designed to perform ad hoc data exploration and visualization, not to automate and manage collections of metrics. SECURITY EVENT AND INCIDENT MANAGEMENT (SIEM) PRODUCTS A third class of tool that can be used for metrics automation includes the software pack- ages used to warehouse detailed security management information from network hard- ware and security operations. Often referred to as Security Event Management (SEM) or Security Incident Management (SIM) products, or more broadly as SIEM, these pack- ages typically have enterprise-wide scope and keep detailed information about security operations. Leading SIEM vendors include ArcSight, Cisco (Protego), Intellitactics, NetForensics, and Novell (E-Security). 225 CHAPTER 7 AUTOMATING METRICS CALCULATIONS A typical data center is awash in configuration, fault, performance, and usage accounting data. SIEM packages are architected to handle all of these. One SIEM ven- dor’s product that we know of, for example, is designed to accumulate over 100GB of data per day The reason for this is that operations personnel need to be able to sense “significant” events within seconds or minutes (at worst) of their occurrence. The result is that very high-granularity, high-frequency data is a by-product of normal data center operations. Such data can be leveraged to support strategic metrics, in addition to the minute-to-minute operational role that it supports now. SIEM systems, in our view, have their uses but are unlikely candidates to evolve into strategic metrics automation systems. SIEM systems are excellent at many things, but they fall short due to some of the things we want to see in an automation system: Real-time focus: Strategic metric time frames run to days, weeks, and months— • rarely hours, minutes, or seconds, which is what SIEM systems typically focus on. Nonaggregated results: Strategic metrics focus on characterizing behavior to sum- • marize hundreds or thousands of operational observations—not discrete events. Anomaly detection instead of process measurement: For strategic metrics, mean • and standard deviation values are suitable quantities, whereas SIEM metrics deal with individual events or clusters of associated “sympathetic” events. Operational orientation: Strategic metrics tend to need information from multiple • operational subsystems, while SIEM systems often focus on a specific management area such as network performance, server configuration, or event detection and cor- relation. Lack of connectivity to nonsecurity data sources: Measurements reported by SIEM • systems are based on data collected by the operational system itself. In contrast, strategic security metrics tend to integrate data collected from multiple external pri- mary or secondary data sources outside the scope of SIEM, such as HR management systems. These primary or secondary data sources are often called element manage- ment systems due to their singular focus on one relatively narrow area. Reporting: The focus of an operational system is to provide adequate management • functionality. Publication of measurements is viewed as a reporting feature that is typically quite limited when compared with more “mainline” management features. But a strategic metrics system’s whole purpose is creation, computation, and com- munication of quantitative data from disparate but related sources; it is much more than mere “reporting.” 226 TECHNICAL REQUIREMENTS FOR AUTOMATION SOFTWARE TECHNICAL REQUIREMENTS FOR AUTOMATION SOFTWARE This section discusses requirements for both enterprise-class metrics automation soft- ware and the supporting management discipline for maximizing the value of security metrics. The purpose of metrics automation software is to enhance the maturity of an organi- zation’s use of measurement and analysis to transform raw data into insight. Such a sys- tem delivers on this objective by providing a trusted environment that enforces and perpetuates regular, repeatable, and auditable metric results collection, computation, persistence, and publication. Strategies for improvement are the focus. Disputes over data quality or processing algorithm validity must be minimized, if not eliminated. The following are key functional requirements for automating metrics: Design environment: A graphical user interface to design and implement metrics, • scorecards, and associated business logic, without programming, is required. Security analysts are the target audience for this design activity, not IT data center operations staff or software developers. Metrics life cycle support: A robust, standards-based environment is required to • ensure that metric results are collected from authoritative sources, run on-schedule, use agreed-upon computational techniques, provide traceable continuity across time, and conform to standards for regularity, accuracy, and auditable change con- trol. This framework must simultaneously deliver trust, reliability, and scalability. Business context mappings: A facility for placing metric results into the context of • the business and business processes under measurement is required. This is a critical prerequisite for strategic application of metric results to yield the insight necessary for improving business process effectiveness and efficiency. Content management work flow: Metrics are valuable content. Their associated • business logic represents one form of content, while metric results and scorecard editions represent another. Both forms of content will expand as time passes—more metrics will be created, and they will be deployed into production to create more metric results and scorecard editions. Flexible results publication: An adaptable mechanism to support the communica- • tion of metric results in the form of scorecard editions is required. Policies associ- ated with metric results distribution include entitlement (who can see what), visualization (what it should look like), dissemination (e-mail, website, static, dynamic), annotation (human-generated interpretation), and subscriptions for automatically “pushed” alerts and notifications. Specifically, media such as PDF files, e-mail messages, and preexisting corporate intranets must be supported while lever- aging tools already in place. “Yet another dashboard” is not a desirable solution. 227 CHAPTER 7 AUTOMATING METRICS CALCULATIONS As the use of metrics for security grows, I expect that certain conventions and best prac- tices will emerge, not unlike what has happened in the Network and Enterprise Systems Management disciplines with the establishment of Service Level Agreements and universally accepted metrics for Quality of Service. Automation systems that formally manage the content associated with metrics will facilitate the establishment of “metrics exchanges” that will have the potential to drive measurement across industries and administrative boundaries. This is essential for enterprise business services that span departments, supply chains, enterprises, and business ecosystems. To meet our functional and usage requirements, Table 7-1 shows the key technical requirements for automation of metrics in detail. Table 7-1 Technical Requirements for Metrics Automation Requirement Description Benefit Data portability Metrics should be data-source- The same metric can be used in agnostic. The data sources needed more than one IT environment. to drive them should be defined in Thus, partners can share their a manner that allows “late bind- metrics, despite having different ing” to the specific provider(s) of antivirus, vulnerability, HR, and this data. network management products, and so on. Identity and Access Management Identity and access management Administrative overhead of users (IAM) portability for metrics should integrate seam- and entitlements is reduced, and lessly with preexisting enterprise with consequential increase in directory services. metric information security. Abstraction of external A metric should be packaged as a Management of metrics is greatly dependencies self-contained entity. External simplified, and sharing of metrics dependencies such as data between independent designers is sources, computational functions, facilitated. or scorecard charting capabilities must be explicitly identified in the form of interface definitions, not hard-coded implementations. Separation of design from pro- The environment for designing Introduction of new metrics and duction metrics should be physically and their resulting data into produc- logically distinct from the envi- tion operations can be carefully ronment in which the metrics exe- controlled. Specifically, the system cute to generate results. should version metrics business logic and should ensure that the collected results are consistently generated. 228 TECHNICAL REQUIREMENTS FOR AUTOMATION SOFTWARE Requirement Description Benefit Separation of computation from All compute logic for a metric Separation ensures consistency publication should be encapsulated within the and auditability of all published metric business logic and placed metric results. It also ensures that under strict version control. all metric results are available to Incorporation of computational the widest possible range of publi- operations within any publication cation facilities without any function should be discouraged. dependence on computational idiosyncrasies embedded in publi- cation logic. Mining at the edge Metrics should extract precisely Sensitive data should be isolated, their required data at the point of typically to a small number of generation (the data source). nodes. Retained data is limited strictly to the results of metric computation—not all the detailed data that may have been used to create it. External data sources are asked to provide precisely the data needed to compute a metric, thus drastically reducing the amount of data used when compared with classical data warehousing extract-transform-load strategies. Open results Metric results should be broadly Metric results can drive other accessible to a wide variety of con- related processes, such as risk sumers. management, compliance, and audit functions. View portability Metric visualization should be The mechanism used to publish publisher-agnostic. results can leverage preexisting infrastructure such as enterprise intranets or management consoles. A few of these items merit further explanation. Several of the items in Table 7-1, such as the separation of computation from publication, and of design from production, refer to the need to keep different organizational duties separate. The person operating the production servers used to collect and compute metrics usually isn’t the same person who designs scorecards. On a related note, different system owners have varying needs for confidentiality and integrity of data within the scope of the systems they operate. When gathering data for 229 CHAPTER 7 AUTOMATING METRICS CALCULATIONS analyzing password quality, for example, it makes no sense to pull passwords out of indi- vidual systems and store them in a central repository. Thus, in many cases organizations should try to “mine at the edge”—do preliminary computations via a process that runs on the source system directly. After initial mining has finished, the rough results can be forward to the automation system’s repository. This strategy has an additional benefit of reducing the amount of data stored by the metrics automation system. A third theme running through the IAM portability, view portability, and data porta- bility requirements concerns interoperability. Most organizations have purchased many systems over the years for managing users, reporting on activities, and storing data. They will undoubtedly purchase more in the future. Thus, an effective metrics automation system should integrate with—but not depend on—the vagaries and idiosyncrasies of particular data sources. The preceding list of formal technical requirements focuses primarily on the most desirable technical attributes we need. I have deliberately glossed over one of the most important aspects of metrics automation: the data model used to represent security events. Thus, the next section discusses security metrics requirements from the stand- point of the data model. After that, we turn to the external system interfaces (data sources and sinks) involved in metrics collection. DATA MODEL To effectively implement a metrics program, we need a model to frame the myriad of quantitative values that threat, exposure, countermeasure, and asset metrics provide. Our model should explain our view of the IT security environment and should help us frame and answer the following questions: Which metrics are the most critical to measure? • Which ones are the drivers or independent variables? • Which ones are just reflections of changes in the drivers—namely, dependent vari- • ables? What is the sensitivity relationship between independent or driver metrics and the • metrics that reflect results? Are sensitivities linear, logarithmic, exponential, or sinusoidal? • Figure 7-1 depicts a logical model of the most basic of IT security processes: describing how threats, exposures, and countermeasures interact as part of a system of controls. My model is largely a synthesis of a model used by Clear Point Metrics and several other models from vendors, educators, and standards bodies. It provides a framework for 230 DATA MODEL identifying independent variables, dependent variables, and their relationships. No doubt you have seen something similar to this model elsewhere. Others exist that can be equally illuminating. Reduced likelihood of Threats Discovers Counter-measures Exploits Eliminates Exposures Decreases Figure 7-1 Logical Model of IT Security Controls (Level 1) The model shows a simple block diagram with three interacting forces: threats, expo- sures, and countermeasures. Threats are things that can happen or are the result of proactive acts against one or more target assets. Vulnerabilities are characteristics of tar- get assets that make them more prone to attack by a threat or make an attack more likely to succeed or have impact. Threats exploit vulnerabilities, the results of which are expo- sures to the assets. Countermeasures are designed to prevent threats from happening or to mitigate their impact when they do. Underlying each of the three preceding concepts (threats, exposures, and countermeasures) are assets—namely, the targets of threats, the possessors of exposures, or the beneficiaries of countermeasures. Assets are the things we were supposed to be protecting in the first place. Merrill Lynch’s Alberto Cardona uses a football analogy to explain the differences between assets, vulnerabilities, threats, and countermeasures: “Your asset is the quarterback. His weak knee is the vulnerability. The primary threat is the other team. Countermeasures include knee-pads, lots of painkillers, and a strong 3 offensive line.” 3 Adapted from correspondence between the author and Cardona, October 2006. 231 CHAPTER 7 AUTOMATING METRICS CALCULATIONS To map the data model to metrics automation a bit more formally, let us drill down a bit more. Figure 7-2 shows the model, decomposed further for each of the three areas. Counter-measures Threats leads to reduces likelihood of Deterrent Attack Threat control has discovers Detective Frequency control has has metric metric Effectiveness Accuracy exploits triggers has has metric Vulnerability Asset Corrective control triggers has results has quantifiable in has metric Preventative eliminates control Impact Exposure Value decreases Exposures Figure 7-2 Logical Model of IT Security Controls (Level 2) THREATS First, consider the Threats portion of the model. Threats lead to attacks. Attacks • result in exploits directed at specific exposures. Attacks are detected and managed by components within the IT environment, • including, for example, commercial products from the SIEM market segment. Managed Security Services Providers (MSSPs) specialize in this part of the ecosys- tem. (I will discuss the operational aspects of attack detection later, in our discussion of the physical view.) For measuring attacks, the most basic quantity is frequency or attack arrival rate, • measured in terms of number of detected events per unit of time. Other measures include success or failure rates. 232 DATA MODEL Simple metrics that can be derived from frequency measurements include the mean • event rate, variance, standard deviation, maximum, and minimum over several time periods. For the attacks the model measures, various “dimensions” can be added to these quanti- ties by dividing the observed events into subgroups. These subgroups can be based on event attributes such as event severity, event type, target asset, or attacker. These sub- groups can be further refined. For example, target assets can be partitioned by asset value, business service supported, operating system, or owning business unit. The implications of tagging each attack with asset and event attributes are straightfor- ward. By associating attacks with assets (which are in turn associated with organizations, business services, or business units), we can create metrics that show mean and standard deviation event incidence for: Business units, ranked by asset value • Affected business service • Targeted operating system • Severity of attack • Moreover, using forecasting models such as linear regression, one can develop projec- tions for future incident frequency. Using correlation models, one can identify potential interdependencies between attack frequency or severity and other measured factors. EXPOSURES Let us examine the Exposures portion of the exhibit. For purposes of discussion, we define exposures as instances of negative characteristics of assets, brought on by a vul- nerability that applies to that asset. Exposures can exist for lots of reasons—because of the asset’s location (it is Internet-facing, for example), the functions it performs, the technology it uses, its user base, or the workload it is subjected to. Standards organizations such as Mitre Corporation and, Inc. play an impor- tant role in formalizing methods for modeling exposures. Mitre and oversee, respectively, the Common Vulnerabilities and Exposures (CVE) dictionary and Common Vulnerability Scoring System (CVSS). Beyond these two efforts, a broad collec- tion of organizations (some for-profit, some not) maintains vulnerability databases. 233 CHAPTER 7 AUTOMATING METRICS CALCULATIONS 4 5 6 In this category are services such as ICAT/NVD, BugTraq, CERT, and the X-Force database from Internet Security Systems. 7 The Common Vulnerabilities and Exposures website defines the term “exposure” slightly differently. It defines it as security-related facts about an asset that might be clas- sified as vulnerabilities by some, but not necessarily by everyone. (Vulnerability, like beauty or ugliness, is in the eye of the beholder.) For this discussion, let us treat vulnera- bilities and exposures as synonymous. The point is that when an attack successfully exploits a vulnerability, there is an impact. Models that attempt to measure the impact of exposures tend to be quite specific to the asset, the vulnerability, and the attack. Impact is often measured in dollars or in percent degradation. Converting impact into a quantitative measure typically takes additional information, such as revenue per transaction, cost per hour of unavailability, baseline throughput, or mean service time. More complex models take into account the network of interdependencies between assets that commonly comprise a business service. Many third-party tools map exposures to network assets, including vulnerability scan- ners like Qualys, Foundstone, and Nessus. Modeling exposures to business services, how- ever, requires higher-order models than a vulnerability scanner can provide. COUNTERMEASURES Countermeasures thwart attacks. For the purposes of our model, four types of control techniques are used by countermeasures: Deterrent controls reduce the likelihood of an attack. • Preventive controls reduce exposure. • Corrective controls reduce the impact of a successful attack. • Detective controls discover attacks and trigger preventive or corrective controls. • ISPs that aggressively block phishing sites are an example of a deterrent control in that they lower the likelihood of identity theft attacks. Firewalls are an example of a pre- ventive control because they block bad traffic. Antivirus software is an example of a 4 See the U.S. National Vulnerability Database at 5 See BugTraq at 6 See CERT at 7 See 234 DATA MODEL corrective control because it removes detected infections. Examples of detective controls include Intrusion Detection Systems (IDSs) and SIEM systems. However, most compa- nies use their IDS and SIEM systems only to detect attacks; they do not typically use them to trigger corrective controls, other than perhaps updating an event display or cut- ting a trouble ticket. For deterrent, preventive, and corrective controls, metrics that quantify effectiveness and efficiency are most important. Both of these ideas can be expressed as percentages— for example, the percentage of attacks thwarted or the percentage of throughput lost. A well-known and hotly debated metric in this space is accuracy, defined as 1.0 minus the percentage of false alarms. The term false positive is often used to refer to the detection of an attack that turns out not to be one. ASSETS The preceding discussion naturally suggests a potential logical model of the IT security environment—essential for automation. If we can model threats, exposures, and coun- termeasures of the security environment we are measuring successfully, we are that much closer to automating the measurement process. Although we did not break them into their own section of the diagram, assets are cen- tral to our understanding of our logical model. It is worth taking a few moments to elab- orate. First, assets are not just targets of attack; they can also be involved in delivering or countering the attack. As I mentioned in Chapter 4, “Measuring Program Effectiveness,” estimating asset value is a difficult, if not impossible, endeavor. No consensus exists on methodology for assigning dollar values to assets. For example, many IT assets are merely part of the infrastructure and are not directly involved in the moneymaking or value-delivering parts of the organization. For example, commodity servers, commercial software prod- ucts, and networking gear are sold generically to millions of customers. Many security products that provide countermeasures, attack detection, and vulnerability scanning focus on this type of asset. Any individual running instance of these products probably will not have a direct business value, although a collection of these will, in aggregate. I do not take a particularly hard-and-fast position about the right way to model asset values. However, data models for automation need flexibility for naming and grouping them. Assets are, in themselves, hierarchies of contained assets that coexist in networks of other, interdependent assets. It is important to allow flexibility for such fuzzy concepts as “aggregate” assets, containment hierarchies, and aliases. 235 CHAPTER 7 AUTOMATING METRICS CALCULATIONS Models exist for understanding basic containment and dependency relationships, but a lot of customization is always required to map metrics about individual components into metrics about the system they comprise. For this reason, tools to compose and tune asset relationships and asset values will always be a requirement for any complete automation system. The preceding discussion is not meant to be the definitive treatise on security model- ing. No one model is appropriate for all companies to use. The security and risk manage- ment market has created many alternative versions, and I am sure you have your favorite. This model was intended to get you started. Next, we turn to a discussion about how the metrics model drives requirements for automating metrics collection. DATA SOURCES AND SINKS At a high level, security metrics obtain measurements from data produced by a collec- tion of external data sources, apply some business logic, and finally publish results (such as scorecards) to an external sink. The word “sink” is just a fancy, slightly formal way of saying “destination.” The American Oxford Dictionary defines sink as “a body or process that acts to absorb or remove energy or a particular component from a system.” Framing sources and sinks in terms of work flow, at one end of the work flow is a col- lection of data providers (sources); at the other end are results publishers (sinks). The business logic associated with computation and visualization transforms raw data into insight. In this section we focus on the external data providers and data publishers. These are the integration points between an automated metrics and scorecard system and its surrounding ecosystem. Figure 7-3 depicts the ecosystem in which metrics operate. Notice that, given our focus in this book, we have (naturally) placed Security Metrics at the center of the universe. Note that the automated security metrics system at the center is both a consumer and a producer of services and data with the other members of the ecosystem. From a tech- nology point of view, this closed-loop symmetry is at the heart of an effective metrics and scorecard automation system. 236

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.