How to Optimize Cybersecurity Using Measurement
Digital transformation exposes more value, to more people, through more channels, at higher velocities. The capabilities behind this transformation are software-defined. The people who are making it happen are software developers. It’s an all-encompassing evolution in enterprise strategy and technology that demands a change for security.
Digital transformation requires the role of security to evolve. Tactically, it requires security to be developer-executed. That means enabling development teams to fulfill security requirements, while not impacting their velocity. Security’s work shifts to optimizing processes and ensuring that requirements are working.
The new questions for the digitally transformed security team include:
- “Are security capabilities scaling as software accelerates?”
- “Is the software-defined attack surface shrinking fast and affordable enough?”
- “Is my risk surface growing beyond my risk tolerance and do I need to make a change?”
Measurement is the key to answering these questions. Once you can measure, you can work on optimization, which is the future of digitally transformed security work.
The Motivation for Measurement
Small cracks in the security status quo appeared about 10 years ago. The emergence of public clouds and DevOps were the early causes. Operational roles began to morph as infrastructure became more software-defined. APIs took the place of boxes, networks and the people who supported them. In the process, the speed with which developers could expose value and risk to customers was increasing exponentially.
More recently, cloud-native development—with containerized microservices, Kubernetes and serverless architectures—added rocket fuel to the speed of digital change. Today, COVID-19 is another factor pressuring global organizations to rapidly adopt these new technologies to better compete.
Now in full force, digital transformation is requiring security to adapt to meet the needs of a software-defined world. We must shift more security responsibility onto development teams. When I say development teams, I mean DevOps, Site Reliability Engineers (SRE), developers and the like. It is security’s job to enable development teams to take on more security responsibility.
Shifting Security Left Reduces Drag and Augments Accountability
From a sheer resource and technology perspective, security is outgunned in comparison to development teams. Not only are security teams outnumbered; they are typically uninformed about the particulars of the specific applications they theoretically protect.
What is, and isn’t, a security issue is context-dependent. Expecting a security practitioner to have enough context across dozens—if not hundreds or thousands—of microservices doesn’t scale. When security tries to be responsible and stand in the middle of the flow of software, there is really only one outcome: Drag.
Drag is orthogonal to the velocity goals of digital transformation. Digital transformation exists to optimize the delivery of value with more speed. That is why security must be in the optimization game. One of the chief ways it does this is by getting out of the way and enabling security responsibilities (execution) to shift to those creating value. Remove the middleperson, remove the extra steps, remove the weakest link, process-wise, and you remove drag.
So where does that leave security? Security is accountable to the system as a whole. That means governing security capabilities executed by others. It means ensuring those capabilities are scaling with the increased velocity of development. And it means responding to meaningful degradations in capability performance. It’s a natural evolution and response to a maturing ecosystem.
The Methods of Measurement
The methods of security measurement fall into two related classes; cybersecurity risk management and metrics.
Cybersecurity risk management makes forecasts about the likelihood and impact of risk. It works at the portfolio or enterprise level, and it combines subject matter expert (SME) judgments with relevant data. Think of it as the method for measuring and controlling for “risk surface.”
Metrics measure operations for the purpose of capability optimization. Metrics can work with small and big data alike, and can also incorporate SME input in the more advanced applications. Metrics in this context measure capability effectiveness at controlling “attack surface.”
Cybersecurity Risk Management: Managing Risk Surface
Cybersecurity risk management is measurement at the enterprise risk level. An example might be forecasting the likelihood and impact of having one or more breaches over a three-year period. This could be applied to a business unit, portfolio or some crown jewel set of applications.
The various forecasts can also be combined to provide an enterprise view of risk. Since mathematically sound methods of measurement underpin these approaches, adding portfolios of risk together for a unified view is simple.
Using modern cybersecurity risk management methods, leaders determine what investments best reduce risk with the best return on investment (ROI). Results specifically help leaders make decisions about investing in capabilities (people, process and technologies) and risk transfer (cyber insurance). Priority is given to investments that “reduce the risk curve” to a tolerable level with the least spend.
Security Metrics: Managing Attack Surface
Metrics will tell us if our capabilities are scaling, accelerating or slowing. Scaling simply means that security capabilities are keeping up with the speed of digital transformation. Accelerating means the rate of control is improving. And slowing means capabilities are degrading.
Scaling, accelerating and slowing have value in relationship to objectives. An objective is a goal for your capability. Achieving an objective requires making changes to capabilities. You know those changes are working if you see acceleration toward your goal.
But we aren’t always sure what we need to change to achieve our goals. When we are uncertain, we need to figure out what is the strongest predictor, or set of predictors, of success. What should we alter in our capability that best achieves our goal, with the least cost and disruption? The process of analyzing metrics in relationship to an outcome, is optimization.
An Intro to BOOM Metrics: Baseline Objectives and Optimization Measurements
BOOM3 is a metrics framework for measuring and optimizing capabilities.
The baseline portion of BOOM has six fundamental metric building blocks. They are: Basic Counts, Burndown Ratios, Arrival Rates, Departure Rates, Survival Rates, and Escape Rates:
- Basic Counts are the most atomic measures. They consist of hits and misses, ons and offs, ins and outs etc. For example, “There are 500 critical vulnerabilities over 50 days of age.”
- Burndown Ratios are a cumulative measure of security work efficiency. You can think of it as tickets closed over tickets created. You might say, “Critical vulnerability remediation burndown for Q2 was 90%, this was a 5% improvement on Q1.”
- Arrival Rates measure how much security risk materializes in a given time frame. This could be threats or vulnerabilities. It is designed to be a predictive metric. You might say, “Based on the last two quarters, we expect 20 critical incidents a month on average. Given the rate and volume of software releases expected, we forecast this to expand to 50 a month on average over the next quarter.”
- Departure Rates are essentially the same as arrival rates. The former measures how many we expect to appear in a given time frame, and departures measures how much we should expect to disappear. In a digitally transformed environment things come and go rapidly. It is important to measure both arrivals and departures to determine if capabilities will continue to scale, accelerate or slow.
- Survival Rates measure how long threats and vulnerabilities live. It uses methods coming from survival analysis. In its simplest form, you can say things like, “50% of critical vulnerabilities live for 5 days or longer, 10% for 30 days, and 1% for 365 days.”
- Escape Rates measure the frequency with which risks change state over a time period. Change includes movement in location like pre-production to production, or north to south, or internal to external etc. You might say something like, “Over the last quarter 10% of known critical vulnerabilities escaped into production. This is a 5% improvement on the last quarter.”
One or more baselines are assembled in various configurations under a measurement objective. Objectives are simply goals not dissimilar in concept to the now popular Objectives and Key Results (OKRs). A general objective might be, “Reduce the time to discover publicly exposed services that don’t adhere to least-privilege policy.”
The Future of Security Measurement
A digitally transformed security measurement system will consist of descriptive, predictive and prescriptive analytics capabilities. It answers, “what happened, what will happen, and what should I do now?”
Descriptive analytics is the real-time visibility and history of what happened operationally. Predictive analytics forecasts what will happen given what has transpired (input from descriptive analytics). From there, we analytically weigh the various factors involved in prediction to understand what will give us the best security ROI to control both attack and risk surfaces. We call that last step “prescriptive analytics.”
Digitally transformed security shifts the responsibility for executing security capabilities to development teams. Now the role of security becomes accountability for assuring those capabilities are scaling (if not accelerating). The soul of accountability is measurement.
Ultimately, measurement enables both security process automation and enterprise risk decision-making. It gives security a meaningful role in a digitally transforming organization, and gives them parity with the velocity of change without getting in the way. This is the future of digitally transformed security.
Richard Seierson is Chief Executive Officer of Soluble.ai.