How to think about cloud security governance
When customers first move to the cloud, their instinct might be to build a cloud security governance model based on one or more regulatory frameworks that are relevant to their industry. Although this can be a helpful first step, it’s also critically important that organizations understand what the control objectives for their workloads should be.
In this post, we discuss what you need to do both organizationally and technically with Amazon Web Services (AWS) to build an efficient and effective governance model. People who are taking their first steps in cloud can use this post to guide their thinking. It can also act as useful context for folks who have been running in the cloud for a while to evaluate their current governance approach.
But before you can build that model, it’s important to understand what governance is and to consider why you need it. Governance is how an organization ensures the consistent application of policies across all teams. The best way to implement consistent governance is by codifying as much of the process as possible. Security governance in particular is used to support business objectives by defining policies and controls to manage risk.
Moving to the cloud provides you with an opportunity to deliver features faster, react to the changing world in a more agile way, and return some decision making to the hands of the people closest to the business. In this fast-paced environment, it’s important to have a way to maintain consistency, scaleability, and security. This is where a strong governance model helps.
Creating the right governance model for your organization may seem like a complex task, but it doesn’t have to be.
Many customers use a standard framework that’s relevant to their industry to inform their decision-making process. Some frameworks that are commonly used to develop a security governance model include: NIST Cybersecurity Framework (CSF), Information Security Registered Assessors Program (IRAP), Payment Card Industry Data Security Standard (PCI DSS), or ISO/IEC 27001:2013
Some of these standards provide requirements that are specific to a particular regulator, or region and others are more widely applicable—you should choose one that fits the needs of your organization.
While frameworks are useful to set the context for a security program and give guidance on governance models, you shouldn’t build either one only to check boxes on a particular standard. It’s critical that you should build for security first and then use the compliance standards as a way to demonstrate that you’re doing the right things.
After you’ve selected a framework to use, the next considerations are controls. A control is a technical- or process-based implementation that’s designed to ensure that the likelihood or consequences of an identified risk are reduced to a level that’s acceptable to the organization’s risk appetite. Examples of controls include firewalls, logging mechanisms, access management tools, and many more.
Controls will evolve over time; sometimes they do so very quickly in the early stages of cloud adoption. During this rapid evolution, it’s easy to focus purely on the implementation of a control rather than the objective of it. However, if you want to build a robust and useful governance model, you must not lose sight of control objectives.
Consider the example of the firewall. When you use a firewall, you implement a control. The objective is to make sure that only traffic that should reach your environment is able to reach it. Although a firewall is one way to meet this objective, you can achieve the same outcome with a layered approach using Amazon Virtual Private Cloud (Amazon VPC) Security Groups, AWS WAF and Amazon VPC network access control lists (ACLs). Splitting the control implementation into multiple places can enable workload owners to have greater flexibility in how they configure resources while the baseline posture is delivered automatically.
Not all areas of a business necessarily have the same cloud maturity level, or use the same methods to deploy or run workloads. As a security architect, your job is to help those different parts of the business deliver outcomes in the way that is appropriate for their maturity or particular workload.
The best way to help drive this goal is for the security part of your organization to clearly communicate the necessary control objectives. As a security architect, it’s easier to have a discussion about the things that need tweaking in an application if the objectives are well communicated. It is much harder if the workload owner doesn’t know they have to meet certain security expectations.
What is the job of security?
At AWS, we talk to customers across a range of industries. One thing that consistently comes up in conversation is how to help customers understand the role of their security team in a distributed cloud-aware environment. The answer is always the same: we as security people are here to help the business deploy and run applications securely. Our job is to guide and educate the rest of the organization on the best way to meet the business objectives while meeting the security, risk, and compliance requirements.
So how do you do this?
Technology and culture are both important to an organization’s security posture, and they enable each other. AWS is a good example of an organization that has a strong culture of security ownership. One thing that all customers can take away from AWS: security is everyone’s job. When you understand that, it becomes easier to build the mechanisms that make the configuration and operation of appropriate security control objectives a reality.
The cloud environment that you build goes a long way to achieving this goal in two key ways. First, it provides guardrails and automated guidance for people building on the platform. Second, it allows solutions to scale.
One of the challenges organizations encounter is that there are more developers than there are security people. The traditional approach of point-in-time risk and control assessments performed by a human looking at an architecture diagram doesn’t scale. You need a way to scale that knowledge and capability without increasing the number of people. The best way to achieve this is to codify as much as possible, early in the build and release process.
One way to do this is to run the AWS platform as a product in its own right. Team members should be able to submit feature requests, and there should be metrics on the features that are enabled through the platform. The more security capability that teams building workloads can inherit from the platform, the less they have to implement at the workload level and the more time they can spend on product features. There will always be some security control objectives that can only be delivered by specific configuration at the workload level; this should build on top of what’s inherited from the cloud platform. Your security team and the other teams need to work together to make sure that the capabilities provided by the cloud platform are available to help people build and release securely.
One part of the governance model that we like to highlight is the concept of platform onboarding. The idea of this part of the governance model is to quickly and consistently get to a baseline set of controls that enable you to use a service safely in a particular environment. A good example here is to give developers access to evaluate a service in an experimentation account. To support this process, you don’t want to spend a long time building controls for every possible outcome. The best approach is to take advantage of the foundational controls that are delivered by the cloud platform as the starting point. Things like federation, logging, and service control policies can be used to provide guard rails that enable you to use services quickly. When the services are being evaluated, your security team can work together with your business to define more specific controls that make sense for the actual use cases.
AWS Well-Architected Framework
The cloud platform you use is the foundation of many of the security controls. These guard rails of federation, logging, service control polices, and automated response apply to workloads of all types. The security pillar in the AWS Well-Architected Framework builds on other risk management and compliance frameworks, provides you with best practices, and helps you to evaluate your architectures. These best practices are a great place to look for what you should do when building in the cloud. The categories—identity and access management, detection, infrastructure protection, data protection, and incident response—align with the most important areas to focus on when you build in AWS.
For example, identity is a foundational control in a cloud environment. One of the AWS Well-Architected security best practices is “Rely on a centralized identity provider.” You can use AWS Single Sign-On (AWS SSO) for this purpose or an equivalent centralized mechanism. If you centralize your identity provider, you can perform identity lifecycle management on users, provide them with access to only the resources that are required, and support users who move between teams. This can apply across the multiple AWS accounts in your AWS environment. AWS Organizations uses service control policies to enable you to use a subset of AWS services in particular environments; this is an identity-centric way of providing guard rails.
In addition to federating users, it’s important to enable logging and monitoring services across your environment. This allows you to generate an event when something unexpected happens, such as a user trying to call AWS Key Management Service (AWS KMS) to decrypt data that they should have access to. Securely storing logs means that you can perform investigations to determine the causes of any issues you might encounter. AWS customers who use Amazon GuardDuty and AWS CloudTrail, and have a set of AWS Config rules enabled, have access to security monitoring and logging capabilities as they build their applications.
The layer cake model
When you think about cloud security, we find it useful to use the layer cake as a good mental model. The base of the cake is the understanding of the below-the-line capability that AWS provides. This includes self-serving the compliance documentation from AWS Artifact and understanding the AWS shared responsibility model.
The middle of the cake is the foundational controls, including those described previously in this post. This is the most important layer, because it’s where the most controls are and therefore where the most value is for the security team. You could describe it as the “solve it once, consume it many times” layer.
The top of the cake is the application-specific layer. This layer includes things that are more context dependent, such as the correct control objectives for a certain type of application or data classification. The work in the middle layer helps support this layer, because the middle layer provides the mechanisms that make it easier to automatically deliver the top layer capability.
The middle and top layers are not just technology layers. They also include the people and process parts of the equation. The technology is just there to support the processes.
One thing to be aware of is that you shouldn’t try to define every possible control for a service before you allow your business to use the service. Make use of the various environments in your organization—experimenting, development, testing, and production—to get the services in the hands of developers as quickly as possible with the minimum guardrails to avoid accidental misconfiguration. Then, use the time when the services are being assessed to collaborate with the developers on control implementation. Control implementations can then be rolled into the middle layer of the cake, and the services can be adopted by other parts of the business.
This is also the ideal time to apply practical threat modelling techniques so you can understand what threats and risks you must address. Working with your business to define recommended implementation patterns also helps provide context for how services are typically used. This means you can focus on the controls that are most relevant.
The architecture, platform, or cloud center of excellence (CoE) teams can help at this stage. They can likely make a quick determination of whether an AWS service fits in with your organization’s architectural direction. This quick triage helps the security team focus their efforts in helping get services safely in the hands of the business without being seen as blocking adoption. A good mechanism for streamlining the use of new services is to make sure the backlog is well communicated, typically on a platform team wiki. This helps the security and non-security parts of your organization prioritize their time on services that deliver the most business value. A consistent development approach means that the services that are used are probably being used in more places across the organization. This helps your organization get the benefits of scale as consistent approaches to control implementation are replicated between teams.
Simplicity, metrics, and culture
The world moves fast. You can’t just define a security posture and control objectives, and then walk away. New services are launched that make it easier to do more complex things, business priorities change, and the threat landscape evolves. How do you keep up with all of it?
The answer is a combination of simplicity, metrics, and culture.
Simplicity is hard, but useful. For example, if you have 100 application teams all building in a different way, you have a large number of different configurations that you must ensure are sensibly defined. Ideally, you do this programmatically, which means that the work to define and maintain that set of security controls is significant. If you have 100 application teams using only 10 main patterns, it’s easier to build controls. This has the added benefit of reducing the complexity at the operations end, which applies to both the day-to-day operations and to incident responses. Simplification of your control environment means that your monitoring is less complex, troubleshooting is easier, and people have time to focus on the development of new controls or processes.
Metrics are important because you can make informed decisions based on data. A good example of the usefulness of metrics is patching. Patching is one of the easiest ways to improve your security posture. Having metrics on patch age, presented where this information is most important in your environment, enables you to focus on the most valuable areas. For example, infrastructure on your edge is more important to keep patched than infrastructure that is behind multiple layers of controls. You should patch everything, but you need to make it easy for application teams to do so as part of their build and release cycles. Exposing metrics to teams and leadership helps your organization learn from high performing areas in the business. These could be teams that are regularly meeting the patching expectations or have low instances of needing to remediate penetration testing findings. Metrics and data about your control effectiveness enables you to provide assurance internally and externally that you’re meeting your control objectives.
This brings us to culture. Security as an enabler is something that we think is the most important concept to take away from this post. You must build capabilities that enable people in your organization to have the secure configuration or design choice be the easiest option. This is the role of security. You should also make sure that, when there are problems, your security team works with the business to help everyone learn the cause and improve for next time.
AWS has a culture that uses trouble ticketing for everything. If our employees think they have a security problem, we tell them to open a ticket; if they’re not sure that they have a security problem, we tell them to open a ticket anyway to get guidance. This kind of culture encourages people to communicate and help means so we can identify and fix issues early. Issues that aren’t as severe thought can be downgraded quickly. This culture of ticketing gives us data to inform what we build, which helps people be more secure. You can get started with a system like this in your own environment, or look to extend the capability if you’ve already started.
Take our recommendation to turn on GuardDuty across all your accounts. We recommend that the resulting high and medium alerts are sent to a ticketing system. Look at how you resolve those issues and use that to prioritize the next two weeks of work. Now you can build automation to fix the issues and, more importantly, build to prevent the issues from happening in the first place. Ask yourself, “What information did I need to diagnose the problem?” Then, build automation to enrich the findings so your tickets have that context. Iterate on the automation to understand the context. For example, you may want to include information to show whether the environment is production or non-production.
Note that having production-like controls in non-production environments means that you reduce the chance of deployment failures. It also gets teams used to working within the security guardrails. This increased rigor earlier on in the process, and helps your change management team, too.
It doesn’t matter what security frameworks or standards you use to inform your business, and you might not even align with a particular industry standard. What does matter is building a governance model that empowers the people in your organization to consistently make good security decisions and provides the capability for your security team to enable this to happen. To get started or continue to evolve your governance model, follow the AWS Well-Architected security best practices. Then, make sure that the platform you implement helps you deliver the foundational security control objectives so that your business can spend more of its time on the business logic and security configuration that is specific to its workloads.
The technology and governance choices you make are the first step in building a positive security culture. Security is everyone’s job, and it’s key to make sure that your platform, automation, and metrics support making that job easy.
The areas of focus we’ve talked about in this post are what allow security to be an enabler for business and to ultimately help you better help your customers and earn their trust with everything you do.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.