AWS Security Blog

Considerations for security operations in the cloud

Cybersecurity teams are often made up of different functions. Typically, these can include Governance, Risk & Compliance (GRC), Security Architecture, Assurance, and Security Operations, to name a few. Each function has its own specific tasks, but works towards a common goal—to partner with the rest of the business and help teams ship and run workloads securely.

In this blog post, I’ll focus on the role of the security operations (SecOps) function, and in particular, the considerations that you should look at when choosing the most suitable operating model for your enterprise and environment. This becomes particularly important when your organization starts to adapt and operate more workloads in the cloud.

Operational teams that manage business processes are the backbone of organizations—they pave the way for efficient running of a business and provide a solid understanding of which day-to-day processes are effective. Typically, these processes are defined within standard operating procedures (SOPs), also known as runbooks or playbooks, and business functions are centralized around them—think Human Resources, Accounting, IT, and so on. This is also true for cybersecurity and SecOps, which typically has operational oversight of security for the entire organization.

Teams adopt an operating model that inherently leans toward a delegated ownership of security when scaling and developing workloads in the cloud. The emergence of this type of delegation might cause you to re-evaluate your currently supported model, and when you do this, it’s important to understand what outcome you are trying to get to. You want to be able to quickly respond to and resolve security issues. You want to help application teams own their own security decisions. You also want to have centralized visibility of the security posture of your organization. This last objective is key to being able to identify where there are opportunities for improvement in tooling or processes that can improve the operation of multiple teams.

Three ways of designing the operating model for SecOps are as follows:

  • Centralized – A more traditional model where SecOps is responsible for identifying and remediating security events across the business. This can also include reviewing general security posture findings for the business, such as patching and security configuration issues.
  • Decentralized – Responsibility for responding to and remediating security events across the business has been delegated to the application owners and individual business units, and there is no central operations function. Typically, there will still be an overarching security governance function that takes more of a policy or principles view.
  • Hybrid – A mix of both approaches, where SecOps still has a level of responsibility and ownership for identifying and orchestrating the response to security events, while the responsibility for remediation is owned by the application owners and individual business units.

As you can see from these descriptions, the main distinction between the different models is in the team that is responsible for remediation and response. I’ll discuss the benefits and considerations of each model throughout this blog post.

The strategies and operating models that I talk about throughout this blog post will focus on the role of SecOps and organizations that operate in the cloud. It’s worth noting that these operating models don’t apply to any particular technology or cloud provider. Each model has its own benefits and challenges to consider; overall, you should aim to adopt an operating model that gets to the best business outcome, while managing risk and providing a path for continuous improvement.

Background: the centralized model

As you might expect, the most familiar and well-understood operating model for SecOps is a centralized one. Traditionally, SecOps has developed gradually from internal security staff who have a very good understanding of the mostly static on-premises infrastructure and corporate assets, such as employee laptops, servers, and databases.

Centralizing in this way provides organizations with a familiar operating model and structure. Over time, operating in this model across an industry has allowed teams to develop reliable SOPs for common security events. Analysts who deal with these incidents have a good understanding of the infrastructure, the environment, and the steps that are needed to resolve incidents. Every incident gives opportunities to update the SOPs and to share this knowledge and the lessons learned with the wider industry. This continuous feedback cycle has provided benefits to SecOps teams for many years.

When security issues occur, understanding the division of responsibility between the various teams in this model is extremely important for quick resolution and remediation. The Responsibility Assignment Matrix, also known as the RACI model, has defined roles—Responsible, Accountable, Consulted, and Informed. Utilizing a model like this will help align each employee, department, and business unit so that they are aware of their role and contact points when incidents do occur, and can use defined playbooks to quickly act upon incidents.

The pressure can be high during a security event, and incidents that involve production systems carry additional weight. Typically, in a centralized model, security events flow into a central queue that a security analyst will monitor. A common approach is the Security Operations Center (SOC), where events from multiple sources are displayed on screens and also trigger activity in the queue. Security incidents are acted upon by an experienced team that is well versed in SOPs and understands the importance of time sensitivity when dealing with such incidents. Additionally, a centralized SecOps team usually operates in a 24/7 model, which might be achieved by having teams in multiple time zones or with help from an MSSP (Managed Security Service Provider). Whichever strategy is followed, having experienced security analysts deal with security incidents is a great benefit, because experience helps to ensure efficient and thorough remediation of issues.

So, with context and background set—how does a centralized SOC look and feel when it operates in the cloud, and what are its challenges?

Centralized SOC in the cloud: the advantages

Cloud providers offer many solutions and capabilities for SOCs that operate in a centralized model. For example, you can monitor your organization’s cloud security posture as a whole, which allows for key performance indicator (KPI) benchmarking, both internally and industry wide. This can then help your organization target security initiatives, training, and awareness on lower-scoring areas.

Security orchestration, automation, and response (SOAR) is a phrase commonly used across the security industry, and the cloud unlocks this capability. Combining both native and third-party security services and solutions with automation facilitates quick resolution of security incidents. The use of SOAR means that only incidents that need human intervention are actually reviewed by the analysts. After investigation, if automation can be introduced on that alert, it’s quickly applied. Having a central place for automating alerts helps the organization to have a consistent and structured approach to the response for security events and gives analysts more time to focus on activities like threat hunting.

Additionally, such threat-hunting operations require a central security data lake or similar technology. As a result, the SecOps team helps to drive the centralization of data across the business, which is a traditional cybersecurity function.

Centralized SOC in the cloud: organizational considerations

Some KPIs that a traditional SOC would typically use are time to detect (TTD), time to acknowledge (TTA), and time to resolve (TTR). These have been good metrics that SecOps managers can use to understand and benchmark how well the SecOps team is performing, both internally and against industry benchmarks. As your organization starts to take advantage of the breadth and depth available within the cloud, how does this change the KPIs that you need to track? As stated earlier, the cloud makes it easier to track KPIs through increased visibility of your cloud footprint—although you should evaluate traditional KPIs to understand whether they still make sense to use. Some additional KPIs that should be considered are metrics that show increasing automation, reduction in human access, and the overall improvement in security posture.

Organizations should consider scaling factors for operational processes and capability in the centralized SOC model. Once benefits from adopting the cloud have been realized, organizations typically expand and scale up their cloud footprint aggressively. For a centralized SecOps team, this could cause a challenging battle between the wider business, which wants to expand, and the SOC, which needs the ability to fully understand and respond to issues in the environment. For example, most organizations will put together small proof of concepts (POCs) to showcase new architectures and their benefits, and these POCs may become available as blueprints for the wider organization to consume. When new blueprints are implemented, the centralized SecOps team should implement and rely on its automation capabilities to verify that the correct alerting, monitoring, and operational processes are in place.

Decentralization: all ownership with the application teams

Moving or designing workloads in the cloud provides organizations with many benefits, such as increased speed and agility, built-in native security, and the ability to launch globally in minutes. When looking at the decentralized model, business units should incorporate practices into their development pipelines to benefit from the security capabilities of the cloud. This is sometimes referred to as a shift left or DevSecOps approach—essentially building security best practices into every part of the development process, and as early as possible.

Placing the ownership of the SecOps function on the business units and application owners can provide some benefits. One immediate benefit is that the teams that create applications and architectures have first-hand knowledge and contextual awareness of their products. This knowledge is critical when security events occur, because understanding the expected behavior and information flows of workloads helps with quick remediation and resolution of issues. Having teams work on security incidents in the ways that best fit their operational processes can also increase speed of remediation.

Decentralization: organizational considerations

When considering the decentralized approach, there are some organizational considerations that you should be aware of:

Dedicated security analysts within a central SecOps function deal with security incidents day in and day out; they study the industry, have a keen eye on upcoming threats, and are also well versed in high-pressure situations. By decentralizing, you might lose the consistent, level-headed experience they offer during a security incident. Embedding security champions who have industry experience into each business unit can help ensure that security is considered throughout the development lifecycle and that incidents are resolved as quickly as possible.

Contextual information and root cause analysis from past incidents are vital data points. Having a centralized SecOps team makes it much simpler to get a broad view of the security issues affecting the whole organization, which improves the ability to take a signal from one business unit and apply that to other parts of the organization to understand if they are also vulnerable, and to help protect the organization in the future.

Decentralizing the SecOps responsibility completely can cause you to lose these benefits. As mentioned earlier, effective communication and an environment to share data is key to verifying that lessons learned are shared across business units—one way of achieving this effective knowledge sharing could be to set up a Cloud Center of Excellence (CCoE). The CCoE helps with broad information sharing, but the minimization of team hand-offs provided by a centralized SecOps function is a strong organizational mechanism to drive consistency.

Traditionally, in the centralized model, the SOC has 24/7 coverage of applications and critical business functions, which can require a large security staff. The need for 24/7 operations still exists in a decentralized model, and having to provide that capability in each application team or business unit can increase costs while making it more difficult to share information. In a decentralized model, having greater levels of automation across organizational processes can help reduce the number of humans needed for 24/7 coverage.

Blending the models: the hybrid approach

Most organizations end up using a hybrid operating model in one way or another. This model combines the benefits of the centralized and decentralized models, with clear responsibility and division of ownership between the business units and the central SecOps function.

This best-of-both-worlds scenario can be summarized by the statement “global monitoring, local response.” This means that the SecOps team and wider cybersecurity function guides the entire organization with security best practices and guardrails while also maintaining visibility for reporting, compliance, and understanding the security posture of the organization as a whole. Meanwhile, local business units have the tools, knowledge, and expertise available to confidently own remediation of security events for their applications.

In this hybrid model, you split delegation of ownership into two parts. First, the operational capability for security is centrally owned. This centrally owned capability builds upon the partnership between the application teams and the security organization, via the CCoE. This gives the benefits of consistency, tooling expertise, and lessons learned from past security incidents. Second, the resolution of day-to-day security events and security posture findings is delegated to the business units. This empowers the people closest to the business problem to own service improvement in ways that best suit that team’s way of working, whether that’s through ChatOps and automation, or through the tools available in the cloud. Examples of the types of events you might want to delegate for resolution are items such as patching, configuration issues, and workload-specific security events. It’s important to provide these teams with a well-defined escalation route to the central security organization for issues that require specialist security knowledge, such as forensics or other investigations.

A RACI is particularly important when you operate in this hybrid model. Making sure that there is a clear set of responsibilities between the business units and the SecOps team is crucial to avoid confusion when security incidents occur.

Conclusion

The cloud has the ability to unlock new capabilities for your organization. Increased security, speed, and agility and are just some of the benefits you can gain when you move workloads to the cloud. The traditional centralized SecOps model offers a consistent approach to security detection and response for your organization. Decentralization of the response provides application teams with direct exposure to the consequences of their design decisions, which can speed up improvement. The hybrid model, where application teams are responsible for the resolution of issues, can improve the time to fix issues while freeing up SecOps to continue their works. The hybrid operating model compliments the capabilities of the cloud, and enables application owners and business units to work in ways that best suit them while maintaining a high bar for security across the organization.

Whichever operating model and strategy you decide to embark on, it’s important to remember the core principles that you should aim for:

  • Enable effective risk management across the business
  • Drive security awareness and embed security champions where possible
  • When you scale, maintain organization-wide visibility of security events
  • Help application owners and business units to work in ways that work best for them
  • Work with application owners and business units to understand the cyber landscape

The cloud offers many benefits for your organization, and your security organization is there to help teams ship and operate securely. This confidence will lead to realized productivity and continued innovation—which is good for both internal teams and your customers.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Stuart Gregg

Stuart Gregg

Stuart enjoys providing thought leadership and being a trusted advisor to customers. In his spare time Stuart can be seen either eating snacks, running marathons or dabbling in the odd Ironman.