AWS Architecture Blog

Field Notes: How FactSet Balances Developer Velocity with Governance using AWS IAM

This post was co-written by FactSet’s Cloud Infrastructure team, Gaurav Jain, Nathan Goodman, Geoff Wang, Daniel Cordes, Sunu Joseph and AWS Solution Architects, Amit Borulkar and Tarik Makota.

At FactSet, their goal for cloud platform on AWS Cloud is to have high developer velocity alongside enterprise governance. They wanted application teams to have a frictionless experience that balances agility, governance, and ease of use. They achieved it using micro-accounts, where each AWS account is allocated for one project and is owned by a single team. By using the AWS account as an isolation boundary, each application team can be granted a broad set of permissions to explore and innovate within their own accounts.

Source Control as source of truth

FactSet’s core design principles for the identity model in micro-accounts is that all configuration must be stored in source control. For them, source control is the source of truth that drives automation through a GitOps workflow. They start with default configuration which they then copy into account specific folder under source control. Each AWS account is provisioned from its account-specific folder in source control. Once an AWS account is provisioned, they can adjust the configuration in source control to grant or deny permissions per account depending on the pace at which a team matures and adopts AWS services.

They help maintain compliance across account, IAM role, and IAM policy configuration, and use automation to detect drift and correct it. This helps them to maintain these roles across thousands of micro-accounts. The benefit of having source control as source of truth allowed them to start every account with a standard default configuration that follows approved best practices, but still allow flexibility for adjustments per account that limits the scope of impact from an AWS Identity and Access Management (IAM) policy change. This is a common paradigm in several areas described later in this blog.

Standard Policies and Roles

Their micro-accounts have a layered authorization model, starting with IAM roles and Service Control Policies (SCP) as the foundation for what is permitted inside of an account. They use IAM roles to delegate access to users, applications, or services. SCPs are policies that they use to manage permissions in their organization. SCPs allow them to have central control over the maximum available permissions for all accounts in our organization. Each micro-account comes with standard IAM roles and SCPs, both for interactive use in addition to for automation (Figure 1).

Authorization model for a micro-account

Figure 1 – Authorization model for a micro-account

Service Control Policies

Service Control Policies are an important part of their governance. They provide enterprise-level control by defining guardrails, or setting limits, on any permissions that may be granted within a given micro-account.  Some examples of SCPs for micro-accounts that they use are:

  • Policies to ensure critical infrastructure and logging resources are protected in each micro-account
  • Policies to gate different AWS technologies until they are approved for use within the enterprise
  • Policies to ensure that people are operating in approved AWS Regions (following JSON policy)
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DenyOtherRegions",
            "Effect": "Deny",
            "Action": "*",
            "Resource": "*",
            "Condition": {
                "StringNotEquals": {
                    "aws:RequestedRegion": [
                        "region1",
                        "region2"
                    ]
                },
                "ForAllValues:StringNotLike": {
                    "aws:PrincipalArn": [
                        "arn:aws:iam::*:role/SupportInteractiveRole",
                        "arn:aws:iam::*:role/SupportAutomationRole"
                    ]
                }
            }
        }
    ]
}

The preceding policy is an example Service Control Policy that denies access to all regions except ones in an approved list.

Note – You should carefully consider any global AWS services that need to be exempted from the Example Service Control Policy for your environment. Global services have endpoints in the us-east-1 Region and need to be exempted, visit Example Service Control Policies for more information.

In order to realize the benefit of SCPs while retaining maximum flexibility, they create SCPs from a template, store them in source control, and attach them at the account level per micro-account (Figure 2).

With flexibility, there are few things that need to be controlled:

  • Granting access to the top-level management account for a dedicated member account that runs automation.
  • A member account that automation runs out of needs to have a dedicated role it can assume in the management account since it is the only place SCPs can be managed.
  • SCP service limits – as the number of accounts grows (they have thousands) the AWS Organization service limit for number of policies can quickly be reached. In their case, accounts start out with the same SCP and most of them don’t get changed. The automation process calculates a MD5 hash on the contents of a SCP in source control to compare policies across thousands of accounts. For those accounts that have the same policy content, they attach the same policy object in AWS Organizations. This allows them to minimize the number of distinct policies in AWS Organizations.
Service Control Policy attachments

Figure 2 – Service Control Policy attachments

IAM Roles

They provision each micro-account with three standard interactive IAM roles for use solely by the application team.

  • Read-only – for listing and reading all data and configuration within the account. This is often used by non-technical members of the team. Examples would include reading from an Amazon Simple Queue Service (Amazon SQS) queue or viewing configuration of an existing Amazon Elastic Compute Cloud (Amazon EC2) instance.
  • Developer – for reading/writing data within the account. This is intended to be the “daily driver” role for regular development against resources that have already been provisioned. Examples of this would include reading/writing items to an existing Amazon SQS queue or powering an existing Amazon EC2 instance on or off.
  • DevOps – for CRUD activity to manage infrastructure within the account. This is the most permissive of the roles and is essentially a power-user with some limited IAM permissions. Examples of this would include creating a new Amazon SQS queue or shutting down an Amazon EC2 instance.

Note – Developer and DevOps interactive roles can only be assumed for production environments via the Breakglass mechanism. See following section on Break-glass.

Support Interactive Roles

In their environment, certain teams like Cloud Administrators, Information Security, and Infrastructure play a fundamental role in supporting the application teams. Some examples of this are: performing security audits, managing billing, debugging database issues or networking problems.

There are a set of standard interactive support roles in every micro-account that allows these support teams have access to all AWS accounts to support and collaborate with application teams. Having role-based access helps ensures least-privilege for the central support teams to successfully do their work in each account and support the AWS account owners.

User-owned Groups

They allow interactive roles in every micro-account to be assumed by specific Active Directory groups. These Active Directory groups are owned by individuals responsible for each project and hence have the knowledge and ability to maintain the group membership themselves. This was intentionally done to lower the operational overhead on central cloud teams as well as allowing the application owner to control who has access to their account.

Developer Automation Roles

  • Each micro-account also comes with standard automation roles that can be used by applications:
    • Service-Start (execution) – A role that can read/write data within the account. This policy is similar in spirit to the Developer interactive role. The trust policy is configured to allow many commonly used services such as AWS Lambda, Amazon EC2, Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS CodeBuild to assume this role. This is intended to be a role for application developers to quickly and easily get started in AWS.
    • Remote-Start (execution) – A limited role that can read/write Amazon Simple Storage Service (Amazon S3) from outside the boundaries of AWS, i.e. on-premises datacenters. The trust policy is configured for their external IdP and authorization database that dictates which application service principal can assume this role.

Support Automation roles

Like the interactive support roles, they also have central cloud teams to support application teams with automation to do things like perform policy enforcement, deploy/maintain infrastructure, monitor accounts for governance, inventory and security.

One example of this is the automated patching of EC2 instances in the application teams’ account. The central cloud infrastructure team uses automation to assume a specific automation role in each target account to perform the patching workflow. Having these roles and specific teams supporting these common workflows for our several hundred micro-accounts really takes away a lot of maintenance responsibilities from the owning teams.

Permission Boundaries

One of the features of the AWS IAM service that they found very beneficial from a flexibility and operations perspective is safe delegation of IAM role/policy editing using permission boundaries. They use permission boundaries for IAM to control the maximum set of permissions that any application team can grant in each account.

The standard DevOps role includes a policy statement that allows IAM role and policy CRUD actions if the required permission boundary is attached, as shown in the following policy. This has enabled individual application teams to define and provision new custom roles/policies as needed, while the central cloud team still gets to enforce governance across the enterprise. The permission boundary has the same policy content as the DevOps role IAM policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowPowerUser",
            "Effect": "Allow",
            "NotAction": [
                "iam:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowIAMRoleCreation",
            "Effect": "Allow",
            "Action": [
                "iam:CreateRole",
                "iam:CreateRolePolicy",
                "iam:DeleteRolePolicy",
                "iam:AttachRolePolicy",
                "iam:DetachRolePolicy",
                "iam:PutRolePolicy",
                "iam:PutRolePermissionsBoundary"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:PermissionsBoundary": "arn:aws:iam::[AWS Account ID]:policy/boundary-policy"
                }
            }
        }
    ]
}

This is example DevOps IAM policy with permission boundary.

Single authentication/authorization/credentials API

They handle Authentication and Authorization of interactive logins and programmatic logins through a single central API that communicates with an IdP and AWS Security Token Service (AWS STS) on the backend to provide the needed credentials to access the AWS Account (Figure 3).

Figure 5: Central Authentication/Authorization/Credentials API

Figure 3 – Central Authentication/Authorization/Credentials API

They provide three main interfaces for users to access their accounts. All these interfaces use the same central API to get credentials for the user.

Identity Web Portal

They created a web portal to provide a user-friendly UI for the users. This allows users to see which accounts and roles they have access to in a single view. From the web portal they can easily get console access across their AWS account. Other helpful features include a view of break-glass status as well as click-button functionality to export the credentials for use in AWS CLI.

Identity Web Portal

Figure 4 – Identity Web Portal

Identity CLI

To enhance the experience of developers who would prefer not to swap between a command line interface and a web browser, they built a CLI tool for developers to perform the same process of authentication, authorization and getting temporary credentials for a given account and role. This helps improve developer efficiency when working with multiple AWS accounts and goes hand in hand with using the AWS CLI.

AWS CLI image

Identity API

Having an API interface was also essential for programmatic access which is to be used by service accounts to access the remote-execution-role in an account. Credentials for the service account are passed into the API to get valid temporary credentials from AWS STS.

Authentication

Interactive users use Web Portal or CLI and provide the required corporate credentials, which in turn calls the AWS STS API. That user request is then authenticated with the IdP.  Similarly, for service accounts we support authenticating service accounts for programmatic access to developer automation roles like the remote-start (execution).

Authorization

For individual users they keep mappings in a central database. This mapping determines which IAM role and account that the user can assume. The source of truth for this is in source control and we enforce them regularly to correct drift and to make any new changes.

When it comes to application service principals, they authorize access to assume a single role in a single account.  As with individual roles they keep it in source control and enforced regularly across all accounts.

Break-glass

During the authorization phase they have implemented the ability to grant elevated access permissions when necessary (break-glass). This is a form of conditional-access where they we check to make sure all requirements are met before the user is considered to be completely authorized. The break-glass feature is required to assume any interactive IAM role that can make changes as well as for any corresponding OS level access for Amazon EC2 in production AWS environments. Having a central API for authentication and authorization made implementation of this feature easy as there is a single gate.

Credentials

One of their design principles for the identity model was to make IAM roles and temporary credentials the primary IAM principal type used with AWS instead of long-lived IAM user credentials. Application teams have the flexibility to choose the length of time for their authenticated session up to a defined maximum duration. Once the user is authenticated and authorized for a given account and role, we create valid temporary credentials using AWS Simple Token Service (STS). With the valid temporary AWS credential, they are able to give their users console access or to export them for use with AWS CLI or SDKs.

Considerations

Building a custom identity system can be complex and may not be for everyone. They actually started by using the AWS Single Sign-on service when it was initially released and they had a small number of AWS accounts to manage. As FactSet’s multi-account strategy became clear, the scale of micro-accounts drove them to look for solutions in the following areas:

  • Need for just-in-time access and temporary credentials
  • Need for an API to automate provisioning access for thousands of AWS accounts
  • Need for integration with a third-party Identity Provider
  • Need for multiple methods to get temporary credentials for developer productivity (UI, CLI, API)
  • Need for a conditional access or break-glass process to control access to production environments
  • Need for integration into a central logging stack

The business requirements for FactSet outweighed the cost of engineering a new custom solution, as well as any required on-going engineering maintenance and operations. Since that time, many things have improved in the AWS SSO service. Some of the features they required have been added over time.

For others that may consider building their own solution, perhaps start by using source control as the source of truth for IAM policy and role configuration. FactSet recommends this as it makes management of resources at scale much easier. For the other parts, it would be good to consider the pros and cons of developing a custom authentication/authorization API.

Conclusion

Identity plays a critical part in securely operating in the cloud. In FactSet’s case having the account be the delineation between different application teams greatly simplified permissions. The micro-account strategy approach we outlined uses a premise that each account’s IAM roles and policies are driven from an account-specific location in source control and are applied with automation. We hope this overview of the identity model used with FactSet’s AWS micro-accounts gives you some ideas on how to balance developer velocity with security and governance at scale.

In their own words, FactSet creates flexible, open data and software solutions for tens of thousands of investment professionals around the world, which provides instant access to financial data and analytics that investors use to make crucial decisions. At FactSet, we are always working to improve the value that our products provide.

Recommended Reading:

Field Notes: How FactSet Uses ‘microAccounts’ to Reduce Developer Friction and Maintain Security at Scale

Field Notes provides hands-on technical guidance from AWS Solutions Architects, consultants, and technical account managers, based on their experiences in the field solving real-world business problems for customers.

 

Daniel Cordes

Daniel Cordes

Daniel Cordes is a Principal Software Architect with FactSet. He manages cloud automation efforts for internal developers. He holds master’s degrees from Columbia University and the City University of New York.

Gaurav Jain

Gaurav Jain

Gaurav Jain is Director of Cloud Platform at FactSet Research system. He is responsible for implementing FactSet’s cloud adoption and migration strategy. His background includes developing data APIs and database technologies for financial data. He holds a MSE in Computer Engineering from University of Michigan, and a MBA from NYU Stern.

Nathan Goodman

Nathan Goodman

Nathan Goodman is a Principal Systems Architect on FactSet’s Public Cloud enablement team. He has over 20 years of industry experience building highly scalable and resilient systems both on-premise & in the cloud.

Geoff Wang

Geoff Wang

Geoff is a Principal Systems Engineer in FactSet’s Cloud Team, focused on helping developers on FactSet’s journey to the cloud. He has over 15 years working in system architecture and administration and holds a BSE in Electrical Engineering from the University of Michigan.

Sunu Joseph

Sunu Joseph

Sunu Joseph is an Associate Director in FactSet’s Cloud Team focused on improving developer workflows during cloud adoption and migration. He has 11 years of industry experience with a background in developing database technologies to store financial data and building software solutions for the aerospace industry. He holds an MS in Computer Science from the University of Edinburgh.

TAGS:
Tarik Makota

Tarik Makota

Tarik Makota is a Principal Solutions Architect with Amazon Web Services. He provides technical guidance, design advice and thought leadership to AWS’ customers across the US Northeast. He holds an M.S. in Software Development and Management from Rochester Institute of Technology.

Amit Borulkar

Amit Borulkar

Amit is a Solutions Architect with Amazon Web Services (AWS) focussed on helping customers craft highly resilient and scalable cloud architectures which address their business problems. He also holds a Masters degree in Computer Science from North Carolina State University.