Category: AWS Partner Solutions Architect (SA) Guest Post


Architecting Microservices Using Weave Net and Amazon EC2 Container Service

In the past, it was far more common to have services that were relatively static. You might remember having a database server that existed on a fixed IP, and you probably had a static DNS mapping to that server. If this information changed, it could be a problem.

However, in the cloud we accept change as a constant, especially when building microservice architectures based on containers. Containers often live only for minutes or hours, and they may change hosts and ports each time they are spun up. When building containerized microservices, a major component of the architecture includes underlying “plumbing” that allows components to communicate across many hosts in a cluster. Generally, the tools allowing this communication also help us deal with the dynamic nature of containers by automatically tracking and replicating information about changes across the cluster. These tools provide functionality commonly known as “service discovery.” Today, we’re going to talk about a popular service discovery option, Weave Net, by APN Technology Partner Weaveworks.

We’re fans of Weave Net for a few reasons, but the most attractive thing about Weave Net is its simplicity. Weave Net allows you to reference containers by name, as you would reference them locally with Docker links, across the entire cluster. That’s it.

Weave Net is also unique among service discovery options because it doesn’t use an external database or a consensus algorithm, but instead uses a gossip protocol to share grouped updates about changes in the cluster. This has interesting implications for partition tolerance, where availability is prioritized over consistency.

Let’s talk CAP theorem!

In a world where the network is unreliable (read: the one that we live in), your distributed system can either be highly available and partition tolerant, or highly available and consistent, but not highly available, highly consistent, and partition tolerant. Service discovery options that use consensus algorithms prioritize consistency, so in network partition events they must sacrifice availability to avoid having inconsistent cluster state.

Weave Net, on the other hand, uses a data structure called a CRDT where nodes in the cluster only make local updates, and replication happens via merged updates to the rest of the cluster. This means that the data structure is eventually consistent, but the lack of a strong consistency requirement allows it to be extremely fast and highly available. (See the fast data path discussion on the Weaveworks website.) It’s important to consider the needs of your system, and if your application prioritizes availability or doesn’t require strong consistency, then Weave Net is an excellent service discovery option.

Microservice design with Weave Net and Amazon ECS

Within Amazon EC2 Container Service (Amazon ECS), you can use task definitions to deploy containers to your ECS cluster. A task definition can include one or many containers, but the key concept to remember is that when you run a task definition, all the containers in that task get placed on a single instance. Ask yourself, “Does my application require a specific combination of containers to run together on the same host?” You might say, “Yes, this is necessary,” if your containers need to share a local Amazon Elastic Block Store (Amazon EBS) volume, for example.

You might also answer in the affirmative if your containers need to be linked. However, Weave Net gives you more flexibility when designing your application. Because Weave Net automatically handles container communication across the cluster, you have a bit more freedom when building task definitions. You’re freed from needing to use local Docker links, so you don’t have to place all the containers that make up your application on the same instance.

As a result, you can define a single container per task definition. Consider our sample application: a Python Flask application on the front end and Redis on the back end. Because these components have different scheduling and scaling requirements, we should manage them independently by building a separate task definition for each container. This one-to-one model is far simpler to think about and to manage than multiple services defined in a single task definition.

Let’s unpack our sample application architecture a little bit. Our front-end application is a simple Python hit counter application. From line 6 we see that we’re building the connection object for our Redis back end. We know that we are going to name the Redis container redis, so we can write our code with this in mind. Weave Net will take care of the rest.

Now we need to build two ECS task definitions. These definitions are really straightforward, since we’re adding only a single container to each task.

Here’s the back-end Redis task (notice that my container name is redis):

{
    "ContainerDefinitions": [
        {
            "Essential": true,
            "Name": "redis",
            "Image": "redis",
            "Cpu": 10,
            "Memory": 300
        }
    ],
    "Volumes": []
}

Here’s the front-end hit counter task:

{
    "ContainerDefinitions": [
        {
            "PortMappings": [
                {
                    "HostPort": 80,
                    "ContainerPort": 5000
                }
             ],
             "Essential": true,
             "Name": "hit-counter",
             "Image": "errordeveloper/hit-counter",
             "Command": [
                 "python",
                 "app.py"
             ],
             "Cpu": 10,
             "Memory": 300
        }
    ],
    "Volumes": []
}

Next, to scale these components individually, we’ll take these task definitions and wrap a higher-level ECS scheduling construct, known as a service, around them. A service lets us do things like define how many tasks we want running in the cluster, scale the number of active tasks, automatically register the containers to an ELB, and maintain a quota of healthy containers. In our architecture, which has one container per task and one task per service, we can use the service scheduler to determine how many Redis containers and hit counter applications we’d like to run. If we run more than one container per service, Weave Net automatically handles load balancing among the containers in the service via “round robin” DNS responses.

The end result is an ECS cluster that has three container instances and two services—one front-end hit-counter service scaled to three tasks, and a back-end Redis service with one task running.

$ aws ecs describe-clusters --cluster WeaveSKO-EcsCluster-
1USVF4UXK0IET --region eu-west-1
{
    "clusters": [
        {
            "status": "ACTIVE",
            "clusterName": "WeaveSKO-EcsCluster-1USVF4UXK0IET",
            "registeredContainerInstancesCount": 3,
            "pendingTasksCount": 0,
            "runningTasksCount": 4,
            "activeServicesCount": 2,
            "clusterArn": "arn:aws:ecs:eu-west-1:<account-
id>:cluster/WeaveSKO-EcsCluster-1USVF4UXK0IET"
        }

        $ aws ecs list-services --cluster arn:aws:ecs:eu-west-
1:<account-id>:cluster/WeaveSKO-EcsCluster-1USVF4UXK0IET --region 
eu-west-1
        {
            "serviceArns": [
                "arn:aws:ecs:eu-west-1:<account-
id>:service/WeaveSKO-EcsBackendDataService-1KTM3UFB3LKIO",
                "arn:aws:ecs:eu-west-1:<account-
id>:service/WeaveSKO-EcsFrontendAppService-1A5RQTSV7LMWE"
            ] 

Weave Net technical deep dive

How does Weave Net provide so much flexibility when you’re designing microservices with no configuration?

Weave Net starts by implementing an overlay network between cluster hosts. Each host has a network bridge, and containers are connected to the bridge with a virtual Ethernet pair, upon which they are assigned important information like an IP and netmask.

Weave Net discovers peers by querying the Auto Scaling API, so you don’t need to configure the overlay network yourself. On each host, Weave Net also runs a component called the Docker API proxy, which sits between the Docker daemon and the Docker client to record events on the host, like a container starting or stopping. The proxy also handles automatic registration of new containers to the overlay bridge.

Weave Net builds an in-memory database of local configuration information that it records from local Docker activity and chooses a random subset of peers with which to exchange topology information (using the gossip protocol). When it’s time to forward communications between hosts, packets are encapsulated in a tunnel header and forwarded to the Linux kernel. The Weave Net router communicates with the Linux kernel’s Open vSwitch module to tell the kernel how to process packets. This approach allows packets to progress straight from the user’s application to the kernel, avoiding a costly context switch from user space to kernel space.

These features result in greatly reduced complexity when designing microservices. Weave Net also provides other niceties like multicast support and round-robin DNS for container lookups. Take a look as we run a dig query against the hit-counter service. You can see that the query returns the three nodes that run the service in random order:

$ docker run 2opremio/weaveecsdemo dig +short hit-counter
10.32.0.3
10.36.0.3
10.40.0.3

$ docker run 2opremio/weaveecsdemo dig +short hit-counter
10.40.0.3
10.32.0.3
10.36.0.3

Weave Scope

The last thing I’d like to point out about Weaveworks is the useful Weave Scope component. Weave Scope is a monitoring and troubleshooting tool that provides a birds-eye view of the cluster, and it can be run either as a hosted/SaaS tool or locally on your cluster instances. It displays all the containers in the system and the relationships among them:

We can also drill down into a specific container to see vital telemetry information:

Lastly, we can open a shell directly into the container. In this screen illustration, we’re taking a look at the output of env inside the container:

Conclusion

When you build microservice-based applications, deciding how individual microservices will communicate with one another requires a lot of thought. Weave Net greatly reduces the complexity involved in standing up microservices and increases flexibility in design patterns. In this blog post, we’ve explored Weave Net features and described how you might use them to design an application in Amazon ECS.

If you’d like to take a look at Weave Net and Weave Scope yourself, you can stand up your own copy of the example architecture from this post by using the following AWS CloudFormation template, which was developed and is maintained by Weaveworks: https://s3.amazonaws.com/weaveworks-cfn-public/integrations/ecs-baseline.json. Keep in mind that this template will spin up resources that could cost you money. To learn more about Weaveworks, visit their website at http://weave.works or take a look at their technical deep dive.

 

 

Modeling SaaS Tenant Profiles on AWS

Most software companies understand the importance of being customer focused. The adoption of agile software development practices was heavily influenced by the needs of software organizations to connect to, and focus on, the continually evolving needs of their customers.

Now, take this basic reality and imagine how it manifests itself in the world of SaaS (software as a service) applications. With SaaS, your customers may be running in a fully shared environment, and the stakes are even higher. The real challenge for SaaS providers is to leverage the power and value of having a shared environment while accommodating the diversity of their customers’ needs. Striking a balance between these often-competing forces is essential to SaaS solutions.

Building an application that accommodates this level of flexibility and diversity requires diligence. The implications can be very pervasive and can affect multiple dimensions of your SaaS strategy. Ultimately, how you approach this opportunity is likely to directly influence the agility of your offering and its ability to align with the needs of your customers. The approach you choose will certainly have some impact on your ability to respond to market dynamics and competitive challenges.

The need for tenant-driven agility also highlights the intertwined nature of business and technology in SaaS environments. SaaS demands a design, architecture, and deployment vision that enables and supports the ability of your business to offer customers options that align with their specific needs.

In this blog, we’ll look at some of the considerations that go into capturing data on your tenants with an emphasis on identifying some of the key areas that could shape your system’s architecture. The landscape of possibilities here is broad, and my hope is to provide a glimpse of some of the factors that you may include in a broader assessment of your tenant profile.

Assessing Tenant Needs

Developing a tenant profile is all about understanding the value systems and domain forces that will influence a customer’s willingness to adopt your SaaS solution. Your ability to understand and categorize customer needs can help determine how and where you might need to support tenant variations in your underlying design and architecture.

Here are some of the typical questions that SaaS teams examine as part of this assessment:

  • What are some of the security considerations that might influence your customer’s willingness to run in a multi-tenant environment?
  • Are there specific standards or compliance criteria that must be met by some or all of your tenants?
  • Are there tenants who may have specific data governance requirements?
  • Are there other SaaS solutions that might be used in combination with this application?
  • Are there tenants who will demand some level of isolation from other tenants?
  • Are there tenants who will be willing to run in a fully shared environment?
  • Are there optimizations, workflows, or product features that could be offered as value-added options?
  • Are there any significant variations in how each tenant may need to respond to spikes in user load/activity?
  • Do one or more tenants require customization of the application’s flow, business rules, appearance, and so on?
  • Are there specific SLA requirements/expectations that may be imposed by some tenants?

This list is only meant to serve as a starting point. The nature of these questions is to tease out important traits that may distinguish the needs of specific tenants. Working through this list (and adding your questions) should give you a good sense of the range of options you may need to support in your solution.

Developing Tenant Personas

Once you have a well-understood list of tenant criteria, your next goal would be to determine how these criteria might be clustered together to represent different personas or families of tenants in your target audience.

This approach mirrors the agile concept of defining personas. The idea is to clearly capture and articulate the characteristics and requirements associated with each tenant type. As a part of this exercise, teams may assign tenant persona names that can be used as a common reference point for the broader team. The image below provides a simple example of how you might go about defining a persona.

Identifying personas should not be a heavyweight process. Invest enough energy to test yourself and the boundaries of your tenants’ requirements, but don’t iterate on this too long. You simply want to gather enough knowledge to ensure that you have identified the likely points of inflection in your SaaS architecture.

Aligning Architecture and Personas

The overarching goal of defining personas is to determine what values will shape the design of your SaaS application. Once you’ve identified your target personas, you’ll have the data points you need to determine the dimensions of flexibility and customizability you need to address with your underlying design.

While the personas for each SaaS application will vary, there are some common themes to consider as you begin to align your personas with your architecture. These themes often include isolation strategies, tenant policy management, metering models, and data access patterns. The sections that follow will explore how each of these areas can influence the design of your solution.

Tenant Isolation

Often, you can address the compliance and security requirements of your tenants by applying one or more isolation strategies. Tenants may require complete and total isolation from other tenants, or they may just need isolation at one particular layer or dimension of their environment.

The introduction of isolated tenants creates natural opportunities to define tiers of your product offering. Tenants who demand more isolation will easily understand how the value of that isolation justifies the costs associated with a higher-priced tier. Isolation also equips SaaS organizations with a mechanism for offsetting the increased costs associated with providing isolated infrastructure.

While you may have tenants who demand isolation, this does not mean every tenant will require an isolated solution. The challenge is to strike a balance between addressing the edge cases while continuing to offer shared, multi-tenant solutions where possible. The result is often a hybrid model where some tenants reside in a fully shared space while others are provisioned with some level of isolation.

The following diagram provides two examples of hybrid isolation models you can implement on Amazon Web Services (AWS). The image on the left depicts an approach that offers some tenants full isolation and others a shared environment. Tenants 1 and 2 presumably had requirements that warranted isolation and opted into this model, while the remaining tenants were willing to run in a shared environment.

The diagram on the right shows how you might implement a hybrid model, where isolation is achieved at different tiers of the architecture. The web tier is delivered as a shared environment while the application tiers are isolated for each tenant.

These are just two examples of isolation patterns. AWS provides constructs that can be leveraged to address a fairly diverse set of tenant isolation needs. The AWS stack includes account, networking, security, and storage constructs that enable a wide range of strategies for partitioning a tenant’s footprint. We won’t dig into each individual strategy in this post, but more information is available in the Tenant Isolation Architectures whitepaper.

Centralized Tenant Policies

One common way to address the varying needs of your tenant personas is to introduce a centralized model for managing tenant policies. Tenant policies enable you to customize many dimensions of the tenant experience. Performance, workflows, appearance, feature access, service-level agreements (SLAs)—all of these options can be potentially configured through tenant policies.

The design of your application directly influences the level of flexibility you’ll have in building and applying these policies. First, you’ll need to introduce a service that manages tenant configuration information. The diagram above provides a conceptual view of how these policies would be managed, acquired, and applied. Each tenant’s experience could be altered or shaped at any of the tiers of your application design based on the configuration of tenant policies.

Where this concept gets its power is in a number of levers and knobs you’ll have available to configure tenant workflows, performance, and so on. An application that has been decomposed into finer-grained services is going to allow for more granular control over the tenant experience, allowing you to support a wider range of tenant personas. The presence of these policies tends to put you in a better position to respond to new tenant needs that might surface as your product evolves.

The introduction of these tenant policies may also influence how you apply AWS constructs in your environment. You can imagine, for example, that a tenant policy might correlate to an Identity and Access Management (IAM) policy to control the visibility and control that is given to a tenant or one of the tenant’s users. Tenant policies also might shape the custom metrics that you capture with CloudWatch.

Metering Tenant Activity

In many cases, tenant personas are driven directly by consumption. To support these personas, you’ll need to design and instrument your application with mechanisms that will allow you to capture and aggregate a rich set of consumption metrics. These metrics should span all the moving parts of your system, allowing you to construct a comprehensive view of the infrastructure footprint being imposed by each tenant.

To create a complete metering picture of tenant activity, you’ll need to acquire data from multiple sources. The Amazon CloudWatch service is a good source for surfacing your tenant metrics. Amazon CloudWatch natively supplies access to consumption that spans most of the core metrics you’ll want to include in your tenant profile. It also supports the publication of custom metrics that correlate to your application’s features and functions. You might, for example, be able to determine that the use of a particular application feature directly correlates to a spike in disk or memory utilization.

In cases where you have isolated tenant infrastructure, you’ll want to think about how you might use tagging or separate accounts to identify resources that are associated with a specific tenant.

The last bit of metering you’ll need to consider is how (or if) you plan to apply policies based on tenant consumption. You’ll need to determine how your system will respond as tenants near or exceed consumption thresholds. You may choose to wire these policies up to Amazon Simple Notification Service (Amazon SNS) and to Amazon Simple Email Service (Amazon SES), for example. In some rare cases, SaaS providers might even disable portions of their application’s functionality when a tenant exceeds a threshold. The Amazon API Gateway’s throttling capabilities could be a good fit for this scenario. One common approach is to allow tenants to burst over their consumption threshold on a periodic basis and only enforce constraints when they being to exceed their limits on a more regular basis.

Data Access Optimization

Storage is not an area that gets a lot of attention when we think about personas and how we might distinguish tenants. Instead, we tend to view storage as an all-or-nothing proposition where all tenants get the same behavior. However, given the wide array storage options that are available on AWS, it makes good sense to consider how you might leverage these different storage models to support the different tiers of your SaaS system.

This concept connects directly to the tenant policies section described in the previous section. Through these policies, you can offer tenants different models for accessing and storing data. Combine this with the rich set of storage options available on AWS, and you have a fairly diverse set of options at your fingertips. You might support completely different storage options for tenants (RDS, DynamoDB, S3, Glacier, and so on). You might support separate policies for how data is archived and moved from one storage model to the next. Or, you might have separate throughput policies for each tenant. The range of options is very broad.

Cost is often a key motivator for optimizing data access. If your solution has a heavy storage footprint that is a significant portion of your tenant costs, then developing storage optimization strategies for different personas is an opportunity to align your costs with your tenant tiers.

Support Automation

Support represents another area where there is often variation among SaaS tenants. It’s natural for organizations to offer different levels of SLAs to different types of tenants.

In order to maximize your ability to support more demanding SLAs, you should think about how and where you may want to introduce automation that will trigger support activity. Your focus should be on proactively detecting issues and triggering the creation of support tickets. Each organization is going to have its own processes and models for driving support. The key is to determine what level of automation is needed, and what mechanisms may be used to detect and trigger the support model SLAs you’ve defined for each of your tenant personas.

AWS provides mechanisms that can assist with automating your support policies. You can use CloudWatch events, for example, to trigger a support workflow. Lambda functions could also be applied here to introduce any custom functions that would be used to automate your support process.

Applying YAGNI

The software development and agile communities have often advocated the YAGNI (you aren’t gonna need it) principle. This principle promotes the idea of designing only for the set of features and capabilities that you know are required today. The goal is to avoid investing in flexibility and generality that hasn’t yet earned its way into your solution.

You should consider this rule of thumb when you look at your personas and the needs you’re addressing with your solution. Yes, where possible, you’d like to accommodate generality that will promote and enable a range of tenant personas. At the same time, you don’t want to introduce abstractions or complexity that is not of immediate need or value to your current customers. You must find a way to strike a balance between architectural and design models that are enabling your business, and those that are simply laying a foundation for flexibility that may never be required.

As you think about decomposing your application into services and abstracting out tenant policies, you’ll likely find yourself right in the heart of the YAGNI challenge. Certainly, the more you lean toward more granular services and a more configuration-driven model, the more prepared you will be to support tenant variations. It’s all about finding the boundaries of what’s immediately useful, what’s just a good principle, and what might be overkill for your current needs.

Making Agility a Priority

As a SaaS solution provider, you should always consider agility and how your personas may or may not be influencing your ability to respond rapidly to customer feedback and changes in market dynamics. So, while you certainly want to embrace the varying requirements of your tenant personas, you also want to be sure that the level of customization and variation you’re accommodating doesn’t undermine the fundamental agility goals of your business.

This is especially relevant as you look at tenant isolation schemes. Typically, supporting isolation also means supporting more complex and potentially more fragile provisioning and deployment models. It also tends to limit your ability to roll out new features and capabilities rapidly. It can even affect the efficiency of your management and monitoring environments.

Leveraging AWS Services

AWS brings a rich collection of services to the SaaS domain, and provides a variety of different strategies to address the current and emerging needs of your SaaS customers. You can apply networking and account constructs to achieve multiple flavors of tenant isolation. You can use the continually evolving list of AWS storage options to offer tenants a range of different experiences. With Amazon Elastic Compute Cloud (Amazon EC2), Amazon EC2 Container Service (Amazon ECS), and AWS Lambda, you also have a diverse set of compute models that can influence the profile and footprint of your SaaS solution.

Summary

In this post, I’ve outlined the value of understanding the profiles of your tenants and aligning your design with the key points of variation for each tenant persona. Ultimately, your goal is to build a solution that offers clear, distinguished boundaries of value to each persona. You can then use these boundaries as a natural tool for creating product tiers that match the business needs of your customers with the broader cost considerations of your SaaS offering.

Whether you’re starting from scratch or evolving an existing SaaS solution, you should always have some sense of the current and evolving needs of your tenants. Even if you’re not formally tiering your product into separate offerings, you can still use your awareness of the range of tenant requirements to shape your architectural direction. Insights into your tenant profiles can lead you to new and interesting ways to leverage AWS services and capabilities in patterns that you may not have originally anticipated.

Securely Accessing Customer AWS Accounts with Cross-Account IAM Roles

As a Partner Solutions Architect, I look at a lot of AWS Partner Network (APN) Partner software and services. I like trying new things and experiencing the exciting solutions that our APN Partners are building. Security is job zero at AWS, so when I work with our APN Partners, there’s one thing I look for above all others, and that’s to understand if the APN Partner is following best practices to protect customer data and any customer AWS account they may access. If it seems like the APN Partner’s product will need to access a customer account, I’ll check to see how the APN Partner is getting credentials from the customer. If a partner is asking customers for AWS Identity and Access Management (IAM) access keys and secret keys, I halt my investigation and focus on helping the partner fix this approach.

It’s not that I have a problem with partners accessing customer accounts—APN Partners can add incredible functionality and value to the resources in an AWS account. For example, they can analyze AWS CloudTrail logs, or help optimize costs by monitoring a customer’s Amazon Elastic Compute Cloud (Amazon EC2) usage. The problem here is how the APN Partner is accessing the AWS account. IAM access keys and secret keys could be used anywhere, by anyone who has them. If a customer gives these keys to an APN Partner, they need to be able to trust that the APN Partner is adhering to best practices to protect those keys. This should really resonate with APN Partners, who need to store and protect their customers’ keys, but lack control over how customers manage those keys. Using IAM access keys and secret keys for cross-account access is not ideal for anyone. Fortunately, there is a better way.

Cross-account IAM roles allow customers to securely grant access to AWS resources in their account to a third party, like an APN Partner, while retaining the ability to control and audit who is accessing their AWS account. Cross-account roles reduce the amount of sensitive information APN Partners need to store for their customers, so that they can focus on their product instead of managing keys. In this blog post, I explain some of the risks of sharing IAM keys, how you can implement cross-account IAM roles, and how cross-account IAM roles mitigate risks for customers and for APN Partners, particularly those who are software as a service (SaaS) providers.

The problem(s) with sharing IAM keys

On AWS, access and secret keys are credentials that allow access to AWS APIs in an account. These can be associated with an IAM user in an account or with the root user of an account. Sharing these keys with external parties can create a lot of headache for everyone involved. The root of the problem with sharing IAM keys is that they can be used until explicitly revoked by a customer, and that the keys can be used from any computer that has Internet access (this includes servers, laptops, mobile phones, etc.). If an APN Partner wants to use IAM access and secret keys in a customer’s product, here are some important questions that both the APN Partner and the customer should be able to confidently answer:

  • Are keys being managed securely? Are they encrypted when they are transmitted and stored? Who has access to the keys? What processes protect the keys from being exfiltrated from the partner’s systems?
  • Are keys being rotated frequently? If you are a customer, will the APN Partner tell you when to rotate your keys? As an APN Partner, how can you make sure all your customers frequently rotate their keys? How do you coordinate key rotation with your customers to minimize downtime?
  • Can you control who has access to the customer’s AWS account? For both customers and APN Partners, how will you know who uses a key, and from where?
  • Is the access policy associated with the key too permissive? For APN Partners that have a database of keys, how many of those keys provide too much access, or are root account keys?

APN Partners could build solutions that address these considerations, and customers could take on more work to ensure that their keys are being handled in a secure way. However, this involves a lot of undifferentiated heavy lifting that cross-account IAM roles can handle for both parties.

How cross-account IAM roles work

An IAM role is an AWS identity with an access policy that determines what the role can and can’t do in AWS. They are designed to be assumable by another AWS identity that is already authenticated to AWS. When an identity assumes a role, it receives temporary credentials and the same access policy as the role. You may be familiar with how roles work if you have used EC2 instance profiles, or have set up an AWS Lambda function.

A cross-account IAM role is an IAM role that includes a trust policy that allows AWS identities in another AWS account to assume the role.  Put simply, I can create a role in one AWS account that delegates specific permissions to another AWS account. Let’s take a look at the overall process as it applies to APN Partner software that needs to access a customer account:

  1. A customer creates an IAM role in their account with an access policy for accessing the resources that the APN partner requires. They specify that the role can be assumed by the partner’s AWS account by providing the APN Partner’s AWS account ID in the trust policy for that role.
  2. The customer gives the Amazon Resource Name (ARN) of the role to the APN partner. The ARN is the fully qualified name of the role.
  3. When the APN Partner’s software needs to access the customer’s account, the software calls the AssumeRole API in the AWS Security Token Service (STS) with the ARN of the role in the customer’s account. STS returns a temporary AWS credential that allows the software to do its work.

Customers can include conditional checks on the trust policy associated with an IAM role to limit how third parties can assume the role. An example of this would be the external ID check. The external ID is a string defined in the trust policy that the partner must include when assuming a role.  External IDs are a good way to improve the security of cross-account role handling in a SaaS solution, and should be used by APN Partners who are implementing a SaaS product that uses cross-account roles.

How cross-account roles mitigate risks

Using cross-account roles addresses and mitigates a number of risks, so it’s worth taking a closer look at how cross-account roles help address the security questions we listed earlier.

  • Are keys being managed securely? The Role allows the partner to get temporary credentials when they need to use them. Unlike Access and Secret Keys, these don’t need to be stored, so partner doesn’t need to be concerned with managing keys.
  • Are keys being rotated frequently? Credentials generated by STS expire after an hour. Many of our software development kits (SDKs) have credential providers that handle this automatically, so neither the APN Partner nor the customer needs to manage credential rotation manually.
  • Can you control who has access to the customer’s AWS account? The role in the customer’s account can be assumed only by an authenticated AWS identity in the partner’s account. The customer knows that only the APN Partner is accessing their resources, and the APN Partner can focus solely on managing and protecting the IAM roles and users in their own account.
  • Is the access policy associated with the key too permissive? A role can’t have root key permissions, and since the cross-account role’s trust policy specifies the partner’s account, it is more likely that the permissions in the role’s access policy will reflect the partner’s requirements. APN Partners can encode cross-account IAM roles in AWS CloudFormation templates to make sure that customers are giving them exactly the permissions they need.

Using AWS CloudFormation to help customers create roles

Providing documentation to customers about how to create roles is straightforward, but if an APN Partner wants to make this simpler for their customers, they can package the role in an AWS CloudFormation template. This approach lets customers deploy a role into their account quickly without having to copy and paste trust and access policy documents.

I’ve provided an example AWS CloudFormation template that creates a cross-account role with an external ID for accessing Amazon S3. If you’re an APN Partner who needs access customer accounts, you can create templates like this with the access and trust policies you need, and instruct your customers to instantiate the template in their account. This template has parameters for the account number to make testing easy. In practice, partners will want to hard code their account number in the trust policy in the template. This template outputs the ARN that the customer can give to you to allow access to their account.

Where to go from here

In this blog post, I’ve explained why using IAM keys to provide AWS account access to third parties is not ideal, and talked about how APN Partners can implement cross-account IAM roles in their products. To learn more, take a look at AWS documentation on IAM roles, cross-account IAM roles, and external IDs. If you’re an APN Partner who wants to discuss this practice in more detail, please feel free to email me: rocamora@amazon.com.

Looking for implementation details? Take a look at our SDK documentation that explains how to use AWS SDKs to build this into your product. Here are links to documentation about assuming roles for the Java, Ruby, Golang, .NET, and Python SDKs. Most partners won’t need this, but if you want to see the low-level details, take a look at the STS API documentation. I also recommend that you audit your own AWS accounts using a credential report to see if you are providing cross-account access with an IAM user.

If you’re a consulting partner or an MSP, you probably find yourself needing to access your customers’ AWS accounts through the AWS Management Console. You can use cross-account IAM roles for this as well by using the switch roles feature of the AWS Management Console. This gives you access to your customers’ accounts without having to manage users, passwords, or keys.

Finally, if you’re an AWS customer and work with an APN Partner who is requiring keys, ask them how you can use cross-account roles with their products, and don’t hesitate to share this post with them.

Terraform: Beyond the Basics with AWS

This is a guest post co-authored by Josh Campbell and Brandon Chavis, two of our Partner Solutions Architect (SA) team members. 

What is Terraform?

Terraform by HashiCorp, an APN Technology Partner and AWS DevOps Competency Partner, is an “infrastructure as code” tool similar to AWS CloudFormation that allows you to create, update, and version your AWS infrastructure. Terraform has a great set of features that make it worth adding to your toolbelt, including:

  • Friendly custom syntax, but also has support for JSON.
  • Visibility into changes before they actually happen.
  • Built-in graphing feature to visualize the infrastructure.
  • Understands resource relationships. One example is failures are isolated to dependent resources while non-dependent resources still get created, updated, or destroyed.
  • Open source project with a community of hundreds of contributors who add features and updates.
  • The ability to break down the configuration into smaller chunks for better organization, re-use, and maintainability. The last part of this article goes into this feature in detail.

New to Terraform?

This article assumes you have some familiarity with Terraform already.

We recommend that you review the HashiCorp documentation for getting started to understand the basics of Terraform.  Conveniently, their documentation uses AWS as the example cloud infrastructure of choice!

Keeping Secrets

You can provide Terraform with an AWS access key directly through the provider, but we recommend that you use a credential profile already configured by one of the AWS SDKs.  This prevents you from having to maintain secrets in multiple locations or accidentally committing these secrets to version control.  In either scenario, you’ll want to be sure to read our best practices for maintaining good security habits.  Alternatively, you can run Terraform from one or more control servers that use an IAM instance profile.

Each instance profile should include a policy that provides the appropriate level of permissions for each role and use case.  For example, a development group may get a control server with an attached profile that enables them to run Terraform plans to create needed resources like Elastic Load Balancers and Auto Scaling Groups, but not resources outside the group’s scope like Redshift clusters or additional IAM roles.  You’ll need to plan your control instances carefully based on your needs.

To use an instance or credential profile with Terraform, inside your AWS provider block simply remove the access_key and secret_key declarations and any other variables that reference access and secret keys.  Terraform will automatically know to use the instance or credential profile for all actions.

If you plan to share your Terraform files publicly, you’ll want to use a terraform.tfvars file to store sensitive data or other data you don’t want to make public.  Make sure this file is excluded from version control (for example, by using .gitignore).  The file can be in the root directory and might look something like this:

region = “us-west-2”
keypair_name = “your_keypair_name”
corp_ip_range = “192.168.1.0/24”
some_secret = “your_secret”

Building Blocks

An advantage of using an infrastructure of code tool is that your configurations also become your documentation.  Breaking down your infrastructure into components makes it easier to read and update your infrastructure as you grow. This, in turn, helps makes knowledge sharing and bringing new team members up to speed easier.

Because Terraform allows you to segment chunks of infrastructure code into multiple files (more on this below), it’s up to you to decide on a logical structure for your plans.  With this in mind, one best practice could be to break up Terraform files by microservice, application, security boundary, or AWS service component. For example, you might have one group of Terraform files that build out an ECS cluster for your inventory API and another group that builds out the Elastic Beanstalk environment for your production front-end web application.

Additionally, Terraform supports powerful constructs called modules that allow you to re-use infrastructure code.  This enables you to provide infrastructure as building blocks that other teams can leverage.  For example, you might create a module for creating EC2 instances that uses only the instance types your company has standardized on.  A service team can then include your module and automatically be in compliance.  This approach creates enablement and promotes self-service.

Organizing Complex Services with Modules

Modules are logical groupings of Terraform configuration files.  Modules are intended to be shared and re-used across projects, but can also be used within a project to help better structure a complex service that includes many infrastructure components.  Unfortunately, Terraform does not recursively load files in subdirectories, so you have to use modules to add structure to your project instead of using one large file or multiple smaller files in the same directory.  You can then execute these modules from a single configuration file (we’ll use main.tf for this example) in the parent directory where your sub-directories (modules) are located.  Let’s examine this concept a bit closer.

Modules run sequentially, so you must understand your order of dependencies.  For example, a module to create a launch configuration must run before a module that creates an Auto Scaling group, if the Auto Scaling group depends on the newly created launch configuration.  Each module should be ordered top (first to run) to bottom (last to run) in your main.tf file (more on this later) appropriately.

Terraform allows you to reference output variables from one module for use in different modules. The benefit is that you can create multiple, smaller Terraform files grouped by function or service as opposed to one large file with potentially hundreds or thousands of lines of code.  To use Terraform modules effectively, it is important to understand the interrelationship between output variables and input variables.   At a high level, these are the steps you would take to make an object in one module available to another module:

  1. Define an output variable inside a resource configuration (module_A).  The scope of resource configuration details are local to a module until declared as an output.
  2. Declare the use of module_A’s output variable in the configuration of another module, module_B. Create a new key name in module_B and set the value equal to the output variable from module_A.
  3. Finally, create a variables.tf file for module_B. In this file, create an input variable with the same name as the key you defined in module_B in step 2. This variable is what allows dynamic configuration of resource(s) in a module. Because this variable is limited to module_B in scope, you need to repeat this process for any other module that needs to reference module_A’s output.

As an example, let’s say we’ve created a module called load_balancers that defines an Elastic Load Balancer. After declaring the resource, we add an output variable for the ELB’s name:

output "elb_name" {
value = "${aws_elb.elb.name}"
}

You can then reference this ELB name from another module using ${module.load_balancers.elb_name}. Each module (remember that a module is just a set of configuration files in their own directory) that wants to use this variable must have its own variables.tf file with an input variable of elb_name defined.

Sample Directory Layout Using Local Modules for Organization

There are a few things to note about this layout:

  • main.tf is the file that will invoke each module. It provides no configuration itself aside from declaring the AWS provider.
  • Each subdirectory is a module. Each module contains its own variables.tf file.
  • We’re using version control to store our infrastructure configuration. Because this is a public repository, we’ve asked Git to not store our .tfstate files since they contain sensitive information.  In production, you’ll want to store these files in private version control, such as AWS CodeCommit or Amazon S3, where you can control access.
  • The site module contains security groups and a VPC. These are resources that have no other dependencies, and most other resources depend on these.  You could create separate security_groups and vpcs modules as well. This module is the first to be called in main.tf.

Following Along with an Example

In this section, we’ll walk you through an example project that creates an infrastructure with several components, including an Elastic Load Balancer and an Auto Scaling group, which will be our focus.

Main.tf

Looking at main.tf you will see that there are several modules defined. Let’s focus on the autoscaling_groups module first:

module "autoscaling_groups" {
source = "./autoscaling_groups"
public_subnet_id = "${module.site.public_subnet_id}"
webapp_lc_id = "${module.launch_configurations.webapp_lc_id}"
webapp_lc_name =
"${module.launch_configurations.webapp_lc_name}"
webapp_elb_name = "${module.load_balancers.webapp_elb_name}"
}

The first thing to notice is the line source = "./autoscaling_groups".  This simply tells Terraform that the source files for this module are in the autoscaling_groups subdirectory.  Modules can be local folders as they are above, or they can come from other sources like an S3 bucket or different Git repository. This example assumes you will run all Terraform commands from the parent directory where main.tf exists.

autoscaling_groups Modules

If you then examine the autoscaling_groups directory you’ll notice that  it includes two files:  variables.tf and webapp-asg.tf.  Terraform will run any .tf files it finds in the Module directory, so you can name these files whatever you want.  Now look at line 20 of autoscaling_groups/webapp-asg.tf:

load_balancers = ["${var.webapp_elb_name}"]

Here we’re setting the load_balancers parameter to an array that contains a reference to the variable webapp_elb_name.  If you look back at main.tf, you’ll notice that this name is also part of the configuration of the autoscaling_groups module.  Looking in autoscaling_groups/variables.tf, you’ll see this variable declared with empty curly braces ({}).  This is the magic behind using outputs from other modules as input variables.

load_balancers Modules

To bring it together, examine load_balancers/webapp-elb.tf and find this section:

output "webapp_elb_name" {
value = "${aws_elb.webapp_elb.name}"
}

Here we’re telling Terraform to output a variable named webapp_elb_name, whose value is equal to our ELB name as determined by Terraform after the ELB is created for us.

Summary

In this example:

  1. We created an output variable for the load_balancers module named webapp_elb_name in load_balancers/webapp-elb.tf.
  2. In main.tf under the autoscaling_groups module configuration, we set the webapp_elb_name key to the output variable of the same name from the load_balancers module as described above. This is how we reference output variables between modules with Terraform.
  3. Next, we defined this input variable in autoscaling_groups/variables.tf by simply declaring variable webapp_elb_name {}. Terraform automatically knows to set the value of webapp_elb_name to the output variable from the load_balancers module, because we declared it in the configuration of our autoscaling_groups module in step 2.
  4. Finally, we’re able to use the webapp_elb_name variable within autoscaling_groups/webapp-asg.tf.

Collaborating with Teams

Chances are, if you’re using Terraform to build production infrastructure, you’re not working alone. If you need to collaborate on your Terraform templates, the best way to sync is by using Atlas by HashiCorp. Atlas allows your infrastructure templates to be version controlled, audited, and automatically deployed based on workflows you configure. There’s a lot to talk about when it comes to Atlas, so we’ll save the deep dive for our next blog post.

Wrapping up

We hope we’ve given you a good idea of how you can leverage the flexibility of Terraform to make managing your infrastructure less difficult. By using modules that logically correlate to your actual application or infrastructure configuration, you can improve agility and increase confidence in making changes to your infrastructure. Take a look at Terraform by HashiCorp today: https://www.terraform.io/.

Multi-Tenant Storage with Amazon DynamoDB

Tod Golding is an AWS Partner Solutions Architect (SA). He works closely with our SaaS Partner ecosystem. 

If you’re designing a true multi-tenant software as a service (SaaS) solution, you’re likely to devote a significant amount of time to selecting a strategy for effectively partitioning your system’s tenant data. On Amazon Web Services (AWS), your partitioning options mirror much of what you see in the wild. However, if you’re looking at using Amazon DynamoDB, you’ll find that the global, managed nature of this NoSQL database presents you with some new twists that will likely influence your approach.

Before we dig into the specifics of the DynamoDB options, let’s look at the traditional models that are generally applied to achieve tenant data partitioning. The list of partitioning solutions typically includes the following variations:

  • Separate database – each tenant has a fully isolated database with its own representation of the data
  • Shared database, separate schema – tenants all reside in the same database, but each tenant can have its own representation of the data
  • Shared everything – tenants all reside in the same database and all leverage a universal representation of the data

These options all have their strengths and weaknesses. If, for example, you’d like to support the ability for tenants to have their own data customizations, you might want to lean toward a model that supports separate schemas. If that’s not the case, you’ll likely prefer a more unified schema. Security and isolation requirements are also key factors that could shape your strategy. Ultimately, the specific needs of your solutions will steer you toward one or more of these approaches. In some cases, where a system is decomposed into more granular services, you may see situations where multiple strategies are applied. The requirements of each service may dictate which flavor of partitioning best suits that service.

With this as a backdrop, let’s look at how these partitioning models map to the different partitioning approaches that are available with DynamoDB.

Linked Account Partitioning (Separate Database)

This model is by far the most extreme of the available options. Its focus is on providing each tenant with its own table namespace and footprint with DynamoDB. While this seems like a fairly basic goal, it is not easily achieved. DynamoDB does not have the notion of an instance or some distinct, named construct that can be used to partition a collection of tables. In fact, all the tables that are created by DynamoDB are global to a given region.

Given these scoping characteristics, the best option for achieving this level of isolation is to introduce separate linked AWS accounts for each tenant. To leverage this approach, you need to start by enabling the AWS Consolidated Billing feature. This option allows you to have a parent payer account that is then linked to any number of child accounts.

Once the linked account mechanism is established, you can then provision a separate linked account for each new tenant (shown in the following diagram). These tenants would then have distinct AWS account IDs and, in turn, have a scoped view of DynamoDB tables that are owned by that account.

While this model has its advantages, it is often cumbersome to manage. It introduces a layer of complexity and automation to the tenant provisioning lifecycle. It also seems impractical and unwieldy for environments where there might be a large collection of tenants. Caveats aside, there are some nice benefits that are natural byproducts of this model. Having this hard line between accounts makes it a bit simpler to manage the scope and schema of each tenant’s data. It also provides a rather natural model for evaluating and metering a tenant’s usage of AWS resources.

Tenant Table Name Partitioning (Shared Database, Separate Schema)

The linked account model represents a more concrete separation of tenant data. A less invasive approach would be to introduce a table naming schema that adds a unique tenant context to each DynamoDB table. The following diagram represents a simplified version of this approach, prepending a tenant ID (T1, T2, and T3) to each table name to identify the tenant’s ownership of the table.

This model embraces all the freedoms that come with an isolated tenant scheme, allowing each tenant to have its own unique data representation. With this level of granularity, you’ll also find that this aligns your tenants with other AWS constructs. These include:

  • The ability to apply AWS Identity and Access Management (IAM) roles at the table level allows you to constrain table access to a given tenant role.
  • Amazon CloudWatch metrics can be captured at the table level, simplifying the aggregation of tenant metrics for storage activity.
  • IOPS is applied at the table level, allowing you to create distinct scaling policies for each tenant.

Provisioning also can be somewhat simpler under this model since each tenant’s tables can be created and managed independently.

The downside of this model tends to be more on the operational and management side. Clearly, with this approach, your operational views of a tenant will require some awareness of the tenant table naming scheme in order to filter and present information in a tenant-centric context. The approach also adds a layer of indirection to any code you might have that is metering tenant consumption of DynamoDB resources.

Tenant Index Partitioning (Shared Everything)

Index-based partitioning is perhaps the most agile and common technique that is applied by SaaS developers. This approach places all the tenant data in the same table(s) and partitions it with a DynamoDB index. This is achieved by populating the hash key of an index with a tenant’s unique ID. This essentially means that the keys that would typically be your hash key (Customer ID, Account ID, etc.) are now represented as range keys.  The following example provides a simplified view of an index that introduces a tenant ID as a hash key. Here, the customer ID is now represented as a range key.

This model, where the data for every tenant resides in a shared representation, simplifies many aspects of the multi-tenant model. It promotes a unified approach to managing and migrating the data for all tenants without requiring a table-by-table processing of the information. It also enables a simpler model for performing tenant-wide analytics of the data. This can be extremely helpful in assessing and profiling trends in the data.

Of course, there are also limitations with this model. Chief among these is the inability to have more granular, tenant-centric control over access, performance, and scaling. However, some may view this as an advantage since it allows you to have a more global set of policies that respond to the load of all tenants instead of absorbing the load of maintaining policies on a tenant-by-tenant basis. When you choose your partitioning approach, you’ll likely strike a balance between these tradeoffs.

Another consideration here is that this approach could be viewed as creating a single point of failure. Any problem with the shared table could affect the entire population of tenants.

Abstracting Client Access

Each technique outlined in this blog post requires some awareness of tenant context. Every attempt to access data for a tenant requires acquiring a unique tenant identifier and injecting that identifier into any requests to manage data in DynamoDB.

Of course, in most cases, end-users of the data should have no direct knowledge that their provider is a tenant of your service. Instead, the solution you build should introduce an abstraction layer that acquires and applies the tenant context to any DynamoDB interactions.

This data access layer will also enhance your ability to add security checks and business logic outside of your partitioning strategies, with minimal impact to end-users.

Supporting Multiple Environments

As you think about partitioning, you may also need to consider how the presence of multiple environments (development, QA, production, etc.) might influence your approach. Each partitioning model we’ve discussed here would require an additional mechanism to associate tables with a given environment.

The strategy for addressing this problem varies based on the partitioning scheme you’ve adopted. The linked account model is the least affected, since the provisioning process will likely just create separate accounts for each environment. However, with table name and index-based partitioning, you’ll need to introduce an additional qualifier to your naming scheme that will identify the environment associated with each table.

The key takeaway is that you need to be thinking about whether and how environments might also influence your entire build and deployment lifecycle. If you’re building for multiple environments, the context of those environments likely need to be factored into your overall provisioning and naming scheme.

Microservice Considerations

With the shift toward microservice architectures, teams are decomposing their SaaS solutions into small, autonomous services. A key tenant of this architectural model is that each service must encapsulate, manage, and own its representation of data. This means that each service can leverage whichever partitioning approach best aligns with the requirements and performance characteristics of that service.

The other factor to consider is how microservices might influence the identity of your DynamoDB tables. With each service owning its own storage, the provisioning process needs assurance that the tables it’s creating for a given service are guaranteed to be unique. This typically translates into adding some notion of the service’s identity into the actual name of the table. A catalog manager service, for example, might have a table that is an amalgam of the tenant ID, the service name, and the logical table name. This may or may not be necessary, but it’s certainly another factor you’ll want to keep in mind as you think about the naming model you’ll use when tables are being provisioned.

Agility vs. Isolation

It’s important to note that there is no single preferred model for the solutions that are outlined in this blog post. Each model has its merits and applicability to different problem domains. That being said, it’s also important to consider agility when you’re building SaaS solutions. Agility is fundamental to the success of many SaaS organizations and it’s essential that teams consider how each partitioning model might influence its ability to continually deploy and evolve both applications and business.

Each variation outlined here highlights some of the natural tension that exists in SaaS design. In picking a partitioning strategy, you must balance the simplicity and agility of a fully shared model with the security and variability offered by more isolated models.

The good news is that DynamoDB supports all the mechanisms you’ll need to implement each of the common partitioning models. As you dig deeper into DynamoDB, you’ll find that it actually aligns nicely with many of the core SaaS values. As a managed service, DynamoDB allows you to shift the burden of management, scale, and availability directly to AWS. The schemaless nature of DynamoDB also enables a level of flexibility and agility that is crucial to many SaaS organizations.

Kicking the Tires

The best way to really understand the merits of each of these partitioning models is to simply dig in and get your hands dirty. It’s important to examine the overall provisioning lifecycle of each partitioning approach and determine how and where it would fit into a broader build and deployment lifecycle. You’ll also want to look more carefully at how these partitioning models interact with AWS constructs. Each approach has nuances that can influence the experience you’ll get with the console, IAM roles, CloudWatch metrics, billing, and so on. Naturally, the fundamentals of how you’re isolating tenants and the requirements of your domain are also going to have a significant impact on the approach you choose.

Are you building SaaS on AWS? Check out the AWS SaaS Partner Program, an APN Program providing Technology Partners with support to build, launch, and grow SaaS solutions on AWS.

Data Democratization with APN Technology Partner Calgary Scientific

Christopher Crosbie MPH, MS and AWS Partner Network (APN) Solutions Architect

As a Healthcare and Life Science-focused Partner Solutions Architect, I have an opportunity to meet with a variety of APN partners who are using cloud computing to enhance the healthcare and life science industries. I also gain insight into the way our APN partners and customers make use of cloud technologies in the health tech field by attending key industry events.

For more than 100 years, Radiological Society of North America (RSNA) has hosted an annual meeting to bring together an international community of radiologists, medical physicists, and other medical professionals. This conference presents the latest developments and upcoming innovations in the technically advanced field of radiology imaging. It’s a great place to first learn about upcoming enhancements to the healthcare industry.

Massive file sizes and long retention rates have made the cloud an attractive storage option for radiology IT. The shift away from traditional storage systems has been subtle in the industry thus far, but it’s certainly not new. Radiology vendors have seen the benefits experienced by industry leaders such as Philips Healthcare, who have already embraced and succeeded in the cloud.

Nevertheless, this year’s RSNA conference seemed to be a turning point for the cloud storage option in radiology equipment – for the first time, the cloud seemed to be a ubiquitous storage solution option throughout the innovations and developments presented at the conference sessions. Now that “cloud-ready” storage equipment is reaching clinical users, the industry is beginning to see cloud-native applications built on top of imaging data that unlock information in ways not previously anticipated or even thought possible.

One of the leaders and most innovative companies unlocking the potential of cloud-stored imaging data is AWS Partner Network (APN) Advanced Technology Partner and AWS Healthcare Competency Partner Calgary Scientific with their ResolutionMD application. This technology is built on top of the PureWeb software platform that provides advanced web, mobility, and cloud enablement solutions for industries looking for secure access to their data or graphics-intensive applications while continuing to use their existing systems.

When Calgary Scientific demonstrates its products at conferences like RSNA, the presentation starts with a very high-powered server directly connected to the conference environment. The presenter then switches from the local server and does the same demo using AWS and asks the audience, “Did you notice a difference?” The answer is always a resounding, “No.” With no degradation in user experience, the discussion turns to the benefits that cloud and AWS offers clinical users.

I wanted to learn more about ResolutionMD and how Calgary Scientific was able to migrate this highly compute-intensive, complex, and yet responsive application over to AWS, so I sat down with Dan Pigat, VP Products – Cloud & Collaboration, Calgary Scientific. However, my line of questioning on complex system migration was quickly cut short when Dan chuckled at my premise and informed me that their migration to the cloud was simple: “It just worked.” So Dan and I began to discuss the impact of cloud capabilities on end users in the many industries Calgary Scientific serves.

About the PureWeb software platform

To understand the success of Calgary Scientific’s ResolutionMD, it’s important to know the history of PureWeb, the software underlying the ResolutionMD technology. Dan took me through a quick crash course.

The PureWeb SDK came into existence soon after the launches of AWS (March 2006) and the Apple iPhone (June 2007). “In a way, these three technologies have grown up together,” Dan tells me. “Our focus was on adapting existing 2D and 3D applications to run on the cloud and then access them on any web browser or mobile device, connecting these two worlds.”

The initial target for this software was the medical industry – a sector with the most rigorous demands for access, visually rich data, privacy, security, and scalability. The ResolutionMD product was developed on top of PureWeb to provide clinical grade medical imaging to the mobile world. Clinicians can use it on web and mobile devices to access the same quality medical imaging that historically was only available on expensive equipment in hospital and laboratory settings. ResolutionMD is FDA-cleared and gaining fast acceptance in the industry, with Calgary Scientific having established more than 50 partnerships with companies like Siemens, Fuji and McKesson due to ResolutionMD’s seamless connectivity into existing Picture Archiving and Communication Systems (PACS).

The PureWeb software platform has recently been expanding into new industries such as design, manufacturing and energy. Despite the vast differences between these industries and their underlying technological needs, “their trajectory with cloud adoption is surprisingly similar,” Dan tells me. He sees a lot of parallels in their use of the cloud’s capabilities.

Moving from expert opinion to interactions 

The role of a radiologist is often stereotyped as a solitary one – an expert sitting in a dark hospital basement reviewing images on specialized monitors who then creates reports for physicians. While this representation may be an exaggeration, radiology does diverge from other medical specialties in that it depends entirely on visual perception. Communication of the diagnostic interpretation is, therefore, a critical component of the radiologist’s expertise. Failure to communicate their findings accurately is the fourth most frequent allegation against radiologists in medical malpractice claims.

“With today’s technology, we can connect everyone to the same tools and data plus provide better ways to communicate,” Dan explains. Radiology does not need to be a secluded art. By leveraging cloud technology via applications like ResolutionMD, physicians can use their iPads, Android phones or any web browser from wherever they happen to be to review images alongside radiologists in real time. “This collaboration generates better interpretations and diagnosis,” says Dan. The same images can even be shared with patients on their own devices. This access, in conjunction with an expert clinical team, enables patients to be direct participants in the management of their own care.

This democratization of the data is not limited to the field of radiology. Dan goes on to explain that he is seeing similar excitement and potential from the manufacturing and construction industries. Colleagues on distant ends of a project can use the cloud and PureWeb to access Computer Aided Design (CAD) files in the same way images are now available in healthcare. CAD drawings have evolved from being one expert’s view on a project into a conversation between designers and those on the manufacturing floor or job site about how to best adapt designs into the real world.

Dan is reminded of a conversation he once had with a construction customer who was frustrated with the number of times the acronym, V.I.F. (a shorthand for Verify In Field), showed up on CAD drawings. “He was excited that cloud technologies morphed those V.I.F.s into a collaborative discussion in which all the stakeholders could contribute to finding the best possible project design,” he recalled.

Unlocking the data

The story of cloud and imaging may have started with a need to simply “put the data away.” With more capable CT and MRI scanners producing ever-larger files, data storage can be a daunting challenge. Many hospitals reach a saturation point where they need to move data to the cloud simply to meet requirements, such as retaining Medicare managed care program provider records for 10 years.

This essential requirement of storing the data cheaply and securely is being met with cloud services such as Amazon Simple Storage Service (S3) and Amazon Glacier. Not only are these more cost effective options, but applications like ResolutionMD can open and enable full diagnostic reads of these studies directly from the cloud. The data is no longer locked away in offsite locations or on backup tape; cloud storage keeps this data accessible and available.

Dan’s excitement over what PureWeb customers are doing with this newfound ability is apparent as he lists applications that range from monitoring cancer tumor growth to tracking changes in oil and gas reservoirs. “We’ve only just started to uncover the myriad of opportunities available with these technologies,” he speculates.

Focus on the humans

As I asked Dan for some closing thoughts, he reminded me to focus on the end user and to try not to solely get wrapped up in the cool details of the technology itself. This resonated with me, as I’m admittedly a technologist who finds himself quickly enthralled by the underlying workings of the technology innovations that AWS drives.

However, Dan also had very good reason to present this reminder. He took me back to the early days of the cloud and PureWeb, when access over a network was slower than their dedicated systems. Dan says that many of the technologists at the time were disappointed by this and somewhat dismissed the cloud as an alternative. The clinical users, on the other hand, absolutely loved the technology. Even though interaction was slower on a mobile device, that time was nominal compared to alternatives such as driving thirty minutes to the hospital to respond to a page, which can cause a critical difference in care.

As the technologies have matured, Dan believes we’ve actually crossed a tipping point where large files rendered by powerful remote GPU’s can actually outperform local rendering on lower spec PC’s. But it’s still about people and giving them better access to the tools they need to do their jobs.

The result of this progress can be measured with examples like ResolutionMD, which provides significantly faster image access compared to standard of care image viewers. While this is possible due to advances in Amazon EC2 GPU instance types and web standards, we need to look at the actual impact on patients. Dan references a study done by a major healthcare institution which verified that the use of ResolutionMD on mobile devices resulted in an average 11-minute reduction in time to diagnosis for critical groups of patients (such as stroke victims where time equates to brain loss). With results like that, the impact of cloud and mobile technologies in healthcare becomes very apparent.

Speaking with innovative APN partners like Dan of Calgary Scientific makes me anxious for the exciting shift that is taking place with AWS and our APN partners across all industries. If recent innovations in radiation imaging technology are any indication, the future for both imaging and healthcare technology is extremely bright.

Take a look at the following demonstrative AWS reference diagram for a cloud-based ResolutionMD deployment:

 

To learn more about Calgary Scientific, visit the company’s AWS Partner Directory page.

How to Prepare Your Business and Technical Teams on AWS

Erik Farr is a Partner Solutions Architect (SA) with the AWS Partner Network (APN). 

My Amazon Web Services (AWS) journey started in late 2012, when I opened a cloud consulting practice at a firm that was an AWS Partner Network (APN) Premier Partner. As an APN Partner, I found AWS training courses and certifications invaluable, augmenting them with additional AWS resources and tools to ensure that my team was properly prepared.

Now that I work at AWS as a Solutions Architect, APN Partners often ask me how—and when—to start training their staff. AWS has a wealth of great resources. Business and Technical Partner Learning Plans, for example, are designed to help you and your team ramp up on the AWS platform, deepen your knowledge and skills, and better serve your customers in your first weeks and months.

What follows is my own unique learning plan, designed to cater to the full lifecycle of AWS enablement from entry level to global expert. Remember, this is just one way to grow. Think of this plan as a template to augment your own personalized approach. While I focus mostly on AWS skills, it’s important to supplement these with traditional IT skills like Linux/Windows OS management, networking, scripting, and development. This plan also focuses on stages, not titles (although you could map it to roles), and doesn’t include familiar “time in stage” because everyone starts at a different place and grows at a different pace.

The early stage is meant to provide a baseline level of knowledge of AWS; all people on your team should have this foundation to ensure future success. The middle stage is relevant for the bulk of your workforce. It focuses on gaining certifications and real world experience, expanding knowledge of AWS into fringe cases, and beginning a professional relationship with AWS Solution Architects, Partner Development Managers, and/or Account Managers your company is already working with. The advanced stage is the peak tier and where your top employees should strive to be. Completing activities outlined in the advanced stage allows real differentiation for both your company and for the individual.

Early Stage

This stage of the journey is for people with little or no exposure to the AWS cloud.  Typically these are people new to your organization or team (i.e. fresh from school or transfers from a non-cloud team), and ready to begin the AWS enablement process.  Having your entire staff complete this stage ensures teams will have a solid understanding of the AWS cloud and its value proposition.

Recommended Activities to Complete

In the table below, I’ll link to a number of webpages to direct you to introductory information on training courses and resources. Be sure to register for all training activities through the APN Portal training tab. APN Partners can take online accreditation courses at no cost and receive a 20% discount on AWS-delivered public classes registered through the APN Portal.

 Recommended Activity Notes/Links
Sign up for the APN Portal, an exclusive online resource for APN Partners
  • Go to APN Portal
  • Follow the link to create a new account and be sure to sign up with your company email (e.g. name@company.com)
Complete Business or Technical Partner Accreditation based on role, followed up by the TCO Accreditation course, and other online Courses.
Create a personal AWS account
  • Sign up using the AWS Free Tier
  • Learn the layout and functionality of the console
Enroll in One Day Essentials and similar courses as they pertain to your role An instructor led introduction to AWS products, services, and common solutions providing the person with basic fundamentals to become more proficient in identifying AWS services.

Identify AWS official blogs, presentations and videos specific to your role
or interests to gain insights into new services, architectures and whitepapers
Gain foundational knowledge about key AWS services with our Introduction to AWS series. These free online videos and self-paced labs help you get started with core AWS services, terminology, and key concepts such as Amazon EC2, Amazon S3, and Elastic Load Balancing. We also have more than 75 lab topics available to help you get hands-on practice working with AWS services and use cases Gain experience either during your normal daily role or with self-study using online labs.

Learn how to setup a new VPC, creating private and public subnets (creating the CIDR blocks), modifying routing tables and adding routes for internal and external routes to/from the Internet with a NAT Server
Learn how to create Elastic Compute Cloud (EC2) instances (multiple OS’s) with Elastic Block Storage (EBS) attached, how to put them private and public subnets and how to log into the various operating systems (Linux and Windows)with keys and passwords. Assign Elastic IP’s to instances and troubleshoot when/why these can be accessed externally
Understand how to create/modify/deploy Amazon Machine Images (AMI) of existing instances
Set up Amazon Simple Storage Service (Amazon S3) Buckets and put/get objects from Amazon Elastic Compute Cloud (EC2) instances and local PCs
Set up Elastic Load Balancer (ELBs) and load traffic between Elastic Compute Cloud (EC2) instances, ensure Security groups are understood and setup using good security practices
Understand the basics of AWS CloudFormation, and how to create (JSON), execute and stop CloudFormation scripts
Install, configure, and execute AWS Command Line Interface (CLI) on both Windows and Linux Elastic Compute Cloud (EC2) instances

Middle Stage

The middle stage of the journey focuses on certifications and specialization.  The people in this stage aren’t new to cloud, and typically already have a strong understanding of AWS, either because they completed the Early Stage training or have previous cloud experience.  After an individual has completed this stage, they should feel highly capable of using AWS services and be an anchor resource for a new or complex cloud project.

Recommended Activities to Complete

In the table below, I’ll link to a number of webpages to direct you to introductory information on training courses and resources. Be sure to register for all training activities through the APN Portal training tab. APN Partners can take online accreditation courses at no cost and receive a 20% discount on AWS-delivered public classes registered through the APN Portal.

Recommended Activity Notes/Links
Act as a mentor for early stage employees Help early stage employees with AWS use cases and best practices.
For Solution Architects: take the architecting on AWS course and prepare for AWS Certified Solutions Architect – Associate exam Architecting on AWS: This training is designed to teach Solution Architects how to optimize the use of the AWS Cloud by understanding AWS services and how these services fit into a cloud solution.
Study for “AWS Certified Solutions Architect – Associate” certification and take/pass the exam
For Operations:  Enroll in Systems Operations on AWS and prepare for AWS Certified SysOps Administrator – Associate exam System Operations on AWS – System Operations on AWS is designed to teach those in a Systems Administrator or Developer Operations (DevOps) role how to create automatable and repeatable deployments of networks and systems on the AWS platform.
Study for “AWS Certified SysOps Administrator – Associate” certification and take/pass the exam
For Developers:  Enroll in Developing on AWS and prepare for AWS Certified Developer – Associate exam Developing on AWS – The Developing on AWS course is designed to help individuals design and build secure, reliable, and scalable AWS-based applications.
Study for “AWS Certified Developer – Associate” certification and take/pass the exam
Become proficient in the sales cycle for an AWS cloud engagement
  • Assist with the development of proposals and capture of reusable content for AWS bids
  • Conduct AWS discussions/workshops with current or prospective technical customers, develop and own proposal content, and identify lessons learned from previous proposals or engagements
Develop skills relevant to an area of specialty within AWS
  • Identify your passion for an area of specialty, where applicable, on AWS. For example: Big Data, IoT, Mobility, Managed Services, DevOps, Security, etc.
  • Establish a core competency in an area of specialty
    • Attend internal and external community events (e.g., AWS Meet-ups)
    • Publish blog content and comment on online communities related to the chosen competency
    • Work on projects and proposals associated with the chosen competency
Begin to individually make relationships with AWS professional and link minded individuals
Begin to identify ISV solutions that integrate with, or are built on, the AWS platform that address specific customer use cases
  • We recommend that you take a look at our AWS Competency Partners. APN Partners who’ve attained an AWS Competency have demonstrated technical proficiency and proven customer success in specialized solution areas

Advanced Stage

The advanced stage of the journey moves past associate certifications and focuses on thought leadership and specialization at local, regional, and global levels. These people are already considered subject matter experts on AWS and typically have vast amounts of real world AWS experience over many years. The people in this stage are very technically proficient on the AWS platform and have effectively devoted their career to working with the AWS cloud. They will continually learn new services as they come out, and develop solutions using AWS cloud native architectures.

Recommended Activities to Complete

In the table below, I’ll link to a number of webpages to direct you to introductory information on training courses and resources. Be sure to register for all training activities through the APN Portal training tab. APN Partners can take online accreditation courses at no cost and receive a 20% discount on AWS-delivered public classes registered through the APN Portal.

Recommended Activity Notes/Links
Act as a role model for all employees and in some cases external people Provide mentorship on AWS use cases, best practices, and areas of specialties for all levels of employees
Enroll in Advanced training courses and Study for Professional certification exams as applicable to your role
Be the go-to person with the sales and/or delivery for a cloud engagement
  • Lead AWS Discussions with current or prospective technical or business customers, drive AWS sales and lead proposal processes locally/regionally
  • Act as Lead Cloud Architect on large and complex engagements in your local/region
  • Be a Pre-Sales and delivery expert with AWS
    • Be able to drive sales conversations from inception to SOW
    • Technical lead on multi-year global datacenter transformation projects
Create thought leadership within area of specialty with AWS
  • Solidify reputation as an expert with AWS and within the chosen area of specialty
    • Run a blog on the specialty; post often, and comment on other blogs to establish thought leadership
    • Publish whitepapers on solutions or concepts related to specialty
    • Participate in and host internal speaking engagements to drive agenda and help win additional work
    • Lead or be heavily involved in local AWS communities and meet-ups
  • Stretch into additional areas of specialties and gain credibility by showing thought leadership
Collaborate with AWS professionals and link minded individuals
  • Identify topics for joint blog/whitepaper writing with AWS professionals
  • Develop and deliver content for AWS local/regional events (such as regional AWS Summits)
  • Develop and deliver content for AWS global events (such as AWS re:Invent)

In summary, this guide is meant to assist you when developing a training and growth path for your employees who are AWS-focused.  In following these recommendations, I’m confident that you can build a solid foundation for your team that is AWS-focused, whether your company has tens or thousands of people delivering AWS projects.

Best of luck, and feel free to contact me for additional information or clarifications: erikfarr@amazon.com

 

The 10 Most Popular APN Blog Posts of 2015

We launched the APN Blog in late November 2014, and I’ve loved developing and growing this blog as an outlet of news and information about the APN. It’s also been quite a treat to get to work with teams across AWS and with a number of APN Partners around the globe to highlight great APN Partner stories and solutions on AWS. We’ve only just begun, and I can’t wait to share more with you throughout 2016.

As we wrap up the year, I want to share with you some of our most popular posts from 2015. In case you missed ’em, here’s a recap of 10 of our most popular posts. Happy reading! See you in 2016.

Active Directory Single Sign-On (SSO) on AWS with Bitium

Bitium, an APN Technology Partner, offers an enterprise-grade solution for single sign-on, application and user management, password management, directory integration, and security and compliance.  By utilizing the Bitium software as a service (SaaS) solution, we are going to demonstrate the integration of Microsoft Active Directory (AD) logins with the Amazon Web Services (AWS) Management Console via the Bitium SAML implementation.

A popular request when implementing a new system is, “can we use our existing directory for authentication?” Running independent user lists can become quite a hassle.  For example, when an employee switches departments, when you add a new staff member, or when a staff member takes on additional responsibilities, you can either update the information in your single central source for AAA (authentication, authorization, accounting), or you can maintain independent lists of users with varying settings for items such as password requirements, password aging, and when to audit users.

Wouldn’t it be easier if everything just worked together?

In this post, we are going to run through the process of deploying the Bitium SaaS offering as an authentication solution. Using this application, you’ll be able to administer AWS Management Console access directly from your Active Directory administration console.  Many organizations currently rely upon Active Directory for their corporate directory solution, and while this post focuses on that one form of directory integration, Bitium provides solutions from LDAP integrations to third party SAML integrations. This expands the capabilities to define authentication sources as any IdP (identity provider) that uses standardized SAML, LDAP, Active Directory, or Google Apps methods of authentication.  For example, the Bitium website explains the Bitium features that enable the integration of Google Apps with AWS.

(more…)

Announcing Atlassian Bitbucket Support for AWS CodeDeploy

Shortly after making Atlassian Bamboo and AWS CodeDeploy integration available to support Continuous Integration (CI) and Continuous Delivery (CD) workflows on AWS, AWS Partner Network (APN) member Atlassian has added another key integration with AWS Code services. The integration of Atlassian products and AWS Code services is a compelling story for companies that deploy software on AWS at every stage of the build, test, and deployment lifecycle.

We’re happy to announce Atlassian Bitbucket support for AWS CodeDeploy, so you can now push code to Amazon EC2 instances directly from the Bitbucket UI. This is a great example of simplifying deployments, especially if you prefer “a-human-presses-a-button” control over your deployments.

As an example, I’m a developer and I want to deploy a change to my PHP website that runs on a cluster of Amazon EC2 web servers. First, I will update my code in my BitBucket repository. Then, to minimize the context switching that would happen from logging into my CI platform, or logging directly into my EC2 hosts to run a manual deployment process, I could now take advantage of CodeDeploy’s flexibility by deploying my code to my EC2 instances directly from the BitBucket UI.

Let’s take a look at how this works!

First, we’ll need a sample application in Bitbucket. Grab our sample application and push it to Bitbucket: https://s3.amazonaws.com/aws-codedeploy-us-east-1/samples/latest/SampleApp_Linux.zip

Next, install the CodeDeploy add-on through the Settings menu in Bitbucket. Then, under my hello-world app’s repository, I can choose CodeDeploy Settings to configure CodeDeploy:

Bitbucket needs the ability to stage your code artifacts in an Amazon S3 bucket for CodeDeploy to pull, so step one of this setup process is to create an AWS Identity and Access Management (IAM) role with the following policy:

{"Version": "2012-10-17","Statement": [{"Effect": "Allow","Action":
["s3:ListAllMyBuckets","s3:PutObject"],"Resource":
"arn:aws:s3:::*"},{"Effect": "Allow","Action":
["codedeploy:*"],"Resource": "*"}]}

The setup will ask for the ARN of the IAM role so Atlassian can assume a role in your account, push code to your S3 bucket on your behalf, and do a deployment using CodeDeploy.

Once you’ve provided the role ARN, you’ll also be able to tell Bitbucket which S3 bucket to use for storing deployment artifacts and which CodeDeploy application to deploy to:

If you haven’t set up CodeDeploy yet, that’s okay—it’s easy to get started. Step one is to make sure you have an EC2 instance running the CodeDeploy Agent. Make sure you tag the instance with something that is identifiable, because tags are one way that CodeDeploy identifies the instances it should add to the deployment group. Once you have an instance running, sign in to the CodeDeploy console and choose Create New Application. In CodeDeploy, an application is a namespace that AWS CodeDeploy uses to correlate attributes, such as what code should be deployed and from where.

After you’ve created your application, you can specify a deployment group, which is a collection of EC2 instances that CodeDeploy will execute on for each deployment.

Now that the basics for CodeDeploy are configured, we need to tell CodeDeploy how to deploy to our instances by using an appspec.yml file.  Because the instances in my deployment group are just Apache 2.0 web servers, my AppSpec file tells CodeDeploy how to update the default index.html with my new code:

Now we’re ready to push code to my deployment group. From within my repo’s “Production” branch, I can simply choose Deploy to AWS:

Now I can check on the status of the deployment in the CodeDeploy console:

Finally, let’s see if the deployment was successful by viewing the DNS address of my instance in a browser:

So there you have it—a simple mechanism for pushing code directly to EC2 instances by using Atlassian Bitbucket and AWS CodeDeploy. We’re really excited to offer this integration, so check out the new Bitbucket add-on here! I also encourage you to check out Atlassian’s post on the integration.


Editor’s note on March 14th, 2016: We’ve received a few questions on two points covered in this blog post. We hope to provide clarity with the following information from the author. 

One major ask has been to use this feature for automated deployments in your CI/CD workflow. This Bitbucket integration is simply intended to streamline pushing code artifacts to instances quicker, but isn’t intended to serve as your primary CI/CD workflow.  If you’re looking to build a full continuous integration and continuous deployment workflow, take a look at Atlassian Bamboo. Bamboo is designed for CI, builds, and testing, and you can use the Bamboo CodeDeploy plugin to push code to instances.

https://utoolity.atlassian.net/wiki/display/TAWS/Using+the+AWS+CodeDeploy+Deployment+task+in+Bamboo/

The second most common question has been around the use of roles. There are actually two roles required for the setup in this post. One role, which we talk about explicitly in the post, is the “cross-account IAM role”, which allows Atlassian to do something in your account (in this case upload to Amazon S3). The other role is one we don’t directly call out, and this role is the “CodeDeploy Service Role”, and we assume you’ll create that as part of the CodeDeploy setup process. This role allows CodeDeploy the service to do things on your behalf, like query instances in the Deployment group. You can create this role by following this documentation:

http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-service-role.html