AWS Partner Network (APN) Blog

Category: AWS Elastic Beanstalk

How We Built a SaaS Solution on AWS, by CrowdTangle

by Kate Miller | on 03 NOV 2016 | in Amazon DynamoDB, Amazon RDS, Amazon Redshift, APN Partner Highlight, APN Technology Partners, AWS Elastic Beanstalk, Database, SaaS on AWS, Startups | Permalink | Comments

The following is a guest post from Matt Garmur, CTO at CrowdTangle, a startup and APN Technology Partner who makes it easy for you to keep track of what’s happening on social media. Enjoy!

Horses were awesome.

If you had a messenger service 150 years ago, using horses was so much better than the alternative, walking. Sure, you had to hire people to take care of horses, feed them, and clean up after them, but the speed gains you got were easily worth the cost. And over time, your skills at building a business let you create systems that could handle each of these contingencies extremely efficiently.

And then cars came around, and you were out of luck.

Not immediately, of course. The first car on the street didn’t put you out of business. Even as cars got more mainstream, you still had the benefit of experience over startup car services. But once the first company grew up that was built with the assumption that cars existed, despite all your knowledge, you were in big trouble.

At CrowdTangle, we build some of the best tools in the world for helping people keep track of what’s happening on social media. We have a team of engineers and account folks helping top media companies, major league sports teams, and others find what they care about in real time (and we’re hiring!). Importantly, we started our company in 2011, which meant that AWS had been around for 5 years, and we could, and did, confidently build our entire business around the assumption that it would exist.

AWS was our car.

It may seem like an exaggeration, but it’s not. We were able to build an entirely different type of organization on AWS than we could have built five years prior. Specifically, it has impacted us in four critical ways: business model, hiring, projections and speed, which of course are all different ways of saying, “cost,” and thus, “survival.”

First is the business model. When we started developing our company, we didn’t consider producing physical media to hold our software, nor did we consider installing it on-premises. By making our model Software as a Service (SaaS), we got a lot of immediate benefits: we were able to allow users to try our product with no more effort than going to a website; we could push features and fixes dozens of times a day; and we could know that everyone would get the same controlled experience. But by taking on the hosting ourselves, we would need to have a significant capital outlay at the start in order to simply deliver our product. Having AWS to begin on without those initial costs made SaaS a viable option for our growing startup.

Next is hiring. AWS has Amazon Relational Database Service (Amazon RDS), a managed database service, which means I don’t need to hire a DBA, since it’s coder-ready (and on Intel Xeon E5s, so we’re certainly not sacrificing quality). AWS has Elastic Beanstalk, a service that makes it simple for us to deploy our application on AWS, which means I can set up separate environments for front- and back-end servers, and scale them independently at the push of a button. Amazon DynamoDB, the company’s managed noSQL database service, helps alleviate me of the need to have four full-time engineers on staff keeping my database ring up and running. We keep terabytes of real-time data, get single-digit millisecond response times, and from my perspective, it takes care of itself. My team can be focused on what matters to our driving the growth of our business, because we don’t need to spend a single hire on keeping the lights on.

Third is projections. If you’re in the horse world, your purchasing model for computers is to run as close to capacity as possible until it’s clear you need a capital outlay. Then you research the new machine, contact your supplier, spend a lot of money at once, wait for shipping, install it, and when it goes out of service, try to resell it and recover some of the cost. In the car world, if I think we might need more machinery, even for a short time, I request an instance, have it available immediately, and start paying pennies or dollars by the hour. If I’m done with that instance? Terminate and I stop paying for it. If I need a bigger instance? I simply provision a bigger instance on the spot.

Finally, I want to talk about speed. Because of our choice to build our solution on AWS, we have a lean team that can provision resources faster, and can constantly work on fun projects rather than having to focus on simple maintenance. Not only can we move quickly on the scoped projects, but we can do cheap R&D for the moonshots. Every new project could be a bust or our next million-dollar product, but they start the same — have an idea, clone an existing environment, put your project branch on it, trot it out for clients to play with, and spin it down when done.

We recently decided that an aggregation portion of our system was slower than we liked, and we researched moving it to Amazon Redshift. To do so, we spun up a small Redshift instance (note: no projections), did initial testing, then copied our entire production database into Redshift (note: R&D speed). “Production” testing proved the benefits, so now we have an entire secondary Amazon Kinesis-Redshift managed pipeline for our system (note: no hiring, despite adding systems), and the speed increase has opened the door for new products that weren’t possible for us under the prior method. How much would that experimentation cost in the horse world? What would it have taken to execute? Would any of those projects have been small enough to be worth taking a chance on? We place small bets all the time, and that’s what helps us remain a leader in our field.

Your next competitor will have grown up in the age of cars. How can you compete when you have horses?

To learn more about CrowdTangle, click here.

The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.

How AWS Partners Can Optimize Healthcare by Orchestrating HIPAA Workloads on AWS

by Niranjan Hira | on 06 JUL 2016 | in Amazon ECS, AWS Elastic Beanstalk, AWS Lambda, AWS Partner Solutions Architect (SA) Guest Post, Healthcare | Permalink | Comments

Christopher Crosbie was also a contributor on this post.

In his new book, The Death of Cancer, world renowned oncologist Dr. Vincent T. DeVita Jr. laments: “… it illustrates what has been, for me, a source of perennial frustration: at this date, we are not limited by the science; we are limited by our ability to make good use of the information and treatments we already have.”[1] This frustration with the sluggish pace of technology adoption in healthcare is uncomfortably familiar. To make matters worse, illegal access of medical records continues at an alarming rate; as the Financial Times reported in December 2015, over 100m health records were hacked in 2015.[2] Not only have we been slow to employ new technologies, we have also come to tolerate a surprising dearth of privacy.

In this post, we will explore some ways AWS partners can use AWS orchestration services in conjunction with a DevSecOps methodology to deliver technology solutions that help to optimize healthcare while maintaining HIPAA compliance to safeguard patient privacy.[3]

HIPAA-eligible Services and Orchestration Services

Let’s start with HIPAA-eligible services and how they can be used together with orchestration services for healthcare applications. Customers who are subject to HIPAA and store Protected Health Information (PHI) on AWS must designate their account as a HIPAA account. Customers may use any AWS service in these accounts, but only services defined as HIPAA-eligible services in the Business Associates Agreement (BAA) should be used to process, store, or transmit PHI.

AWS follows a standards-based risk management program to ensure that the HIPAA-eligible services specifically support the security, control, and administrative processes required under HIPAA. Using these services to store and process PHI allows AWS and AWS customers to address the HIPAA requirements applicable to our utility-based operating model.

AWS is constantly seeking customer feedback for new services to add to the AWS HIPAA compliance program. Currently there are nine HIPAA-eligible services, including Amazon DynamoDB, Amazon Elastic Block Store (Amazon EBS), Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic MapReduce (EMR), Elastic Load Balancing (ELB), Amazon Glacier, Amazon Relational Database Service (RDS) (MySQL and Oracle engines), Amazon Redshift, and Amazon Simple Storage Service (Amazon S3) (excluding Amazon S3 Transfer Acceleration).

Just because a service is not HIPAA-eligible doesn’t mean that you can’t use it for healthcare applications. In fact, many services you would use as part of a typical DevSecOps architecture pattern are only used to automate and schedule automation activities, and therefore do not store, process, or transmit PHI. As long as only HIPAA-eligible services are used to store, process, or transmit PHI, you may be able to use orchestration services such as AWS CloudFormation, AWS Elastic Beanstalk, Amazon EC2 Container Service (Amazon ECS), and AWS OpsWorks to assist with HIPAA compliance and security by automating activities that safeguard PHI.

Let’s walk through a few example scenarios using AWS Elastic Beanstalk, Amazon ECS, and AWS Lambda to demonstrate how AWS partners have used orchestration services to optimize their healthcare application to meet their own HIPAA eligibility requirements.

AWS Elastic Beanstalk Example

Consider an internal-facing IIS web application that is deployed using AWS Elastic Beanstalk.

Figure 1: Elastic Beanstalk Application

Set up the network

Let’s start with a simple AWS CloudFormation template that does a few things for us (the template uses one Availability Zone for illustration). First, it sets up an Amazon VPC so that EC2 instances launched in this VPC are launched with their tenancy attribute set to dedicated. It then creates a public subnet and a private subnet with the necessary routing and networking components so that instances in the private subnet can connect to the Internet as needed.

Here’s the AWS CloudFormation template that sets up networking: https://github.com/awslabs/apn-blog/blob/master/orchestrating/vpc-with-public-and-private-subnets.json

Set up the application

Now we have a logical network where we can launch Dedicated Instances so that they are only accessible internally. Assuming we also have an Amazon EBS snapshot that we can use as the base image for our encrypted EBS volume and the Elastic Beanstalk application bundle that we wish to deploy, we can use an AWS CloudFormation template to easily set up this application. First, we set up a new Elastic Beanstalk application (we could also use the .NET Sample Application from the AWS Elastic Beanstalk tutorial); then, we make our bundle available as an application version; finally, we launch an Elastic Beanstalk environment so we can interact with our new web service that needs to process PHI.

Here’s the AWS CloudFormation template: https://github.com/awslabs/apn-blog/blob/master/orchestrating/eb-iis-app-in-vpc.json

We would also configure the Elastic Load Balancer for end-to-end encryption so secure connections are terminated at the load balancer and traffic from the load balancer to the back-end instances is re-encrypted. Another option here would be to have the load balancer relay HTTPS traffic without decryption, but then the load balancer would not be able to optimize routing or report response metrics because it would not be able to “see” requests.

Note how we used the power of AWS CloudFormation and Elastic Beanstalk but left our application (which is running on EC2 Dedicated Instances with encrypted EBS volumes) with the responsibility to store, process, and transmit PHI.

Did it work?

The new application is visible on the Elastic Beanstalk console with a URI for the load balancer. We then log into the Bastion host and use the browser there to confirm the application is running.

Amazon ECS Example

In this second scenario, we look at Amazon EC2 Container Service (Amazon ECS) and Amazon EC2 Container Registry (Amazon ECR). Consider a PHP application that runs on Docker.

Since ECS is only orchestrating and scheduling application containers that are deployed and run on a cluster of EC2 instances, ECS can be used under the AWS BAA because the actual PHI is managed on EC2 (a HIPAA-eligible service). We must still ensure that EC2 instances processing, storing or transmitting PHI are launched with dedicated tenancy and that PHI is encrypted at rest and in transit. In addition, we must ensure that no PHI leaks into any configuration or metadata such as a task definition. As an example, this means the definition of what applications to start must not contain any PHI, and the exit string of a failed container must not contain PHI as this data is persisted in the ECS service itself, outside of EC2.

Similarly, we can use ECR to house our Docker images as long as we can ensure that the images themselves do not contain any PHI. As an example, images that require cached PHI or PHI as “seed” data must not be added to ECR. A custom registry on dedicated tenancy EC2 instances with encrypted volumes might serve as an alternative.

Set up ECR & ECS

We followed the Getting Started Guide to create the IAM roles that are required and then created a repository for Docker images.

# from local console using aws-shell (https://github.com/awslabs/aws-shell)
aws> ecr create-repository --region us-west-2 --repository-name aws-phi-demo

Set up the network

This AWS CloudFormation template creates a VPC with dedicated tenancy for use with ECS, a private subnet, and a public subnet. It creates the corresponding route tables, Internet gateway, and managed NAT gateway. It also adds the appropriate network routes. Then, it creates an ECS security group for container instances in the VPC. Finally, it creates a Linux bastion host. For illustration, we also use this instance as our Docker-admin instance to manage Docker images (you can, of course, build and manage Docker images in other ways).

Here’s the template: https://github.com/awslabs/apn-blog/blob/master/orchestrating/ecs-vpc-with-public-and-private-subnets.json

Set up Docker

Next, we created a Docker image for the application.

# log in to the Docker-admin instance (get IP address from the Outputs tab)
$ ssh -i  ec2-user@

# install packages needed by new image (for illustration, we'll use a simple php web app)
$ sudo yum install git aws-cli
$ git clone https://github.com/awslabs/ecs-demo-php-simple-app
$ cd ecs-demo-php-simple-app/

# optional - update Dockerfile to use ubuntu:14.04
# change FROM to ubuntu:14.04
# change source location to /var/www/html (from /var/www)
# change to use apache2ctl (instead of apache)

# build the Docker image from the Dockerfile and tag it as aws-phi-demo for the default registry (AWS ECR)
# substitute ${awsAccountId} with your AWS Account ID (e.g., 123456789012)
$ export awsAccountId=...
$ docker build -t aws-phi-demo .

# verify that image was created correctly and pushed to the appropriate repository
$ docker images

# run the newly created Docker image mapping ports as needed
$ docker run -p 80:80 -p 443:443 aws-phi-demo

# confirm that the site is available (e.g., use another terminal session on the docker-admin instance)
$ curl --verbose "http://localhost/"

# stop the Docker container (docker ps, docker stop from the other terminal)

We pushed the image so it became available in ECR.

# get login command for docker (need to set up credentials first or run this somewhere else)
$ aws ecr get-login --region us-west-2

# login using docker login command (provided above)

# tag the image
$ docker tag aws-phi-demo:latest \
    ${awsAccountId}.dkr.ecr.us-west-2.amazonaws.com/aws-phi-demo:latest

# push the image to the new repo
$ docker push ${awsAccountId}.dkr.ecr.us-west-2.amazonaws.com/aws-phi-demo:latest

# confirm the image is now available (e.g., from aws-shell on the local console)
aws> ecr list-images --repository-name aws-phi-demo

Start up a cluster

This AWS CloudFormation template creates an ECS cluster with an internal-facing load balancer. It then uses the ECR reference to set up a task definition and a service for the Docker image that has been provided. Finally, it sets up a launch configuration and an Auto Scaling Group to launch some container instances to host the application.

Note: Container instances are created with encrypted volumes, so data is protected at rest (Docker creates LVM volumes from the EBS volumes provided).

Here’s the template: https://github.com/awslabs/apn-blog/blob/master/orchestrating/ecs-php-app-in-vpc.json

Did it work?

We created an internal-facing application, so we used the Docker-admin instance to confirm it’s available.

# from the Docker-admin instance, confirm that the app is available using the load balancer, for example...
$ curl --verbose "http://internal-ecs-demo-EcsElasti-11YT610RK4VGN-499766265.us-west-2.elb.amazonaws.com"

As with the previous example, we see how another set of orchestration services helps to automate application activities while relying on HIPAA-eligible services to store, process, and transmit PHI.

AWS Lambda Example

Consider the scenario where PHI is posted to an S3 bucket and we need an application similar to the ones above to process the data and store results back to S3. Can we use S3 event notifications with a Lambda function? Yes, we can, as long as the Lambda function does not store, process, or transmit PHI.

Set up an S3 bucket

Let’s start with a bucket.

# from aws-shell
# create the bucket
aws> s3 mb --region us-west-2 s3://phi-demo

# attach a policy that requires encrypted uploads (using SSE-S3)
aws> s3api put-bucket-policy --bucket phi-demo --policy
'{
   "Version":"2012-10-17",
   "Id":"PutObjPolicy",
   "Statement":[{
         "Sid":"DenyUnEncryptedObjectUploads",
         "Effect":"Deny",
         "Principal":"*",
         "Action":"s3:PutObject",
         "Resource":"arn:aws:s3:::phi-demo/*",
         "Condition":{
            "StringNotEquals":{
               "s3:x-amz-server-side-encryption":"AES256"
            }
         }
      }
   ]
}'

# confirm the policy is in effect
aws> s3 cp ip-ranges.json "s3://phi-demo/" 
upload failed: ./ip-ranges.json to s3://phi-demo/ip-ranges.json A client error 
(AccessDenied) occurred when calling the PutObject operation: Access Denied

# use SSE-S3 encryption this time
aws> s3 cp --sse AES256  ip-ranges.json "s3://phi-demo/" 
upload: ./ip-ranges.json to s3://phi-demo/ip-ranges.json

# confirm SSE is turned on
aws> s3api head-object --bucket phi-demo --key "ip-ranges.json"

# add a "folder" for data
aws> s3api put-object --bucket phi-demo --key "data/"

Set up AWS Lambda

Then, we need to set up a new Lambda function that is invoked when an object is added to the bucket.

# create the IAM Role we need and attach the access policy to it
aws> iam create-role --role-name lambda_s3_exec_role --assume-role-policy-document 
'{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}'

aws> iam put-role-policy --role-name lambda_s3_exec_role 
--policy-name s3_access --policy-document 
'{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::phi-demo"
      ]
    }
  ]
}'

# make function source available (as an example, invoke PHI service with S3 URI)
# use the s3-get-object blueprint as a guide, but remember that the function
# must not retrieve the PHI bytes
aws> s3 cp orchestrate-phi-demo.zip s3://phi-demo/src/lambda/orchestrate-phi-demo.zip

# create a function (replace ${account} with the correct account number)
aws> lambda create-function --cli-input-json 
'{
    "FunctionName": "orchestrate-phi-demo", 
    "Runtime": "nodejs", 
    "Role": "arn:aws:iam::${account}:role/lambda_s3_exec_role", 
    "Handler": "index.handler", 
    "Code": {
        "S3Bucket": "phi-demo", 
        "S3Key": "src/lambda/orchestrate-phi-demo.zip"
    }, 
    "Description": "Function that will orchestrate phi-demo", 
    "Timeout": 300, 
    "MemorySize": 128, 
    "Publish": true
}'

# allow S3 to invoke the function (this should include a source-account so only
# the intended bucket is connected
aws> lambda add-permission --region us-west-2 --function-name orchestrate-phi-demo \
  --action "lambda:InvokeFunction" --principal s3.amazonaws.com \
  --statement-id 20160502-phi-demo-lambda --source-arn arn:aws:s3:::phi-demo

# configure Notifications so the function is invoked when s3 Objects are created
# (replace ${account} with the correct account number)
aws> s3api put-bucket-notification-configuration --bucket phi-demo --region us-west-2 --notification-configuration 
'{
    "LambdaFunctionConfigurations": [
        {
            "Id": "orchestrate-phi-demo-nc",
            "LambdaFunctionArn": "arn:aws:lambda:us-west-2: 
${account}:function:orchestrate-phi-demo",
            "Events": [
                "s3:ObjectCreated:*"
            ],
            "Filter": {
                "Key": {
                    "FilterRules": [
                        {
                            "Name": "prefix",
                            "Value": "data/"
                        },
                        {
                            "Name": "suffix",
                            "Value": ".phi"
                        }
                    ]
                }
            }
        }
    ]
}'

Did it work?

When a new object is added, the function is invoked and the corresponding PHI service is invoked.

# add some object
aws> s3 cp ./test.phi s3://phi-demo/data/test-001.phi

Use CloudWatch Logs to confirm that the function executed successfully.

Conclusion

We have just reviewed three simple examples of how HIPAA-eligible services can be used with orchestration services to safeguard patient privacy while delivering solutions that can help optimize healthcare applications. AWS partners have tools at their disposal to address the need to manage HIPAA-compliant workloads. Healthcare customers should not fear non-eligible services just because they are non-eligible. Some services don’t typically interact with PHI so they may not need to be “eligible”.

[1] DeVita, Vincent T., and Elizabeth DeVita-Raeburn. The Death of Cancer: After Fifty Years on the Front Lines of Medicine, a Pioneering Oncologist Reveals Why the War on Cancer Is Winnable–and How We Can Get There. Page 32. New York: Sarah Crichton Books; Farrar, Straus, and Giroux, 2015.

[2] Scannell, Kara, and Gina Chon. “Cyber Security: Attack of the Health Hackers – FT.com.” Financial Times. N.p., 21 Dec. 2015. Web. 05 July 2016.

[3] Note that this post is not intended to guarantee HIPAA compliance – customers should ensure that any AWS services are used consistent with their HIPAA compliance programs, that risks are addressed in their risk analysis and risk management programs, and that they consult with qualified HIPAA counsel or consultants when appropriate.