AWS Government, Education, & Nonprofits Blog

Imagine: A Better World – A Global Nonprofit Conference Recap

To further the achievement of the United Nations’ Sustainable Development Goals, we presented Imagine: A Better World – a global nonprofit conference where over 270 nonprofit leaders from around the globe convened at the Amazon Meeting Center in Seattle for a unique and collaborative learning experience.

There were three high-level conference themes:

  1. Overcoming global challenges through technology
  2. Increasing scale and reach through effective marketing and fundraising
  3. Powering mission and marketing efforts through Amazon’s social good community

AWS hosted the Smart Technology and the Sustainable Development Goals track, where attendees learned best practices and engaged in interactive dialogues around technology’s role in ensuring everyone has the opportunity to live a life of dignity on a healthy planet.

AWS, American Heart Association, AARP Foundation, and Global Citizen took the stage and delivered keynotes that outlined their vision, work, and possibilities for the future in a world that can be changed.

Track sessions included: hosted concurrent sessions where Amazon specialists and nonprofit thought leaders shared learnings and best practices in social media, digital presence, and community engagement. Networking events closed out each day where all participants collaborated with one another and had the opportunity to engage with the various teams across Amazon dedicated to driving impact in the social sector. These included AWS Open Data, AmazonSmile, Amazon Media Group, AWS Cloud Credits for Research, We Power Tech, Amazon Business, Amazon Pay, Amazon Web Services, AWS Educate, and Merch by Amazon.

Learn more about how AWS can help your organization.

Security Assurance Package Submitted to the Government of Canada

Amazon Web Services has made a major milestone in its ability to drive cloud transformation for the Government of Canada (GC). This week, we demonstrated that we are ready to meet the Cloud Security Profile for Protected B / Medium Integrity / Medium Availability (PBMM) to GC. This means that AWS is in alignment with how GC IT Services leverages internationally-recognized, widely-accepted accreditations empowering adopting organizations to benefit from independently validated audits and assessments without any redundant work.

Over the past several months, AWS worked closely with GC to understand its cloud security requirements and authorization needs. AWS proved valuable to the GC due to its extensive experience with the following international and national security accreditations: ISO 27001, Service Organization Controls (SOC), FedRAMP Moderate and High, and most recently the U.S. Defense Department’s Impact Level 5 for sensitive controlled unclassified information.

By aligning with prevailing security accreditations and addressing any residual requirements, AWS demonstrated how its capabilities meet, and in many cases, exceed GC’s security watermark as validated by an independent assessor. Governments such as the UK and Australia have already reaped the benefits of accreditation reciprocity.

Such an approach will allow GC to authorize cloud technology in a secure, compliant manner to transform the delivery of high-value services to Canadian citizens.

Get Your University Ready for NIST 800-171

The deadline to implement National Institute of Standards and Technology (NIST) Special Publication 800-171 is fast approaching. Beginning in January 2018, you may miss out on government funding that stipulates its implementation if you have not taken action.

In 2015, NIST published Special Publication (SP) 800-171 – Protecting Controlled Unclassified Information in Non-federal Information Systems and Organizations – introducing the standards for non-federal entities, such as academic institutions working under a government contract. NIST 800-171 was meant to take the security controls from a larger NIST publication, NIST 800-53, and assist non-federal agencies to apply controlled unclassified information (CUI) controls to their environments. When NIST 800-171 was published, it specified a grace period that ends on December 31, 2017. Therefore, compliance with the framework is mandatory beginning in 2018.

Many universities are turning to AWS to leverage the robust controls in place to maintain security and data protection in the cloud and be compliant with NIST 800-171 rather than overhauling their existing environment or data center facilities. For example, as Purdue University mentions in a recent article published on Educause, AWS allowed them to create a separate domain for controlled research without negatively impacting their existing facilities.

AWS makes compliance easy by providing free NIST 800 Quick Starts. The Quick Start is a reference deployment guide that discusses architectural considerations and steps for deploying NIST 800-53 and 800-171 on the AWS Cloud. In addition, the Quick Starts include an AWS CloudFormation template that automates the heavy lifting required to deploy the reference architecture. Also, the Quick Starts include a security controls matrix, which maps the architecture components to the requirements specified in NIST 800-53 and NIST 800-171.

To get started, view the Quick Start guide in HTML or PDF. To launch the Quick Start, either click on the following link in your browser, or from the AWS console, paste the following URL  into the CloudFormation console in US-East-1 as shown below:


If you need assistance with an enterprise implementation of the capabilities introduced through this Quick Start, AWS Professional Services offers an Enterprise Accelerator – Compliance service to guide and assist with the training, customization, and implementation of deployment and maintenance processes.

Please contact your AWS Account Manager for further information, or send an inquiry to:

How to Buy: Cloud Procurement Made Easy

When it comes to cloud computing, the purchase process can seem daunting. Fortunately, AWS can help. Our experts at AWS have assisted many government IT leaders in selecting the right acquisition approach for their agency.

Download the on-demand webinar on How to Buy: Cloud Procurement Made Easy.

  • Find out how the cloud alleviates upfront costs with a pay-as-you-go model, (only pay for what you need).
  • Discover steps for structuring your cloud procurement strategy.
  • Examine the different purchase models available to support your agency-specific needs.
  • Learn how to shift from capital expenditure to operational spending.
  • Uncover partners and contract vehicles available to government agencies for cloud procurements.

For more on how to buy cloud, visit the How to Buy section of our website.

Recap of the AWS Public Sector Summit – Canberra

We just wrapped the AWS Public Sector Summit in Canberra, Australia where 900+ attendees participated in workshops, roundtables, bootcamps, breakout sessions, and a keynote delivered by Teresa Carlson, Vice President of Worldwide Public Sector at AWS.

Teresa was joined onstage by Australia Post, Geoscience Australia, and an adviser to the Australian government, who shared how they use the AWS Cloud to strengthen cyber security, improve service delivery to the public, and innovate faster.

Watch the keynote video on-demand.

Throughout the packed day, attendees could opt for sessions spanning Data and Analytics, Security, Industry & Innovation, and Developer tracks, based on their business and technical interests.

A few featured sessions include:

  • How Novel Compute Technology Transforms Medical and Life Science Research: Genomic research has leapfrogged to the forefront of big data and cloud solutions. This session outlined how to deal with “big” (many samples) and “wide” (many features per sample) data on Apache Spark. Attendees also learned best practices for keeping runtime constant by using automatically scalable micro services such as AWS Lambda, as well as how AWS technology has powered research at CSIRO.
  • Terraforming Geoscience with Infracode: Geoscience Australia welds science and technology with tools such as Terraform on AWS, to examine the geology and geography of Australia. The organization gave us an inside look at how it secures Australia’s natural resources, builds Earth Observation infrastructure, and analyzes geoscientific data. Learn how Geoscience Australia is taking advantage of this and other innovations – including Packer and CI/CD – to drive change, improve developer experience, and deliver value to users.
  • Robots: The Fading Line Between Real and Virtual Worlds: Our Summit audience got to witness how live, virtual 3D worlds rendered with Amazon Lumberyard – a complimentary, cross-platform, 3D game engine – interconnects with IoT devices in the real world. This session illustrated how AWS IoT can be used to remotely control inanimate objects such as Sphero robots, using Bluetooth. Attendees observed how AWS IoT and AWS Lambda empower users to create bi-directional communication between moving robots, which can detect collisions in a virtual world created through Amazon’s game engine. Learn how voice commands control physical and virtual robots using AWS IoT through Alexa Skills Kit and the Amazon Echo.

View all breakout sessions videos.

Interested in attending more AWS Summits? Find them in cities near you.

Building a Serverless Analytics Solution for Cleaner Cities

Many local administrations deal with air pollution through the collection and analysis of air quality data. With AWS, cities can now create a serverless analytics solution using AWS Glue and AWS QuickSight for processing and analyzing a high volume of sensor data. This allows you to start quickly without worrying about servers, virtual machines, or instances, so you can focus on your core business logic to help your organization meet its analytics objectives.

You can pick up data from on-premises databases or Amazon Simple Storage Service (Amazon S3) buckets, transform it, and then store it for analytics in your Amazon Redshift data warehouse or in another S3 bucket in a query-optimized format. In this blog, we will read PM2.5 sensor data, transform the dataset, and visualize the results.

Figure 1: High level architecture of serverless analytics for air pollution analysis.

Data Collection

In a real-life scenario, you would start with data acquisition by leveraging services like AWS Greengrass, AWS IoT, or AWS Kinesis Firehose. In this example, our data will be stored in S3 buckets and we will be working with two data sources: raw sensor data in CSV files and sensor meta-data in JSON and CSV format.

AWS Glue, a set of services to do ETL – extract, transform, and load – is working with a data catalog containing databases and tables. The air pollution data is stored as JSON files in S3 buckets. Those files will be analyzed manually or automatically in a given context (database) and the data format(s) will be used to create schemata (tables). In the data catalog, you can also find the processes to inspect the data (crawlers) and data parsers (classifiers).

Figure 2: How data is organized in the S3 bucket

Creating a Crawler

By using the AWS Glue Crawler, defining schemata can be done automatically. Point the crawler to look for data and it will recognize the format based on built-in classifiers. You can write your own custom classifiers if needed. In cases where you cannot use the crawler, the data structure can also be defined manually.

Figure 3: Adding a Glue Crawler

Switch to “crawlers” in the “data catalog” section. For automatically analyzing data structures, click “add crawler” and follow the wizard:

  1. Provide a name and an IAM role for the crawler. Check the documentation for the permissions a crawler needs in order to function. You can provide a description and custom classifiers in this stage if you need one.
  2. Enter one or more data sources. For S3 data, choose the bucket and, if necessary, the folder. By default, the crawler will analyze every subfolder from the starting point and you need to provide an exclude pattern if this is not desired.
  3. Define a schedule for when to execute the crawler. This is useful if the source data is expected to change. Since our data won’t change structurally, we’ll choose “run on demand.”
  4. Choose the output of a crawler – the database. Use an existing one or let the crawler create one. Remember that the crawler will inspect the files in the S3 bucket and every distinct data structure identified will show up as a table in this database.
  5. Review and confirm.

Adding a Custom Classifier

Figure 4: Showing subtle difference in input data formats

We could launch the crawler now, but our sensors provide values in floating point and in integer format. The crawler would recognize them as different tables. You can see how this works by default and then delete the tables afterwards. Or, to import the files as one data structure, we can introduce our own data classifier for CSV-based data.

On the “data catalog” tab, choose section “classifiers” and add a new classifier by defining a Grok pattern and additional custom patterns if needed. In our example, data is structured as CSV with two columns. The first column is a timestamp; the second is a number. All analyzed files that match this pattern will end up in the same table.


Figure 5: Adding a classifier

Working with Partitions

We had hierarchical structure in our data. There were folders for the week number of the data and there were folders for the sensor location. Glue is able to use this information for the datasets. While our data files contain two components (timestamp and value), the crawler created a schema containing four columns.

Figure 6: Glue working with partitions

Glue automatically added the information contained in the structure of our data store as fields in our dataset. The names of the columns are also taken from the folder names given in our data store. Instead of just having a folder named “Frankfurt”, we had one called “loc=’Frankfurt,’” which is interpreted as column name and value. If you are not able to follow that naming scheme, the additional fields will be introduced as well, but they will have generic names, which can be renamed in the transformation part of our Glue-based ETL.

Defining a database and table manually

The destination for our data is S3, but in JSON files. We can’t use a crawler to automatically define this because our destination files do not exist yet. Go to the “data catalog” section, select “databases,” and select “add database.” You need to provide a name – details will follow in the tables. To define them, select your database, click on “Tables in <your-database-name>” and start the table wizard by clicking “add tables.” Following the wizard, you can provide the data location, its format, and manually define the data schema. In our example, we will only have a simple transformation and Glue will be able to create the destination table.

Figure 7: Glue, table wizard.

Moving and transforming data

In AWS Glue, a Job will handle moving and transforming data. A Glue Job is based on a script written in a PySpark dialect, which allows us to execute certain built-in and custom-defined transformations while moving the data. We can strip decimal digits from our PM2.5 values and we will add a city code to our records.

Select “jobs” in the ETL section and start the wizard for adding a new Job. We will keep our example simple and let Glue generate a script.

Figure 8: Authoring a job in Glue

Select data source(s) and data target. You can filter the list of tables to find the one that our crawler created. For the target, you can search for an existing one or create one now. In the case of S3 buckets, make sure that they really exist – otherwise, your script might fail.

We disabled “job bookmark.” Glue knows what data has been processed by maintaining a bookmark. If you are testing or debugging a script, it’s a good idea to have this disabled; otherwise, nothing will happen after you have processed your data once.

Figure 9: Choosing targets

Now that Glue knows the source and target, it will offer us a mapping between them. You can add columns and change data types.

Figure 10: Mapping source to target in Glue

In the background, Glue will generate the script as needed. You will be able to change it from here. Insert pre-defined transformation templates and adapt them to your data structures.

Figure 11: Editing your ETL job in Glue

Launching simple ETL scripts will take our data into our destination bucket in the format and with the column names we defined. It’s possible to have more than one destination. You can decide to write your data to JSON files and CSV files within one ETL job.

Figure 12: Glue transformations in action

In a second transformation, we join the sensor measurement data with the sensor meta data to add a location code to the name of the city. Therefore, we need to access a second data source and join both sources during the ETL job. You can find the complete job script at s3://glue-article-cleanercities/resources/.

For this, please create and run a new crawler for the file s3://glue-article-cleanercities/sensordata-metadata/sensor_location.csv. Edit the resulting table schema to provide some meaningful column names like “city” and “code” instead of “col0” and “col1.”

Now, edit the job script to include this new data source. Select the “source” button and enter the respective values in the inserted code snippet. Then join both tables on the fields “loc” and “city.” Place the cursor after the “ApplyMapping” snippet, push “Transform” and select “Join.”

## @type: Join

## @args: [keys1 = [“loc”], keys2 = [“city”]]

## @return: join1

## @inputs: [frame1 = applymapping1, frame2 = datasource1]

join1 = Join.apply(frame1 = applymapping1, frame2 = datasource1, keys1 = [“loc”], keys2 = [“city”])

And that’s it. Starting the job again will result in records that contain values of both data sources.

Figure 13: Results from both sources

Analyzing Data in S3 Buckets

There are many options to process and analyze data on AWS, such as AWS Redshift or AWS Athena. For a quick visual analysis of our data, let’s use AWS Quicksight. Quicksight is able to consume data from S3 buckets and allows you to create diagrams in a few clicks. You can share your analysis with your team and they can be applied to new data quickly and easily.

Importing and visualizing sensor data with a serverless architecture can be a first step to improve air quality for citizens. Taking the next steps, such as correlating air quality with factors like weather or time of day, training artificial intelligence systems and influencing traffic flows based on machine learning predictions, can be achieved by leveraging AWS services without having to manage a single server.

Whether it is air pollution monitoring and analysis, threat detection, emergency response, continuous regulatory compliance, or any other public sector big data and analytics use case, AWS services can help you build a complex analytics workload quickly, easily, and cost effectively.

A post by Ralph Winzinger, Solutions Architect at AWS, and Pratim Das, Specialist Solutions Architect– Analytics at AWS.

Amazon Web Services Achieves DoD Impact Level 5 Provisional Authorization

We are pleased to announce that Amazon Web Services has achieved a Provisional Authorization (PA) by the Defense Information Systems Agency (DISA) for Impact Level (IL) 5 workloads, as defined in the Department of Defense (DoD) Cloud Computing (CC) Security Requirements Guide (SRG), in the AWS GovCloud (US) Region.

Now DoD customers and contractors working with the DoD can leverage AWS’s PA to meet DoD CC SRG IL5, along with IL2 and IL4, compliance requirements. This further bolsters AWS as an industry leader in helping support the DoD’s critical mission in protecting our security. The AWS services support a variety of DoD workloads, including workloads containing sensitive controlled unclassified information (CUI) and National Security Systems (NSS) information. AWS services are already being used for a number of cutting-edge, mission-critical DoD workloads, such as the Global Positioning System Next Generation Operational Control System (GPS OCX), a critical navigation information system that supports global cyber protection and analysis of satellite data.

AWS GovCloud (US) for IL5

To enable extremely high security levels for our customers, AWS employs a robust set of security technologies and practices, including encryption and access control features that exceed DoD security requirements.

AWS GovCloud (US) can scale to handle large-scale compute and storage intensive workloads and meet DoD CC SRG IL5 requirements. The AWS GovCloud (US) Region is composed of multiple data centers that can handle the scale of high capacity, mission-critical workloads, including High Performance Compute, Big Data, or ERP workloads. Importantly, customers leveraging AWS’s DoD CC SRG IL 5 PA will enjoy the same cost savings as other customers in the AWS GovCloud (US) Region, making it an affordable and secure compute and storage environment for DoD workloads.

Modernizing Defense in the Cloud 

At the 2016 AWS re:Invent, Air Force Lt. Gen. Samuel Greaves shared his perspective on how the small, highly empowered Defense Digital Services (DDS) team is breaking down innovation barriers with AWS GovCloud (US). Watch the full video here.

The AWS Cloud brought about a cultural shift in the Air Force around delivering ground software capabilities. The DDS team encouraged the Air Force to use commercial cloud for testing the service’s GPS OCX, which would control the newest version of the DoD’s global positioning system satellites.

The program’s engineers need regular and reliable test environments to more rapidly test the software. The solution: build test environments in AWS GovCloud (US).

“We deployed our first ever national security system, or Impact Level 5, to AWS GovCloud (US). We are working on automatic builds and deployment. But the real impact is that when we are done, we are going to take something that took three weeks down to 15 minutes,” said Chris Lynch, Director of the DDS.

No other cloud provider was able to meet the IL5 security requirements and scale of the US Air Force’s GPS OCX program. That program required 200+ Dedicated Hosts running upwards of 1,000 individual Virtual Machines. Each Virtual Machine needed at least eight vCPUs and 32GB RAM. When the Air Force looked at other cloud providers, none of them were able to immediately handle the compute scale while also meeting the DoD CC SRG IL5 requirements. Not only did AWS meet the Air Force’s needs, but the Air Force also experienced a 30% cost savings for storage costs.

Architect for Compliance in AWS GovCloud (US)

Cloud computing can help defense organizations increase innovation, efficiency, agility, and resiliency—all while reducing costs.

Learn how to architect for compliance in the AWS Cloud and see how your organization can leverage the agility, cost savings, scalability, and flexibility of the cloud, while meeting the most stringent regulatory and compliance requirements, including Federal Risk and Authorization Management Program (FedRAMP), ITAR, CJIS, HIPAA, and DoD SRG Levels 2, 4, and 5. Join us for the webinar and hear best practices and practical use cases for using AWS GovCloud (US) to comply with a variety of regulatory regimes.

This webinar helps DoD customers, partners, and systems integrators adopt cloud architecture best practices, so they can confidently and securely use AWS for storage, compute, database, Big Data, and other AWS services that offer flexibility and cost savings in the cloud. Register now.

Five Key Trends for Education in Canada

Since the launch of the AWS Canada (Central) Region in December 2016, Canadian educational institutions have been using the AWS Cloud to help facilitate teaching and learning, launch student analytics initiatives, and manage IT operations.

From data center migration to workforce development, primary schools and universities have been focused on five key trends:

  1. Data residency is top of mind: Customers can run their applications and workloads in the Canada (Central) Region with two availability zones. End users based in Canada can leverage the Canada Region to avoid up-front expenses, long-term commitments, and scaling challenges associated with maintaining and operating their own infrastructure. Echo360, an APN Partner, recently announced that they moved to the Canada Region to address data privacy and security concerns for Canadian colleges and universities.
  2. Website migration is a great place to start: The University of Alberta, a leading research university and the Province’s fourth largest employer, recently moved hundreds of websites and resources that comprise its digital environment to the AWS Cloud. “Power and scalability was extremely important to us,” said Jennifer Chesney, Associate Vice-President, University Digital Strategy. “We needed to move our digital brand to an infrastructure management partner that could provide the university the highest quality optimization of our complex environment. We are excited about the Canadian Region, as it opens up further cloud possibilities for Canadian organizations.” Learn how to get started on the cloud.
  3. Train students for cloud computing careers: To fuel the pipeline of technologists entering the workforce, the British Columbia Institute of Technology (BCIT), one of Canada’s largest post-secondary polytechnics, is constantly looking for ways to bring innovative technology to their students. Whether it is equipping students with cloud computing resources through AWS Educate or partnering with technology companies to understand the skills necessary to succeed post-graduation, BCIT is an example of the importance of industry and education working together to meet the increasing demand for cloud employees. “We brought in our first cloud computing class four years ago, and recently we made the decision to only teach AWS. Instead of breadth, we decided to give students depth on AWS so they can easily transition into the workforce,” said Dr. Bill Klug, Cloud Computing Option Head & Instructor, British Columbia Institute of Technology (BCIT). Bring the cloud to your classroom with AWS Educate.
  4. Simplifying Access to Learning Resources: Amazon WorkSpaces is a fully managed, secure Desktop-as-a-Service (DaaS) solution that helps higher education institutions and primary school districts give students and instructors consistent access to teaching and learning software on any device. The University of Maryland University College uses Amazon WorkSpaces in a virtual environment giving students a desktop ready for their learning needs. Watch the webinar.
  5. Partners can help institutions migrate: On-demand compute, storage, and database services help higher education, primary, and research IT teams build secure environments for mission-critical applications, freeing them to focus on student success. As more educational institutions move to the cloud, it is important for our customers to be able to identify specialized APN Partners to assist them in this segment. Congratulations to Advanced APN Partner D2L, which has achieved the AWS Education Competency designation.

Learn more about AWS in Canada here.

Secure Network Connections: An Evaluation of the US Trusted Internet Connections Program

Access the first guide in the new AWS Government Handbook Series: Secure Network Connections: An evaluation of the US Trusted Internet Connections program.

As a global first mover, the U.S. Government has invested considerable time in developing approaches to network perimeter security. However, while these approaches have been operating in the traditional IT space, additional innovation and iteration is necessary to better align with newer, non-traditional technologies, such as cloud.

This document discusses the following:

  • A summary of lessons learned from AWS’s work with various government agencies, including the Department of Homeland Security (DHS).
  • The various federal-wide secure network connections programs, particularly the “Trusted Internet Connections” (TIC) initiative.
  • The AWS policy position and recommendations for how governments can consider establishing or enhancing their cloud-based network perimeter monitoring capabilities.

Download the AWS Government Handbook.

We are encouraged by the government’s evolution that has moved them toward innovative, cloud-adaptive solutions to achieve network perimeter monitoring objectives in the cloud. We are committed to ongoing collaboration with the governments worldwide that are evaluating the merits, best practices, and lessons learned from the TIC program.

Look for the next handbook in the series coming later this year.

Automatically Discover, Classify, and Protect Your Data

In our post, Building a Cloud-Specific Incident Response Plan, we walked through a hypothetical incident response (IR) managed on AWS with the Johns Hopkins University Applied Physics Laboratory (APL). With the recent launch of Amazon Macie, a new data classification and security service, you have additional controls to understand the type of data stored in your Amazon Simple Storage Service (Amazon S3). Amazon Macie can also help you meet your compliance objectives, with the ability to set up automated mechanisms to track and report security incidents.

Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Amazon Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or stored. The fully managed service continuously monitors data access activity for anomalies, and generates detailed alerts when it detects risk of unauthorized access or inadvertent data leaks.

Benefits of Amazon Macie for public sector organizations include:

  • Superior Visibility of Your Data – Amazon Macie makes it easy for security administrators to have management visibility into data storage environments, beginning with Amazon S3, with additional AWS data stores coming soon.
  • Simple to Set Up, Easy to Manage – Getting started with Amazon Macie is fast and easy. Log into the AWS console, select the Amazon Macie service, and provide the AWS accounts you would like to protect.
  • Data Security Automation Through Machine Learning – Amazon Macie uses machine learning to automate the process of discovering, classifying, and protecting data stored in AWS. This helps you better understand where sensitive information is stored and how it’s being accessed, including user authentications and access patterns.
  • Custom Alert Monitoring with Cloudwatch – Amazon Macie can send all findings to Amazon CloudWatch Events. This allows you to build custom remediation and alert management for your existing security ticketing systems.

Customers including Edmunds, Netflix, and Autodesk are using Amazon Macie to provide insights that will help them tackle security challenges. Learn more about how to get started with Amazon Macie. If you are a first-time user of Amazon Macie, we recommend that you begin by reading the Macie documentation.