AWS Open Source Blog

re:Invent Open Source Recap

When we published our post on November 22nd about open source sessions and events coming up at re:Invent, there was a lot we couldn’t tell you – because it hadn’t yet been announced! And there were some exciting open source-related announcements at re:Invent, including FreeRTOS, EKS, Fargate, and SageMaker. Right after the announcements, new sessions were added to the re:Invent catalog, including overviews and deep dives on these new releases.

Below is a recap of open source-related sessions and workshops, with links to videos and slides. This list is still not exhaustive: there were many more sessions on open source software used in Machine Learning. Also see Your Guide to Machine Learning at re:Invent 2017 for more on Apache MXNet, Gluon, TensorFlow, and related tools and topics. (That post is being updated with links to videos as they become available; this one will be, too.)

Talks About Open Source

ARC213 – Open Source at AWS—Contributions, Support, and Engagement

Adrian Cockcroft and Zaheda Bhorat on what AWS is doing in open source, our projects, and how we can collaborate.

CON205 – Birds of a Feather: Containers and Open Source at AWS (video)

Thought leaders from CNCF, Docker, the Kubernetes community, and AWS wear funny hats and discuss the cloud’s direction for growth and enablement of the open source community, and discuss how AWS is integrating open source code into our container services, and our contributions to open source projects.

FreeRTOS

IOT212 – NEW LAUNCH! Amazon FreeRTOS: IoT Operating System for Microcontrollers (video)

A deeper look at the newly-announced Amazon FreeRTOS, an operating system for microcontrollers that makes small, low-power edge devices easy to program, deploy, secure, connect, and manage. Amazon FreeRTOS is based on the FreeRTOS kernel, a popular open source operating system for microcontrollers, and extends it with software libraries that make it easy to securely connect your small, low-power devices to AWS cloud services like AWS IoT Core or to more powerful edge devices and gateways running AWS Greengrass.

IOT403 – NEW LAUNCH! AWS Greengrass and Amazon FreeRTOS: Connectivity and Security at the Edge (video)

How customers can use Amazon FreeRTOS on microcontrollers with Greengrass at the edge. It will walk through connecting your devices running Amazon FreeRTOS, how to connect devices to Greengrass, and how these two services can work together to solve customer use cases. We will also cover security best practices for key management and TLS implementation across Amazon FreeRTOS and Greengrass.

EKS

CON215 – NEW LAUNCH! Introducing Amazon EKS (video)

Amazon Elastic Container Service for Kubernetes (Amazon EKS) is a new managed service for running Kubernetes on AWS. This session provides an overview of Amazon EKS, why we built it, and how it works.

CON409 – Deep Dive into Amazon EKS (video)

Get a sneak peek into how Amazon EKS works, from provisioning nodes, launching pods, and integrations with AWS services such as Elastic Load Balancing and Auto Scaling.

Fargate

CON214 – NEW LAUNCH! Introducing AWS Fargate (video)

AWS Fargate is a technology for Amazon ECS and EKS that allows you to run containers without having to manage servers or clusters. Join us to learn more about how Fargate works, why we built it, and how you can get started using it to run containers today.

CON333 – Deep Dive into AWS Fargate (video)

AWS Fargate makes running containerized workloads on AWS easier than ever before. This session will provide a technical background for using Fargate with your existing containerized services, including best practices for building images, configuring task definitions, task networking, secrets management, and monitoring.

SageMaker

MCL365 – NEW LAUNCH! Introducing Amazon SageMaker (video)

Amazon SageMaker is a fully-managed service that enables data scientists and developers to quickly and easily build, train, and deploy machine learning models, at scale. This session will introduce you the features of Amazon SageMaker, including a one-click training environment, highly-optimized machine learning algorithms with built-in model tuning, and deployment without engineering effort.

MCL341 – NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI (video)

In machine learning, training large models on massive amount of data usually improved results. Our customers report, however, that training such models and deploying them is either operationally prohibitive or outright impossible for them. Amazon AI Algorithms is designed to solve this problem. It is a collection of distributed streaming ML algorithms that scale to any amount of data. They are fast and efficient because they distribute across CPU/GPU machines and share a collective distributed state via a highly-optimized parameter server. They scale to an infinite amount of data because they operate in the streaming model. This means they require only one pass over the data and never increase their resources consumption, allowing training to be paused, resumed, and snapshotted and even for algorithms to consume kinesis streams directly providing an “always on” training mechanism. They are production ready. Trained models are automatically containerized and useable in production using Amazon SageMaker hosting. Finally, we provide a convenient SDK which allows scientists to create new algorithms which operate in this model and enjoy all the benefits above.

This talk discusses our design choices and some of the internal working of the system. It will also describe the distributed streaming model and its numerous benefits to machine learning practitioners. We will show how to invoke large scale learning from Amazon SageMaker, or Amazon EMR, and host the solution.

MCL408 – NEW LAUNCH! Training and Hosting Machine Learning Models on Amazon SageMaker

Amazon SageMaker provides a hosted Jupyter Notebook environment, serverless, distributed model training, and auto-scaling hosting to generate predictions in real time. Join us for a chalk talk that will demonstrate how to train and host machine learning models using the spectrum of methods SageMaker supports. We will leverage both Amazon’s custom-built algorithms as well as show how you can bring their own algorithms with Docker containers. We will illustrate how to interact with SageMaker through the hosted Jupyter Notebook as well as integrate with existing Spark EMR pipelines.

Kubernetes

CON308 – Mastering Kubernetes on AWS (video)

Insights and experiences running Kubernetes on AWS.

CON213 – Hands-on Deployment of Kubernetes on AWS

Use Kubernetes and Kops (Kubernetes Operations) to create, deploy, manage and scale a Kubernetes cluster on AWS. Learn how to deploy your microservices-based applications and use service discovery for them.

Community Sessions

DVC202 – Community Knowledge Sharing for AWS (video)

The Open Guide to AWS is an open source writing project which has become one of the most popular AWS resources on the web. It’s both a written resource on GitHub, with over 100 contributors, and a large Slack group. Each has become a forum for trading practical knowledge not covered in standard documentation. We talk about the Guide and how it started, share lessons on seeding initial content, the editorial process, and how to foster a healthy extended community and encourage social engagement.

DVC201 – Build AWS Skills Through Community-Led User Groups (video)

Did you know that there are over 300 AWS User Groups worldwide? Join this panel discussion featuring AWS community leaders from around the world, and learn the value of attending community-led AWS Meetups in your region. Community leaders share their experiences, talk through how local communities help developers solve problems and achieve their goals, and discuss the benefits of participating in peer-to-peer AWS knowledge sharing and networking activities.

The above two sessions were part of the re:Invent Developer Community Day, six community-led sessions where AWS enthusiasts share technical insights on trending topics based on first-hand experiences and knowledge shared within local AWS communities.

Other Open Source Topics

ABD202 – Best Practices for Building Serverless Big Data Applications (video)

Explore the concepts behind and benefits of serverless architectures for big data, when and how you can use serverless technologies to streamline data processing, minimize infrastructure management, improve agility and robustness, and share a reference architecture using a combination of cloud and open source technologies.

ABD403 – Best Practices for Distributed Machine Learning and Predictive Analytics Using Amazon EMR and Open-Source Tools (video)

Common use cases and design patterns for predictive analytics using Amazon Elastic Map Reduce. We address accessing data from a data lake, extraction and preprocessing with Apache Spark, analytics and machine learning code development with notebooks (Jupyter, Zeppelin), and data visualization using Amazon QuickSight, and other operational topics.

DAT401 – The Boss: A Petascale Database for Large-Scale Neuroscience (video)

The Boss uses AWS to provide a cloud-native spatial database with an innovative storage hierarchy and auto-scaling capability. We provide an overview of the Boss, and how the APL used Amazon DynamoDB, AWS Lambda, and AWS Step Functions for several high-throughput components of the system. We discuss both the challenges and successes with serverless technologies.

DEV315 – GitHub to AWS Lambda: Developing, Testing, and Deploying Serverless Apps (video)

A hands-on demo of how to use GitHub as the core of a DevOps toolchain. Learn how to leverage AWS integrations with Jenkins, the AWS CLI, and open source software to build, test, and deploy a service to AWS Lambda. (Session sponsored by GitHub, Inc.)

DEV332 – Using AWS to Achieve Both Autonomy and Governance at 3M (video)

Nathan Scott, Senior Consultant at AWS and James Martin, Automation Engineering Manager at 3M, on how they have achieved both autonomy and governance through self-service automation tools on AWS. Includes a demo from Casey Lee, Chief Architect at Stelligent, on the tools used to accomplish this for 3M, including AWS Service Catalog, AWS CloudFormation, AWS CodePipeline and Cloud Custodian, an open source tool for managing AWS accounts.

ENT318 – Leveraging a Cloud Policy Framework – From Zero to Well Governed (video)

An open source “cloud policy framework” enables users to leverage a community that can help define and tune best practice policies, and help SaaS vendors and ISVs capture the best way to manage an application and share it with customers. (Session sponsored by CloudHealth Technologies)

LFS304 – Born in the AWS Cloud: How Eagle Genomics Uses AWS to Process Billions of DNA Sequence Reads (video)

With the increasing use of genomic sequencing for scientific discovery, the rate-limiting step for researchers is not in obtaining genetic code, but in having the capacity for storage and computing power to analyze it. Learn how Eagle Genomics built a cloud solution that uses an open-source workflow engine eHive), Docker containers to process jobs, and a REST service to manage pipeline runs, all to help customers process genetic sequences up to 20 times faster without additional costs.

SRV302 – Building CI/CD Pipelines for Serverless Applications (video)

A method for automating the deployment of serverless applications running on AWS Lambda, including how you can model and express serverless applications using the open-source AWS Serverless Application Model (AWS SAM).

SRV424 – Massively Parallel Data Processing with PyWren and AWS Lambda

How to achieve fast processing speeds using an open-source project called PyWren to massively parallelize data analytics jobs across hundreds or thousands of AWS Lambda functions.

DEV337 – Deploy a Data Lake with AWS CloudFormation

Learn to build AWS CloudFormation templates using proven methods and best practices. Deploy a fully functional data lake architecture using AWS services like Amazon RDS and open source components like Apache Zeppelin.

GPSWKS301 – GPS: Comprehensive Big Data Analytics Architecture Made Easy

A modern Big Data architecture involves extending your on-premises data management to AWS, implementing a data pipeline to stream real-time data into cloud data warehouse Amazon Redshift, perform data transformation, discovery, predictive analytics through machine learning, visualize complex information, and be notified to respond to business events. This session is for APN Consulting Partners and organizations looking for ways to accelerate and modernize their Big Data projects. You will learn how to deploy and integrate AWS Services with Third-party Solutions in AWS Marketplace. Reduce your time to market by combining AWS services, open source software and ready-to-run on AWS solutions.

ARC318 – Building .NET-based Serverless Architectures and Running .NET Core Microservices in Docker Containers on AWS (video)

Common approaches to refactoring common legacy .NET applications to microservices and AWS serverless architectures, modern approaches to .NET-based architectures on AWS, and running .NET Core microservices in Docker containers natively on Linux in AWS while examining the use of AWS SDK and .NET Core platform.

Deirdré Straughan

Deirdré Straughan

Deirdré has been communicating about technology, and helping others to do so, for 30 years. She has written one book (so far); edited two more (so far); produced and delivered technical training; produced hundreds of videos and live streams of technical talks; written, edited, and managed blogs; and managed events. She has been applying this skill set to cloud computing since 2010, and to open source for even longer. She joined AWS in 2017. You can find her at @deirdres on Twitter.