AWS Security Profile: Byron Cook, Director of the AWS Automated Reasoning Group

Author

Byron Cook leads the AWS Automated Reasoning Group, which automates proof search in mathematical logic and builds tools that provide AWS customers with provable security. Byron has pushed boundaries in this field, delivered real-world applications in the cloud, and fostered a sense of community amongst its practitioners. In recognition of Byron’s contributions to cloud security and automated reasoning, the UK’s Royal Academy of Engineering elected him as one of 7 new Fellows in computing this year.

I recently sat down with Byron to discuss his new Fellowship, the work that it celebrates, and how he and his team continue to use automated reasoning in new ways to provide higher security assurance for customers in the AWS cloud.

Congratulations, Byron! Can you tell us a little bit about the Royal Academy of Engineering, and the significance of being a Fellow?

Thank you. I feel very honored! The Royal Academy of Engineering is focused on engineering in the broad sense; for example, aeronautical, biomedical, materials, etc. I’m one of only 7 Fellows elected this year that specialize in computing or logic, making the announcement really unique.

As for what the Royal Academy of Engineering is: the UK has Royal Academies for key disciplines such as music, drama, etc. The Royal Academies focus financial support and recognition on these fields, and gives a location and common meeting place. The Royal Academy of Music, for example, is near Regent’s Park in West London. The Royal Academy of Engineering’s building is in Carlton Place, one of the most exclusive locations in central London near Pall Mall and St. James’ Park. I’ve been to a number of lectures and events in that space. For example, it’s where I spoke ten years ago when I was the recipient of the Roger Needham prize. Some examples of previously elected Fellows include Sir Frank Whittle, who invented the jet engine; radar pioneer Sir George MacFarlane, and Sir Tim Berners-Lee, who developed the world-wide web.

Can you tell us a little bit about why you were selected for the award?

The letter I received from the Royal Academy says it better than I could say myself:

“Byron Cook is a world-renowned leader in the field of formal verification. For over 20 years Byron has worked to bring this field from academic hypothesis to mechanised industrial reality. Byron has made major research contributions, built influential tools, led teams that operationalised formal verification activities, and helped establish connections between others that have dramatically accelerated growth of the area. Byron’s tools have been applied to a wide array of topics, e.g. biological systems, computer operating systems, programming languages, and security. Byron’s Automated Reasoning Group at Amazon is leading the field to even greater success”.

Formal verification is the one term here that may be foreign to you, so perhaps I should explain. Formal verification is the use of mathematical logic to prove properties of systems. Euclid, for example, used formal verification in ~300 BC to prove that the Pythagorean theorem holds for all possible right-angled triangles. Today we are using formal verification to prove things about all possible configurations of a computer program might reach. When I founded Amazon’s Automated Reasoning Group, I named it that because my ambition was to automate all of the reasoning performed during formal verification.

Can you give us a bit of detail about some of the “research contributions and tools” mentioned in the text from Royal Academy of Engineering?

Probably my best-known work before joining Amazon was on the Terminator tool. Terminator was designed to reason at compile-time about what a given computer program would eventually do when running in production. For example, “Will the program eventually halt?” This is the famous “Halting problem,” proved undecidable in the 1930s. The Terminator tool piloted a new approach to the problem which is popular now, based on the idea of incrementally improving the best guess for a proof based on failed proof attempts. This was the first known approach capable of scaling termination proving to industrial problems. My colleagues and I used Terminator to find bugs in device drivers that could cause operating systems to become unresponsive. We found many bugs in device drivers that ran keyboards, mice, network devices, and video cards. The Terminator tool was also the basis of BioModelAnaylzer. It turns out that there’s a connection between diseases like Leukemia and the Halting problem: Leukemia is a termination bug in the genetic-regulatory pathways in your blood. You can think of it in the same way you think of a device driver that’s stuck in an infinite loop, causing your computer to freeze. My tools helped answer fundamental questions that no tool could solve before. Several pharmaceutical companies use BioModelAnaylzer today to understand disease and find new treatment options. And these days, there is an annual international competition with many termination provers that are much better than the Terminator. I think that this is what Royal Academy is talking about when they say I moved the area from “academic hypothesis to mechanized industrial reality.”

I have also worked on problems related to the question of P=NP, the most famous open problem in computing theory. From 2000-2006, I built tools that made NP feel equal to P in certain limited circumstances to try and understand the problem better. Then I focused on circumstances that aligned with important industrial problems, like proving the absence of bugs in microprocessors, flight control software, telecommunications systems, and railway control systems. These days the tools in this space are incredibly powerful. You should check out the software tools CVC4 or Z3.

And, of course, there’s my work with the Automated Reasoning Group, where I’ve built a team of domain experts that develop and apply formal verification tools to a wide variety of problems, helping make the cloud more secure. We have built tools that automatically reason about the semantics of policies, networks, cryptography, virtualization, etc. We reason about the implementation of Amazon Web Services (AWS) itself, and we’ve built tools that help customers prove the correctness of their AWS-based implementations.

Could you go into a bit more detail about how this work connects to Amazon and its customers?

AWS provides cloud services globally. Cloud is shorthand for on-demand access to IT resources such as compute, storage, and analytics via the Internet with pay-as-you-go pricing. AWS has a wide variety of customers, ranging from individuals to the largest enterprises, and practically all industries. My group develops mathematical proof tools that help make AWS more secure, and helps AWS customers understand how to build in the cloud more securely.

I first became an AWS customer myself when building BioModelAnaylzer. AWS allowed us working on this project to solve major scientific challenges (see this Nature Scientific Report for an example) using very large datacenters, but without having to buy the machines, maintain the machines, maintain the rooms that the machines would sit in, the A/C system that would keep them cool, etc. I was also able to easily provide our customers with access to the tool via the cloud, because it’s all over the internet. I just pointed people to the end-point on the internet and, presto, they were using the tool. About 5 years before developing BioModelAnalyzer, I was developing proof tools for device drivers and I gave a demo of the tool to my executive leadership. At the end of the demo, I was asked if 5,000 machines would help us do more proofs. Computationally, the answer was an obvious “yes,” but then I thought a minute about the amount of overhead required to manage a fleet of 5,000 machines and reluctantly replied “No, but thank you very much for the offer!” With AWS, it’s not even a question. Anyone with an Amazon account can provision 5,000 machines for practically nothing. In less than 5 minutes, you can be up and running and computing with thousands of machines.

What I love about working at AWS is that I can focus a very small team on proving the correctness of some aspect of AWS (for example, the cryptography) and, because of the size and importance of the customer base, we make much of the world meaningfully more secure. Just to name a few examples: s2n (the Amazon TLS implementation); the AWS Key Management Service (KMS), which allows customers to securely store crypto keys; and networking extensions to the IoT operating system Amazon FreeRTOS, which customers use to link cloud to IoT devices, such as robots in factories. We also focus on delivering service features that help customers prove the correctness of their AWS-based implementations. One example is Tiros, which powers a network reachability feature in Amazon Inspector. Another example is Zelkova, which powers features in services such as Amazon S3, AWS Config, and AWS IoT Device Defender.

When I think of mathematical logic I think of obscure theory and messy blackboards, not practical application. But it sounds like you’ve managed to balance the tension between theory and practical industrial problems?

I think that this is a common theme that great scientists don’t often talk about. Alan Turing, for example, did his best work during the war. John Snow, who made fundamental contributions to our understanding of germs and epidemics, did his greatest work while trying to figure out why people were dying in the streets of London. Christopher Stratchey, one of the founders of our field, wrote:

“It has long been my personal view that the separation of practical and theoretical work is artificial and injurious. Much of the practical work done in computing, both in software and in hardware design, is unsound and clumsy because the people who do it have not any clear understanding of the fundamental design principles in their work. Most of the abstract mathematical and theoretical work is sterile because it has no point of contact with real computing.”

Throughout my career, I’ve been at the intersection of practical and theoretical. In the early days, this was driven by necessity: I had two children during my PhD and, frankly, I needed the money. But I soon realized that my deep connection to real engineering problems was an advantage and not a disadvantage, and I’ve tried through the rest of my career to stay in that hot spot of commercially applicable problems while tackling abstract mathematical topics.

What’s next for you? For the Automated Reasoning Group? For your scientific field?

The Royal Academy of Engineering kindly said that I’ve brought “this field from academic hypothesis to mechanized industrial reality.” That’s perhaps true, but we are very far from done: it’s not yet an industrial standard. The full power of automated reasoning is not yet available to everyone because today’s tools are either difficult to use or weak. The engineering challenge is to make them both powerful and easy to use. With that I believe that they’ll become a key part of every software engineer’s daily routine. What excites me is that I believe that Amazon has a lot to teach me about how to operationalize the impossible. That’s what Amazon has done over and over again. That’s why I’m at Amazon today. I want to see these proof techniques operating automatically at Amazon scale.

Links:
• Provable security webpage
• Lecture: Fundamentals for Provable Security at AWS
• Lecture: The evolution of Provable Security at AWS
• Lecture: Automating compliance verification using provable security
• Lecture: Byron speaks about Terminator at University of Colorado
• https://biomodelanalyzer.org/

If you have feedback about this post, let us know in the Comments section below.

Want more AWS Security news? Follow us on Twitter.