AWS Open Source Blog
How using hyper in curl can help make the internet safer
In February, Josh Aas from Internet Security Research Group, Daniel Stenberg from curl, and I (from hyper and Amazon Web Services) hosted a joint webinar to discuss memory safety and the internet, and how using hyper in curl can help make the internet safer. Because curl is open source and permissively licensed, it is found in everything from IoT devices to satellites, and in most Linux distributions. Curl is one of those foundational libraries that we take for granted, but that influences our lives online.
In this post, we provide highlights from the webinar, which is also available on YouTube.
The not-so-safe internet
Memory safety is a property of a programming language or system that protects the program from incorrectly accessing memory that it shouldn’t, such as thinking a buffer is bigger than it is, or interpreting data from a pointer that has since been cleaned up and used for a different value. Lack of memory safety is a serious persistent threat to internet infrastructure, and has caused significant, meaningful damage to individuals and organizations alike. For example:
- Microsoft estimates that 70% of all vulnerabilities in their products over the past decade have been caused by a lack of memory safety.
- Google found that 70% of Chrome’s serious security bugs are memory safety problems.
- The Android team reports that 90% of their vulnerabilities are memory safety issues.
- Mozilla notes that 74% of Firefox’s security bugs in its style component are memory safety bugs.
- A recent study indicates 60-70% of iOS and macOS vulnerabilities are related to memory safety.
- An analysis by Project Zero discovered that more than 80% of exploited 0-days in the wild were due to a lack of memory safety.
These vulnerabilities can result in real life privacy violations, financial losses, denial of public services, and human rights impact.
A significant contributor to this proliferation of vulnerabilities is that many are tools written in programming languages that don’t do much—if anything—to protect against memory safety bugs. And although “memory safe” languages have existed for a long time, frequently they are overlooked because of performance or interoperability requirements.
This is usually the point in the conversation when the Rust programming language gets mentioned. Rust is a newer language that enforces memory safety, but has similar performance and interoperability as C.
This blog post is not, however, a rally cry to “rewrite it (all) in Rust.” Rewriting all the things is not realistic, nor does every piece of code have the same impact on humanity. Instead, let’s briefly walk through a practical effort to bring memory safety to critical parts of the internet. This effort encourages projects to replace libraries or modular functionality with memory safe libraries, rather than embark upon ground-up rewrites. It breaks up the work into manageable pieces and delivers value incrementally.
curl is everywhere
curl is an ideal candidate to start this work. The potential impact is enormous, because curl is everywhere. According to the curl site:
curl is used in command lines or scripts to transfer data. curl is also used in cars, television sets, routers, printers, audio equipment, mobile phones, tablets, settop boxes, media players and is the internet transfer engine for thousands of software applications in over ten billion installations. curl is used daily by virtually every internet-using human on the globe.
curl is also written in C.
In a recent blog post, Daniel Stenberg notes that about half of curl’s vulnerabilities are C mistakes—in other words, memory safety related. The mistakes include buffer overread, buffer overflow, use-after-free, and double free. And although curl handles a whole bunch of protocols, HTTP was the second largest area of memory safety vulnerabilities.
Can we make curl safer?
Doing so would make the internet safer, so it’s worth the effort. And that’s what we set out to do. Curl’s API and ABI are the stable “armored door” that doesn’t break. But curl is already familiar with selecting different “backends” for its internal implementation details. Before now, curl could be configured to internally support backends for TLS, DNS, compression, and other components. We’d just need to provide a safer HTTP backend option.
hyper has entered the chat
Hyper is a safe, correct, and fast HTTP library for the Rust language. Hyper is open source, and used by AWS, Buoyant, Discord, Google, Microsoft, Mozilla, and many more. It has both client and server APIs, and provides support for HTTP/1 and HTTP/2.
The relevant part here is its memory safety and ability to interoperate relatively easily with C. A Rust library doesn’t bring a dependency on a new runtime, and there’s no overhead to invoke Rust functions in C or vice versa. It just requires engineer work to expose a C-compatible API.
The hyper developers immediately knew this was something that needed to be done. Considering how much curl is used, this was an opportunity to make the internet safer. And hyper’s Rust users would benefit as well, because any bug fixes or edge cases handled as a result of this work would fix it for them, too.
We designed a C API for hyper. Most of the work was around recognizing differences in assumptions between the Rust and C languages. After those parts were ironed out, the API worked. Hyper added a couple options to allow curl to make the differences for their users nearly indistinguishable.
Code in both curl and hyper has been merged to their respective main development branches. Curl can be configured to compile with hyper as its HTTP backend, although it will be in an experimental status until all of curl’s HTTP support works the same as with its internal C backend and it has been tried out more extensively. Most of the code already works, and issues that remain are more complex HTTP features that still need to be updated to support different backends. The best way to track the work is via curl’s extensive test suite.
Quantitatively, 95% of 800 or so HTTP unit tests are ported and passing with hyper configured as the backend. That means a lot of the standard features in curl are already working. With the hyper backend, curl has support for HTTP/1 and HTTP/2. It works with HTTPS using any of the TLS backends, and even with HTTP(S) proxies. The HTTP requests over the wire are identical with either backend.
A few more features need to be piped together in curl to get the whole test suite passing. In hyper, there’s a desire to improve handling of panics and out-of-memory aborts, and other such details around exposing a C library from Rust. Also, developers need to adapt or fix anything that curl notices while completing the enabling of its unit tests.
This is far-reaching work, and it’s hit an exciting milestone. To learn more, watch the webinar, which provides more detail and includes a question and answer section. Because both projects are under open source licenses, anyone is welcome to join us working on curl and hyper.