Nytro Posted April 24, 2019 Report Posted April 24, 2019 Modern C++ Won't Save Us 2019-04-21 by alex_gaynor I'm a frequent critic of memory unsafe languages, principally C and C++, and how they induce an exceptional number of security vulnerabilities. My conclusion, based on reviewing evidence from numerous large software projects using C and C++, is that we need to be migrating our industry to memory safe by default languages (such as Rust and Swift). One of the responses I frequently receive is that the problem isn't C and C++ themselves, developers are simply holding them wrong. In particular, I often receive defenses of C++ of the form, "C++ is safe if you don't use any of the functionality inherited from C"1 or similarly that if you use modern C++ types and idioms you will be immune from the memory corruption vulnerabilities that plague other projects. I would like to credit C++'s smart pointer types, because they do significantly help. Unfortunately, my experience working on large C++ projects which use modern idioms is that these are not nearly sufficient to stop the flood of vulnerabilities. My goal for the remainder of this post is to highlight a number of completely modern C++ idioms which produce vulnerabilities. Hide the reference use-after-free The first example I'd like to describe, originally from Kostya Serebryany, is how C++'s std::string_view can make it easy to hide use-after-free vulnerabilities: #include <iostream> #include <string> #include <string_view> int main() { std::string s = "Hellooooooooooooooo "; std::string_view sv = s + "World\n"; std::cout << sv; } What's happening here is that s + "World\n" allocates a new std::string, and then is converted to a std::string_view. At this point the temporary std::string is freed, but sv still points at the memory that used to be owned by it. Any future use of sv is a use-after-free vulnerability. Oops! C++ lacks the facilities for the compiler to be aware that sv captures a reference to something where the reference lives longer than the referent. The same issue impacts std::span, also an extremely modern C++ type. Another fun variant involves using C++'s lambda support to hide a reference: #include <memory> #include <iostream> #include <functional> std::function<int(void)> f(std::shared_ptr<int> x) { return [&]() { return *x; }; } int main() { std::function<int(void)> y(nullptr); { std::shared_ptr<int> x(std::make_shared<int>(4)); y = f(x); } std::cout << y() << std::endl; } Here the [&] in f causes the lambda to capture values by reference. Then in main x goes out of scope, destroying the last reference to the data, and causing it to be freed. At this point y contains a dangling pointer. This occurs despite our meticulous use of smart pointers throughout. And yes, people really do write code that handles std::shared_ptr<T>&, often as an attempt to avoid additional increment and decrements on the reference count. std::optional<T> dereference std::optional represents a value that may or may not be present, often replacing magic sentinel values (such as -1 or nullptr). It offers methods such as value(), which extract the T it contains and raises an exception if the the optional is empty. However, it also defines operator* and operator->. These methods also provide access to the underlying T, however they do not check if the optional actually contains a value or not. The following code for example, simply returns an uninitialized value: #include <optional> int f() { std::optional<int> x(std::nullopt); return *x; } If you use std::optional as a replacement for nullptr this can produce even more serious issues! Dereferencing a nullptr gives a segfault (which is not a security issue, except in older kernels). Dereferencing a nullopt however, gives you an uninitialized value as a pointer, which can be a serious security issue. While having a T* with an uninitialized value is also possible, these are much less common than dereferencing a pointer that was correctly initialized to nullptr. And no, this doesn't require you to be using raw pointers. You can get uninitialized/wild pointers with smart pointers as well: #include <optional> #include <memory> std::unique_ptr<int> f() { std::optional<std::unique_ptr<int>> x(std::nullopt); return std::move(*x); } std::span<T> indexing std::span<T> provides an ergonomic way to pass around a reference to a contiguous slice of memory and a length. This lets you easily write code that works over multiple different types; a std::span<uint8_t> can point to memory owned by a std::vector<uint8_t>, a std::array<uint8_t, N>, or even a raw pointer. Failure to correctly check bounds is a frequent source of security vulnerabilities, and in many senses span helps out with this by ensuring you always have a length handy. Like all STL data structures, span's operator[] method does not perform any bounds checks. This is regrettable, since operator[] is the most ergonomic and default way people use data structures. std::vector and std::array can at least theoretically be used safely because they offer an at() method which is bounds checked (in practice I've never seen this done, but you could imagine a project adopting a static analysis tool which simply banned calls to std::vector<T>::operator[]). span does not offer an at() method, or any other method which performs a bounds checked lookup. Interestingly, both Firefox and Chromium's backports of std::span do perform bounds checks in operator[], and thus they'll never be able to safely migrate to std::span. Conclusion Modern C++ idioms introduce many changes which have the potential to improve security: smart pointers better express expected lifetimes, std::span ensures you always have a correct length handy, std::variant provides a safer abstraction for unions. However modern C++ also introduces some incredible new sources of vulnerabilities: lambda capture use-after-free, uninitialized-value optionals, and un-bounds-checked span. My professional experience writing relatively modern C++, and auditing Rust code (including Rust code that makes significant use of unsafe) is that the safety of modern C++ is simply no match for memory safe by default languages like Rust and Swift (or Python and Javascript, though I find it rare in life to have a program that makes sense to write in either Python or C++). There are significant challenges to migrating existing, large, C and C++ codebases to a different language -- no one can deny this. Nonetheless, the question simply must be how we can accomplish it, rather than if we should try. Even with the most modern C++ idioms available, the evidence is clear that, at scale, it's simply not possible to hold C++ right. [1] I understood this to be referring to raw pointers, arrays-as-pointers, manual malloc/free, and other similar features. However I think it's worth acknowledging that given that C++ explicitly incorporated C into its specification, in practice most C++ code incorporates some of these elements. Hi, I'm Alex. I'm currently at a startup called Alloy. Before that I was a engineer working on Firefox security and before that at the U.S. Digital Service. I'm an avid open source contributor and live in Washington, DC. Sursa: https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/ 1 Quote