Jump to content
Nytro

Modern C++ Won't Save Us

Recommended Posts

Modern C++ Won't Save Us

2019-04-21 by alex_gaynor

I'm a frequent critic of memory unsafe languages, principally C and C++, and how they induce an exceptional number of security vulnerabilities. My conclusion, based on reviewing evidence from numerous large software projects using C and C++, is that we need to be migrating our industry to memory safe by default languages (such as Rust and Swift). One of the responses I frequently receive is that the problem isn't C and C++ themselves, developers are simply holding them wrong. In particular, I often receive defenses of C++ of the form, "C++ is safe if you don't use any of the functionality inherited from C"1 or similarly that if you use modern C++ types and idioms you will be immune from the memory corruption vulnerabilities that plague other projects.

I would like to credit C++'s smart pointer types, because they do significantly help. Unfortunately, my experience working on large C++ projects which use modern idioms is that these are not nearly sufficient to stop the flood of vulnerabilities. My goal for the remainder of this post is to highlight a number of completely modern C++ idioms which produce vulnerabilities.

Hide the reference use-after-free

The first example I'd like to describe, originally from Kostya Serebryany, is how C++'s std::string_view can make it easy to hide use-after-free vulnerabilities:

#include <iostream>
#include <string>
#include <string_view>

int main() {
  std::string s = "Hellooooooooooooooo ";
  std::string_view sv = s + "World\n";
  std::cout << sv;
}

What's happening here is that s + "World\n" allocates a new std::string, and then is converted to a std::string_view. At this point the temporary std::string is freed, but sv still points at the memory that used to be owned by it. Any future use of sv is a use-after-free vulnerability. Oops! C++ lacks the facilities for the compiler to be aware that sv captures a reference to something where the reference lives longer than the referent. The same issue impacts std::span, also an extremely modern C++ type.

Another fun variant involves using C++'s lambda support to hide a reference:

#include <memory>
#include <iostream>
#include <functional>


std::function<int(void)> f(std::shared_ptr<int> x) {
    return [&]() { return *x; };
}

int main() {
    std::function<int(void)> y(nullptr);
    {
        std::shared_ptr<int> x(std::make_shared<int>(4));
        y = f(x);
    }
    std::cout << y() << std::endl;
}

Here the [&] in f causes the lambda to capture values by reference. Then in main x goes out of scope, destroying the last reference to the data, and causing it to be freed. At this point y contains a dangling pointer. This occurs despite our meticulous use of smart pointers throughout. And yes, people really do write code that handles std::shared_ptr<T>&, often as an attempt to avoid additional increment and decrements on the reference count.

std::optional<T> dereference

std::optional represents a value that may or may not be present, often replacing magic sentinel values (such as -1 or nullptr). It offers methods such as value(), which extract the T it contains and raises an exception if the the optional is empty. However, it also defines operator* and operator->. These methods also provide access to the underlying T, however they do not check if the optional actually contains a value or not.

The following code for example, simply returns an uninitialized value:

#include <optional>

int f() {
    std::optional<int> x(std::nullopt);
    return *x;
}

If you use std::optional as a replacement for nullptr this can produce even more serious issues! Dereferencing a nullptr gives a segfault (which is not a security issue, except in older kernels). Dereferencing a nullopt however, gives you an uninitialized value as a pointer, which can be a serious security issue. While having a T* with an uninitialized value is also possible, these are much less common than dereferencing a pointer that was correctly initialized to nullptr.

And no, this doesn't require you to be using raw pointers. You can get uninitialized/wild pointers with smart pointers as well:

#include <optional>
#include <memory>

std::unique_ptr<int> f() {
    std::optional<std::unique_ptr<int>> x(std::nullopt);
    return std::move(*x);
}

std::span<T> indexing

std::span<T> provides an ergonomic way to pass around a reference to a contiguous slice of memory and a length. This lets you easily write code that works over multiple different types; a std::span<uint8_t> can point to memory owned by a std::vector<uint8_t>, a std::array<uint8_t, N>, or even a raw pointer. Failure to correctly check bounds is a frequent source of security vulnerabilities, and in many senses span helps out with this by ensuring you always have a length handy.

Like all STL data structures, span's operator[] method does not perform any bounds checks. This is regrettable, since operator[] is the most ergonomic and default way people use data structures. std::vector and std::array can at least theoretically be used safely because they offer an at() method which is bounds checked (in practice I've never seen this done, but you could imagine a project adopting a static analysis tool which simply banned calls to std::vector<T>::operator[]). span does not offer an at() method, or any other method which performs a bounds checked lookup.

Interestingly, both Firefox and Chromium's backports of std::span do perform bounds checks in operator[], and thus they'll never be able to safely migrate to std::span.

Conclusion

Modern C++ idioms introduce many changes which have the potential to improve security: smart pointers better express expected lifetimes, std::span ensures you always have a correct length handy, std::variant provides a safer abstraction for unions. However modern C++ also introduces some incredible new sources of vulnerabilities: lambda capture use-after-free, uninitialized-value optionals, and un-bounds-checked span.

My professional experience writing relatively modern C++, and auditing Rust code (including Rust code that makes significant use of unsafe) is that the safety of modern C++ is simply no match for memory safe by default languages like Rust and Swift (or Python and Javascript, though I find it rare in life to have a program that makes sense to write in either Python or C++).

There are significant challenges to migrating existing, large, C and C++ codebases to a different language -- no one can deny this. Nonetheless, the question simply must be how we can accomplish it, rather than if we should try. Even with the most modern C++ idioms available, the evidence is clear that, at scale, it's simply not possible to hold C++ right.

[1] I understood this to be referring to raw pointers, arrays-as-pointers, manual malloc/free, and other similar features. However I think it's worth acknowledging that given that C++ explicitly incorporated C into its specification, in practice most C++ code incorporates some of these elements.

about.jpg

Hi, I'm Alex. I'm currently at a startup called Alloy. Before that I was a engineer working on Firefox security and before that at the U.S. Digital Service. I'm an avid open source contributor and live in Washington, DC.

 

 

 

 

Sursa: https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/

  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×
×
  • Create New...