Nytro Posted February 11, 2016 Report Posted February 11, 2016 Serialization Security Bugs Explained If you’re in information security you’ve probably heard a lot about serialization bugs. They are becoming increasingly common, and I wanted to give a basic overview of how they work and why they’re an issue. The parsing problem So much of security comes down to parsing. It’s the primary reason we need input validation, and the reason that software like antivirus and network protocol analyzers can have so many security issues. The job of a parser is to take input from somewhere else and run it through your own software. That should frighten you. It’s like a CDC employee using the ‘open and lick’ method to test petri dish samples. Bottom If you’re going to parse something, you have to get intimate with it. And that brings us to serialization. Serialization Serialization is the process of capturing a data structure or an object’s state into a (serial) format that can be efficiently stored or transmitted for later consumption. So you can take an object, capture its state, and then put it in memory, write it to disk, or send it over the network. Then at some point the object can be retrieved and consumed, restoring the object’s state. Example A basic example of serialization might be to take the following array: $array = array("a" => 1, "b" => 2, "c" => array("a" => 1, "b" => 2)); And to serialize it into this: a:3:{s:1:"a";i:1;s:1:"b";i:2;s:1:"c";a:2:{s:1:"a";i:1;s:1:"b";i:2;}} At its core, serialization is a type of encoding. The crux So this brings us to the core issue: deserialization requires parsing. In order to go from that serialized format to usable data, some software package needs to unpack that content, figure it out, and then consume it. And this is precisely what parsers are so bad at. And doing it poorly can lead to all manner of flaws, up to and including arbitrary code execution. Summary Parsing untrusted input is hard Serialization takes data and encodes it into opaque formats for transfer and storage To make use of that content, parsers must unpack and consume it It’s extremely hard to do this correctly, and if you do it wrong it could mean code execution This applies to most any language that uses serialization, but some languages (like Java) are in worse shape than others. Notes The Wikipedia article has a good set of language implementations Here’s a writeup on a Java serialization bug. Link: https://danielmiessler.com/study/serialization-security-bugs-explained/ Quote