With only the binary executable of a program, it is useful to discover the program's data structures and infer their syntactic and semantic definitions. Such knowledge is highly valuable in a variety of security and forensic applications. Although there exist efforts in program data structure inference, the existing solutions are not suitable for our targeted application scenarios. In this paper, we propose a reverse engineering technique to automatically reveal program data structures from binaries. Our technique, called REWARDS, is based on dynamic analysis. More specifically, each memory location accessed by the program is tagged with a timestamped type attribute. Following the program's runtime data flow, this attribute is propagated to other memory locations and registers that share the same type. During the propagation, a variable's type gets resolved if it is involved in a type-revealing execution point or type sink. More importantly, besides the forward type propagation, REWARDS involves a backward type resolution procedure where the types of some previously accessed variables get recursively resolved starting from a type sink. This procedure is constrained by the timestamps of relevant memory locations to disambiguate variables re-using the same memory location. In addition, REWARDS is able to reconstruct in-memory data structure layout based on the type information derived. We demonstrate that REWARDS provides unique benefits to two applications: memory image forensics and binary fuzzing for vulnerability discovery.
The full paper can be downloaded here [PDF]