We work on scalable techniques to generate a large number of valid binaries to provide a good valuation dataset for all binary analysis tasks, e.g., type inference, function boundary detection, CFG recovery, etc.

Datasets

Our techniques have created the following binary datasets:

Cornucopia dataset

Cornucopia is our first approach where we use feedback guided techniques to generate a large number of semantically-equivalent binaries for a given program. You can access the dataset here.

Current Projects

Please refer to our projects page for the list of projects and their details.