-
-
Notifications
You must be signed in to change notification settings - Fork 388
Improving Handling of Custom Inputs #2422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 20 commits
762461e
9571619
fda42e7
1f2507d
529e399
e4ab632
1769ff1
11801f5
84d704e
5f6c689
ac7dfb9
4df0d65
dbbe477
8a50e0a
2c19717
a672041
4d3b8a1
66a33d8
abc2458
86e072b
d871728
5f1c5bc
1171cc0
3a40f76
afc5259
d9cb391
fc1943f
2d257b9
147e88b
4ea8b72
b6e2af1
4c8fe8f
61f3822
4411d6b
745945e
41de711
3afb25a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
[package] | ||
name = "baby_fuzzer_custom_input" | ||
version = "0.1.0" | ||
authors = ["Valentin Huber <[email protected]>"] | ||
edition = "2021" | ||
|
||
[profile.dev] | ||
panic = "abort" | ||
|
||
[profile.release] | ||
panic = "abort" | ||
lto = true | ||
codegen-units = 1 | ||
opt-level = 3 | ||
debug = true | ||
|
||
[dependencies] | ||
libafl = { path = "../../../libafl/" } | ||
libafl_bolts = { path = "../../../libafl_bolts/" } | ||
serde = "*" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Baby fuzzer | ||
|
||
This is a minimalistic fuzzer demonstrating how to employ mapping mutators to use default mutators on custom inputs. Custom inputs are necessary when the input to your program is a combination of parts, especially when those parts have different data types. Check multipart inputs if you have an input consisting of multiple parts of the same datatype and you don't need your mutation scheduler to be able to select which mutation is performed on which part. | ||
|
||
The fuzzer runs on a single core until a crash occurs and then exits. The tested program is a simple Rust function without any instrumentation. For real fuzzing, you will want to add some sort to add coverage or other feedback. | ||
|
||
You can run this example using `cargo run`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
use std::{ | ||
borrow::Cow, | ||
hash::{DefaultHasher, Hash, Hasher}, | ||
}; | ||
|
||
use libafl::{ | ||
corpus::CorpusId, | ||
generators::Generator, | ||
inputs::Input, | ||
prelude::{MutationResult, Mutator}, | ||
state::HasRand, | ||
Error, SerdeAny, | ||
}; | ||
use libafl_bolts::{rands::Rand, Named}; | ||
use serde::{Deserialize, Serialize}; | ||
|
||
/// The custom [`Input`] type used in this example, consisting of a byte array part, a byte array that is not always present, and a boolean | ||
/// | ||
/// Imagine these could be used to model command line arguments for a bash command, where | ||
/// - `byte_array` is binary data that is always needed like what is passed to stdin, | ||
/// - `optional_byte_array` is binary data passed as a command line arg, and it is only passed if it is not `None` in the input, | ||
/// - `boolean` models the presence or absence of a command line flag that does not require additional data | ||
#[derive(Serialize, Deserialize, Clone, Debug, Hash, SerdeAny)] | ||
pub struct CustomInput { | ||
pub byte_array: Vec<u8>, | ||
pub optional_byte_array: Option<Vec<u8>>, | ||
pub boolean: bool, | ||
} | ||
|
||
/// Hash-based implementation | ||
impl Input for CustomInput { | ||
fn generate_name(&self, _id: Option<CorpusId>) -> String { | ||
let mut hasher = DefaultHasher::new(); | ||
self.hash(&mut hasher); | ||
format!("{:016x}", hasher.finish()) | ||
} | ||
} | ||
|
||
impl CustomInput { | ||
/// Returns a mutable reference to the byte array | ||
pub fn byte_array_mut(&mut self) -> &mut Vec<u8> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could just return a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See comments above. Yes, it could. But it's more logic I can do centrally instead of having to do it at every custom input. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is it more logic? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because of the additional |
||
&mut self.byte_array | ||
} | ||
|
||
/// Returns an immutable reference to the byte array wrapped in [`Some`] | ||
pub fn byte_array_optional(&self) -> Option<&Vec<u8>> { | ||
Some(&self.byte_array) | ||
} | ||
|
||
/// Returns a mutable reference to the optional byte array | ||
pub fn optional_byte_array_mut(&mut self) -> &mut Option<Vec<u8>> { | ||
&mut self.optional_byte_array | ||
} | ||
|
||
/// Returns an immutable reference to the optional byte array | ||
pub fn optional_byte_array_optional(&self) -> Option<&Vec<u8>> { | ||
self.optional_byte_array.as_ref() | ||
} | ||
} | ||
|
||
/// A generator for [`CustomInput`] used in this example | ||
pub struct CustomInputGenerator { | ||
pub max_len: usize, | ||
} | ||
|
||
impl CustomInputGenerator { | ||
/// Creates a new [`CustomInputGenerator`] | ||
pub fn new(max_len: usize) -> Self { | ||
Self { max_len } | ||
} | ||
} | ||
|
||
impl<S> Generator<CustomInput, S> for CustomInputGenerator | ||
where | ||
S: HasRand, | ||
{ | ||
fn generate(&mut self, state: &mut S) -> Result<CustomInput, Error> { | ||
let byte_array = generate_bytes(self.max_len, state); | ||
let optional_byte_array = state | ||
.rand_mut() | ||
.coinflip(0.5) | ||
.then(|| generate_bytes(self.max_len, state)); | ||
let boolean = state.rand_mut().coinflip(0.5); | ||
|
||
Ok(CustomInput { | ||
byte_array, | ||
optional_byte_array, | ||
boolean, | ||
}) | ||
} | ||
} | ||
|
||
/// Generate a [`Vec<u8>`] of a length between 1 (incl.) and `length` (incl.) filled with random bytes | ||
fn generate_bytes<S: HasRand>(length: usize, state: &mut S) -> Vec<u8> { | ||
riesentoaster marked this conversation as resolved.
Show resolved
Hide resolved
|
||
let rand = state.rand_mut(); | ||
let len = rand.between(1, length); | ||
let mut vec = Vec::new(); | ||
vec.resize_with(len, || rand.next() as u8); | ||
vec | ||
} | ||
|
||
/// [`Mutator`] that toggles the optional byte array of a [`CustomInput`], i.e. sets it to [`None`] if it is not, and to a random byte array if it is [`None`] | ||
pub struct ToggleOptionalByteArrayMutator { | ||
length: usize, | ||
} | ||
|
||
impl ToggleOptionalByteArrayMutator { | ||
/// Creates a new [`ToggleOptionalByteArrayMutator`] | ||
pub fn new(length: usize) -> Self { | ||
Self { length } | ||
} | ||
} | ||
|
||
impl<S> Mutator<CustomInput, S> for ToggleOptionalByteArrayMutator | ||
where | ||
S: HasRand, | ||
{ | ||
fn mutate(&mut self, state: &mut S, input: &mut CustomInput) -> Result<MutationResult, Error> { | ||
input.optional_byte_array = match input.optional_byte_array { | ||
None => Some(generate_bytes(self.length, state)), | ||
Some(_) => None, | ||
}; | ||
Ok(MutationResult::Mutated) | ||
} | ||
} | ||
|
||
impl Named for ToggleOptionalByteArrayMutator { | ||
fn name(&self) -> &Cow<'static, str> { | ||
&Cow::Borrowed("ToggleOptionalByteArrayMutator") | ||
} | ||
} | ||
|
||
/// [`Mutator`] that toggles the boolean field in a [`CustomInput`] | ||
pub struct ToggleBooleanMutator; | ||
|
||
impl<S> Mutator<CustomInput, S> for ToggleBooleanMutator { | ||
domenukk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
fn mutate(&mut self, _state: &mut S, input: &mut CustomInput) -> Result<MutationResult, Error> { | ||
input.boolean = !input.boolean; | ||
Ok(MutationResult::Mutated) | ||
} | ||
} | ||
|
||
impl Named for ToggleBooleanMutator { | ||
fn name(&self) -> &Cow<'static, str> { | ||
&Cow::Borrowed("ToggleBooleanMutator") | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
mod input; | ||
|
||
#[cfg(windows)] | ||
use std::ptr::write_volatile; | ||
use std::{path::PathBuf, ptr::write}; | ||
|
||
use input::{ | ||
CustomInput, CustomInputGenerator, ToggleBooleanMutator, ToggleOptionalByteArrayMutator, | ||
}; | ||
use libafl::{ | ||
corpus::{InMemoryCorpus, OnDiskCorpus}, | ||
events::SimpleEventManager, | ||
executors::{inprocess::InProcessExecutor, ExitKind}, | ||
feedbacks::{CrashFeedback, MaxMapFeedback}, | ||
fuzzer::{Fuzzer, StdFuzzer}, | ||
monitors::SimpleMonitor, | ||
mutators::{ | ||
mapped_havoc_mutations, optional_mapped_havoc_mutations, scheduled::StdScheduledMutator, | ||
}, | ||
observers::StdMapObserver, | ||
schedulers::QueueScheduler, | ||
stages::mutational::StdMutationalStage, | ||
state::StdState, | ||
}; | ||
use libafl_bolts::{ | ||
current_nanos, | ||
rands::StdRand, | ||
tuples::{tuple_list, Append, Merge}, | ||
}; | ||
|
||
/// Coverage map with explicit assignments due to the lack of instrumentation | ||
static mut SIGNALS: [u8; 16] = [0; 16]; | ||
static mut SIGNALS_PTR: *mut u8 = unsafe { SIGNALS.as_mut_ptr() }; | ||
|
||
/// Assign a signal to the signals map | ||
fn signals_set(idx: usize) { | ||
if idx > 2 { | ||
println!("Setting signal: {idx}"); | ||
} | ||
unsafe { write(SIGNALS_PTR.add(idx), 1) }; | ||
} | ||
|
||
#[allow(clippy::similar_names, clippy::manual_assert)] | ||
pub fn main() { | ||
// The closure that we want to fuzz | ||
// The pseudo program under test uses all parts of the custom input | ||
// We are manually setting bytes in a pseudo coverage map to guide the fuzzer | ||
let mut harness = |input: &CustomInput| { | ||
signals_set(0); | ||
if input.byte_array == vec![b'a'] { | ||
signals_set(1); | ||
if input.optional_byte_array == Some(vec![b'b']) { | ||
signals_set(2); | ||
if input.boolean { | ||
#[cfg(unix)] | ||
panic!("Artificial bug triggered =)"); | ||
|
||
// panic!() raises a STATUS_STACK_BUFFER_OVERRUN exception which cannot be caught by the exception handler. | ||
// Here we make it raise STATUS_ACCESS_VIOLATION instead. | ||
// Extending the windows exception handler is a TODO. Maybe we can refer to what winafl code does. | ||
// https://github.com/googleprojectzero/winafl/blob/ea5f6b85572980bb2cf636910f622f36906940aa/winafl.c#L728 | ||
#[cfg(windows)] | ||
unsafe { | ||
write_volatile(0 as *mut u32, 0); | ||
} | ||
} | ||
} | ||
} | ||
ExitKind::Ok | ||
}; | ||
|
||
// Create an observation channel using the signals map | ||
let observer = unsafe { StdMapObserver::from_mut_ptr("signals", SIGNALS_PTR, SIGNALS.len()) }; | ||
|
||
// Feedback to rate the interestingness of an input | ||
let mut feedback = MaxMapFeedback::new(&observer); | ||
|
||
// A feedback to choose if an input is a solution or not | ||
let mut objective = CrashFeedback::new(); | ||
|
||
// create a State from scratch | ||
let mut state = StdState::new( | ||
// RNG | ||
StdRand::with_seed(current_nanos()), | ||
// Corpus that will be evolved, we keep it in memory for performance | ||
InMemoryCorpus::new(), | ||
// Corpus in which we store solutions (crashes in this example), | ||
// on disk so the user can get them after stopping the fuzzer | ||
OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), | ||
// States of the feedbacks. | ||
// The feedbacks can report the data that should persist in the State. | ||
&mut feedback, | ||
// Same for objective feedbacks | ||
&mut objective, | ||
) | ||
.unwrap(); | ||
|
||
// The Monitor trait define how the fuzzer stats are displayed to the user | ||
let mon = SimpleMonitor::new(|s| println!("{s}")); | ||
|
||
// The event manager handle the various events generated during the fuzzing loop | ||
// such as the notification of the addition of a new item to the corpus | ||
let mut mgr = SimpleEventManager::new(mon); | ||
|
||
// A queue policy to get testcasess from the corpus | ||
let scheduler = QueueScheduler::new(); | ||
|
||
// A fuzzer with feedbacks and a corpus scheduler | ||
let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); | ||
|
||
// Create the executor for an in-process function with just one observer | ||
let mut executor = InProcessExecutor::new( | ||
&mut harness, | ||
tuple_list!(observer), | ||
&mut fuzzer, | ||
&mut state, | ||
&mut mgr, | ||
) | ||
.expect("Failed to create the Executor"); | ||
|
||
// Generator of printable bytearrays of max size 32 | ||
let mut generator = CustomInputGenerator::new(1); | ||
|
||
// Generate 8 initial inputs | ||
state | ||
.generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) | ||
.expect("Failed to generate the initial corpus"); | ||
|
||
// Merging multiple lists of mutators that mutate a sub-part of the custom input | ||
// This collection could be expanded with default or custom mutators as needed for the input | ||
// First, mutators for the simple byte array | ||
let mutations = mapped_havoc_mutations( | ||
CustomInput::byte_array_mut, | ||
&CustomInput::byte_array_optional, | ||
) | ||
// Then, mutators for the optional byte array, these return MutationResult::Skipped if the part is not present | ||
.merge(optional_mapped_havoc_mutations( | ||
CustomInput::optional_byte_array_mut, | ||
&CustomInput::optional_byte_array_optional, | ||
)) | ||
// A custom mutator that sets the optional byte array to None if present, and generates a random byte array of length 1 if it is not | ||
.append(ToggleOptionalByteArrayMutator::new(1)) | ||
// Finally, a custom mutator that toggles the boolean part of the input | ||
.append(ToggleBooleanMutator); | ||
|
||
// Scheduling layer for the mutations | ||
let mutator = StdScheduledMutator::new(mutations); | ||
// Defining the mutator stage | ||
let mut stages = tuple_list!(StdMutationalStage::new(mutator)); | ||
|
||
// Run the fuzzer | ||
fuzzer | ||
.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr) | ||
.expect("Error in the fuzzing loop"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should re-export the og
havoc_mutations
to not break all code out there, what do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That's a fair point. For code we control, I'd much rather have the import from the source (
mutators::scheduled
has nothing to do with havoc_mutations anymore).Unfortunately, deprecating re-exports is not supported yet (see issue). So I guess we're stuck with just re-exporting them and add a comment telling people to do the other thing and explaining why the re-export is there. Maybe link the issue as well.
Opinions?