Skip to content

Expressivity Gap: concurrent ABA-safe stack (atomics + provenance = pain) #480

Open
@RalfJung

Description

@RalfJung

Alice Ryhl brings up the following data structure: a concurrent stack where all elements can only be popped at once.

Stack declaration:

static STACK_TOP: AtomicPtr<Elem>;

struct Elem {
    next: *mut Elem,
}

Push:

let elem = malloc();
elem.next = STACK_TOP.load(Acquire);
while let Err(new_next) = STACK_TOP.cas(elem.next, elem, AcqRel) {
    elem.next = new_next;
}

Pop all:

let next = STACK_TOP.swap(null);
while !next.is_null() {
    let curr = next;
    next = curr.next;
    use(curr);
    free(curr);
}

This can run into ABA situations in push if, between the load and the cas, the stack gets popped, freed, new elements get pushed, and they re-use the same address. Then STACK_TOP will contain an Elem whose next points to the right element but still uses the provenance of the old element. Adding any code between the load and the cas cannot fix this since the new element might bet allocated after all of that code runs, and from_exposed_addr cannot "grab" a provenance that does not even exist yet.

IMO what should be done here is that next should be declared to have type usize, and one just passes the pointers through exposed_addr/from_exposed_addr when they enter/leave the stack. However some people seem to consider it very important that the stack uses a normal pointer type and can be used in the rest of the code without being aware that there are concurrency shenanigans going on. I am not sure why that is so important, given in particular that I think this will at best be an utter nightmare to specify, but here are some options I have seen or thought of:

  1. I have seen a proposal to fix this for C++ with some sot of operation that "refreshes the provenance of a pointer and every other pointer reachable through it". That operation sounds like a nightmare for program verification since it can change state that is very far away from where that operation happens; we don't currently have any operation that reaches transitively through pointers and I think that kind of locality is important.
  2. Make it so that from_exposed_addr actually works here. For instance, rather than angelically guessing the provenance when from_exposed_addr get called, we could generate a fresh symbolic provenance and gather constraints on how that provenance is used, and then raise UB when the constraints become unsatisfiable. (This is basically the "udi" part of PNVI-ae-udi.) Unfortunately I have no great idea how to do this with a complex aliasing model.
  3. Introduce a kind of provenance that has the following effect: when doing a load through a pointer with such provenance, all provenance that was loaded is refreshed by passing the pointer through |p| from_exposed_addr(p.exposed_addr()), and the loaded provenance also recursively obtains this effect. However, now loads through such pointers cannot be DCE'd, since they have the side-effect of exposing some provenance, so this doesn't seem practical either.

So to me (2) is the only realistic option I have seen so far -- which conveniently would also get rid of angelic choice again, if we can make it work...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions