So yeah topological sorting is one element, but that global stack is a data race! You need to test set inclusion AND insert into it in an ordered way. Global mutex is gross. To do so lock-free could maybe be done with a lock free concurrent priority queue with a pair of monatomic generation counters for the priorities processed then next, then some memo of updates so that the conflicting re-update is invalidated by violation the generation constraint. I see no less than 3 CAS, so updates across a highly contentious system get fairly hairy. But still, a naive approach is good enough for the 99% so let there be glitches!
yea, this is in javascript. it's inherently single-threaded in almost all contexts (e.g. node.js shared memory where you're intentionally bypassing core semantics for performance, and correctness is entirely on you)