@@ -218,12 +218,18 @@ struct DAE : public Pass {
218218 }
219219 }
220220
221- // For each function, the set of callers. This is somewhat expensive to
222- // compute, so we don't do it in every iteration. Rarely, a call may vanish
223- // (due to applying a constant param that then lets the optimizer remove it),
224- // and in such cases we are over-approximating the set of callers here, which
225- // can lead to a little wasted work, but this is still more efficient than
226- // computing callers precisely in each iteration.
221+ // For each function, the set of callers. This is used to propagate changes,
222+ // e.g. if we remove a return value from a function, the calls might benefit
223+ // from optimization. It is ok if this is an over-approximation, that is, if
224+ // we think there are more callers than there are, as it would just lead to
225+ // unneeded extra scanning of calling functions (in the example just given, if
226+ // a caller did not actually call, they would not benefit from optimization,
227+ // but no harm is done, and no optimization is missed). Such over-
228+ // approximation can happen in later optimization iterations: We may manage to
229+ // remove a call from a function to another (say, after applying a constant
230+ // param, we see the call is not reached). This is somewhat rare, and the cost
231+ // of computing this map is significant, so we compute it once at the start
232+ // and then use that possibly-over-approximating data.
227233 std::vector<std::unordered_set<Name>> callers;
228234
229235 bool iteration (Module* module , DAEFunctionInfoMap& infoMap) {
@@ -276,9 +282,8 @@ struct DAE : public Pass {
276282 }
277283 }
278284
285+ // See comment above, we compute callers once and never again.
279286 if (callers.empty ()) {
280- // This is faster, but confirm no regress... maybe recompute once if no other opts kick in, from outside?
281- // a ctually thisis fine... it only does more unneeded work at worst! comment up top. but test on more code, and with dae-opt too
282287 callers.resize (numFunctions);
283288 for (auto & [func, info] : infoMap) {
284289 for (auto & [name, calls] : info.calls ) {
0 commit comments