-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Hello. I am sharing a crazy idea we had this week.
Based on the pieces of advice given in the "performances" section of SLiM manual, I assume that Eidos script lines written in every block/callback/function of a model (or at least some intermediate, pre-parsed representation of them) are interpreted at runtime to figure which internal functions to call, check whether values have correct types, handle the bookkeeping of variables etc.
Although this provides incredible model flexibility, it must come with some overhead which I assume is twofold:
-
The cost of interpretation itself (typechecking, bookkeeping of variables, dynamic calls resolution etc.), which I have some assumptions about:
- Easy to measure.
- Small compared to the time passed within the actual C++ functions "doing the job" (so it's okay).
- Irreducible: already optimal or at least very near.
- Predictible and linear: the more Eidos script lines the more overhead.
-
The cost of missed optimisations. Lexical variables and dynamic calls resolution make it impossible for a compiler to statically analyze the model and inline function calls, prune dead branches, unroll loops, optimize variables+typechecking away etc. I assume that this fraction of the overhead is:
- Variable and unpredictable, highly nonlinear : overhead depends on how easy it would be for a compiler to optimize a given user model.
- Difficult to measure, because the performances of a model written in Eidos script should be compared to that of an alternate, implementation of the same model written in a pure low-level language, in a way that depends on the quality of this alternate implementation.
- Possibly large or very large, especially for models involving numerous Eidos variables/lines and/or tight Eidos loops.
Addressing 2. requires a huge amount of work, which I would break down to the following:
-
Select a couple typical SLiM recipes, and implement them in some alternate pure low-level language (C/C++/Rust..) to compare performances against their Eidos version. The alternate implementations should be free from the Eidos library, without dynamic calls resolutions. No flexibility required.
-
If the performances are nearly identical, then stop here. There is no point in experimenting further : interpreted Eidos is already optimal.
-
If there is some juicy performance improvement, then we have found that Eidos script needs a compiler. Write this compiler, featuring one Eidos recipe at the time, until the whole language is covered.
The (experimental) compiler would not need to interfere with the software at all. It would just feed from a simulation model file written in Eidos script, and output an alternate, dedicated, rigid implementation of the same model in some lower level language (C/C++/Rust..). That output could then be compiled down to optimized machine code by existing low-level compilers (gcc/clang/rustc..) depending on user architecture etc.
The work is huge because a e.g. Eidos -> Rust compiler would need to embed a complete, modular representation of SLiM simulation model, so as to be able to generate minimal code corresponding to any given user-defined model. Fortunately, the SLiM simulation model is very well defined to the point that it already has a very precise and rigorous specification : the SLiM manual and the grammar of Eidos script itself <3 These two documents constitute a very strong basis to build such a tool upon IMO, because they mean that SLiM model is already polished, ergonomic, meaningful, and free of the kind of uncertain quirks that would make the writing of a compiler a nightmare. Therefore, my opinion is that, although huge, and if juicy, this work would also be pleasant, instructive and fun.
The first reason I am opening this issue is to share these thoughts with the community here, asking whether there has been a precedent and whether more is known about the possible magnitude of 2.
The second reason is that I would personally be very excited to take a shot at this, as it falls within the crossed scopes of my interest, of the interest of people in my lab, of my experience and of my job. Although I don't have any bandwidth to get into this right now, I'd be happy to save some by 2025 if this turned out to be interesting to you :)