Skip to content

Non deterministic macros and id consistency problem #12627

Closed
@cooldome

Description

@cooldome

There are macros that are not deterministic in a sense that what these macros do is actually differ when their arguments remain identical. Since their behaviour is driven by the external state usually files on the disk.
the most classic example: a macro that generates a file on the first run and then reusing generated file on the following runs. Putting generated code into the file has many advantages: code gets a lineinfo information and hence debuggable, speeds up the compilation process if macro is expensive.

I have a macro that is invoked 100+ times and generates a file on each invocation. It is using symBodyHash to check if the file is still up to date or if it needs to be regenerated on subsequent runs.

There is one problem. On first run macro generates a lot of code and hence consumes a lot of Nim ids (idgen.nim), while on the second run it barely does anything and consumes a little of ids. It affects internal state of compiler and sym ids in all templates expanded after macro invocation. If you compare generated CPP code you will find that only ids of genSyms have changed and some proc hashes. Nim keeps on recompiling the whole project on every run, because ids keep on changing as first invocation of my macro affects ids of all further invocations.

I have found rather easy solution to the problem: add
idSynchronizationPoint(2000) call to semAfterMacroCall function in sem.nim. This keeps generated ids identical in all runs. idSynchronizationPoint(1000) wasn't enough. FYI, idSynchronizationPoint skip ids to the next round ids divisible by its argument.

Possible solutions.

  1. Just add idSynchronizationPoint(2000) to semAfterMacroCall.
    We might expose ourselves to the exhaustion of all ids problem in macro heavy code, but problem could be more theoretical than practical. Did anyone have exhausted all ids in the past?
    I might be the case that 2000 is not enough in the future and we will have to increase it.

  2. Expose idSynchronizationPoint function in macros.nim to the user. Document it really well. User can call it at the bottom of his non deterministic macro to keep Nim internal state consistent.

  3. Go fancy, introduce a new syntax/pragma for non deterministic macros to make compiler aware that it needs to sync ids after such macro call. Something like this:

macro x(a: untyped): untyped {.sideEffect.}

The problem of selection of how big skip range should be remains, but we have at a lot less of skips to make.

I like second solution. While it feels undesirable to open up the internal state of compiler, it always possible to make exposed function do nothing and mark it as deprecated once we have better solution.
Selecting the arguments is now a user problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions