Skip to content

Conversation

erikmd
Copy link
Collaborator

@erikmd erikmd commented Sep 8, 2021

This PR is intended to Reproduce and Fix #414

…atch guard

Aim: mitigate a TOC/TOU race condition which occurs when
Learnocaml_process_exercise_repository.n_processes > 1,
triggered by a command such as "learn-ocaml build -j 2".

This isn't the only possible fix/workaround, just a concise one
relying on error handling.
Instead of tweaking Lwt_utils.mkdir_p, we add a call to this function
for the "www/static" directory before running the parallel build jobs.

MINOR-DRAWBACK: "www/static" is created even if there is no exercise!
@erikmd
Copy link
Collaborator Author

erikmd commented Sep 8, 2021

Hi all,

I've tried 2 different fixes in this PR (tweak Lwt_utils.mkdir_p (in my reverted commit) or just calling Lwt_utils.mkdir_p once more).

Both fix the issue mentioned in issue #414: Avoid Cannot process exercises: Unix.Unix_error(Unix.EEXIST, "mkdir", "./www/static")

However, the issue is more pervasive than what I thought at first sight: despite this potential fix,
the parallel processing is still broken… because we get randomly, with the 2-exercises demo repositiory, one of these outcomes:

  1. OK
Learnocaml v.0.12 running.
Updating app at ./www
demo2                        [OK]
demo                         [OK]
  1. Partial build and freeze (the build process hangs):
Updating app at ./www
demo2                        [OK]
  1. something yet more strange:
Learnocaml v.0.12 running.
Updating app at ./www
Grader error: Lwt.Resolution_loop.Canceled
demo                       [FAILED]
demo2                        [OK]
demo                         [OK]

So, I'm going to postpone this for the time being − fixing this parallel build issue would be nice, but will require more investigation in src/repo/learnocaml_process_exercise_repository.ml

(FTR, the outputs above mention "0.12" but the tested version was obviously the commits from this PR;
anyway, I think "the ambient version number" should be bumped in master anytime soon, and after each release)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

learn-ocaml build -j 2 is broken

1 participant