Skip to content

Conversation

@kgerheiser
Copy link
Contributor

@kgerheiser kgerheiser commented Dec 11, 2020

setup_cron.sh sets variables and is the entry point of cron. It then downloads hpc-stack and runs build-and-test.sh with the options given by setup-cron.sh.

Then, build-and-test.sh builds the develop branch hpc-stack (along with the debug version of ESMF), and then it checks out the ufs-weather-model and runs the regression test.

A hash is saved after each build so that it is only re-built when there are changes.

A nicely formatted email is sent out with the results of the test, for example:

==============================================================================
Mon Dec 14 03:33:19 UTC 2020
hpc-stack hash: bbbea9dec4768febe3feb7a71a0b1268a5cc80b7
ufs hash: 9bd6b59b3c5a230558430f238e32ca60d2844e57

hpc-stack build: SUCCESS
hpc-stack log: /home/Kyle.Gerheiser/.hpc-stack/logs/hpc-stack_12-14-2020-00:53.log
UFS regression test: PASS
ufs log: /home/Kyle.Gerheiser/.hpc-stack/logs/ufs_12-14-2020-02:29.log

==============================================================================

Fixes #81

@kgerheiser kgerheiser marked this pull request as draft December 11, 2020 18:09
Copy link
Contributor

@edwardhartnett edwardhartnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@kgerheiser kgerheiser marked this pull request as ready for review December 14, 2020 04:47
@kgerheiser
Copy link
Contributor Author

Ready for review now.

I tested it out on Hera over the weekend. I fixed some odds and ends, added the UFS tests, and added a nicely formatted email with the results.

I'll install it across our machines this week.

It will be installed in a public nightly-develop folder every day (if there are changes in hpc-stack) alongside the versioned release so that anyone can try out the latest and greatest.

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some suggestions for modularizing

@kgerheiser
Copy link
Contributor Author

How about this?

I factored out the testing into its own script, and created another script test-applications.sh which calls the individual test scripts.

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this much better.
One suggestion is to also turn on set -x. It produces a lot of output, but is worth it for debugging.
Do however set +x around any module commands.

@kgerheiser
Copy link
Contributor Author

kgerheiser commented Dec 14, 2020

I removed the if-block and fixed the typo.

I added a special return code build-hpc-stack.sh if the stack isn't built because the hash hasn't changed so that the top-level script knows not to run the tests. Is there a better way to handle that case?

I enabled -x in build-hpc-stack.sh and redirected it to a log.

I don't want to get an email everyday with all that output. It's available in the logs.

@kgerheiser
Copy link
Contributor Author

Done

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great.
Thanks for setting this up.

@@ -0,0 +1,76 @@
#!/bin/bash -l
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you are using -l so that $MODULESHOME/init/bash is sourced because that is the case on HPC's.
When a sudo executes this script, there will be no login shell.

My recommendation is to remove -l and do a source $MODULESHOME/init/sh wherever modules are being loaded/used/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try it out, but I don't think $MODULESHOME will even be set. You'll have to source the absolute directory like /apps/contrib/lmod/lmod/init/bash which is a pain to set for each machine.

…n on HPC systems

There are several scripts: setup-cron.sh, build-hpc-stack.sh, test-applications.sh, and test-ufs.sh

Configure variables in setup-cron.sh and then have cron run that.

When run the setup-cron.sh will build hpc-stack and then run tests from downstream applications like the ufs-weather-model.

An email will be sent with the results and logs will be saved the the log directory for more information.
@kgerheiser kgerheiser merged commit fb3ed1d into NOAA-EMC:develop Dec 16, 2020
@kgerheiser kgerheiser deleted the feature/cron-ci branch December 16, 2020 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

set up cron job on hpc systems

3 participants