Skip to content

Conversation

@Flamefire
Copy link
Contributor

As discussed in Slack and confirmed in pytorch/pytorch#44948 using those 2 is not officially tested and known to cause problems at least on POWER architectures

Hence the removal

Also updated to use the EasyBlock, promote Ninja to runtime dep and use EasyBuild installed protobuf (same as 1.4+ version ECs)

@Flamefire Flamefire force-pushed the pytorch1.3.1-remove-glog-gflags branch from bcf3231 to e7c4c54 Compare September 22, 2020 16:07
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
taurusml23 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/6873e198834692d0330c406d50957ca7 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusi5539.taurus.hrsk.tu-dresden.de - Linux RHEL 7.8, x86_64, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/e9c6d1f22c5470de77de8f637d680d3c for a full test report.

@boegel
Copy link
Member

boegel commented Sep 23, 2020

@boegelbot please test @ generoso

@boegel
Copy link
Member

boegel commented Sep 23, 2020

@Flamefire failing tests on POWER?

@Flamefire
Copy link
Contributor Author

Yes, need to add the exclusions too. Let's see what it needs -.-

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11325 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11325 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 7858

Test results coming soon (I hope)...

Details

- notification for comment with ID 697297773 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegel boegel changed the title Use EasyBlock for PyTorch 1.3.1 and remove glog&gflags Use EasyBlock for PyTorch 1.3.1 and remove glog & gflags Sep 23, 2020
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
taurusml22 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/de6c405c90509105ee46c4350f1fd80e for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
generoso-x-3 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/b17a2e40bc1d52d1376b6a7778f6fcff for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
taurusml18 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/5baa52b3844a57f7a3dc10f9ff6f057c for a full test report.

@boegel
Copy link
Member

boegel commented Sep 23, 2020

@Flamefire More? RuntimeError: test_nn failed!

@Flamefire
Copy link
Contributor Author

Jep, kinda expected. Adding them one by one and it will likely be the same as 1.4.0 in the end. Still need to test them...

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
taurusml15 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/8839ded3134029d6aaf37c7b4f11e5dd for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
taurusml30 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/367eed6c55331269e8b222c855f6b2ea for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusml13 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/59c27294935613e3016790b6f8a6e8c3 for a full test report.

@lexming
Copy link
Contributor

lexming commented Sep 26, 2020

Test report by @lexming
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in this PR)
node157.hydra.os - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, Python 2.7.5
See https://gist.github.com/9b5f92b8854fc91b24949f3d98ae91bf for a full test report.

@lexming
Copy link
Contributor

lexming commented Sep 26, 2020

Test report by @lexming
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in this PR)
node157.hydra.os - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, Python 2.7.5
See https://gist.github.com/3a8b3597ec5cedf39d12934a41384d64 for a full test report.

@Flamefire
Copy link
Contributor Author

@lexming No idea why the CUDA test fails, another PyTorch bug? Ignore? Retry?

@lexming
Copy link
Contributor

lexming commented Oct 1, 2020

Test report by @lexming
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node379.hydra.os - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz, Python 2.7.5
See https://gist.github.com/75fbbf933f58b71b4086ee3f0f723fa3 for a full test report.

@lexming
Copy link
Contributor

lexming commented Oct 1, 2020

Test report by @lexming
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node101.hydra.os - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, Python 2.7.5
See https://gist.github.com/ae3fcc91d5465ba473e2c39b1885d69e for a full test report.

Copy link
Contributor

@lexming lexming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lexming
Copy link
Contributor

lexming commented Oct 1, 2020

@Flamefire now it worked, the randomness of PyTorch tests is always baffling. Going in, thanks!

@lexming lexming merged commit 7976898 into easybuilders:develop Oct 1, 2020
@Flamefire Flamefire deleted the pytorch1.3.1-remove-glog-gflags branch October 2, 2020 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants