Skip to content

Conversation

@tanmoy1989
Copy link
Contributor

@tanmoy1989 tanmoy1989 commented Aug 12, 2024

…ore-13.2.0.eb, dill-0.3.8-GCCcore-13.2.0.eb, flatbuffers-python-23.5.26-GCCcore-13.2.0.eb, grpcio-1.57.0-GCCcore-13.2.0.eb, ml_dtypes-0.4.0-gfbf-2023b.eb, nsync-1.29.2-GCCcore-13.2.0.eb
@boegel boegel changed the title {devel}[GCCcore/13.2.0] TensorFlow v2.15.1, Bazel v6.3.1, dill v0.3.8, ... {devel}[foss/2023b] TensorFlow v2.15.1, Bazel v6.3.1, dill v0.3.8, ... Aug 14, 2024
@boegel boegel added the update label Aug 14, 2024
@boegel boegel added this to the 4.x milestone Aug 14, 2024
@boegel
Copy link
Member

boegel commented Aug 14, 2024

@tanmoy1989 A bunch of patch files for Bazel are missing?

@tanmoy1989
Copy link
Contributor Author

@boegel: thanks, done!

}),
('Werkzeug', '3.0.2', {
'source_tmpl': SOURCELOWER_TAR_GZ,
'source_tmpl': '%(namelower)s-%(version)s.tar.gz',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

@tanmoy1989 tanmoy1989 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly the change was not intentional and also not manual. It appeared when I did: "eb --inject-checksums=sha256 --force Tensorflow-2.15.1-foss-2023b.eb" to include the two new patches - probably due to formatting by EasyBuild/Python?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue for that: easybuilders/easybuild-framework#4695

@pavelToman
Copy link
Collaborator

Is there any progress on this PR? I would use dill-0.3.8 in my PR.

@pavelToman
Copy link
Collaborator

@boegelbot please test @ generoso

@tanmoy1989
Copy link
Contributor Author

@pavelToman Thanks for triggering the bot. I am not aware of any further progress except me just waiting for someone to review it.

@akesandgren
Copy link
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 7 out of 7 (5 easyconfigs in total)
b-cn1613.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, Python 3.10.12
See https://gist.github.com/akesandgren/4a18f42922387d91999f7b2e16bd9af3 for a full test report.

'TensorFlow-2.15.1_fix-pybind11-build.patch',
'TensorFlow-2.15.1_fix-AVX512-eigen-compilation.patch',
'TensorFlow-2.15.1_fix-AVX512-eigen-compilation-gcc13.patch',
'TensorFlow-2.15.1_upgrade-ml_dtypes-dependency-version.patch',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the disable-avx512-extensions.patch

@akesandgren
Copy link
Contributor

@tanmoy1989 And please fix permissions on Bazel-6.3.1_add-symlinks-in-runfiles.patch
it should not be executable.

@tanmoy1989
Copy link
Contributor Author

tanmoy1989 commented Jun 19, 2025

@akesandgren thanks, changed the file permission! And also added the patch file on TensorFlow EC as you mentioned.

@akesandgren
Copy link
Contributor

Doh, you need to update the checksum for Bazel-6.3.1_add-symlinks-in-runfiles.patch since you changed the content, or just re-add that space you removed

@akesandgren
Copy link
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
b-cn1613.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, Python 3.10.12
See https://gist.github.com/akesandgren/e903f1d6629f2b751a432f3ca6581116 for a full test report.

@bedroge
Copy link
Contributor

bedroge commented Jun 23, 2025

Test report by @bedroge
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
node24 - Linux Rocky Linux 8.10, x86_64, AMD EPYC 7763 64-Core Processor (zen3), Python 3.6.8
See https://gist.github.com/bedroge/82f0479848647d1f0dc8f8ce04c58191 for a full test report.

akesandgren
akesandgren previously approved these changes Jun 24, 2025
Copy link
Contributor

@akesandgren akesandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akesandgren
Copy link
Contributor

I withdraw that approve, you still didn't remove tensorboard from the exts_list replacing grpcio with the external tensorboard
@tanmoy1989 Please try that either with 2.18.0 or the 2.15.1 that we have in PR #23080

@akesandgren akesandgren dismissed their stale review June 24, 2025 08:47

Still not using external tensorboard

@tanmoy1989
Copy link
Contributor Author

@akesandgren: Sure, just to confirm the things that I need to do:

  1. Remove tensorboard from exts_list
  2. Remove grpcio/1.67.1 from the dependencies
  3. add tensorboard/2.15.1 from PR {lib}[gfbf/2023b] tensorboard v2.15.1 #23080 to dependencies
    Could you please let me know that's all for me or if I have missed anything?

@akesandgren
Copy link
Contributor

@tanmoy1989 Quoting from my previous comment on this:
"Then remove all extensions that tensorboard already have. See the existing TensorFlow-2.15.1-foss-2023a.eb for which ones it doesn't need when using tensorboard as a standalone dependency"
Or look at what extensions are used in the respective standalone tensorboard EC's and remove them from TensorFlow.

@tanmoy1989
Copy link
Contributor Author

@akesandgren sorry, I missed that! Hopefully, I have now updated everything! Thanks for checking!

@akesandgren akesandgren changed the title {devel}[foss/2023b] TensorFlow v2.15.1, Bazel v6.3.1, dill v0.3.8, flatbuffers-python v23.5.26, grpcio v1.57.0, ml_dtypes v0.4.0, nsync v1.29.2 {devel}[foss/2023b] TensorFlow v2.15.1, Bazel v6.3.1, flatbuffers-python v23.5.26, ml_dtypes v0.4.0 Jun 24, 2025
@akesandgren
Copy link
Contributor

Test report by @akesandgren
FAILED
Build succeeded for 3 out of 4 (4 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/0f586a1cbedd1749078fac5b555227a8 for a full test report.

@akesandgren
Copy link
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/ba0f0029d13f10b814d256b57685c98a for a full test report.

@akesandgren
Copy link
Contributor

Hmmm, since we already have tensorboard 2.18.0 as dependency for other packages in 2023b, can you try changing to that?
It's fairly likely to work, but in case not we'll drop back to 2.15.1 and add an exception for tensorboard

@akesandgren
Copy link
Contributor

Test report by @akesandgren
FAILED
Build succeeded for 4 out of 5 (4 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/fe226c4ef4368d394acfc987ca7da1a1 for a full test report.

@akesandgren
Copy link
Contributor

Ok, TF 2.15 can't use tensorboard 2.18, so we'll need to go back to 2.15.1 and add an exception for that.
@boegel I don't remember how do do that. Can you assist here?

@Thyre Thyre added the 2023b label Aug 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants