Skip to content

Conversation

@FransUrbo
Copy link
Contributor

@FransUrbo FransUrbo commented Dec 1, 2025

Motivation and Context

A previous commit (e865e78), the local keyword was removed because of bashism.

Removing bashisms are correct, however this caused multiple functions using the same variable to overwrite each other.
So make function variables unique in the (now) global namespace.

Plus additional shell scripting best practices changes and code cleanups.

Fixes: #17963

Description

Prevent the previously local variables, now global, from being overwritten by another function by renaming them to something unique.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the OpenZFS code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.
  • This PR have been tested on Debian GNU/Linux Trixie as well as Testing (as of 12/Dec 2025).

@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 5295b17 to 6f50e21 Compare December 1, 2025 18:44
Copy link
Contributor Author

@FransUrbo FransUrbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm perfectly happy to drop this commit.

I can't remember where I saw it, or when, but it's more than ten years ago, that "good shell script practices" was to write variables as ${variable}, not $variable.

Weather that's true or not, it IS annoying to me to see both side by side - PICK ONE!! :D :D.

Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good grief, what a mess. Good work tracking all this down.

One low-key suggestion: I wonder if it would be useful to prefix the "local" variables with an underscore, just to make them easier to see?

I have not tested and don't have this problem, but based on the description and changes I can't obviously see how it could do anything worse, assuming the original logic was fine. Approving on that basis.

Thanks!

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this script clearly needed some attention. Thanks for the cleanup.

I haven't manually tested this either, but reading through it makes sense and is definitely an improvement.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Dec 3, 2025
@FransUrbo
Copy link
Contributor Author

Thanx guys, but for complete transparency, I haven't tested it myself!

Now, I could setup a new initrd (which I've done in the past obviously, but it was quite a few years ago :) and do the test on say a Debian GNU/Linux box, but if this is used in many other places - it have at least at some point been used by Debian/kFreeBSD (which is no more unfortunately), Ubuntu and Gentoo and others.. I can't test them all!

However, I do (vaguely!) remember that we had a talk about this being in the ZFS repo itself, instead of letting each dist/os do their own. Our compromise was contrib - "no support, but if you want to use it, here it is". ?

Not meaning we should abandon it completely, but my thinking here is that we do the best "ocular inspection" we can, and if we all think it's ok, then we merge it and we fix any problems users/dists report.. ?

I know for a fact that the ONLY bug that's been reported (or at least the only one I've seen!) about this is that Debian GNU/Linux one, and it was for the testing dist - which is known to be a bit .. "fritzzy" at times :).

@FransUrbo
Copy link
Contributor Author

I'm perfectly happy to test this on at least one system if you want me to, I don't really have much else to do now for the next two weeks :).

One low-key suggestion: I wonder if it would be useful to prefix the "local" variables with an underscore, just to make them easier to see?

I think that would be very sensible thing to do, I'll update the commits.

@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 6f50e21 to b38eec5 Compare December 3, 2025 10:40
@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Dec 3, 2025
@FransUrbo FransUrbo force-pushed the fix_global_variables branch 3 times, most recently from 251022f to 4dc3052 Compare December 3, 2025 11:35
@FransUrbo
Copy link
Contributor Author

I'm starting to wonder if this is actually the problem!

I can't reproduce the problem on Trixie (which, from what I can see, uses the same version as Testing).

@FransUrbo
Copy link
Contributor Author

I've asked the original Debian GNU/Linux submitter for more information, since I can't reproduce the problem (on Trixie, which is almost identical to the version in Testing).

The original bug report doesn't actually state that it is a problem with the initrd, I guess they only assumed it was?

We can hold off on this PR, since it doesn't actually solve the problem. Although, while trying to reproduce it, I noticed another edge case, and some code cleanups that may or may not be appropriate.

Let's wait and see what happens, if this can be reproduced in a better way.

@behlendorf
Copy link
Contributor

Let's wait and see what happens, if this can be reproduced in a better way.

That sounds good. I'd also be okay with applying all of the mechanical cleanup you did in 9873d41 as a starting point. That won't fix the bug, but if you can test it on at least one platform and it passes visual inspection it's pretty low risk.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 4, 2025
@FransUrbo
Copy link
Contributor Author

Yeah, I've tested the new version, with all the commits applied in the PR.

It at least didn't break anything :). I do think it's prudent to apply the PR (eventually), just to make things easier to maintain.
But then we/I need to reword things here and there.

I'll see if I can get a VM installed with Debian GNU/Linux Testing in the same way just to make sure. I'm starting to believe that it's a problem "elsewhere". As in, not in the initrd.

@FransUrbo FransUrbo force-pushed the fix_global_variables branch 4 times, most recently from 18e102a to 3836534 Compare December 4, 2025 23:30
@FransUrbo
Copy link
Contributor Author

Can't reproduce the original problem on Testing either..

But the latest version of the script works fine, on both Trixie and Testing/Forky/Sid. So there's that.

Now we need to wait for the original bug reporter to come back with more information.

@FransUrbo
Copy link
Contributor Author

Since I can't reproduce the original bug on either Trixie or Testing, and it's been a week since I asked for more information on that Debian GNU/Linux bug report, but still nothing.

Doesn't really mean that much, it took me four months until I found out about it and had a look :).

So what do we do about this PR? Should we reword it, merge everything into one commit and call it "code cleanup and good practices reworking"? Or does anyone have another suggestion?

I do believe that the changes are worthwhile in the long run, even if it doesn't actually solve anything (if there indeed IS something to be "solved"!). They should help future changes, improvements and maintenance..

PS. Do we, OpenZFS, have a shell coding style practices? I know there is one (or used to be one for ZFSOnLinux anyway :) for C, but is there one for shell scripts?

@behlendorf
Copy link
Contributor

Well even if this change doesn't fix whatever the mystery bug is I think the code cleanup is definitely worthwhile. I'd actually even prefer that keep the patch stack broken up with each commit clearly describing the one thing it's cleaning up.

Do we, OpenZFS, have a shell coding style practices?

Not exactly. We don't have a style guide, but as a minimum bar for non-test scripts we do want them to cleanly pass shellcheck. The CI will run the make shellcheck target to enforce that. We have excluded some legacy scripts from that check and this might have been one of them. If you can make sure it's now getting checked by make shellcheck and is passing I'm fine with merging this cleanup.

@FransUrbo
Copy link
Contributor Author

No, the shellsheck is run, it got triggered a few times but I fixed, rebased and force pushed..

I'll just update the initial commit message then, the one that talks about "the bug" and make it more nondescript :).

In a previous commit (e865e78), the
`local` keyword was removed in functions because of bashism.

Removing bashisms is correct, however this could cause variable overwrites,
since several functions use the same variable name.

So this commit make function variables unique in the (now) global name
space.

The problem from the original bug report (see openzfs#17963) could not be duplicated,
but it is still sane to make sure that variables stay unique.

Fixes: openzfs#17963
Signed-off-by: Turbo Fredriksson <[email protected]>
It's considered good practice to:
1) Wrap the variable name in `{}`.
   As in `${variable}` instead of `$variable`.
2) Put variables in `"`.

Also some minor error message tuning.

Signed-off-by: Turbo Fredriksson <[email protected]>
This just to make them easier to see.

Signed-off-by: Turbo Fredriksson <[email protected]>
When a pool is degraded, or needs special action, the `zpool import`
(without pool to import) line will report:
```
  pool: rpool
    id: 01234567890123456789
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
   [..]
```
If the import with the pool name fails, it is supposed to try importing
using the pool ID.

However, the script is also getting the `action` line (and probably `scrub:`
if/when that's available):
  pool; The pool can be imported using its name or numeric identifier.;config:;
which causes issues on consequent import attempts.

Cleanup the information by rewriting the `sed` command line.

Signed-off-by: Turbo Fredriksson <[email protected]>
The file `/etc/default/zfs` is already sourced by the `/etc/zfs/zfs-functions`,
so no need to source it again.

Signed-off-by: Turbo Fredriksson <[email protected]>
The `ZFS_INITRD_ADDITIONAL_DATASETS` variable is used in the initrd
script to boot additional OS file systems besides the root file system.
But it wasn't included as an example in the config files.

The `ZFS_POOL_EXCEPTIONS` *was* included in the example defaults file,
but it was not exported, so not available in the initrd.

Signed-off-by: Turbo Fredriksson <[email protected]>
More code standard changes, where if/then is on different lines.
To have it on the same, or on different lines, can be argued, but
we need to pick one, and try not to mix how to do things.

Signed-off-by: Turbo Fredriksson <[email protected]>
There's no real documenation (which should probably be written!),
so instead document the code the best we can on what's going and
with the mounting of file systems to make future updates easier.

Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Turbo Fredriksson <[email protected]>
@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 3836534 to 49059d0 Compare December 12, 2025 21:33
The `type` command is an optional feature in POSIX, so shouldn't be
used.

Instead, use `command -v`, which commit
  e865e78
did, but it missed this file.

Signed-off-by: Turbo Fredriksson <[email protected]>
@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 49059d0 to f172c41 Compare December 12, 2025 21:46
@FransUrbo
Copy link
Contributor Author

FransUrbo commented Dec 12, 2025

I found another file with bashisms that I fixed "while I'm at it" :): etc/zfs/zfs-functions.

However, the checkstyle check does go through that file, but the bashisms wasn't triggered! So I'm not sure we can trust it.. ?

Also, when running the make checkstyle myself in my Testing VM, I get:

[...]
SHELLCHECK scripts/mancheck.sh
SHELLCHECK scripts/paxcheck.sh
scripts/paxcheck.sh:13:6: note: Avoid x-prefix in comparisons as it no longer serves a purpose. [SC2268]
scripts/paxcheck.sh:29:6: note: Avoid x-prefix in comparisons as it no longer serves a purpose. [SC2268]
make: *** [Makefile:14395: shellcheck-here-scripts^paxcheck.sh] Error 1

I'm running shellcheck v0.11.0.

EDIT: There's quite a few triggers by shellcheck, so I'll add that as a commit as well.

@FransUrbo
Copy link
Contributor Author

Let me know how you want to proceed regarding the commits.

Newer versions of `shellcheck` and `checkbashism` finds more than
previous, so fix those.

Signed-off-by: Turbo Fredriksson <[email protected]>
@FransUrbo
Copy link
Contributor Author

I'm happy with what it looks like now, let me know how to proceed.

The last shellcheck/checkbashism commit seem to work on the older (?) versions the CI is running, it passes successfully.

I've also tested these last versions of all the changes I made on my VM, and it works just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

initramfs-scripts: more elegant canmount property handling?

4 participants