Skip to content

Embed binary files as base64 encoded strings #3421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025

Conversation

jandubois
Copy link
Member

They need to be broken into smaller lines; yqlib doesn't do this, and yamlfmt fails with a buffer overflow when it tries to keep existing line breaks.

Fixes #3403

@jandubois jandubois added this to the v1.1.0 milestone Apr 11, 2025
@jandubois jandubois requested a review from a team April 11, 2025 06:51
@jandubois
Copy link
Member Author

I'm going to switch this to draft because I just thought of 2 changes I want to make:

  • Add an INFO level log message when a file is base64 encoded
  • Check max line length for scripts / text data files, and encode them as well if they are too long

@jandubois jandubois marked this pull request as draft April 11, 2025 16:08
@jandubois jandubois marked this pull request as ready for review April 11, 2025 16:56
@jandubois
Copy link
Member Author

jandubois commented Apr 11, 2025

I'm going to switch this to draft because I just thought of 2 changes I want to make:

  • Add an INFO level log message when a file is base64 encoded
  • Check max line length for scripts / text data files, and encode them as well if they are too long

Implemented:

cat tmpl.yaml
base: template://alpine
provision:
- mode: data
  file: sample.data
  path: /tmp/dataperl -E 'say "A" x 66000' > sample.datalimactl tmpl copy --embed tmpl.yaml /dev/null
WARN[0000] `base` is experimental
WARN[0000] `provision[*].file` and `probes[*].file` are experimental
INFO[0000] File "/Users/jan/suse/lima/sample.data" is being base64 encoded: line 1 (offset 0) is longer than 65000 characterscp /bin/cat sample.datalimactl tmpl copy --embed tmpl.yaml /dev/null
WARN[0000] `base` is experimental
WARN[0000] `provision[*].file` and `probes[*].file` are experimental
INFO[0000] File "/Users/jan/suse/lima/sample.data" is being base64 encoded: unprintable character '\x00' at offset 4

@jandubois
Copy link
Member Author

jandubois commented Apr 11, 2025

Full example:

cat tmpl.yaml
base: template://alpine
provision:
- mode: system
  file: my.scriptecho -e 'echo "Hello world\a\n"' > my.scripthexdump -C my.script
00000000  65 63 68 6f 20 22 48 65  6c 6c 6f 20 77 6f 72 6c  |echo "Hello worl|
00000010  64 07 0a 22 0a                                    |d..".|
00000015limactl start -y tmpl.yaml
WARN[0000] `base` is experimental
WARN[0000] `provision[*].file` and `probes[*].file` are experimental
INFO[0000] File "/Users/jan/suse/lima/my.script" is being base64 encoded: unprintable character '\a' at offset 17
grep -A2 provision ~/.lima/tmpl/lima.yaml
provision:
- mode: system
  script: !!binary ZWNobyAiSGVsbG8gd29ybGQHCiIKlimactl shell tmpl sudo grep Hello /var/log/cloud-init-output.log
Hello world

@jandubois
Copy link
Member Author

Sorry, did one more force-push to move emcodeScriptReason just behind the definition of maxLineLength, and before binaryScript. Feels more logical to me this way.

@jandubois jandubois force-pushed the binary-strings branch 2 times, most recently from 63f9a5d to 3833782 Compare April 12, 2025 18:49
@jandubois
Copy link
Member Author

Not that it really matters, but I've decided that maxLineLength should include the newline at the end, so I've updated the PR once more.

They need to be broken into smaller lines; yqlib doesn't do this,
and yamlfmt fails with a buffer overflow when it tries to keep
existing line breaks.

Signed-off-by: Jan Dubois <[email protected]>
// yamlfmt will fail with a buffer overflow while trying to retain line breaks if the line
// is longer than 64K. We will encode all text files that have a line that comes close.
// maxLineLength is a constant; it is only a variable for the benefit of the unit tests.
var maxLineLength = 65000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's a good idea to declare maxLineLength as var instead of const.
For unit tests, we can use strings.Repeat to construct a very long string.

Suggested change
var maxLineLength = 65000
const maxLineLength = 65000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using strings.Repeat before, but changed it in response to a request from @AkihiroSuda to add a test for `maxLineLength'.

It is not an exported variable, and the comment right above says it is only mutable so unit tests can mock it. Unit tests should use constant data for "expected" results. Otherwise a thinko in the implementation of the function that is being tested can carry over in the test function constructing the test data and cancel out the error in the implementation. Not that it matters in a trivial case like this.

I think it is fine as is.

@jandubois jandubois requested a review from AkihiroSuda April 13, 2025 19:43
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit bda2786 into lima-vm:master Apr 14, 2025
31 checks passed
@jandubois jandubois deleted the binary-strings branch April 14, 2025 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embedding a larger binary file doesn't work
3 participants