Skip to content

Avoid leaking keys by mistake with --upload-test-report #4877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 21, 2025

Conversation

Crivella
Copy link
Contributor

@Crivella Crivella commented May 14, 2025

Reasoning

While setting EASYBUILD_TEST_REPORT_ENV_FILTER is a viable option to avoid publishing sensitive info from the environment when running a build with --upload-test-report, there should be a way for easyblocks that read keys from the environment to tell easybuild what are this sensitive variables and exclude them automatically.

This would guard against mistakes in EASYBUILD_TEST_REPORT_ENV_FILTER or when requesting PR authors to make a build with --upload-test-report when the reviewer does not have access to a key for testing.

The changes in this PRs will make it much more difficult to leak private keys/secrets from the environment when running a build with --upload-test-report

Changes introduced

  • Added filtering of environment variable based on value and regex patterns
    • Added default list of regexes of known token patterns to be excluded automatically from the reports
  • Added filtering of environment variables based on the name of the variable and a string pattern
    • Added default list of string patterns to be excluded automatically (if a variable contains any of this patterns in a non-case sensitive way) from the report
    • Added a function exclude_env_from_report_add to add a string pattern to be excluded from reports
    • Added a function exclude_env_from_report_clear to clear the custom list of patterns to be excluded (does not modify the default list

Possible list of missing improvements

  • Add 2 functions 2 add/clear to a list of regex patterns for the values
  • Add a build option to force ignoring the default exclude patterns

Possible unexpected behavior

Right now the Environment variable is excluded from all reports.
Another possibility would be to exclude it only from uploaded gists.

  • This would exclude the matching variables globally and not only for EC ran with the easyblock in question, and could lead to excluding other variables from reports in a Bundle build or similar

@Crivella
Copy link
Contributor Author

Where the parts like

        for pattern in patterns[:2]:
            self.assertIn(pattern, res['full'])

supposed to test res['full'] again or are they supposed to test res['overview'] ?

@Crivella
Copy link
Contributor Author

As suggested in #4877 (comment)
I've also tried to replace those full with overview and the testsuite still passes, so i guess that was the intended usecase (otherwise the same patterns were being tested twice with nothing in between justifying it)

@Crivella Crivella marked this pull request as ready for review May 15, 2025 14:51
@Crivella Crivella changed the title Added possibility to exclude environment variables from a report on-demand Avoid leaking keys by mistake with --upload-test-reports May 15, 2025
@Crivella Crivella changed the title Avoid leaking keys by mistake with --upload-test-reports Avoid leaking keys by mistake with --upload-test-report May 15, 2025
ocaisa
ocaisa previously requested changes May 16, 2025
@branfosj
Copy link
Member

Should we also add a warning message, something like:

The test report generated by `--upload-test-report` will be uploaded as a GitHub Gist. This will include most environment variables. EasyBuild attempts to filter out ones that include tokens, secret keys, etc., but this filtering may not be perfect. You can add additional environment filtering using `--test-report-env-filter` - see https://docs.easybuild.io/integration-with-github/#github_test_report_env_filter

Also, if it is possible to tell the user to check what will appear, then we could include instructions of generating a report to see what is in it.

@Crivella
Copy link
Contributor Author

Fixed (0e29f98) the failing test that was introduced by the change at

def test_test_report_env_filter(self):
"""Test use of --test-report-env-filter."""
def toy(extra_args=None):
"""Build & install toy, return contents of test report."""
eb_file = os.path.join(os.path.dirname(__file__), 'easyconfigs', 'test_ecs', 't', 'toy', 'toy-0.0.eb')
args = [
eb_file,
'--force',
'--debug',
]
if extra_args is not None:
args.extend(extra_args)
with self.mocked_stdout_stderr():
self.eb_main(args, do_build=True, raise_error=True, verbose=True)
software_path = os.path.join(self.test_installpath, 'software', 'toy', '0.0')
test_report_path_pattern = os.path.join(software_path, 'easybuild', 'easybuild-toy-0.0*test_report.md')
test_report_txt = read_file(glob.glob(test_report_path_pattern)[0])
return test_report_txt
# define environment variables that should (not) show up in the test report
test_var_secret = 'THIS_IS_JUST_A_SECRET_ENV_VAR_FOR_EASYBUILD'
os.environ[test_var_secret] = 'thisshouldremainsecretonrequest'
test_var_secret_regex = re.compile(test_var_secret)
test_var_public = 'THIS_IS_JUST_A_PUBLIC_ENV_VAR_FOR_EASYBUILD'
os.environ[test_var_public] = 'thisshouldalwaysbeincluded'
test_var_public_regex = re.compile(test_var_public)
# default: no filtering
test_report_txt = toy()
self.assertTrue(test_var_secret_regex.search(test_report_txt))
self.assertTrue(test_var_public_regex.search(test_report_txt))
# filter out env vars that match specified regex pattern
filter_arg = "--test-report-env-filter=.*_SECRET_ENV_VAR_FOR_EASYBUILD"
test_report_txt = toy(extra_args=[filter_arg])
res = test_var_secret_regex.search(test_report_txt)
self.assertFalse(res, "No match for %s in %s" % (test_var_secret_regex.pattern, test_report_txt))
self.assertTrue(test_var_public_regex.search(test_report_txt))
# make sure that used filter is reported correctly in test report
filter_arg_regex = re.compile(r"--test-report-env-filter='.\*_SECRET_ENV_VAR_FOR_EASYBUILD'")
tup = (filter_arg_regex.pattern, test_report_txt)
self.assertTrue(filter_arg_regex.search(test_report_txt), "%s in %s" % tup)

@Crivella
Copy link
Contributor Author

Also, if it is possible to tell the user to check what will appear, then we could include instructions of generating a report to see what is in it.

Is there a way to upload a test report of an already performed run?
Otherwise i think the best solution would by to add a new option for a dry run mode of --upload-test-report to show how the report would look like

@Crivella
Copy link
Contributor Author

I am not sure the failed CI is related to this PR?

@ocaisa
Copy link
Member

ocaisa commented May 16, 2025

Tests are taking crazy long, I think if you sync with develop things will improve (there were a few PRs related to this today)

@Crivella Crivella force-pushed the feature-exclude_env_from_report branch from 0e29f98 to 04476d3 Compare May 16, 2025 13:22
@Crivella
Copy link
Contributor Author

Rebased onto develop

'LICENSE',
'LICENCE',
]
DEFAULT_EXCLUDE_FROM_REPORT_RGX = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use regex rather than rgx quite consistently across the EasyBuild codebase, so I prefer also using it here.

Also, the name of the constant should be a bit more descriptive, it's too vague/broad, so something like:

Suggested change
DEFAULT_EXCLUDE_FROM_REPORT_RGX = [
DEFAULT_EXCLUDE_FROM_TEST_REPORT_VALUE_REGEX = [

Similar above for DEFAULT_EXCLUDE_FROM_REPORT, which I would rename to something like DEFAULT_EXCLUDE_FROM_TEST_REPORT_ENV_VAR_NAMES

Constants are here to stay, they effectively become part of the EasyBuild framework API, so we better try and make sure they have names that don't leave much room for guessing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 9cf65b5

_exclude_env_from_report.append(key.upper())


def exclude_env_from_report_clear():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in the tests, it's fine to twiddle with internal things there, so I wouldn't define a custom function for this, and just play with _exclude_env_from_report directly in the tests?

Likewise for exclude_env_from_report_add

@@ -58,6 +59,48 @@

_log = fancylogger.getLogger('testing', fname=False)

_exclude_env_from_report = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the only reason that this is added is so we have something to play with in the tests?

That's a bad pattern, I think it's better pull tricks in the tests (like changing constants in place and than restoring their original value in the tearDown of the tests) rather than introducing global variables that are only there to play with in the test suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was for this functions to also be used outside of framework, eg for an easyblock to excluded a specific environment variable from reports that is know to contain a secret

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if there would be a better solution to implement excluding variables on demand.
In case we want more discussion around that, i think the default excludes should be included ASAP, so i could split this PR in 2 if needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that _exclude_env_from_report is a global variable here is a problem, since that means it'll be shared across multiple easyblocks used in a single EasyBuild session...

environment += ["%s = %s" % (key, value)]

environment = list(filter(
lambda x: not any(y in x.upper() for y in DEFAULT_EXCLUDE_FROM_REPORT + _exclude_env_from_report),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we also tackle that part above?

I.e. only add a key-value pair if the key (env var name) doesn't have a partial match with anything in DEFAULT_EXCLUDE_FROM_TEST_REPORT_ENV_VAR_NAMES?

The combination of filter with a lambda expression seems way too involved for what's going on here, I'd rather see a 2nd simple condition with a continue above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 4cf9c83

@boegel
Copy link
Member

boegel commented May 21, 2025

Also, if it is possible to tell the user to check what will appear, then we could include instructions of generating a report to see what is in it.

Is there a way to upload a test report of an already performed run? Otherwise i think the best solution would by to add a new option for a dry run mode of --upload-test-report to show how the report would look like

No, there's not, but there's --dump-test-report so you can inspect it locally.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel boegel dismissed ocaisa’s stale review May 21, 2025 15:25

requested changes made

@boegel boegel merged commit 4b57587 into easybuilders:develop May 21, 2025
37 checks passed
@Crivella Crivella deleted the feature-exclude_env_from_report branch May 21, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants