Skip to content

Conversation

rfredette
Copy link
Contributor

@rfredette rfredette commented Apr 20, 2023

This PR makes the router parse the certificates supplied in ROUTER_MUTUAL_TLS_AUTH_CA, download CRLs from any CRL distribution points that are found, and write them to a file that is supplied to HAProxy for mTLS certificate validation.

It also includes a fix for an issue where, when a CRL is distributed in a certificate that is not its issuer, the CRL would sometimes be missing or two copies of the CRL would be present.

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 20, 2023
@openshift-ci-robot
Copy link
Contributor

@rfredette: This pull request references Jira Issue OCPBUGS-9464, which is invalid:

  • expected the bug to target the "4.14.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

The CRLs on disk don't include enough information to accurately determine which CA certificate they were downloaded for. Instead of re-parsing the CRLs on disk, keep them in a map indexed by the CA's
subject key, so we can avoid unnecessary downloads and incorrect attribution.

This change requires #447 , so until that merges, this includes the changes from that PR as well. As of this writing, only the commit "Cache current CRLs in memory instead of re-parsing from disk" (36ebc38) is new.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from gcs278 and Miciah April 20, 2023 22:06
@rfredette
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 20, 2023
@openshift-ci-robot
Copy link
Contributor

@rfredette: This pull request references Jira Issue OCPBUGS-9464, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@frobware
Copy link
Contributor

/assign

@rfredette rfredette changed the title OCPBUGS-9464: Cache current CRLs in memory instead of re-parsing from disk OCPBUGS-6661, OCPBUGS-9464: Handle mTLS CRLs, and fix accidental CRL duplication May 17, 2023
@rfredette rfredette force-pushed the ocpbugs-9464-duplicate-crls branch from bef0660 to e701475 Compare May 17, 2023 23:03
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. and removed jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. labels May 17, 2023
@openshift-ci-robot
Copy link
Contributor

@rfredette: This pull request references Jira Issue OCPBUGS-9464, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This PR makes the router parse the certificates supplied in ROUTER_MUTUAL_TLS_AUTH_CA, download CRLs from any CRL distribution points that are found, and write them to a file that is supplied to HAProxy for mTLS certificate validation.

It also includes a fix for an issue where, when a CRL is distributed in a certificate that is not its issuer, the CRL would sometimes be missing or two copies of the CRL would be present.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

{{- with (env "ROUTER_MUTUAL_TLS_AUTH_CA") }} ca-file {{. }} {{ else }} ca-file /etc/ssl/certs/ca-bundle.trust.crt {{ end }}
{{- with (env "ROUTER_MUTUAL_TLS_AUTH_CRL") }} crl-file {{. }} {{ end }}
{{- if (ne (env "ROUTER_MUTUAL_TLS_AUTH_CRL") "") }}
{{- with (env "ROUTER_MUTUAL_TLS_AUTH_CA") }} ca-file {{ . }} {{- else }} ca-file /etc/ssl/certs/ca-bundle.trust.crt {{ end }}
Copy link
Contributor

@frobware frobware May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes here:

  • previously: {{. }}, now: {{ . }} -- innocuous
  • previously: {{ else }}, now {{- else }} -- but now we trim all the whitespace before the action

Is the last change intentional? If so, for what reason?
My comment holds true for the other template change below too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an intentional change; I will update this to get rid of that drift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return nil, fmt.Errorf("http.Get failed: %w", err)
}
defer resp.Body.Close()
crlBytes, err := io.ReadAll(resp.Body)
Copy link
Contributor

@frobware frobware May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we ReadAll() we should check the response code:

if resp.StatusCode != http.StatusOK {
	return nil, fmt.Errorf("received non-OK status code: %d", resp.StatusCode)
}

The error message may be more helpful when a GET fails (e.g., 404).

What happens if the status code is a redirect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if block == nil {
break
}
clientCAData = data
Copy link
Contributor

@frobware frobware May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In clientCAData are we only concerned with certificates?

if block.Type != "CERTIFICATE" {
    return nil, time.Time{}, false, fmt.Errorf("client CA bundle contains a non-certificate PEM block")
}

Or log and skip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only care about certificates; any other PEM block type shouldn't matter. I think log & skip any non-certificate content is the way to go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

caUpdateChannel <- struct{}{}
}
if err := r.watchVolumeMountDir(filepath.Dir(caPath), caReloadFn); err != nil {
log.V(0).Error(err, "failed to establish watch on mTLS certificate directory")
return nil
Copy link
Contributor

@frobware frobware May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is existing behaviour, why does this not return the error?

Copy link
Contributor

@frobware frobware May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Miciah I noticed that writeDefaultCert doesn't return an error for its calls to watchVolumeMountDir() should the call fail. Do you know of any history for why this returns nil?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to return an error I'd propose we do that change in 4.14+ only for the moment as we want the backports to discretely fix OCPBUGS-6661 and OCPBUGS-9464 only.

rfredette added 3 commits May 19, 2023 15:11
The CRLs on disk don't include enough information to accurately
determine which CA certificate they were downloaded for. Instead of
re-parsing the CRLs on disk, keep them in a map indexed by the CA's
subject key, so we can avoid unnecessary downloads and incorrect
attribution.
Allows easier backporting to releases reliant on go 1.18 or older
@rfredette rfredette force-pushed the ocpbugs-9464-duplicate-crls branch from e701475 to 6593d76 Compare May 19, 2023 19:11
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 19, 2023

@rfredette: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@frobware
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 22, 2023
@Miciah
Copy link
Contributor

Miciah commented May 22, 2023

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 22, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 22, 2023
@openshift-merge-robot openshift-merge-robot merged commit 5426a9b into openshift:master May 22, 2023
@openshift-ci-robot
Copy link
Contributor

@rfredette: Jira Issue OCPBUGS-9464: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-9464 has not been moved to the MODIFIED state.

In response to this:

This PR makes the router parse the certificates supplied in ROUTER_MUTUAL_TLS_AUTH_CA, download CRLs from any CRL distribution points that are found, and write them to a file that is supplied to HAProxy for mTLS certificate validation.

It also includes a fix for an issue where, when a CRL is distributed in a certificate that is not its issuer, the CRL would sometimes be missing or two copies of the CRL would be present.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rfredette
Copy link
Contributor Author

/cherry-pick release-4.13
/cherry-pick release-4.12
/cherry-pick release-4.11

@openshift-cherrypick-robot

@rfredette: new pull request created: #485

In response to this:

/cherry-pick release-4.13
/cherry-pick release-4.12
/cherry-pick release-4.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants