Skip to content

fix: #1218 Proxy implementation for Lambdas (basic UT for aws-sdk v2) #1256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed

fix: #1218 Proxy implementation for Lambdas (basic UT for aws-sdk v2) #1256

wants to merge 19 commits into from

Conversation

axel3rd
Copy link
Contributor

@axel3rd axel3rd commented Oct 11, 2021

PR proposal for fixing #1218.


As first minor review, perhaps https_proxy could be renamed HTTPS_PROXY for coherence in environment variables naming (even if proxy variables is generally named in lowercase):

image


Tested in "real condition", working fine about EC2 launch:

image

PS: End2end use case is always in progress, templates/user-data.sh should be completely customized to manage proxy usage.

@axel3rd axel3rd marked this pull request as draft October 11, 2021 10:00
@axel3rd
Copy link
Contributor Author

axel3rd commented Oct 14, 2021

As first minor review, perhaps https_proxy could be renamed HTTPS_PROXY for coherence in environment variables naming (even if proxy variables is generally named in lowercase):

Done

@axel3rd axel3rd marked this pull request as ready for review October 18, 2021 08:05
@axel3rd axel3rd changed the title Fix #1218 (part, without UT) : Proxy implementation for Lambdas Fix #1218 (without UT) : Proxy implementation for Lambdas Oct 18, 2021
@npalm npalm self-requested a review October 19, 2021 21:01
Copy link
Member

@npalm npalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the conribution.

  • why did you only add the proxy to the runner lambda? And not to the other two?
  • see other commands inline in the code

@axel3rd
Copy link
Contributor Author

axel3rd commented Oct 21, 2021

why did you only add the proxy to the runner lambda? And not to the other two?

[comment updated after initially missing the sense of question]

@npalm : The webhook & runner-binaries-syncer are not attached to defined VPC defined by lambda_subnet_ids template parameter and can accessing directly to AWS APIs =>For:

  • webhook: Can access to SQS without any problem.
  • runner-binaries-syncer:
    • (here is supposition, I only do some real test with up2date actions-runner-linux-x64-xxx.tar.gz downloaded in lambdas-download/ directory from internal repository due to proxy constraints)
    • GHES URL seems not implemented (runner-binaries-syncer/variables.tf) => if last version retrieved from GitHub.com, S3 could be updated
    • But yes, the templates/user-data.sh should be customized with proxy usage in this case (like the SSM access in addition 😞)

@axel3rd axel3rd changed the title Fix #1218 (without UT) : Proxy implementation for Lambdas Fix #1218 : Proxy implementation for Lambdas (without UT for aws-sdk v2) Oct 26, 2021
@axel3rd axel3rd requested a review from npalm October 26, 2021 17:12
@axel3rd axel3rd changed the title Fix #1218 : Proxy implementation for Lambdas (without UT for aws-sdk v2) Fix #1218 : Proxy implementation for Lambdas (basic UT for aws-sdk v2) Nov 30, 2021
@axel3rd axel3rd marked this pull request as draft November 30, 2021 16:39
@axel3rd axel3rd marked this pull request as ready for review November 30, 2021 18:37
@axel3rd
Copy link
Contributor Author

axel3rd commented Nov 30, 2021

@npalm : Ready for review (basic unit tests added for AWS SDK v2 ; all AWS request API are not mocked).

@npalm
Copy link
Member

npalm commented Dec 7, 2021

Thx, will check this week. I have no option to test the proxy.

@npalm
Copy link
Member

npalm commented Dec 9, 2021

@ScottGuymer would you have a bit of time to check the PR?

@axel3rd
Copy link
Contributor Author

axel3rd commented Dec 9, 2021

For understanding why a proxy, and linked with #1303, a WIP/draft of a future template example of this use case.

Copy link
Contributor

@ScottGuymer ScottGuymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi

Thanks for the contribution.

I would suggest adding in the configuration of the https_proxy to the user-data scripting by default so it does not need to be customised directly for all usages of this feature.

This adds a little extra complexity to the overall setup, which is fine if this is a widely used way of working. Is it not possible to configure your VPC in such a way that it is able to speak to the AWS API without the need for a proxy? I can understand why you may want to block it generally but in this case it really is a key part of the solution and having to proxy all calls could introduce new failure modes on the instances in the future..

@@ -138,6 +138,7 @@ resource "aws_launch_template" "runner" {
start_runner = templatefile(local.userdata_start_runner[var.runner_os], {})
ghes_url = var.ghes_url
ghes_ssl_verify = var.ghes_ssl_verify
https_proxy = var.lambda_https_proxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting change.

In a recent PR (#1444) we have started to pass less through here to be templated into the user-data file. This is to prevent this becoming a place that requires a lot of extra values to be passed into a template at build time. We switched to allowing SSM parameters that can be dynamically queried from the user-data script on boot. Which makes it more extensible as you could add more parameters outside of this module and then customise your user-data file accordingly to use them without needing to pass them specifically here (or other template gymnastics).

This setting however would be required in order to allow SSM to even be used at all, so it might be worthwhile addition to this static list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should learn #1444 to better understand the impact with this PR.

One item to have in mind when EC2 VM is in a "routed VPC" ; the AWS role permissions doesn't work => to access SSM or S3 (to download runner, if not packed in image) some AWS credentials should be configured.
It could be nice if it could come from TF_VAR_ environment variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am scared than on a routed VPC, the problem with proxy will be the same in start-runner.sh than user-data.sh.
Prebuild image with packer would be hard to use in this case 😢.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a requirement for the instance to communicate with instance metadata endpoints and SSM for example. How do you currently handle that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ScottGuymer : Relate PR #1539 for global sample.

When VM is on a routed VPC (no direct internet access without proxy), VM can communicate with metadata without proxy, but any AWS API endpoint like SSM should use proxy, because the AWS "env-runner-role" doesn't work in this case (or I miss something in AWS usage ... I'm not an expert).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of your VPC setup so that we can see why these things won't work?

I'm not sure what setup would mean that an instance would not be able to use its own identity to use the AWS CLI to retrieve the correct information. I can see that the setup might need https_proxy set so that the calls sent by AWS CLI can be proxied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of your VPC setup so that we can see why these things won't work?

To be honest, this kind of network configuration is too complex for me. I understand the network architecture (fw, palo, proxy, reverse-proxy, ...) and usage ; but I don't know how to configure/understand it in AWS VPC configuration.
But now we don't care, see below.

what setup would mean that an instance would not be able to use its own identity to use the AWS CLI to retrieve the correct information

By these sentence, you gave me a wonderful Christmas present 🎁 !

Reading carefully AWS HTTP proxy configuration, the usage of role seems possible when proxy is used ; but 169.254.169.254 is important.

From my incoming sample, the access to 169.254.169.254 (metadata) is possible when proxy is not configured.

=> The (trivial, after understanding) solution is: export no_proxy=169.254.169.254 ; and aws cli is working like a charm.

In synthesis: The proxy is required in user-data template, but not AWS credentials.

And it is really a 🎁, because this limitation (AWS API usage via cli/sdk/etc from routed VPN) was a pain points for many other items.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so the combination of https_proxy and no_proxy enables the use of both the meta data endpoint and the CLI from the instance itself. Meaning there should be no specific need to customise the user-data for this use case, there may be other customisations needed using the pre or post install hooks and this is possible (for instance we set some specific network routing config on the instance).

If we add the proxy settings into the instance with the standard user-data does this make #1539 no longer needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add the proxy settings into the instance with the standard user-data does this make #1539 no longer needed?

This sample is for detail the "particular" usage of GHES + VPC routed + runners at enterprise level. It could be always useful IMO.

@axel3rd
Copy link
Contributor Author

axel3rd commented Dec 20, 2021

@ScottGuymer: Sorry for my response delay, full time on a "logger problem" since ~10 days.

I would suggest adding in the configuration of the https_proxy to the user-data scripting by default so it does not need to be customised directly for all usages of this feature.

Not agree with that ; when EC2 VM are in "routed VPC" and GitHub GHES in this same VPC or on corporate network, the runner (on VM) should be in capacity to request GitHub directly without proxy.

If runner communicate to GitHub with proxy, some reverse proxy (to secure internet access) could be in game and complexify the process (for sample: authentication required, which doesn't support GitHub OAuth currently).

=> "Proxy by default" introduces more problem IMO (vs configured for requests which need it).

Is it not possible to configure your VPC in such a way that it is able to speak to the AWS API without the need for a proxy?

I would like ... but no way to deal on that with security team currently.
For any resource on corporate network (on-premises or routed VPC):

  • (egress) all resource which would access to internet should use proxy
  • (ingress) all internet access to these resources should pass by reverse-proxy (with authentication in some case)

You can see some more extensive examples of mocking dependencies in [...]

Are the current unit tests on SSM & lambda (root dir) sufficient or should I investigate on Axios sample ?

@axel3rd
Copy link
Contributor Author

axel3rd commented Dec 23, 2021

Set to draft even if all checks pass, because when tested really after last develop merge, problem on webhook runner (even if not part of these changes):

image

@axel3rd
Copy link
Contributor Author

axel3rd commented Dec 23, 2021

Set to draft even if all checks pass, because when tested really after last develop merge, problem on webhook runner (even if not part of these changes)

Due to previous release webhook.zip lambda usage. Works fine with rebuild from develop.

@axel3rd axel3rd marked this pull request as ready for review December 23, 2021 13:42
@axel3rd
Copy link
Contributor Author

axel3rd commented Jan 3, 2022

@ScottGuymer : Happy new year ! I think have answered to all request/question. Don't hesitate for any complements if it is not the case.

@axel3rd axel3rd requested a review from ScottGuymer January 6, 2022 11:51
@@ -42,6 +42,7 @@
"@types/express": "^4.17.11",
"@types/node": "^17.0.7",
"aws-sdk": "^2.1048.0",
"proxy-agent": "^5.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the library, tiny but last half year not maintained https://www.npmjs.com/package/proxy-agent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last half year not maintained

Agree, but wide used. Not sure to have another better solution 😢.

@axel3rd
Copy link
Contributor Author

axel3rd commented Jan 14, 2022

@npalm / @ScottGuymer : Sorry to bother you or seems to be insistent, but what could be the following of this PR ? Currently its status is Changes requested but I'm not sure to have concrete updates todo on it.
Let me know, many thanks.

@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Jun 19, 2022
@axel3rd
Copy link
Contributor Author

axel3rd commented Jun 20, 2022

github-actions bot added the Stale label yesterday

Please let me know if any changes are required. It would be nice if this proxy support could be supported IMO.

@github-actions github-actions bot removed the Stale label Jun 21, 2022
@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Aug 26, 2022
@axel3rd
Copy link
Contributor Author

axel3rd commented Aug 30, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

This PR is always relevant IMO.

@github-actions github-actions bot removed the Stale label Aug 31, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Nov 12, 2022
@mcaulifn
Copy link
Contributor

down stale bot, down

@github-actions github-actions bot removed the Stale label Nov 13, 2022
@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants