-
Notifications
You must be signed in to change notification settings - Fork 655
fix: #1218 Proxy implementation for Lambdas (basic UT for aws-sdk v2) #1256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the conribution.
- why did you only add the proxy to the runner lambda? And not to the other two?
- see other commands inline in the code
[comment updated after initially missing the sense of question] @npalm : The
|
@npalm : Ready for review (basic unit tests added for AWS SDK v2 ; all AWS request API are not mocked). |
Thx, will check this week. I have no option to test the proxy. |
@ScottGuymer would you have a bit of time to check the PR? |
For understanding why a proxy, and linked with #1303, a WIP/draft of a future template example of this use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi
Thanks for the contribution.
I would suggest adding in the configuration of the https_proxy to the user-data scripting by default so it does not need to be customised directly for all usages of this feature.
This adds a little extra complexity to the overall setup, which is fine if this is a widely used way of working. Is it not possible to configure your VPC in such a way that it is able to speak to the AWS API without the need for a proxy? I can understand why you may want to block it generally but in this case it really is a key part of the solution and having to proxy all calls could introduce new failure modes on the instances in the future..
modules/runners/main.tf
Outdated
@@ -138,6 +138,7 @@ resource "aws_launch_template" "runner" { | |||
start_runner = templatefile(local.userdata_start_runner[var.runner_os], {}) | |||
ghes_url = var.ghes_url | |||
ghes_ssl_verify = var.ghes_ssl_verify | |||
https_proxy = var.lambda_https_proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting change.
In a recent PR (#1444) we have started to pass less through here to be templated into the user-data file. This is to prevent this becoming a place that requires a lot of extra values to be passed into a template at build time. We switched to allowing SSM parameters that can be dynamically queried from the user-data script on boot. Which makes it more extensible as you could add more parameters outside of this module and then customise your user-data file accordingly to use them without needing to pass them specifically here (or other template gymnastics).
This setting however would be required in order to allow SSM to even be used at all, so it might be worthwhile addition to this static list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should learn #1444 to better understand the impact with this PR.
One item to have in mind when EC2 VM is in a "routed VPC" ; the AWS role permissions doesn't work => to access SSM or S3 (to download runner, if not packed in image) some AWS credentials should be configured.
It could be nice if it could come from TF_VAR_
environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am scared than on a routed VPC, the problem with proxy will be the same in start-runner.sh than user-data.sh.
Prebuild image with packer would be hard to use in this case 😢.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a requirement for the instance to communicate with instance metadata endpoints and SSM for example. How do you currently handle that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ScottGuymer : Relate PR #1539 for global sample.
When VM is on a routed VPC (no direct internet access without proxy), VM can communicate with metadata without proxy, but any AWS API endpoint like SSM should use proxy, because the AWS "env-runner-role" doesn't work in this case (or I miss something in AWS usage ... I'm not an expert).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of your VPC setup so that we can see why these things won't work?
I'm not sure what setup would mean that an instance would not be able to use its own identity to use the AWS CLI to retrieve the correct information. I can see that the setup might need https_proxy
set so that the calls sent by AWS CLI can be proxied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of your VPC setup so that we can see why these things won't work?
To be honest, this kind of network configuration is too complex for me. I understand the network architecture (fw, palo, proxy, reverse-proxy, ...) and usage ; but I don't know how to configure/understand it in AWS VPC configuration.
But now we don't care, see below.
what setup would mean that an instance would not be able to use its own identity to use the AWS CLI to retrieve the correct information
By these sentence, you gave me a wonderful Christmas present 🎁 !
Reading carefully AWS HTTP proxy configuration, the usage of role seems possible when proxy is used ; but 169.254.169.254
is important.
From my incoming sample, the access to 169.254.169.254
(metadata) is possible when proxy is not configured.
=> The (trivial, after understanding) solution is: export no_proxy=169.254.169.254
; and aws cli is working like a charm.
In synthesis: The proxy is required in user-data template, but not AWS credentials.
And it is really a 🎁, because this limitation (AWS API usage via cli/sdk/etc from routed VPN) was a pain points for many other items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so the combination of https_proxy
and no_proxy
enables the use of both the meta data endpoint and the CLI from the instance itself. Meaning there should be no specific need to customise the user-data for this use case, there may be other customisations needed using the pre or post install hooks and this is possible (for instance we set some specific network routing config on the instance).
If we add the proxy settings into the instance with the standard user-data does this make #1539 no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add the proxy settings into the instance with the standard user-data does this make #1539 no longer needed?
This sample is for detail the "particular" usage of GHES + VPC routed + runners at enterprise level. It could be always useful IMO.
@ScottGuymer: Sorry for my response delay, full time on a "logger problem" since ~10 days.
Not agree with that ; when EC2 VM are in "routed VPC" and GitHub GHES in this same VPC or on corporate network, the runner (on VM) should be in capacity to request GitHub directly without proxy. If runner communicate to GitHub with proxy, some reverse proxy (to secure internet access) could be in game and complexify the process (for sample: authentication required, which doesn't support GitHub OAuth currently). => "Proxy by default" introduces more problem IMO (vs configured for requests which need it).
I would like ... but no way to deal on that with security team currently.
Are the current unit tests on SSM & lambda (root dir) sufficient or should I investigate on Axios sample ? |
Due to previous release |
@ScottGuymer : Happy new year ! I think have answered to all request/question. Don't hesitate for any complements if it is not the case. |
@@ -42,6 +42,7 @@ | |||
"@types/express": "^4.17.11", | |||
"@types/node": "^17.0.7", | |||
"aws-sdk": "^2.1048.0", | |||
"proxy-agent": "^5.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the library, tiny but last half year not maintained https://www.npmjs.com/package/proxy-agent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last half year not maintained
Agree, but wide used. Not sure to have another better solution 😢.
@npalm / @ScottGuymer : Sorry to bother you or seems to be insistent, but what could be the following of this PR ? Currently its status is Changes requested but I'm not sure to have concrete updates todo on it. |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions. |
Please let me know if any changes are required. It would be nice if this proxy support could be supported IMO. |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions. |
This PR is always relevant IMO. |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions. |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions. |
down stale bot, down |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions. |
PR proposal for fixing #1218.
As first minor review, perhaps
https_proxy
could be renamedHTTPS_PROXY
for coherence in environment variables naming (even if proxy variables is generally named in lowercase):Tested in "real condition", working fine about EC2 launch:
PS: End2end use case is always in progress,
templates/user-data.sh
should be completely customized to manage proxy usage.