Validate duplicate provider local names in required_providers#31218
Validate duplicate provider local names in required_providers#31218
required_providers#31218Conversation
f6ddbc8 to
ce5758f
Compare
apparentlymart
left a comment
There was a problem hiding this comment.
This seems like a nice improvement to me that should give some better feedback about a situation that was already invalid but checked and reported in a confusing way.
I do have a few questions about the overall impact of this change:
-
Were you able to verify that there weren't any situations before where someone might have been able to "get away with" declaring multiple provider local names in this way without hitting some sort of error?
I cannot imagine any case that should've worked, because declaring two providers with different local names and then configuring against them both means exactly the same thing as declaring two provider configurations with the same local name and alias. But given that this was unambiguous, I do wonder if there's a possibility that something might have crept through that could be arguably covered by the v1.0 compatibility promises. 😬
-
How does this interact with the various implied provider requirements Terraform infers for backward compatibility? For example:
terraform { required_providers { notaws = { source = "hashicorp/aws" } } } provider "notaws" { # ... } provider "aws" { # ... }
I think your
TestProviderConfigTransformer_duplicateLocalNametest is asserting that this is allowed, but that seems counter-intuitive to me since it seems like it's still effectively declaring thehashicorp/awsprovider under two local names, with one of them just being implied by our back-compat rules rather than written out explicitly by the user.Might we need an additional rule in
ProviderConfigTransformerto treat the above as invalid, or does that already get taken care of by some logic I can't see clearly in this diff?
|
Hmm, actually that test example came from my first iteration where this was warning, but I found that because there were multiple points where providers were located only by their full name, you couldn't configure them without possibly running into errors. But now I see the problem here, it's technically possible to have a completely working configuration if you're not configuring the provider at all (or maybe not even using the provider!). There would be no reason to write such a config in the first place, but maybe a module could end up there through some refactoring? 🤔 I think I'm going to loosen this back up to a warning to be on the safe side. |
Adding multiple local names for the same provider type in required_providers was not prevented, which can lead to ambiguous behavior in Terraform. Providers are always indexed by the providers fully qualified name, so duplicate local names cannot be differentiated.
We can skip providers which already have a node in the graph for their type.
ce5758f to
d7238d5
Compare
| // was required via a different name. | ||
| impliedLocalName := req.Type.Type | ||
| // We have to search through the configs for a match, since the keys contains any aliases. | ||
| for _, pc := range mod.ProviderConfigs { |
There was a problem hiding this comment.
There's also the related situation of a provider requirement being implied by a resource or data block which doesn't set the provider argument and whose resource type name starts with a prefix that doesn't match any explicitly-declared identifier. Should we catch that one too, or does it end up not mattering because the resource ends up belonging to the correct provider configuration anyway?
(Specifically: I don't recall if that backward-compatibility behavior causes us to implicitly declare a new provider configuration with the local name matching the prefix, or if it instead just hooks it up to whatever provider configuration it finds that has the assumed provider source address. If it's the latter then I guess it probably doesn't really matter, although it's still potentially confusing to have e.g. aws_instance automatically associated with provider "notaws" just because required_providers declared notaws as being hashicorp/aws. 🤔)
There was a problem hiding this comment.
While it does work out that the implied provider based on the resource prefix will just get hooked up to the correct provider, even if it has a different local name, perhaps it's best to warn about that case too? I didn't delve into that here because it doesn't cause any known issues, and it's not as straightforward as dealing with the provider configs alone.
There was a problem hiding this comment.
I agree that it seems a little unclear what we should do here... I do like the consistency of treating a provider dependency implied by a resource the same way as a provider dependency implied by a provider configuration, since it seems like both of those things are a common source of confusion for folks anyway. But if it is particularly gnarly to do it then I would agree probably not worth it.
I have some memory of a function somewhere which takes something describing a provider and encapsulates the logic for deciding which provider configuration it ought to belong to, which if we could find it might allow all of these situations together as one rule (a provider = aws.foo can also imply a dependency on hashicorp/aws if there isn't already an aws in scope), but if we'd end up having to duplicate all of the logic for detecting implicit provider requirements in here then I'd say that's probably too high a cost for just a little consistency. (Unfortunately I don't remember where this function is, assuming I'm not misremembering it existing in the first place, but maybe you have seen it on your travels while working on this... 🤔)
There was a problem hiding this comment.
Checking this final case wasn't that bad, we've already collected the required_providers local names, so we just need to iterate over the resources looking for implied providers which aren't in that set.
The newest commit adds a diagnostic like:
│ Warning: Duplicate required provider
...
│ Provider "null" was implicitly required via resource "null_resource.a", but listed in
│ required_providers as "nope". Either the local name in required_providers must match the resource
│ name, or the "nope" provider must be assigned within the resource block.
This diagnostic output only uses the local names, because this situation can only happen with providers from the default namespace with a default provider name inferred from the resource name.
23eb9bd to
35f082a
Compare
apparentlymart
left a comment
There was a problem hiding this comment.
Sorry... more noodling here as I dug into the new logic you wrote. Each time I read new code in this PR it reminds me of things I'd forgotten about the provider resolution logic, and so I'm sorry I'm kinda just dribbling context slowly over the course of many round-trips. 😖
| "Provider %q was implicitly required via resource %q, but listed in required_providers as %q. Either the local name in required_providers must match the resource name, or the %q provider must be assigned within the resource block.", | ||
| localName, r.Addr(), prevLocalName, prevLocalName, |
There was a problem hiding this comment.
Subjectively, I think it'd help a little to make that first %q be defAddr instead of just localName, even though we know that it'll always just be hashicorp/ followed by the localName anyway, just to add a little nudge here for those who aren't yet aware of this implied dependency mechanism and might benefit from us linking it with what they can see in the terraform init and terraform providers output.
Probably we'll refine this message some more once we see what sorts of questions people ask about it, since it's hard to guess right now what a typical recipient of this message might know or not know, so otherwise I think this is a fine place to start!
There was a problem hiding this comment.
I actually had that at first, but changed it as I cut the message down for brevity ;) THat's a fair point though, and will put it back.
| // We're looking for resources with no specific provider reference | ||
| if r.ProviderConfigRef != nil { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Aren't we also looking for resources that explicitly specify a provider local name that wasn't declared?
provider = aws.fooI believe (though I haven't tested) that if we encounter the above without there being an aws in required_providers then we treat it as an implied requirement for hashicorp/aws, just as we would if it were not set at all but the type name begins with "aws".
Resource.ProviderConfigAddr seems to be the definition of which local name a particular resource is associated with, and so maybe we can use that here and then take the LocalName field of the result as the local name to use below.
There was a problem hiding this comment.
Hmm, that is validated already which is why I didn't check it here, but not until core tries to connect the providers in the graph.
This additional check was only intended to get the case of an implied local name not matching the required_providers local name, in which which case ProviderConfigAddr hides the fact that provider wasn't set. I guess we could also add more validation to resource provider arguments during config loading, but that was outside of the goal of this PR.
| continue | ||
| } | ||
|
|
||
| defAddr := addrs.NewDefaultProvider(localName) |
There was a problem hiding this comment.
I think this could instead be a call to Module.ImpliedProviderForUnqualifiedType, which I believe aims to encapsulate the logic for "what is the fallback to use if this doesn't match an explicit provider requirement?".
That does also consult the required providers for the module, and so calling it directly wouldn't allow distinguishing between a reference to an explicit declaration vs. a backward-compatibility fallback. So maybe then the better answer would be to skip over that layer and go directly to addrs.ImpliedProviderForUnqualifiedType, which at least deals with the special case that terraform means terraform.io/builtin/terraform rather than hashicorp/terraform. Or maybe we could change the signature of Module.ImpliedProviderForUnqualifiedType to return an additional result that signals explicitly whether this was taken from an explicit required_providers declaration or just a compatibility fallback, in the interests of keeping all of the logic using mostly the same codepaths. 🤔
Generally I'd encourage leaning on these existing methods as much as they make sense, because they are there to encapsulate all of the little oddities that arose in the process of developing the provider source scheme, and so if we skip over them and try to reimplement their logic out here it's likely that we'll miss some edge case that we've forgotten about in the meantime and make this not behave consistently with the logic elsewhere.
There was a problem hiding this comment.
I guess we could add another return parameter to ImpliedProviderForUnqualifiedType to show how it was resolved, but I purposely did avoid that here. The validation is only looking for undeclared implied provider names which are named something different in required_providers. This means we have to skip the automatic resolution mechanisms like ImpliedProviderForUnqualifiedType, since those are what would connect us to the provider declaration that we are trying to avoid.
I was kind of view this from the perspective that the methods we present like Module.ImpliedProviderForUnqualifiedType rely on the underly structured data being valid, but here we are doing the actual validation of that data.
Though here I did mean to use addrs.ImpliedProviderForUnqualifiedType, but ended grabbing the older NewDefaultProvider function. That shouldn't make any difference in this case since you can't add the internal terraform provider into required_providers, but yes, we should continue to use them more consistently.
There was a problem hiding this comment.
I don't remember us making any special effort to treat source = "terraform.io/builtin/terraform" as invalid, so as far as I know you can include it there if you really want to give it a local name other than terraform, though of course that would be a pretty bizarre thing to do.
There was a problem hiding this comment.
oh yeah, you can, not sure why I thought we disallowed that 🤔
35f082a to
69f0c7d
Compare
69f0c7d to
57c0deb
Compare
apparentlymart
left a comment
There was a problem hiding this comment.
This looks good to me! Thanks for indulging me on my gradual trip through reloading the context about how all of this works in order to actually review it. 🙄
|
Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch. |
|
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
Adding multiple local names for the same provider type in
required_providerswas not prevented, which can lead to ambiguous behavior in Terraform. Providers are always located via the provider's fully qualified name (even using the full name as a map key in many places), so duplicate local names cannot be differentiated. This can lead terraform to non-deterministically configure the incorrect provider in some cases.Because it is currently possible that a configuration with providers which themselves do not need explicit configuration may have been working with duplicates, we cannot turn these into errors. Adding multiple entries for the same provider now results in a warning like so:
We can also check for the situation where a user required a provider by a name different from the default, but attempted to configure that provider via the default local name. This can help prevent the case where a provider configuration may apparently not always be taking effect within a module. The warning here is worded slightly differently:
fixes #31196