Skip to content

Conversation

@EronWright
Copy link
Contributor

@EronWright EronWright commented Jun 4, 2025

Overview

This PR fixes a regression causing the provider to inadvertently drop the sub-resources following a pulumi refresh. This leads to flapping attributes as described in #4145, upon a subsequent pulumi up or simply another pulumi refresh.

The problem manifests only for externally-managed sub-resources (as is generally recommended). For example, a NetworkSecurityGroup resource with separate (not inline) SecurityRule resources.

In more detail, refresh causes two erroneous changes to the checkpointed state of a given sub-resource property (e.g. privateEndpointConnections or securityRules):

  1. sets the output property to [], even if there's a meaningful value (the discovered sub-resources).
  2. sets the input property to [], even though the program has no inline resources.

From this point, a subsequent pulumi up or pulumi refresh will show a spurious change.

Closes #4145

Background

This PR essentially reverts some undesirable changes in 642a38b.

In #3054, the Read method was altered to remove sub-resources from the checkpointed state (here). The property map named outputsWithoutIgnores seems to have been mistaken to contain outputs, but is best understood as an intermediate form of the inferred inputs, which is why the sub-resources are stripped.

image

PR #3054 also switches from removing to resetting the properties, with the unexpected effect of adding the property into the inputs. This causes flapping in a subsequent refresh because it confuses findUnsetPropertiesToMaintain to think that the sub-resources are inline.

image

I believe that #3054 was effective at solving #3049 because it incorporated Mikhail's insight about the SDK shape, here. The changes that we now seek to revert were not germane to #3049.

Example 1

Deploy the following program twice, setting the revision tag to 2 after the first deployment. Twice is needed because this program creates an security rule within the NSG, but the NSG's state will not reflect the security rule until after an update to the NSG. Refresh would not be effective because that's the very problem this PR seeks to fix.

using Pulumi;
using Pulumi.AzureNative.Resources;
using Pulumi.AzureNative.Network;
// using Pulumi.AzureNative.Network.Inputs;
using System.Collections.Generic;

return await Pulumi.Deployment.RunAsync(() =>
{
    // Create an Azure Resource Group
    var resourceGroup = new ResourceGroup("resourceGroup");

    var _name = "eron";

    var nsg = new NetworkSecurityGroup($"data-center-{_name}-nsg", new NetworkSecurityGroupArgs
    {
        ResourceGroupName = resourceGroup.Name,
        Tags =
        {
            { "revision", "1" },
        },
    });

    var rule = new SecurityRule($"NSG-AllowVirtualNetworkRdpInbound-{_name}", new SecurityRuleArgs
    {
        SecurityRuleName = "AllowVirtualNetworkRdpInbound",
        Priority = 120,
        Direction = SecurityRuleDirection.Inbound,
        Access = SecurityRuleAccess.Allow,
        Protocol = SecurityRuleProtocol.Tcp,
        SourcePortRange = "*",
        DestinationPortRange = "3389",
        SourceAddressPrefix = "VirtualNetwork",
        DestinationAddressPrefix = "VirtualNetwork",
        ResourceGroupName = resourceGroup.Name,
        NetworkSecurityGroupName = nsg.Name,
        Description = "Allows internal RDP Traffic"
    });
    
    // Export the primary key of the Storage Account
    return new Dictionary<string, object?>
    {
        // ["nsg"] = nsg.ID()
    };
});

Export the state, run pulumi refresh, and export again. You should observe that the NSG's inputs and outputs have been destabilized. Try refreshes and ups to see flapping.

See attachments, for a series of state exports:

Example 2

This example is closer to what's described in #4145. The program creates a private endpoint, so that the EventHub namespace has a private endpoint connection that we may observe to be 'flappy'.

import * as pulumi from "@pulumi/pulumi";
import * as resources from "@pulumi/azure-native/resources";
import * as eventhub from "@pulumi/azure-native/eventhub";
import * as network from "@pulumi/azure-native/network";

// Configuration for the subnet ID
const config = new pulumi.Config();
const subnetId = config.require("subnetId");

// Create an Azure Resource Group
const resourceGroup = new resources.ResourceGroup("eventhub-rg");

const namespace = new eventhub.v20230101preview.Namespace("namespace", {
    isAutoInflateEnabled: true,
    maximumThroughputUnits: 40,
    minimumTlsVersion: "1.2",
    publicNetworkAccess: "Enabled",
    resourceGroupName: resourceGroup.name,
    sku: {
        capacity: 1,
        name: "Standard",
        tier: "Standard"
    },
    tags: {
        "Name": "mynamespace",
        "revision": "3"
    }
});

const privateEndpoint = new network.PrivateEndpoint("privateEndpoint", {
    resourceGroupName: resourceGroup.name,
    location: resourceGroup.location,
    privateEndpointName: "myPrivateEndpoint",
    subnet: {
        id: subnetId,
    },
    privateLinkServiceConnections: [{
        name: "myPrivateLinkServiceConnection",
        privateLinkServiceId: namespace.id,
        groupIds: ["namespace"],
    }],
});

const privateEndpointConnections = namespace.privateEndpointConnections;

repro.zip

@EronWright
Copy link
Contributor Author

EronWright commented Jun 4, 2025

This change is part of the following stack:

Change managed by git-spice.

@EronWright EronWright requested a review from danielrbradley June 4, 2025 23:50
@EronWright EronWright marked this pull request as ready for review June 5, 2025 00:47
@EronWright EronWright requested review from a team and mikhailshilkov June 5, 2025 00:48
Base automatically changed from issue-4094 to master June 5, 2025 22:16
@github-actions
Copy link

github-actions bot commented Jun 5, 2025

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

@codecov
Copy link

codecov bot commented Jun 5, 2025

Codecov Report

Attention: Patch coverage is 72.41379% with 8 lines in your changes missing coverage. Please review.

Project coverage is 58.55%. Comparing base (65208e8) to head (484dce4).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
provider/pkg/provider/provider.go 72.41% 6 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4178      +/-   ##
==========================================
+ Coverage   58.09%   58.55%   +0.46%     
==========================================
  Files          84       84              
  Lines       13546    13551       +5     
==========================================
+ Hits         7869     7935      +66     
+ Misses       5084     5010      -74     
- Partials      593      606      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@EronWright
Copy link
Contributor Author

EronWright commented Jun 6, 2025

I did another round of testing against the repro that was provided in #4145, and confirmed that the issue is reproducable and is corrected by this PR.

Copy link
Contributor

@danielrbradley danielrbradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revert makes sense that the original change was probably in error due to confusing the need for the empty value when sending to Azure vs needing the value not existing for preserving sub-resources.

Let's add the full repro as an e2e test to ensure this doesn't regress again in the future. This can be added in the provider_e2e_test.go) with the full up, refresh, refresh, up, refresh loop and asserting on the up.Outputs() and the inputs via ExportStack().

@EronWright EronWright merged commit 7d74c23 into master Jun 6, 2025
23 checks passed
@EronWright EronWright deleted the issue-4145 branch June 6, 2025 17:12
@pulumi-bot
Copy link
Contributor

This PR has been shipped in release v3.5.1.

1 similar comment
@pulumi-bot
Copy link
Contributor

This PR has been shipped in release v3.5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pulumi Refresh for EventHub Namespace Flapping Attribute

5 participants