Skip to content

Add configuration drift detection and optional reconciliation to ComputeEngineInsertInstanceOperator#61830

Merged
shahar1 merged 1 commit intoapache:mainfrom
SameerMesiah97:61829-ComputeEngineInsertInstanceOperator-Drift-Detection
Mar 9, 2026
Merged

Add configuration drift detection and optional reconciliation to ComputeEngineInsertInstanceOperator#61830
shahar1 merged 1 commit intoapache:mainfrom
SameerMesiah97:61829-ComputeEngineInsertInstanceOperator-Drift-Detection

Conversation

@SameerMesiah97
Copy link
Contributor

@SameerMesiah97 SameerMesiah97 commented Feb 13, 2026

Description

This change enhances ComputeEngineInsertInstanceOperator to detect configuration differences when an instance already exists.

Previously, the operator treated instance presence as success and returned without validating that the existing resource matched the requested configuration. As a result, changes to fields such as machine_type were not detected on subsequent DAG runs.

This update introduces configuration (machine_type) comparison logic when an instance is found. Detected differences are logged. An optional recreate_if_different flag allows users to explicitly request deletion and recreation of the instance when configuration drift is detected.

To support this behavior, two helper methods were introduced: _detect_instance_drift, which compares the existing instance with the requested body, and _create_instance, which encapsulates instance creation logic used by both the initial and recreation paths.

Rationale

The previous behavior relied solely on presence-based idempotence and did not validate configuration consistency across DAG runs. This change surfaces configuration differences and provides an opt-in mechanism for reconciliation, while preserving default behavior.

Drift detection is intentionally limited to machine_type for now. Machine type changes are deterministic, high-impact, and straightforward to compare, whereas other fields (e.g. disks or networking) introduce normalization and defaulting complexity. The implementation is structured to allow incremental expansion of drift detection in future updates.

Notes

  • The execute method has been refactored to accommodate configuration drift logging and the new recreate_if_different flag while preserving default behavior. The refactor extracts instance creation into a helper and introduces structured comparison logic without altering presence-based idempotence unless the new flag is explicitly enabled.
  • A helper method _extract_machine_type has been introduced to extract the machine_type from the full strings.
  • Minor corrections have been made to comments where applicable with redundant comments removed.

Tests

  • Added a unit test verifying that configuration drift is detected and logged.
  • Added unit and system tests verifying that the instance is deleted and recreated when recreate_if_different=True.

Documentation

  • Added documentation for the new recreate_if_different parameter in the operator docstring.
  • Updated the execute method docstring to clarify presence-based idempotence and drift handling behavior.

Backwards Compatibility

There is a behavioral difference in that configuration drift is now logged by default when detected. Additionally, users may opt into reconciliation behavior via recreate_if_different=True, which will delete and recreate the instance when differences are found. Existing DAGs will otherwise continue to behave as before unless the new flag is explicitly enabled.

Closes: #61829

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Feb 13, 2026
@SameerMesiah97 SameerMesiah97 force-pushed the 61829-ComputeEngineInsertInstanceOperator-Drift-Detection branch 4 times, most recently from 7359542 to 4cca068 Compare February 13, 2026 20:25
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Minor comments, but overall looks great.
Could you please create a system test Dag for this specific beahvior (when the flag is true)? It seems worth having it.

CC: @VladaZakharova @MaksYermak

@VladaZakharova
Copy link
Contributor

thank you for raising PR
Can you please send the screenshot of green system tests? Please make sure to run also separately the case where the machine already exists, because current system test doesn't include this.
also correct me if i am wrong, this change is ONLY for the difference in machine type? no other parameters will be detected?

@SameerMesiah97 SameerMesiah97 force-pushed the 61829-ComputeEngineInsertInstanceOperator-Drift-Detection branch from 4cca068 to b0cdbf4 Compare February 18, 2026 19:29
@SameerMesiah97
Copy link
Contributor Author

thank you for raising PR Can you please send the screenshot of green system tests?

Please find below the results of the system tests example_compute.py and example_compute_recreate_drift.py (this is a new test I have added):

image image

Note that the failure indicated in the second screenshot is from another test example_compute_igm.py, and was unrelated to the changes introduced in the PR. It later succeeded when I re-ran it after deleting some Google Cloud resources. Please find below the screenshots proving this:

image image

Note that the first screenshot is the original image from which the results of example_compute.py and example_compute_recreate_drift.py have been cropped. Kindly tell me if this is sufficient for you. But keep in mind that these tests are rather flaky and take a very long time to run. I would appreciate assistance here with running them if you are able.

Please make sure to run also separately the case where the machine already exists, because current system test doesn't include this.

As mentioned above, the new system test example_compute_recreate_drift.py should cover this scenario.

also correct me if i am wrong, this change is ONLY for the difference in machine type? no other parameters will be detected?

That is correct.

…tanceOperator

Detect and log machine type configuration differences when an instance already
exists instead of relying solely on presence-based idempotence.
Introduce `recreate_if_machine_type_different` flag to optionally delete and
recreate instances when drift is detected. Refactor execute logic
to support drift comparison and shared instance creation helper.
Add unit and system tests for drift logging and recreation behavior.
@SameerMesiah97 SameerMesiah97 force-pushed the 61829-ComputeEngineInsertInstanceOperator-Drift-Detection branch from b0cdbf4 to 6dd6f40 Compare February 19, 2026 23:55
@SameerMesiah97
Copy link
Contributor Author

Requesting review for this.

@shahar1 shahar1 merged commit db46429 into apache:main Mar 9, 2026
90 checks passed
jason810496 pushed a commit to jason810496/airflow that referenced this pull request Mar 10, 2026
thejoeejoee pushed a commit to thejoeejoee/airflow that referenced this pull request Mar 10, 2026
dominikhei pushed a commit to dominikhei/airflow that referenced this pull request Mar 11, 2026
Pyasma pushed a commit to Pyasma/airflow that referenced this pull request Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ComputeEngineInsertInstanceOperator ignores configuration changes when instance already exists

3 participants