Skip to content

Airflow 3.x support #618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
6 tasks
sbernauer opened this issue Apr 29, 2025 · 8 comments
Open
6 tasks

Airflow 3.x support #618

sbernauer opened this issue Apr 29, 2025 · 8 comments
Assignees

Comments

@sbernauer
Copy link
Member

sbernauer commented Apr 29, 2025

Relevant Slack thread: https://stackable-workspace.slack.com/archives/C071M36AF45/p1745911389456239

Which new version of Apache Airflow should we support?

3.0.x

Additional information

https://airflow.apache.org/blog/airflow-three-point-oh-is-here/
https://airflow.apache.org/docs/apache-airflow/stable/installation/upgrading_to_airflow3.html

Breaking changes

  • SubDAGs: Replaced by TaskGroups, Assets, and Data Aware Scheduling.
    • ✅ Nothing we can do here
  • Sequential Executor: Replaced by LocalExecutor, which can be used with SQLite for local development use cases.
    • ✅ Nothing we can do here
  • SLAs: Deprecated and removed; Will be replaced by forthcoming Deadline Alerts.
    • ✅ Nothing we can do here
  • Subdir: Used as an argument on many CLI commands, --subdir or -S has been superseded by DAG bundles.
    • 🔴 Don't know, probably nothing we can do here
  • Some Airflow context variables: The following keys are no longer available in a task instance’s context. If not replaced, will cause dag errors: - tomorrow_ds - tomorrow_ds_nodash - yesterday_ds - yesterday_ds_nodash - prev_ds - prev_ds_nodash - prev_execution_date - prev_execution_date_success - next_execution_date - next_ds_nodash - next_ds - execution_date
    • ✅ Nothing we can do here
  • The catchup_by_default dag parameter is now False by default.
    • ✅ Nothing we can do here
  • The create_cron_data_intervals configuration is now False by default. This means that the CronTriggerTimetable will be used by default instead of the CronDataIntervalTimetable
    • ✅ Nothing we can do here
  • Simple Auth is now default auth_manager. To continue using FAB as the Auth Manager, please install the FAB provider and set auth_manager to FabAuthManager:
    • 🔴 We need to do something. Either keep using FabAuthManager (technical debt?) or siwtch to Simple Auth. Let's hope it can do OIDC properly ;)
    • 🔴 The OPA authorizer propably needs touching as well

Changes required

  • Think about if we want to have some sort of automated backup mechanism. Probably not, as this would balloon the issue. But some shell scripts to copy/paste would be awesome!
  • As we are already using Python 3.12 we are good to go
  • (possible change required) Deprecated env-vars should be replaced/updated. See Update environment variable that is used for sql_alchemy_conn #319
  • Start-up commands are different for 3.x. See here. Maybe we should move these out into a start-up script, much like we do for HBase.
    comment(sbernauer): I don't like shell scripts in docker-images, they are hard to maintain and keep versions in sync. I think e.g. in Hive we have a if {} else {} in operator and different start commands. I don't think this is ugly at all.

Implementation checklist

  • Update the Docker image
  • Update documentation to include supported version(s)
  • Update and test getting started guide with updated version(s)
  • Update operator to support the new version (if needed)
  • Update integration tests to test use the new versions (in addition or replacing old versions
  • Update examples to use new versions
@adwk67
Copy link
Member

adwk67 commented Apr 29, 2025

Constraints file: https://raw.githubusercontent.com/apache/airflow/constraints-3.0.0/constraints-3.12.txt.
Move extras out into a separate fle so they can be applied in a version-specific way: https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html

@adwk67
Copy link
Member

adwk67 commented Apr 30, 2025

Some breaking changes are only found at provider level e.g. https://airflow.apache.org/docs/apache-airflow-providers-fab/stable/changelog.html#breaking-changes

@adwk67
Copy link
Member

adwk67 commented Apr 30, 2025

The SimpleAuthManager is meant for testing etc.: sticking with the FabauthManager should be OK and not represent a "step backwards".

@adwk67
Copy link
Member

adwk67 commented May 5, 2025

FabAuthManager requires AIRFLOW__DATABASE__SQL_ALCHEMY_CONN be set otherwise the default DB sqlite is assumed: apache/airflow#49229. See also https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#database-uri. Start scripts also need updating e.g. here.

We seem to need to set a value for REMOTE_TASK_LOG, mentioned here: apache/airflow#49863

@adwk67 adwk67 moved this from Next to Development: In Progress in Stackable Engineering May 5, 2025
@adwk67
Copy link
Member

adwk67 commented May 5, 2025

@adwk67
Copy link
Member

adwk67 commented May 7, 2025

Currently hitting "no host supplied" error.
Hostname is not being written to the database:

airflow=> SELECT dag_id, task_id, run_id, hostname FROM task_instance LIMIT 10;
           dag_id           |  task_id  |                      run_id                       | hostname 
----------------------------+-----------+---------------------------------------------------+----------
 example_trigger_target_dag | run_this  | manual__2025-05-07T12:27:37.629455+00:00_4KP8Myk3 | 
 example_trigger_target_dag | bash_task | manual__2025-05-07T12:27:37.629455+00:00_4KP8Myk3 | 
(2 rows)

airflow=>

Although both of these work within the webserver container:

python -c "from airflow.utils.net import getfqdn; print(getfqdn())"
python -c "from airflow.utils.net import get_hostname; print(get_hostname())"

This was not casued by hostname resolution, but rather by not setting AIRFLOW__CORE__EXECUTION_API_SERVER_URL to wherever the webserver is running, and by setting AIRFLOW__API_AUTH__JWT_SECRET. When this is done, the logs are accessible:

Image

@NickLarsenNZ
Copy link
Member

FYI, there is a 3.13 constraints file if we want to use Python 3.13 (it isn't currently available in dnf though).

@adwk67
Copy link
Member

adwk67 commented May 15, 2025

A LimitRange cannot be used with the KubernetesExecutor as, in 3.0.1, the executor template is extended to add an initContainer (not overridable) that is used to copy the command to a shared volume. This will be replaced in a later patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Development: In Progress
Development

No branches or pull requests

3 participants