Upgrade from Apache Airflow® 2 to 3
Upgrade from Apache Airflow® 2 to 3
Upgrade from Apache Airflow® 2 to 3
Airflow 3 is a major release of Apache Airflow® that includes a completely new UI and significant architectural changes, improving Airflow’s security posture and enabling new features. While Airflow developers took great care to keep as much backward compatibility as possible, making the upgrade process as efficient and smooth as it can be, there are some breaking changes that you need to be aware of. Additionally, the Airflow project has tools to help you upgrade your DAG code and Airflow configuration to be compatible with Airflow 3.
This guides provides a checklist for upgrading from Airflow 2 to Airflow 3, including:
Tip
This guide covers important breaking changes between Airflow 2 and Airflow 3, as well as upgrading instructions for open-source Airflow. Astronomer customers should also refer to the Astro documentation for specific upgrade instructions. For more in depth instructions on how to upgrade from Airflow 2 to Airflow 3, see the free eBook Practical guide: Upgrade from Apache Airflow 2 to Airflow 3.
To get the most out of this guide, you should have an understanding of:
The following checklist provides a high-level overview of the steps you need to take to upgrade your Airflow 2 environment to Airflow 3. The steps are described in more detail in the sections below.
astro dev upgrade-test command contains ruff check options.astro version. You need to be at least on version 1.34.0 to run Airflow 3. You can upgrade the Astro CLI with brew upgrade astro.FROM astrocrpublic.azurecr.io/runtime:<astro-runtime-version> (see the Astro Runtime release notes for the latest version available).Info
If you are still using Airflow 1, we highly recommend upgrading to Airflow 2 as soon as possible. Support for Airflow 1 ended on June 17, 2021, so no further updates are being made, and potential security issues in Airflow 1 are not being addressed. After upgrading to Airflow 2, upgrade to Airflow 2.6.3+; then upgrade to Airflow 3 as explained in this chapter. For information on upgrading from Airflow 1 to Airflow 2, see the Airflow documentation.
In Airflow 3 several deprecated parameters and import paths have been removed. This means that if you have been using deprecated parameters or import paths in your DAG code, you will need to update them to be compatible with Airflow 3. The ruff linter is a Python linter and code transformation tool that can be used to check your Airflow DAG code for compatibility with Airflow 3. There are two sets of ruff rules available for upgrading from Airflow 2 to Airflow 3:
To use the ruff linter, you need to install the latest version of ruff. You can do this with pip:
Then, you can run the ruff linter on your Airflow DAG code with the following command:
You can add --fix to the command to automatically fix issues that ruff finds, note that not all issues can be fixed automatically.
After running this command you will see a list of issues that ruff found in your code in your terminal, with a suggestion for how to fix them. For example, if you have a DAG that uses the fail_stop DAG parameter, which was renamed to fail_fast, you will see an error message like this:
Tip
If you are using the Astro CLI, the ruff check is included in the
astro dev upgrade-testcommand. See the Astro CLI documentation for more information.
Info
The ruff linter is a great tool to help you update your DAG code for Airflow 3. However, it cannot detect all potential compatibility issues. After running the ruff linter, you should still read through the breaking changes section of this guide and the Airflow release notes to ensure that your code is compatible with Airflow 3.
In Airflow 3, some changes have been made to configuration options. For upgrading purposes, four categories of changes are relevant:
[scheduler].catchup_by_default has changed from True to False.[webserver].web_server_host which has been renamed and moved to [api].host.[webserver].error_logfile.0 used to be a valid input to [core].parallelism, but now a positive integer is required.You can learn more about all valid configuration options in the Airflow configuration reference.
The airflow config lint command of the Airflow CLI that can be used to check your Airflow configuration for compatibility with Airflow 3 and airflow config update will make the necessary changes to your configuration file for it to be compatible with Airflow 3.
Astro CLI users first need to export their airflow.cfg file and potentailly make a change for it to parse through the linter correctly:
astro dev start to start your Astro CLI project.astro dev run config list | awk '/^\[core\]/ {found=1} found' > airflow.cfg to export your current configuration file. This command removes any additional lines at the beginning of the file that might cause a parsing error in the linter.airflow.cfg file into your scheduler container with docker cp airflow.cfg <scheduler container>:/usr/local/airflow.astro dev bash.airflow config lint inside the container to check your configuration file for compatibility with Airflow 3.After running the airflow config lint command, you will see a list of issues that were found in your configuration file like the example below:
Being a new major version, Airflow 3 comes with a number of breaking changes that can affect some of your DAGs, depending on which features you are using. This section lists the most important breaking changes that you need to be aware of when upgrading from Airflow 2 to Airflow 3.
Tip
The list of breaking changes in this guide focusses on the most relevant ones but is not exhaustive. For a full list of changes between Airflow 2 and Airflow 3, see the Airflow release notes.
In Airflow 2, all tasks had direct access to the Airflow metadata database. This access was removed in Airflow 3, greatly improving Airflow’s security posture. If you are accessing the Airflow metadata database directly in any of your task or trigger code, such as by using the SQLAlchemy connection environment variable, that process will error in Airflow 3. This includes custom operators making such a connection.
Recommendation: Directly accessing the Airflow metadata database from within tasks is an antipattern because it could lead to accidental modifying or dropping of information that is vital to Airflow’s functioning, up to and including corruption of your entire Airflow instance. To interact with and retrieve information about your Airflow instance, use the Airflow REST API instead.
In Airflow 3, the following changes were made to scheduling parameters and utilities:
schedule_interval and timetable were deprecated in favor of schedule. The default schedule is now None.catchup was set to False by default at the configuration level. This means that if you do not set a value for catchup, Airflow will not try to catch up on missed runs. You can enable the Airflow 2 behavior by setting the [scheduler].catchup_by_default configuration option to True.days_ago function was removed in favor of pendulum.today('UTC').add(days=-N, ...).If you pass raw cron strings to your DAG’s schedule, for example 0 0 * * *, by default it used to be interpreted with the CronDataIntervalTimetable timetable under the hood. In Airflow 3, this behavior was changed to use the CronTriggerTimetable timetable instead. You can change this behavior back to the Airflow 2 behavior by setting the [scheduler].create_cron_data_intervals configuration option to True. For more information on the differences between the two timetables, see the Timetables comparisons in the Airflow documentation.
The logical_date attribute of the DAG run was changed from being equivalent to the data_interval_start in Airflow 2 to being equivalent to the run_after date in Airflow 3. This means that the logical_date is now equivalent to the run_after date and the run_id takes its timestamp from the moment in time when the DAG run is queued. It is also now possible to pass None as the logical_date. The deprecated execution_date attribute was removed. This change is mostly relevant users that utilitze a time-dependent context element in the logic of their DAGs, for example to partition their data in a SQL query. See Schedule DAGs in Apache Airflow® for more information.
Airflow 3 introduces other improvements and changes that may affect you if you use any of the related features. The following list summarizes the most important ones:
v2 version of the Airflow REST API. If you are using the REST API to interact with Airflow you’ll likely need to update your scripts. See the Airflow REST API documentation for more information.execution_date. See the Airflow context guide for more information.SimpleAuthManager. If you need FAB integration, install the FAB provider. For more information on auth managers, see the Airflow documentation. Support for FAB-based plugins is limited in Airflow 3.0 but will be available in a future release.Depending on how you run Airflow, you may find that some Airflow providers (such as FTP, HTTP, and IMAP) that used to be preinstalled in your image/package for Airflow 2 are not preinstalled in Airflow 3. Pip-install the needed providers in your Airflow environment. See the Airflow documentation for a list of officially supported providers.