Operators are one of the building blocks of Airflow DAGs. There are many different types of operators available in Airflow. The PythonOperator can execute any Python function, and is functionally equivalent to using the @task decorator, while other operators contain pre-created logic to perform a specific task, such as executing a Bash script (BashOperator) or running a SQL query in a relational database (SQLExecuteQueryOperator). Operators are used alongside other building blocks, such as decorators and hooks, to create tasks in a DAG written with the task-oriented approach. Operators classes can be imported from Airflow provider packages.
In this guide, you’ll learn the basics of using operators in Airflow.
To view a list of available operators available in different Airflow provider packages, go to the Airflow Registry.
To get the most out of this guide, you should have an understanding of:
Operators are Python classes that encapsulate logic to do a unit of work. They can be viewed as a wrapper around each unit of work that defines the actions that will be completed and abstract the majority of code you would typically need to write. When you create an instance of an operator in a DAG and provide it with its required parameters, it becomes a task.
A base set of operators is contained in the Airflow standard provider package, which is pre-installed when using the Astro CLI. Other operators are contained in specialized provider packages, often centered around a specific technology or service. For example, the Airflow Snowflake Provider package contains operators for interacting with Snowflake, while the Airflow Google provider package contains operators for interacting with Google Cloud services. There are also several packages that contain operators that can be used with a set of services:
Following are some of the most frequently used Airflow operators. Note that only a few of the possible parameters are shown, refer to the Airflow Registry for a full list of parameters for each operator.
PythonOperator: Executes a Python function. It is functionally equivalent to using the @task decorator. See, Introduction to the TaskFlow API and Airflow decorators.
BashOperator: Executes a bash script. See also the Using the BashOperator guide.
KubernetesPodOperator: Executes a task defined as a Docker image in a Kubernetes Pod. See, Use the KubernetesPodOperator.
SQLExecuteQueryOperator: Executes a SQL query against a relational database.
EmptyOperator: A no-op operator that does nothing. This is useful for creating placeholder tasks in a DAG.
All operators inherit from the abstract BaseOperator class, which contains the logic to execute the work of the operator within the context of a DAG.
Arguments of the BaseOperator class can be passed to all operators. The most common arguments are:
task_id: A unique identifier for the task. This is required for all operators.retries: The number of times to retry the task if it fails. This is optional and defaults to 0. See Rerun Airflow DAGs and tasks.pool: The name of the pool to use for the task. This is optional and defaults to None. See Airflow pools.execution_timeout: The maximum time to wait for the task to complete. This is optional and defaults to None. It is a good practice to set this value to prevent tasks from running indefinitely.You can set these arguments and other BaseOperator arguments (other than task_id which needs to be unique per operator) at the DAG level for all tasks in a DAG. By using the default_args dictionary. You can override these values for individual tasks by setting the same arguments in the task definition.
Operators typically only require a few parameters. Keep the following considerations in mind when using Airflow operators:
PythonOperator and BashOperator. These operators are automatically available in your Airflow environment if you are using the Astro CLI. All other operators are part of provider packages, some which you must install separately, depending on what type of Airflow distribution you are using.@task decorator for most of their tasks, and add operators for tasks where a specialized operator exists for their use case. The example above shows a DAG with one operator (BashOperator) and one @task decorated task.@task decorated task or PythonOperator or extend an operator to meet your needs. For more information about customizing operators, see Custom hooks and operators.deferrable parameter to True.