WEBINAR

Data-Aware Scheduling with the Astro Python SDK

Recorded On December 6, 2022

  • Tamara Fingerlin
  • Benji Lampel

In Airflow 2.4 the Datasets feature was introduced.

This allows data-aware scheduling:

  • The DAG author (you) can tell Airflow that a task is updating a Dataset: outlets=[Dataset(“s3://my_bucket”)]
  • DAGs can be scheduled to run on these updates to Datasets: schedule=[Dataset(“s3://my_bucket”)]

You can find all the needed resources in this Github repository.

See More Resources

Best practices for managing Airflow across teams

Dynamic Task Mapping

How to build reliable data products with Astro

Intro to Airflow: Get Started Writing Pipelines for Any Use Case

Try Astro for Free for 14 Days

Sign up with your business email and get up to $500 in free credits.

Get Started

Build, run, & observe your data workflows.
All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.