+

Apache Airflow<sup>®</sup> Logo

About Google Dataproc

[Google Dataproc](https://cloud.google.com/dataproc/) is a highly scalable service to run [Apache Spark](/integrations/apache-spark/), Apache Flink, Presto, and many more open source tools fully integrated in [Google Cloud](/docs/astro/ci-cd-templates/gcs/). Use Google Dataproc to run your compute-intensive Astro tasks handling large amounts of data for data science and [ETL](/solutions/etl-elt/) processes.


Use Case

Gaining insights from large amounts of data using distributed machine learning is a common use case for orchestrating jobs in Google Dataproc using Astro. Astro offers specialized [operators](/docs/learn/what-is-an-operator/) to effortlessly leverage async processes when interacting with Google Dataproc, making your pipeline more cost-effective.

Get started free.

OR

API Access
Alerting
SAML-Based SSO
Airflow AI Assistant
Deployment Rollbacks
Audit Logging

By proceeding you agree to our Privacy Policy, our Website Terms and to receive emails from Astronomer.