For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
      • AstroFully-managed data operations, powered by Apache Airflow.
      • Astro Private CloudRun Airflow-as-a-service in your environment.
      • Professional ServicesExpert Airflow services for your enterprise's success.
    • Tools
      • Cosmos
      • Orbiter
      • CLI
      • AI SDK
      • Agents
      • Blueprint
      • UpdatesThe State of Airflow 2026See the insights from over 5,800 data practitioners in the full report. Download Now ➔
  • Customers
  • Docs
    • Insights
      • Blog
      • Webinars
      • Resource Library
      • Events
    • Education
      • Academy
      • What is Airflow?
  • Pricing
Get Started Free
    • Astro Private Cloud overview
    • Astro Private Cloud features
      • Configure metrics
      • Configure liveness and readiness probes
      • Forward logs to Amazon S3
      • Platform and deployment alerts
      • Logs configuration
      • Export task logs
    • Release and lifecycle policy
    • Support policy

Product

  • Platform Overview
  • Astro
  • Astro Observe
  • Astro Private Cloud
  • Security & Trust
  • Pricing

Tools & Services

  • Cosmos
  • Docs
  • Professional Services
  • Product Updates

Use Cases

  • AI Ops
  • Data Observability
  • ETL/ELT
  • ML Ops
  • Operational Analytics
  • All Use Cases

Industries

  • Financial Services
  • Gaming
  • Retail
  • Manufacturing
  • Healthcare
  • All Industries

Resources

  • Academy
  • eBooks & Guides
  • Blog
  • Webinars
  • Events
  • The Data Flowcast Podcast
  • All Resources

Airflow

  • What is Airflow
  • Airflow on Astro
  • Airflow 3.0
  • Airflow Upgrades
  • Airflow Use Cases
  • Airflow 2.x End of Life

Company

  • Our Story
  • Customers
  • Newsroom
  • Careers
  • Contact

Support

  • Knowledge Base
  • Status
  • Contact Support
GitHubYouTubeLinkedInx
  • Legal
  • Privacy
  • Terms of Service
  • Consent Preferences

  • Do Not Sell or Share My Personal information
  • Limit the Use Of My Sensitive Personal Information

Apache Airflow®, Airflow, and the Airflow logo are trademarks of the Apache Software Foundation. Copyright © Astronomer 2026. All rights reserved.

LogoLogo
On this page
  • Architecture
  • Accessing logs
  • Airflow UI
  • Elasticsearch API
  • BYO visualization
  • Vector configuration
  • Enable Vector
  • Custom log parsing
  • Logging sidecar
  • Elasticsearch configuration
  • Enable Elasticsearch
  • Index lifecycle management
  • External logging
  • Forward to external Elasticsearch
  • Forward to S3
  • Forward to external systems
  • Deployment log settings
  • Task log retention
  • Log level configuration
  • Querying logs
  • Elasticsearch query examples
  • Common log fields
  • Troubleshooting
  • Logs aren’t appearing
  • High disk usage
  • Slow queries
  • Security
  • Access control
  • Log redaction
  • Best practices
Platform Observability

Logs configuration

Edit this page
Built with

Astro Private Cloud (APC) provides centralized logging through Vector and Elasticsearch. Task logs, platform logs, and audit logs are collected by Vector and indexed in Elasticsearch for searchability and troubleshooting.

Architecture

Vector runs as a DaemonSet collecting logs from all pods. Logs are shipped to Elasticsearch for storage and indexing. For log visualization, you can connect your own tools (Kibana, Grafana, OpenSearch Dashboards, etc.) to query Elasticsearch.

Accessing logs

Airflow UI

Task logs are accessible directly in the Airflow webserver UI:

  1. Navigate to the Dag.
  2. Click a task instance.
  3. Click Log.

Elasticsearch API

Query logs directly via Elasticsearch:

$# Search for errors in the last hour
$curl -X GET "https://elasticsearch.<base-domain>/_search" \
> -H "Content-Type: application/json" \
> -d '{
> "query": {
> "bool": {
> "must": [
> { "match": { "log_level": "ERROR" } },
> { "range": { "@timestamp": { "gte": "now-1h" } } }
> ]
> }
> }
> }'

BYO visualization

APC does not include a log visualization UI. Connect your preferred tool to Elasticsearch:

  • Kibana: Deploy separately and point to the Elasticsearch endpoint.
  • Grafana: Use the Elasticsearch data source.
  • OpenSearch Dashboards: Use Elasticsearch API compatibility.

Vector configuration

Vector is the log collection agent in APC 1.0.

Enable Vector

1tags:
2 logging: true
3
4vectorEnabled: true
5 vector:
6 resources:
7 requests:
8 cpu: "250m"
9 memory: "512Mi"
10 limits:
11 cpu: "1000m"
12 memory: "1Gi"

Custom log parsing

Add custom transforms to parse Airflow log formats:

1vector:
2 customConfig: |
3 [transforms.parse_airflow]
4 type = "remap"
5 inputs = ["kubernetes_logs"]
6 source = '''
7 .parsed = parse_regex!(.message, r'^\[(?P<timestamp>.+)\] \{(?P<logger>.+)\} (?P<level>\w+) - (?P<message>.+)$')
8 '''

Logging sidecar

APC supports either DaemonSet or sidecar logging on a data plane cluster, but not both simultaneously. To use sidecar logging, you must first disable the Vector DaemonSet, then enable the sidecar:

1global:
2 vectorEnabled: false
3 loggingSidecar:
4 enabled: true
5 name: sidecar-log-consumer
6 repository: quay.io/astronomer/ap-vector
7 tag: 0.52.0
8 resources:
9 requests:
10 cpu: "100m"
11 memory: "386Mi"

Elasticsearch configuration

Enable Elasticsearch

1tags:
2 logging: true
3
4elasticsearch:
5 common:
6 persistence:
7 enabled: true
8
9 client:
10 replicas: 2
11 heapMemory: "2g"
12 resources:
13 requests:
14 cpu: "1"
15 memory: "2Gi"
16 limits:
17 cpu: "2"
18 memory: "4Gi"
19
20 data:
21 replicas: 3
22 heapMemory: "2g"
23 resources:
24 requests:
25 cpu: "1"
26 memory: "2Gi"
27 limits:
28 cpu: "2"
29 memory: "4Gi"
30 persistence:
31 size: "100Gi"
32
33 master:
34 replicas: 3
35 heapMemory: "2g"
36 resources:
37 requests:
38 cpu: "1"
39 memory: "2Gi"

Index lifecycle management

Configure log retention:

1elasticsearch:
2 indexLifecycleManagement:
3 enabled: true
4 policies:
5 - name: airflow-logs
6 phases:
7 hot:
8 actions:
9 rollover:
10 max_size: 50gb
11 max_age: 7d
12 delete:
13 min_age: 30d
14 actions:
15 delete: {}

External logging

Forward to external Elasticsearch

Send logs to your own Elasticsearch cluster:

1global:
2 customLogging:
3 enabled: true
4 scheme: https
5 host: "elasticsearch.example.com"
6 port: "9200"
7 secret: "es-credentials"

Forward to S3

Archive logs to object storage:

1vector:
2 sinks:
3 s3:
4 type: "aws_s3"
5 inputs: ["kubernetes_logs"]
6 bucket: "my-logs-bucket"
7 region: "us-west-2"
8 compression: "gzip"
9 encoding:
10 codec: "json"

Forward to external systems

Configure Vector to send to any destination:

1vector:
2 sinks:
3 # Splunk
4 splunk:
5 type: "splunk_hec"
6 inputs: ["kubernetes_logs"]
7 endpoint: "https://splunk.example.com:8088"
8 token: "${SPLUNK_TOKEN}"
9
10 # Datadog
11 datadog:
12 type: "datadog_logs"
13 inputs: ["kubernetes_logs"]
14 default_api_key: "${DATADOG_API_KEY}"
15
16 # Generic HTTP
17 http:
18 type: "http"
19 inputs: ["kubernetes_logs"]
20 uri: "https://logs.example.com/v1/logs"
21 encoding:
22 codec: "json"

Deployment log settings

Task log retention

Configure log groomer to manage disk usage:

1# In deployment values
2scheduler:
3 logGroomerSidecar:
4 enabled: true
5 retentionDays: 15
6 frequencyMinutes: 15
7
8workers:
9 logGroomerSidecar:
10 enabled: true
11 retentionDays: 15

Log level configuration

1env:
2 - name: AIRFLOW__LOGGING__LOGGING_LEVEL
3 value: "INFO"
4 - name: AIRFLOW__LOGGING__FAB_LOGGING_LEVEL
5 value: "WARNING"

Querying logs

Elasticsearch query examples

Find task failures:

1{
2 "query": {
3 "bool": {
4 "must": [
5 { "match": { "log_level": "ERROR" } },
6 { "match": { "kubernetes.labels.component": "worker" } }
7 ]
8 }
9 }
10}

Search specific Dag:

1{
2 "query": {
3 "bool": {
4 "must": [
5 { "match": { "dag_id": "my_dag" } },
6 { "match": { "task_id": "my_task" } }
7 ]
8 }
9 }
10}

Filter by time range:

1{
2 "query": {
3 "range": {
4 "@timestamp": {
5 "gte": "2026-02-01T00:00:00",
6 "lt": "2026-02-02T00:00:00"
7 }
8 }
9 }
10}

Common log fields

FieldDescription
kubernetes.namespace_nameDeployment namespace
kubernetes.labels.componentComponent (scheduler, worker, etc.)
kubernetes.pod_namePod name
dag_idDAG identifier
task_idTask identifier
log_levelDEBUG, INFO, WARNING, ERROR
@timestampLog timestamp

Troubleshooting

Logs aren’t appearing

  1. Check Vector is running:

    $kubectl get pods -n astronomer -l app=vector
  2. Check Elasticsearch health:

    $kubectl exec -n astronomer elasticsearch-0 -- \
    > curl -s localhost:9200/_cluster/health
  3. Verify Vector logs:

    $kubectl logs -n astronomer -l app=vector --tail=100

High disk usage

  1. Enable index lifecycle management.
  2. Reduce retention period.
  3. Increase Elasticsearch storage.
  4. Forward logs to external storage (S3).

Slow queries

  1. Add index patterns for common searches.
  2. Increase Elasticsearch resources.
  3. Reduce log verbosity.

Security

Access control

Elasticsearch access is restricted to platform components. For external access, configure authentication:

1elasticsearch:
2 auth:
3 enabled: true
4 secretName: "es-credentials"

Log redaction

Redact sensitive data before indexing:

1vector:
2 transforms:
3 redact:
4 type: "remap"
5 source: '''
6 .message = replace(.message, r'password=\S+', "password=***")
7 .message = replace(.message, r'api_key=\S+', "api_key=***")
8 '''

Best practices

  • Set appropriate retention based on compliance requirements.
  • Use log levels wisely - avoid DEBUG in production.
  • Enable log groomer to prevent disk exhaustion on Airflow pods.
  • Forward logs externally for long-term retention and compliance.
  • Monitor Elasticsearch health and disk usage.
  • Use your preferred visualization tool - deploy Kibana, Grafana, or other tools separately.