For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
      • AstroFully-managed data operations, powered by Apache Airflow.
      • Astro Private CloudRun Airflow-as-a-service in your environment.
      • Professional ServicesExpert Airflow services for your enterprise's success.
    • Tools
      • Cosmos
      • Orbiter
      • CLI
      • AI SDK
      • Agents
      • Blueprint
      • UpdatesThe State of Airflow 2026See the insights from over 5,800 data practitioners in the full report. Download Now ➔
  • Customers
  • Docs
    • Insights
      • Blog
      • Webinars
      • Resource Library
      • Events
    • Education
      • Academy
      • What is Airflow?
  • Pricing
Get Started Free
    • Astro Private Cloud overview
    • Astro Private Cloud features
      • Install overview
      • Install the control plane
      • Install a data plane
      • Install in unified mode
        • Debug an installation
      • Log in to Astro Private Cloud
    • Release and lifecycle policy
    • Support policy

Product

  • Platform Overview
  • Astro
  • Astro Observe
  • Astro Private Cloud
  • Security & Trust
  • Pricing

Tools & Services

  • Cosmos
  • Docs
  • Professional Services
  • Product Updates

Use Cases

  • AI Ops
  • Data Observability
  • ETL/ELT
  • ML Ops
  • Operational Analytics
  • All Use Cases

Industries

  • Financial Services
  • Gaming
  • Retail
  • Manufacturing
  • Healthcare
  • All Industries

Resources

  • Academy
  • eBooks & Guides
  • Blog
  • Webinars
  • Events
  • The Data Flowcast Podcast
  • All Resources

Airflow

  • What is Airflow
  • Airflow on Astro
  • Airflow 3.0
  • Airflow Upgrades
  • Airflow Use Cases
  • Airflow 2.x End of Life

Company

  • Our Story
  • Customers
  • Newsroom
  • Careers
  • Contact

Support

  • Knowledge Base
  • Status
  • Contact Support
GitHubYouTubeLinkedInx
  • Legal
  • Privacy
  • Terms of Service
  • Consent Preferences

  • Do Not Sell or Share My Personal information
  • Limit the Use Of My Sensitive Personal Information

Apache Airflow®, Airflow, and the Airflow logo are trademarks of the Apache Software Foundation. Copyright © Astronomer 2026. All rights reserved.

LogoLogo
On this page
  • Ensure platform components are reaching full availability
  • 1. Verify controllers and ReplicaSets
  • 2. Examine Pods and namespace events
  • 3. Inspect container logs
  • Houston Pods stuck in CrashLoopBackOff
  • x509 “certificate signed by unknown authority” while pulling images
  • Houston worker showing NATS timeout errors after installation
  • Houston worker showing NatsError: 503 after installation
Install Astro Private CloudTroubleshoot

Debug an Astro Private Cloud installation

Edit this page
Built with

Use this guide when your Astro Private Cloud (APC) control plane or data plane Pods are not progressing to a healthy state after installation.

Ensure platform components are reaching full availability

Work through the following checks to from controllers to individual containers to isolate possible causes when Pods don’t reach the READY state.

1. Verify controllers and ReplicaSets

  1. List Deployments, StatefulSets, and ReplicaSets in your namespace and confirm the latest ReplicaSet or StatefulSet shows the expected number of available replicas:

    $kubectl get deployment,statefulset,replicaset -n <astronomer namespace>
  2. Identify the most recent ReplicaSet for the component that is failing, with results sorted by creation timestamp:

    $kubectl get replicaset -n <astronomer namespace> --sort-by=.metadata.creationTimestamp
  3. Inspect the returned ReplicaSet for status and events that may be preventing Pods from launching:

    $kubectl describe replicaset <replicaset-name> -n <astronomer namespace>

    Resolve issues such as insufficient resources, pull errors, or missing secrets, then re-check the ReplicaSet until .status.availableReplicas matches .spec.replicas.

2. Examine Pods and namespace events

  1. List Pod status:

    $kubectl get pods -n <astronomer namespace>
  2. describe a failing Pod to view events, container status, and scheduling details:

    $kubectl describe pod <pod-name> -n <astronomer namespace>
  3. Review recent events in the namespace for additional context:

    $kubectl get events -n <astronomer namespace> --sort-by=.lastTimestamp

3. Inspect container logs

If a Pod continues to restart or stuck in CrashLoopBackOff, gather logs for each container:

$kubectl logs <pod-name> -c <container-name> -n <astronomer namespace>

If the container restarts quickly, use --previous to view logs from the last attempt:

$kubectl logs <pod-name> -c <container-name> -n <astronomer namespace> --previous

Use the collected errors to adjust your configuration, for example, by fixing database credentials or registry access. After remediation, re-run kubectl get pods to confirm all Pods report READY status. If problems persist, collect the relevant logs and events and contact Astronomer support.

Houston Pods stuck in CrashLoopBackOff

Houston (API) connects directly to the control-plane database during startup. If the Pods restart repeatedly:

  1. List Pods to verify their status:

    $kubectl get pods -n <astronomer namespace>
  2. Test connectivity to the database from inside the cluster:

    $kubectl run psql --rm -it --restart=Never --namespace <astronomer namespace> \
    > --image bitnami/postgresql --command -- \
    > psql $(kubectl get secret -n <astronomer namespace> <platform-release-name>-houston-backend \
    > --template='{{.data.connection | base64decode }}' | sed 's/?.*//g')

    If the connection times out, investigate networking or firewall rules between Kubernetes nodes and the Postgres host.

  3. Confirm the astronomer-bootstrap secret contains the correct connection string:

    $kubectl get secret astronomer-bootstrap -n <astronomer namespace> -o yaml

    Decode the connection value and fix any typos. After updating the secret, delete the Houston and Grafana Pods so they pick up the change.

x509 “certificate signed by unknown authority” while pulling images

If image pulls fail with a certificate error, such as when syncing registry certificates, restart the Houston Pods followed by the platform registry Pod. Ensure any custom certificate authorities are configured under global.privateCaCerts and applied via helm upgrade.

Houston worker showing NATS timeout errors after installation

After installing or upgrading APC, you might encounter issues where Deployments appear in the Astro CLI and database, but their Kubernetes namespaces are not created. Houston logs might show UnhandledPromiseRejectionWarning: NatsError: TIMEOUT.

This occurs when the NATS JetStream cluster has not yet elected a metadata leader before the Houston worker Pods attempt to set up streams and consumers.

To resolve:

  1. Verify Houston worker Pods are showing NATS timeout errors:
$kubectl logs -l component=houston-worker -n <astronomer namespace>
  1. Restart the Houston worker Pods to allow them to reconnect after the NATS leader election completes:
$kubectl rollout restart deployment <platform-release-name>-houston-worker -n <astronomer namespace>
  1. Confirm Deployment namespaces are created:
$kubectl get namespaces

After the Houston worker Pods restart, they successfully create the necessary Kubernetes resources for your deployments.

Houston worker showing NatsError: 503 after installation

After installing or upgrading APC, Houston and Houston worker Pods may start successfully but silently fail to connect to NATS JetStream. Houston logs might show repeated UnhandledPromiseRejectionWarning: NatsError: 503 entries shortly after startup.

This occurs when Houston and Houston worker start and attempt to initialize JetStream connections before the NATS JetStream subsystem has finished initializing. Both components call jetstreamManager() during startup, which requires JetStream to be fully ready — not just the NATS TCP port. When this API call is made during the JetStream initialization window, NATS returns a 503 “No Responders” error. Because neither component retries on 503, they continue running with broken or missing JetStream connections and silently drop all deployment events. As a result, Deployments may appear in the Astro CLI and database but their Kubernetes namespaces are never created.

To resolve:

  1. Verify that all NATS Pods are healthy and JetStream has finished initializing:

    $kubectl get pods -l app=<platform-release-name>-nats -n <astronomer namespace>
  2. Wait until all NATS Pods report READY status before continuing. You can also inspect the NATS monitoring endpoint directly from inside the Pod to confirm JetStream is responding:

    $kubectl exec -n <astronomer namespace> <platform-release-name>-nats-0 -- \
    > curl -s http://localhost:8222/jsz

    The response should contain JetStream stream and consumer statistics. If the request fails or returns an error, wait and retry before proceeding.

  3. Restart both Houston and Houston worker Pods:

    $kubectl rollout restart deployment \
    > <platform-release-name>-houston \
    > <platform-release-name>-houston-worker \
    > -n <astronomer namespace>
  4. Confirm that Houston worker has established active JetStream subscriptions:

    $kubectl logs -l component=houston-worker -n <astronomer namespace> | grep -i "Running"

    You should see a Running log line for each of the eight JetStream worker subjects, for example, NATS houston-upsert-deployment-for-create Running....

After both Pods restart and JetStream subscriptions are established, deployment operations resume normally.