12 min read
Dillon Browne

Scale Beyond GitHub Actions

Master CI/CD orchestration beyond GitHub Actions limits. Learn when orchestration tools win, avoid bottlenecks, scale pipelines faster today.

cicd devops github infrastructure orchestration
Scale Beyond GitHub Actions

I’ve spent years building CI/CD pipelines at scale, and I keep seeing the same pattern: teams adopt GitHub Actions for everything, hit walls, and struggle to escape. The truth is, GitHub Actions isn’t designed for complex orchestration - and pretending otherwise costs teams months of productivity. Let me show you when to move beyond GitHub Actions and which orchestration tools actually solve these problems.

The GitHub Actions Comfort Trap

GitHub Actions is brilliant for what it does. Integration is seamless, YAML is familiar, and the marketplace offers thousands of pre-built actions. I’ve built dozens of workflows myself - from simple test runners to complex multi-stage deployments.

But here’s the thing: GitHub Actions is a CI/CD tool, not an orchestration platform. The distinction matters more than you’d think.

In my work with enterprise clients, I’ve seen teams stretch GitHub Actions far beyond its intended use case. They’re running data pipelines, orchestrating microservice deployments across multiple clouds, and managing complex infrastructure provisioning - all through increasingly convoluted YAML files.

The problems start small: a few minutes of queue time here, some flaky reruns there. Then suddenly you’re debugging workflow failures at 2 AM, trying to untangle dependencies across 15 job files.

Identify GitHub Actions Breaking Points

State Management is Fundamentally Limited

GitHub Actions treats each workflow run as ephemeral. You can pass artifacts between jobs, but there’s no built-in concept of stateful orchestration.

I learned this the hard way on a Kubernetes migration project. We needed to coordinate deployment ordering across 40+ microservices with complex dependencies. The workflow file grew to 800 lines of YAML with intricate needs chains.

Here’s a simplified version of what we tried:

jobs:
  deploy-database:
    runs-on: ubuntu-latest
    steps:
      - name: Deploy PostgreSQL
        run: |
          kubectl apply -f k8s/database/
          kubectl wait --for=condition=ready pod -l app=postgres --timeout=300s

  deploy-cache:
    needs: deploy-database
    runs-on: ubuntu-latest
    steps:
      - name: Deploy Redis
        run: kubectl apply -f k8s/cache/

  deploy-api:
    needs: [deploy-database, deploy-cache]
    runs-on: ubuntu-latest
    steps:
      - name: Deploy API services
        run: kubectl apply -f k8s/api/

  # ... 37 more jobs

This approach fails for several reasons:

  1. No rollback state: If deploy-api fails, you need manual intervention to restore previous versions
  2. No partial retries: A transient network error means rerunning the entire pipeline
  3. No dynamic dependency resolution: Adding a new service requires editing the YAML dependency graph

Concurrency Limits Hit Fast

GitHub enforces hard limits on concurrent workflows and jobs that vary by plan and runner type, and these limits can change over time (see the official GitHub Actions usage limits for current details). In practice, many teams find that smaller plans only allow on the order of a few dozen concurrent jobs, while enterprise tiers typically top out in the low hundreds.

I hit this ceiling on a monorepo with 50 microservices. Each push triggered integration tests for all services - that’s 50 parallel jobs right there. Add in frontend builds, security scans, and infrastructure validation, and we were constantly queued.

The math doesn’t work:

  • 50 microservices × 3 environments (dev, staging, prod) = 150 deployments
  • Queue time: 5-15 minutes per deployment
  • Total pipeline time: hours instead of minutes

Complex Conditionals Become Unmaintainable

GitHub Actions uses a limited expression language for conditionals. Once you need business logic beyond “if PR, then test,” you’re writing bash scripts that manipulate JSON.

Here’s actual code I wrote to conditionally deploy based on changed files:

#!/bin/bash
set -e

CHANGED_FILES=$(git diff --name-only ${{ github.event.before }} ${{ github.sha }})

# Determine which services changed
DEPLOY_AUTH=false
DEPLOY_API=false
DEPLOY_WEB=false

if echo "$CHANGED_FILES" | grep -q "^services/auth/"; then
  DEPLOY_AUTH=true
fi

if echo "$CHANGED_FILES" | grep -q "^services/api/"; then
  DEPLOY_API=true
fi

if echo "$CHANGED_FILES" | grep -q "^services/web/"; then
  DEPLOY_WEB=true
fi

# Set outputs for matrix strategy
echo "auth=$DEPLOY_AUTH" >> $GITHUB_OUTPUT
echo "api=$DEPLOY_API" >> $GITHUB_OUTPUT
echo "web=$DEPLOY_WEB" >> $GITHUB_OUTPUT

Then reference these in a matrix strategy with nested conditionals. The workflow file became an unreadable mess.

Choose the Right Orchestration Tool

I now use this decision framework:

Stick with GitHub Actions if:

  • You have < 10 deployment targets
  • Workflows complete in < 30 minutes
  • Dependencies are linear or simple fan-out
  • State between runs doesn’t matter

Move to orchestration when:

  • You need dynamic DAGs (directed acyclic graphs)
  • Partial workflow retries are essential
  • You’re orchestrating multi-cloud resources
  • Workflow state needs to persist across runs

Deploy with Better Orchestration

For teams hitting these limits, I typically recommend one of these paths:

Argo Workflows for Kubernetes-Native Teams

Argo Workflows is purpose-built for orchestration. It understands state, handles complex DAGs, and integrates natively with Kubernetes.

Here’s the same deployment logic, expressed in Argo:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: microservice-deploy-
spec:
  entrypoint: deploy-all
  
  templates:
  - name: deploy-all
    dag:
      tasks:
      - name: deploy-database
        template: deploy-service
        arguments:
          parameters:
          - name: service
            value: database
      
      - name: deploy-cache
        template: deploy-service
        arguments:
          parameters:
          - name: service
            value: cache
        dependencies: [deploy-database]
      
      - name: deploy-api
        template: deploy-service
        arguments:
          parameters:
          - name: service
            value: api
        dependencies: [deploy-database, deploy-cache]

  - name: deploy-service
    inputs:
      parameters:
      - name: service
    container:
      image: bitnami/kubectl:latest
      command: [sh, -c]
      args:
      - |
        kubectl apply -f k8s/{{inputs.parameters.service}}/
        kubectl wait --for=condition=ready pod -l app={{inputs.parameters.service}} --timeout=300s

The key difference: Argo maintains workflow state in etcd. You can retry individual steps, pause workflows, and inspect state at any point. Failed workflows don’t disappear into GitHub’s logs.

Temporal for Complex Business Logic

When workflows involve human approvals, external API calls, or long-running processes, I reach for Temporal.

Temporal workflows are code, not YAML:

import { proxyActivities, sleep } from '@temporalio/workflow';

const { deployService, notifySlack, waitForApproval } = proxyActivities({
  startToCloseTimeout: '5 minutes',
});

export async function deploymentWorkflow(services: string[]): Promise<void> {
  // Deploy infrastructure services first
  await deployService('database');
  await deployService('cache');
  
  // Wait for approval before production deployment
  const approved = await waitForApproval('production-deploy');
  
  if (!approved) {
    await notifySlack('Deployment cancelled by operator');
    return;
  }
  
  // Deploy application services in parallel
  await Promise.all(
    services.map(service => deployService(service))
  );
  
  // Verify health before finishing
  await sleep('2 minutes');
  await notifySlack('Deployment completed successfully');
}

This workflow can run for days if needed. It survives process restarts, handles retries with exponential backoff, and maintains complete history.

Optimize with Hybrid CI/CD

I don’t advocate abandoning GitHub Actions entirely. Instead, I use a hybrid strategy:

GitHub Actions handles:

  • PR validation (linting, unit tests, security scans)
  • Building and pushing container images
  • Triggering orchestration workflows

External orchestration handles:

  • Multi-stage deployments
  • Infrastructure provisioning
  • Data pipelines
  • Anything requiring stateful coordination

Here’s the integration pattern I use:

# .github/workflows/trigger-deployment.yml
name: Trigger Deployment

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Build and push image
        run: |
          docker build -t myapp:${{ github.sha }} .
          docker push myapp:${{ github.sha }}
      
      - name: Trigger Argo Workflow
        run: |
          argo submit -n production \
            --from workflowtemplate/deploy-microservices \
            -p image-tag=${{ github.sha }} \
            -p environment=production

GitHub Actions does what it’s good at (build, test), then hands off to Argo for complex orchestration.

Cost Considerations

One objection I hear: “But Argo/Temporal requires infrastructure.”

True. But let’s do the math:

GitHub Actions costs (for our 50-service example):

  • Enterprise plan: $21/user/month + $0.008/minute for Actions
  • 50 services × 10 deployments/day × 15 minutes = 7,500 minutes/day
  • Monthly cost: ~$1,800 in Actions minutes alone

Self-hosted Argo Workflows:

  • 3 controller pods on existing Kubernetes cluster
  • Resource cost: ~$50/month in compute
  • Maintenance time: ~4 hours/month

The ROI is clear, especially at scale.

Migrate Workflows Incrementally

When I help teams migrate, I follow this pattern:

  1. Start with new workflows: Don’t rewrite everything. Build new complex workflows in your orchestration tool.

  2. Identify the most painful workflows: Which workflows have the most failures? Longest queue times? Migrate those first.

  3. Keep GitHub Actions as the trigger: Developers stay in their familiar flow. They push code, GitHub Actions builds it, then hands off.

  4. Migrate incrementally: Service by service, workflow by workflow. No big-bang migrations.

The Real Question

The question isn’t “Should I use GitHub Actions?” It’s “What’s the right tool for each part of my pipeline?”

GitHub Actions excels at event-driven automation tightly coupled to your repository. Use it for that. But when you need stateful orchestration, dynamic workflows, or complex dependencies, admit the limits and reach for purpose-built tools.

I’ve seen teams waste months fighting GitHub Actions’ constraints when a weekend of Argo setup would solve their problems. Don’t be that team.

The best CI/CD architecture uses each tool for its strengths. GitHub Actions for repository automation. Argo or Temporal for orchestration. Your job is to know where the boundary lies.

Key Takeaways

  • GitHub Actions is CI/CD, not orchestration - know the difference
  • State management and concurrency limits hit faster than you think
  • Complex conditionals in YAML are a code smell
  • Hybrid approaches work: Actions for builds, orchestration tools for deployments
  • Migration is incremental - don’t rewrite everything at once
  • The right tool depends on your scale and complexity

If you’re spending more time debugging GitHub Actions YAML than shipping features, it’s time to reevaluate your CI/CD orchestration strategy. Your future self will thank you.

Found this helpful? Share it with others: