12 min read
Dillon Browne

Build Immutable Infrastructure Without SSH

Learn how immutable infrastructure eliminates SSH while boosting security and deployment speed. Practical patterns for Kubernetes and cloud-native systems.

devops infrastructure security kubernetes cloud
Build Immutable Infrastructure Without SSH

The traditional SSH-into-servers workflow is dying, and that’s a good thing. After years of managing production infrastructure across multiple cloud providers, I’ve learned that the best way to secure a server is to make it impossible to log into.

This isn’t about adding more security layers or implementing bastion hosts. It’s a fundamental shift in how we think about infrastructure: treating servers as immutable, disposable units rather than pets we nurture and modify over time.

Why SSH Access Is a Liability

Every SSH session is a potential security incident waiting to happen. In my experience building cloud infrastructure, I’ve seen several patterns emerge:

  • Configuration drift from manual changes that bypass CI/CD
  • Audit trail gaps when troubleshooting requires root access
  • Lateral movement opportunities for attackers who compromise credentials
  • Knowledge silos when only specific team members can “fix” production

The uncomfortable truth is that SSH access often masks deeper problems: poor observability, slow deployment pipelines, or infrastructure that isn’t truly reproducible.

The Immutable Infrastructure Approach

Immutable infrastructure means your servers never change after deployment. Need to update configuration? Deploy a new server with the new configuration and destroy the old one. Found a bug? Deploy a fixed version rather than patching in place.

This approach eliminates entire classes of problems:

# Traditional mutable approach (dangerous)
def update_server(server_id):
    ssh_connect(server_id)
    run_command("apt update && apt upgrade")
    restart_service("nginx")
    # What if this fails halfway through?
    # What if the config drifted before this?

Compare that to the immutable approach:

# Immutable approach
def deploy_new_version(old_server_id):
    # Build new server from base image
    new_server = create_from_image("app-v2.3.4")
    
    # Health check
    if not health_check(new_server):
        destroy(new_server)
        raise DeploymentError("Health check failed")
    
    # Atomic swap
    load_balancer.add_target(new_server)
    load_balancer.remove_target(old_server)
    
    # Cleanup
    destroy(old_server)

The immutable version is more code, but it’s far more reliable. Every deployment is identical, testable, and reversible.

Implement SSH-Less Infrastructure Patterns

In my consulting work, I’ve helped several teams transition to SSH-less infrastructure. Here’s a practical pattern using Kubernetes and infrastructure-as-code:

Deploy Immutable Containers

Containers are naturally immutable. Once built, they don’t change:

# Dockerfile with all configuration baked in
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .

# No SSH daemon, no shell required
USER node
CMD ["node", "server.js"]

Use Declarative Configuration

Everything that would traditionally require SSH lives in version control:

# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: api
        image: myregistry.io/api:v2.3.4
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        resources:
          limits:
            memory: "512Mi"
            cpu: "500m"

Want to change the memory limit? Modify the YAML, commit it, and let your CI/CD pipeline apply the change. No SSH required.

Implement Observability from Day One

Without SSH, observability becomes critical. I always implement comprehensive logging upfront:

// Go application with structured logging
package main

import (
    "log/slog"
    "os"
)

func main() {
    logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
    
    logger.Info("server starting",
        "version", os.Getenv("APP_VERSION"),
        "environment", os.Getenv("ENVIRONMENT"),
    )
    
    // All logs go to stdout/stderr
    // Collected by your logging infrastructure
    // No SSH needed to tail logs
}

Every piece of diagnostic information you’d traditionally SSH in to find should be available through your observability stack: metrics, logs, traces, and events.

Handling the “But What If…” Scenarios

The most common objection I hear: “What if something goes wrong and I need to debug?” Here’s how I handle common scenarios without SSH:

Debugging Production Issues

Instead of SSHing in:

# Old way
ssh production-server
tail -f /var/log/app.log | grep ERROR

Use your observability platform:

# New way - query your logging infrastructure
kubectl logs -l app=api --tail=100 | grep ERROR

# Or use your logging service
curl -X POST https://logs.company.com/api/query \
  -d '{"query": "level:ERROR AND app:api", "time": "last 1h"}'

Emergency Hotfixes

In rare emergencies, I use kubectl exec, but with strict controls:

# Temporary debug container (terminates after use)
kubectl debug -it pod/api-server-abc123 \
  --image=busybox \
  --target=api \
  -- /bin/sh

# This is audited, logged, and temporary
# The original container remains immutable

The key difference: this is a separate ephemeral container that doesn’t modify the running application. It’s also fully logged and audited.

The Security Benefits

Eliminating SSH access dramatically reduces your attack surface:

  1. No credential theft: No SSH keys to steal or passwords to brute-force
  2. No privilege escalation: Can’t escalate from application user to root
  3. Perfect audit trail: All changes flow through CI/CD and are version controlled
  4. Faster incident response: Compromise recovery is just redeploying known-good images
  5. Compliance made easier: Immutable infrastructure simplifies SOC 2 and ISO 27001 audits

In one project, we eliminated 70% of our security findings by removing SSH access and implementing immutable deployments. The CISO’s team loved it because every change was traceable through Git history.

Migrate to Immutable Infrastructure

You don’t have to switch overnight. Here’s how I typically migrate teams:

Phase 1: Shadow with Immutable Deploys (2-4 weeks)

  • Keep SSH access available
  • Start deploying new versions as full replacements
  • Build confidence in the new process

Phase 2: Emergency-Only SSH (4-8 weeks)

  • Require manager approval for SSH access
  • Log all SSH sessions
  • Post-mortems for any SSH usage to improve automation

Phase 3: SSH Removal (ongoing)

  • Disable SSH on new resources
  • Gradually decommission old infrastructure
  • Use ephemeral debug containers for rare edge cases

Codify Infrastructure with Terraform

The foundation of SSH-less infrastructure is comprehensive IaC. Here’s a Terraform pattern I use:

# terraform/compute.tf
resource "aws_launch_template" "app" {
  name_prefix   = "app-"
  image_id      = data.aws_ami.app_latest.id
  instance_type = "t3.medium"
  
  # No SSH key specified
  # key_name = "production-key"  # Commented out intentionally
  
  metadata_options {
    http_endpoint = "enabled"
    http_tokens   = "required"  # IMDSv2 only
  }
  
  user_data = base64encode(templatefile("${path.module}/user-data.sh", {
    app_version = var.app_version
    environment = var.environment
  }))
  
  tag_specifications {
    resource_type = "instance"
    tags = {
      Name        = "app-${var.environment}"
      Version     = var.app_version
      Immutable   = "true"
      ManagedBy   = "terraform"
    }
  }
}

Notice the commented-out key_name. That’s intentional. New instances don’t have SSH keys provisioned at all.

Build Comprehensive Monitoring

Without SSH, your observability stack needs to answer every question you’d traditionally SSH in to investigate:

// TypeScript: Comprehensive application metrics
import { Registry, Counter, Histogram, Gauge } from 'prom-client';

const registry = new Registry();

const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  registers: [registry]
});

const activeConnections = new Gauge({
  name: 'active_database_connections',
  help: 'Number of active database connections',
  registers: [registry]
});

// Export metrics endpoint (scraped by Prometheus)
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', registry.contentType);
  res.end(await registry.metrics());
});

With proper metrics, you can answer questions like:

  • What’s the request latency distribution?
  • How many active connections do we have?
  • What’s the error rate by endpoint?

All without ever SSHing into a server.

Cost and Performance Wins

Immutable infrastructure isn’t just about security—it has operational benefits:

Faster deployments: My teams went from 45-minute deployment windows (with manual SSH steps) to 5-minute automated rollouts.

Lower costs: Spot instances and auto-scaling work better when servers are truly stateless and disposable.

Better reliability: Configuration drift issues disappeared entirely. Every server is identical to every other server at the same version.

When SSH Might Still Make Sense

I’m pragmatic about this. There are still scenarios where SSH access is reasonable:

  • Development environments: Local development often benefits from direct access
  • Legacy applications: Some systems can’t easily be containerized
  • Specific compliance requirements: Some regulations mandate certain access patterns
  • Highly regulated air-gapped networks: Where typical cloud-native tooling isn’t available

But even in these cases, treat SSH as an escape hatch, not the primary interface.

The Cultural Shift

The hardest part of eliminating SSH isn’t technical—it’s cultural. Engineers who’ve spent years SSHing into servers need to relearn their troubleshooting workflows.

I’ve found success by:

  1. Pairing on incidents: Junior engineers learn to debug without SSH by watching seniors use observability tools
  2. Runbooks that don’t mention SSH: Document procedures using kubectl, cloud CLIs, and observability platforms
  3. Celebrating wins: Track and share how immutable infrastructure prevented issues or accelerated recovery

One team I worked with measured “time to recovery” before and after. Post-migration, their P1 incident recovery time dropped by 60% because they could simply redeploy known-good versions instead of debugging and manually fixing servers.

Conclusion

Removing SSH access from production infrastructure is counter-intuitive but powerful. It forces better practices: comprehensive observability, automation, and treating infrastructure as code.

The transition requires upfront investment in tooling and cultural change, but the payoff is substantial: better security, faster deployments, and more reliable systems.

In my infrastructure consulting work, every team that’s made this transition has told me the same thing: “We can’t imagine going back to SSH-based workflows.” The combination of improved security posture and operational efficiency makes it a one-way door.

If you’re still SSHing into production servers daily, consider it a signal that your infrastructure automation and observability need improvement. The goal isn’t to make SSH harder—it’s to make it unnecessary.

Found this helpful? Share it with others: