Build Immutable Infrastructure Without SSH
Learn how immutable infrastructure eliminates SSH while boosting security and deployment speed. Practical patterns for Kubernetes and cloud-native systems.
The traditional SSH-into-servers workflow is dying, and that’s a good thing. After years of managing production infrastructure across multiple cloud providers, I’ve learned that the best way to secure a server is to make it impossible to log into.
This isn’t about adding more security layers or implementing bastion hosts. It’s a fundamental shift in how we think about infrastructure: treating servers as immutable, disposable units rather than pets we nurture and modify over time.
Why SSH Access Is a Liability
Every SSH session is a potential security incident waiting to happen. In my experience building cloud infrastructure, I’ve seen several patterns emerge:
- Configuration drift from manual changes that bypass CI/CD
- Audit trail gaps when troubleshooting requires root access
- Lateral movement opportunities for attackers who compromise credentials
- Knowledge silos when only specific team members can “fix” production
The uncomfortable truth is that SSH access often masks deeper problems: poor observability, slow deployment pipelines, or infrastructure that isn’t truly reproducible.
The Immutable Infrastructure Approach
Immutable infrastructure means your servers never change after deployment. Need to update configuration? Deploy a new server with the new configuration and destroy the old one. Found a bug? Deploy a fixed version rather than patching in place.
This approach eliminates entire classes of problems:
# Traditional mutable approach (dangerous)
def update_server(server_id):
ssh_connect(server_id)
run_command("apt update && apt upgrade")
restart_service("nginx")
# What if this fails halfway through?
# What if the config drifted before this?
Compare that to the immutable approach:
# Immutable approach
def deploy_new_version(old_server_id):
# Build new server from base image
new_server = create_from_image("app-v2.3.4")
# Health check
if not health_check(new_server):
destroy(new_server)
raise DeploymentError("Health check failed")
# Atomic swap
load_balancer.add_target(new_server)
load_balancer.remove_target(old_server)
# Cleanup
destroy(old_server)
The immutable version is more code, but it’s far more reliable. Every deployment is identical, testable, and reversible.
Implement SSH-Less Infrastructure Patterns
In my consulting work, I’ve helped several teams transition to SSH-less infrastructure. Here’s a practical pattern using Kubernetes and infrastructure-as-code:
Deploy Immutable Containers
Containers are naturally immutable. Once built, they don’t change:
# Dockerfile with all configuration baked in
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
# No SSH daemon, no shell required
USER node
CMD ["node", "server.js"]
Use Declarative Configuration
Everything that would traditionally require SSH lives in version control:
# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
template:
spec:
containers:
- name: api
image: myregistry.io/api:v2.3.4
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
resources:
limits:
memory: "512Mi"
cpu: "500m"
Want to change the memory limit? Modify the YAML, commit it, and let your CI/CD pipeline apply the change. No SSH required.
Implement Observability from Day One
Without SSH, observability becomes critical. I always implement comprehensive logging upfront:
// Go application with structured logging
package main
import (
"log/slog"
"os"
)
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
logger.Info("server starting",
"version", os.Getenv("APP_VERSION"),
"environment", os.Getenv("ENVIRONMENT"),
)
// All logs go to stdout/stderr
// Collected by your logging infrastructure
// No SSH needed to tail logs
}
Every piece of diagnostic information you’d traditionally SSH in to find should be available through your observability stack: metrics, logs, traces, and events.
Handling the “But What If…” Scenarios
The most common objection I hear: “What if something goes wrong and I need to debug?” Here’s how I handle common scenarios without SSH:
Debugging Production Issues
Instead of SSHing in:
# Old way
ssh production-server
tail -f /var/log/app.log | grep ERROR
Use your observability platform:
# New way - query your logging infrastructure
kubectl logs -l app=api --tail=100 | grep ERROR
# Or use your logging service
curl -X POST https://logs.company.com/api/query \
-d '{"query": "level:ERROR AND app:api", "time": "last 1h"}'
Emergency Hotfixes
In rare emergencies, I use kubectl exec, but with strict controls:
# Temporary debug container (terminates after use)
kubectl debug -it pod/api-server-abc123 \
--image=busybox \
--target=api \
-- /bin/sh
# This is audited, logged, and temporary
# The original container remains immutable
The key difference: this is a separate ephemeral container that doesn’t modify the running application. It’s also fully logged and audited.
The Security Benefits
Eliminating SSH access dramatically reduces your attack surface:
- No credential theft: No SSH keys to steal or passwords to brute-force
- No privilege escalation: Can’t escalate from application user to root
- Perfect audit trail: All changes flow through CI/CD and are version controlled
- Faster incident response: Compromise recovery is just redeploying known-good images
- Compliance made easier: Immutable infrastructure simplifies SOC 2 and ISO 27001 audits
In one project, we eliminated 70% of our security findings by removing SSH access and implementing immutable deployments. The CISO’s team loved it because every change was traceable through Git history.
Migrate to Immutable Infrastructure
You don’t have to switch overnight. Here’s how I typically migrate teams:
Phase 1: Shadow with Immutable Deploys (2-4 weeks)
- Keep SSH access available
- Start deploying new versions as full replacements
- Build confidence in the new process
Phase 2: Emergency-Only SSH (4-8 weeks)
- Require manager approval for SSH access
- Log all SSH sessions
- Post-mortems for any SSH usage to improve automation
Phase 3: SSH Removal (ongoing)
- Disable SSH on new resources
- Gradually decommission old infrastructure
- Use ephemeral debug containers for rare edge cases
Codify Infrastructure with Terraform
The foundation of SSH-less infrastructure is comprehensive IaC. Here’s a Terraform pattern I use:
# terraform/compute.tf
resource "aws_launch_template" "app" {
name_prefix = "app-"
image_id = data.aws_ami.app_latest.id
instance_type = "t3.medium"
# No SSH key specified
# key_name = "production-key" # Commented out intentionally
metadata_options {
http_endpoint = "enabled"
http_tokens = "required" # IMDSv2 only
}
user_data = base64encode(templatefile("${path.module}/user-data.sh", {
app_version = var.app_version
environment = var.environment
}))
tag_specifications {
resource_type = "instance"
tags = {
Name = "app-${var.environment}"
Version = var.app_version
Immutable = "true"
ManagedBy = "terraform"
}
}
}
Notice the commented-out key_name. That’s intentional. New instances don’t have SSH keys provisioned at all.
Build Comprehensive Monitoring
Without SSH, your observability stack needs to answer every question you’d traditionally SSH in to investigate:
// TypeScript: Comprehensive application metrics
import { Registry, Counter, Histogram, Gauge } from 'prom-client';
const registry = new Registry();
const httpRequestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
registers: [registry]
});
const activeConnections = new Gauge({
name: 'active_database_connections',
help: 'Number of active database connections',
registers: [registry]
});
// Export metrics endpoint (scraped by Prometheus)
app.get('/metrics', async (req, res) => {
res.set('Content-Type', registry.contentType);
res.end(await registry.metrics());
});
With proper metrics, you can answer questions like:
- What’s the request latency distribution?
- How many active connections do we have?
- What’s the error rate by endpoint?
All without ever SSHing into a server.
Cost and Performance Wins
Immutable infrastructure isn’t just about security—it has operational benefits:
Faster deployments: My teams went from 45-minute deployment windows (with manual SSH steps) to 5-minute automated rollouts.
Lower costs: Spot instances and auto-scaling work better when servers are truly stateless and disposable.
Better reliability: Configuration drift issues disappeared entirely. Every server is identical to every other server at the same version.
When SSH Might Still Make Sense
I’m pragmatic about this. There are still scenarios where SSH access is reasonable:
- Development environments: Local development often benefits from direct access
- Legacy applications: Some systems can’t easily be containerized
- Specific compliance requirements: Some regulations mandate certain access patterns
- Highly regulated air-gapped networks: Where typical cloud-native tooling isn’t available
But even in these cases, treat SSH as an escape hatch, not the primary interface.
The Cultural Shift
The hardest part of eliminating SSH isn’t technical—it’s cultural. Engineers who’ve spent years SSHing into servers need to relearn their troubleshooting workflows.
I’ve found success by:
- Pairing on incidents: Junior engineers learn to debug without SSH by watching seniors use observability tools
- Runbooks that don’t mention SSH: Document procedures using kubectl, cloud CLIs, and observability platforms
- Celebrating wins: Track and share how immutable infrastructure prevented issues or accelerated recovery
One team I worked with measured “time to recovery” before and after. Post-migration, their P1 incident recovery time dropped by 60% because they could simply redeploy known-good versions instead of debugging and manually fixing servers.
Conclusion
Removing SSH access from production infrastructure is counter-intuitive but powerful. It forces better practices: comprehensive observability, automation, and treating infrastructure as code.
The transition requires upfront investment in tooling and cultural change, but the payoff is substantial: better security, faster deployments, and more reliable systems.
In my infrastructure consulting work, every team that’s made this transition has told me the same thing: “We can’t imagine going back to SSH-based workflows.” The combination of improved security posture and operational efficiency makes it a one-way door.
If you’re still SSHing into production servers daily, consider it a signal that your infrastructure automation and observability need improvement. The goal isn’t to make SSH harder—it’s to make it unnecessary.