Wild infrastructure is killing your team

October 27, 2025 4 min read

Resources created outside IaC, managed by hand, documented nowhere. It's faster until it isn't.

DevOpsIaCCloudFinOps

Someone needed a quick fix. AWS console, click click, EC2 instance. Maybe an S3 bucket. Security group with port 22 open to the world.

Six months later, it’s still running. Nobody knows what it does. Nobody knows who created it. Not in Terraform. Not documented. Just… there.

This is wild infrastructure.

What it looks like

Console-created resources. Quick fixes that become permanent. Test environments never deleted. That Lambda someone deployed “just to try something”.

No state file tracks it. No PR reviewed it. It just appeared.

Symptoms:

Cloud bills with mystery resources
Security groups nobody remembers
“Don’t touch that, I think it’s important”

Why people do it

Speed. Always speed.

“I’ll add it to code later.” (They won’t.) “It’s just temporary.” (It isn’t.)

The console is right there. Click, click, done. Terraform feels slow when you need something now.

The test works, becomes the solution, lives forever in the shadows.

The real cost

FinOps

No cost allocation tags. Can’t attribute to team or project. Just bleeds money.

I’ve seen $15,000/month in forgotten resources during audits. RDS instances running for years with no connections. EBS volumes attached to nothing!

You can’t optimize what you can’t see.

Security

That security group for “quick testing”? Still open. That IAM role with admin permissions for “debugging”? Still active.

No security review. No policy compliance. No patches when vulnerabilities drop.

Your security is only as strong as your weakest resource. And you don’t know where it is!

Knowledge loss

Creator leaves or forgets. When it breaks, you debug blind. No commit history. No documentation.

Teams spend days reverse-engineering what a manually-created Lambda does because the author left.

Drift

IaC says one thing. Reality says another. terraform plan shows drift everywhere.

You stop trusting your state. Stop running applies. IaC becomes documentation instead of source of truth.

The “faster” lie

Day 1: Create resource manually. 5 minutes. Feel productive.

Week 2: Teammate asks what that resource does. 15 minutes explaining.

Month 1: Need to replicate in staging. 2 hours recreating from memory.

Month 3: Security audit. 4 hours documenting what you can find.

Month 6: Resource breaks. 8 hours debugging with no context.

Year 1: Cloud cost review. Nobody knows if it’s needed. Keep it just in case.

That 5 minutes “saved” cost dozens of hours. And it keeps costing!

Writing Terraform on day 1 would have taken 30 minutes. PR review, 15 minutes. Total: 45 minutes. With documentation, version history, and reproducibility.

How to fix it

Make IaC the easy path

If Terraform feels slow, fix your workflow. Don’t abandon IaC.

Speed up PR reviews for infrastructure changes
Use modules for common patterns
Set up CI/CD that applies on merge
Make terraform apply a one-command operation

When IaC is faster than clicking, people use IaC.

Detect and import

Use tools to find wild infrastructure:

# AWS: find untagged resources
aws resourcegroupstaggingapi get-resources --resources-per-page 100 | jq '.ResourceTagMappingList | map(select(.Tags | length == 0))'

# Or use dedicated tools
steampipe query "select * from aws_ec2_instance where tags is null"

Then import what matters into Terraform:

terraform import aws_instance.mystery i-1234567890abcdef0

Delete what doesn’t matter. Most of it doesn’t.

Tag everything

Enforce tagging at the organization level. Every resource needs:

Owner (team or person)
Project
Environment
Created date

AWS Service Control Policies can block resource creation without required tags. Use them.

Regular audits

Monthly: review cloud costs by tag. Anything untagged gets investigated.

Quarterly: full inventory reconciliation. Compare IaC state with actual resources.

Kill zombie resources aggressively. If nobody knows what it does after two weeks of asking, delete it. If it was important, someone will notice.

Culture shift

This is the hard part. You need everyone to believe that IaC isn’t bureaucracy, it’s insurance.

Every manual change is tech debt. Every console click is a future incident. The 5 minutes you save today costs hours later.

Make it a team norm: if it’s not in code, it doesn’t exist.

When manual is okay

Almost never, but:

True one-time debugging that gets deleted in the same session
Initial exploration of a new service you’ve never used
Break-glass emergency fixes (that get codified immediately after)

Even then, document it. A Slack message saying “I’m creating a test instance, will delete in 1 hour” is better than nothing.

The goal

100% of your infrastructure in code. Not 90%. Not 95%. All of it.

You should be able to terraform destroy everything and terraform apply it back. Your cloud account should be reproducible from a git repo.

This isn’t perfectionism. It’s survival. Wild infrastructure grows until it consumes your ability to understand, secure, and pay for your own systems.

Tame it or it’ll tame you.