AI for Cloud Cost Optimization

April 30, 2026

Scanned a dev AWS environment. Sent findings to Claude via Bedrock. Got back actionable analysis on where money is being wasted.

Scanners:     NAT Gateway, ALB, EIP, EKS Nodes, Cluster
Lookback:     7 days
Findings:     5 resources flagged
Est. waste:   ~$120–130/month

How it works

A Go service runs as a Kubernetes CronJob. On each run, scanners pull CloudWatch metrics for each resource type — NAT gateways, ALBs, EIPs, EKS nodes, and the cluster. Findings are batched by resource type and sent to Claude via the Bedrock Converse API. The model reasons about each resource — whether the spend is justified, why, and what to do about it — and returns actionable recommendations.

The environment

Same platform as the EKS writeup — a small dev setup, not a large fleet. That's intentional. If Claude can surface actionable signal here, it'll be more useful at scale.

Region:      us-east-2
Cluster:     dev-eks-cluster — 2 × t3a.small (Karpenter)
App:         Go service, SQS consumer, DynamoDB
Networking:  1 NAT Gateway, 3 EIPs (all associated)
Ingress:     1 ALB (AWS Load Balancer Controller)

What Claude found

NAT Gateway — ~$32–35/month

nat-07d90a77be768a55a | BytesOutToDestination (7-day): ~335 MB (~48 MB/day)

335 MB over 7 days is extremely low throughput. At ~$32.40/month fixed cost, this
traffic volume does not justify the expense.

1. Replace with VPC Endpoints — S3, DynamoDB, and SSM traffic can route for free
   via Gateway Endpoints, eliminating the NAT entirely.
2. Replace with NAT Instance — t4g.nano (~$3/month) saves ~$30/month for dev traffic.
3. Delete if unused — if no active workload depends on it, remove it.

Application Load Balancer — ~$16–18/month

k8s-default-goapping-2282c37413 | RequestCount (7-day): 5,185 (~30 req/hour)

~$16–18/month fixed cost regardless of traffic. At 30 requests/hour this ALB is
cost-inefficient for what it's serving.

1. Consolidate — if multiple low-traffic services exist, share one ALB using
   host/path-based routing instead of a dedicated ALB per service.
2. Delete if unused — remove the ALB and the associated Kubernetes ingress resource.

EKS Nodes — instance cost

i-099569743bf2795f3 | t3a.small | CPU utilization (7-day): 0%
i-0f8b2793ff006a058 | t3.small  | CPU utilization (7-day): 0%

0% CPU on both nodes over 7 days indicates idle or abandoned instances. Container
Insights was not enabled on this cluster so metrics reflect missing telemetry, not
confirmed zero utilization. Enable Container Insights to get accurate per-node data
before acting.

EKS Cluster — ~$73/month

dev-eks-cluster | Running nodes: 2

EKS control plane costs ~$73/month regardless of node count or workload.
For a dev environment, creating the cluster on-demand (spin up via Terraform,
tear down when idle) would eliminate this fixed cost entirely.

What's next