The Million-User Tax
I audited a startup's AWS stack. $2,300/month. 200 users. Here's what it cost them, and how to stop paying it
The hardest conversation I have with technical founders isn’t about the product. It’s about the stack.
Not because the stack is broken. Usually it’s impressive. The problem is that it’s built for a company that doesn’t exist yet, at a cost the actual company can’t afford.
A founder I know reached out a few months ago. Building an AI startup, wanted to talk through the direction. We got on a call and he shared his screen.
The architecture diagram was beautiful.
Twelve microservices. Kubernetes on EKS with eight worker nodes. Multi-AZ RDS with two read replicas. Private VPC, NAT gateways, three load balancers. A dedicated GPU inference endpoint on a g4dn.xlarge. Separate CI/CD pipelines per service. The kind of diagram you’d hold up as a textbook example of production architecture.
I asked how many users they had.
Two hundred monthly active users.
What That’s Actually Costing You
Let me show you the numbers, because this is where most founders go quiet.
EKS cluster with eight t3.large worker nodes running around the clock: $560/month. Multi-AZ RDS plus two read replicas: $400/month. Dedicated GPU inference endpoint: $790/month. Two NAT gateways plus the data transfer between twelve microservices constantly talking to each other: $180/month. Three load balancers, CloudWatch, logging: $220/month. Container registry, secrets management, storage: $150/month.
$2,300/month. For 200 monthly actives.
That’s still not the full number. Three months of engineering had gone into building this. Getting Kubernetes stable, configuring the service mesh, writing twelve separate CI/CD pipelines. At $150 per engineering hour, that’s over $54,000 in engineering time that produced zero features users ever saw.
Going forward, 35 to 40% of engineering capacity was going to keeping the infrastructure alive. Not shipping. Not talking to customers. Keeping the cluster running.
The Three Reasons You Build This Way
If you’re a technical founder, you probably recognize the thinking that led here. It’s not carelessness. It’s three very understandable things compressing at once.
The growth scenario. What if you go viral? One press mention, 50,000 signups by Tuesday. AI startups feel this pressure especially hard — the expectation of sudden, dramatic traction makes “underscaling” feel like an existential risk. So you build for the peak you haven’t reached. Just in case.
Engineering identity. You’ve worked on systems with real traffic. You know what good architecture looks like. Multi-AZ is correct. Zero single points of failure is correct. The issue isn’t your standards — it’s applying Big Tech standards to a product that hasn’t found its first hundred users yet, let alone its first million.
Fear of the rewrite. “If I don’t build it right now, I’ll have to redo it later.” Here’s the uncomfortable truth: you’ll rebuild it anyway. The architecture that works for 200 users is different from the one that works for 200,000. That rewrite is coming whether you like it or not. The only question is whether you’re still alive when it arrives.
What You Actually Need at This Stage
A single monolith, or two services at most, on ECS Fargate. One RDS instance — no Multi-AZ, no replicas, just a scheduled daily snapshot to S3. AI inference through Bedrock API calls, pay-per-request. No dedicated GPU endpoint burning $790 a month whether it handles one call or ten thousand. One load balancer. One pipeline.
Monthly cost: $180 to $220.
That’s not cutting corners. That’s right-sizing. The rule I use: don’t build for the next order of magnitude until you’re at 80% capacity on your current one. At 200 MAU, infrastructure is nowhere near your bottleneck. Your bottleneck is learning — and you can’t learn if your engineers are debugging service mesh issues at 2am.
What Happened After
Two weeks to simplify. Twelve microservices into a monolith. GPU endpoint out, Bedrock API calls in. Multi-AZ and read replicas removed, replaced with a backup script. Three load balancers to one.
Cloud bill: $2,300 down to $190/month. A 92% reduction.
Shipping velocity: from one feature every five to six weeks to something new every week. Twelve pipelines became one. Twelve failure points became one.
The infrastructure stopped being the product. The product became the product.
The Million-User Tax
There’s a name for what was happening. I call it the Million-User Tax — the cost you pay when you build for users you don’t have yet.
It shows up in your cloud bill. In your engineering roadmap. In the customer conversations that didn’t happen because your team was firefighting infrastructure instead.
The tax is paid in velocity. In features not shipped. In runway that quietly shortens while your stack sits ready for a scale event that hasn’t arrived.
Before your next sprint, ask yourself: what is this stack actually built for?
If the answer isn’t “the next 200 users,” you’re probably paying it.
I’m putting together the Right-Sized Stack guide — a one-page reference with exactly what to run, what to skip, and when to upgrade from 0 to 100K users, with real cost estimates at each stage. Subscribers get it first. Subscribe at macrostack.dev.
If you want a second pair of eyes on your architecture before you scale, I help founders with this at fusionone.tech.



