SaaS development: architecture, cost, and how to build one
Contents
At the moment a SaaS product crosses its first hundred enterprise accounts, the architectural decisions made at sprint one stop being abstract and start costing real money.
A row-level discriminator that worked fine at 50 tenants begins saturating a shared connection pool; a single Kubernetes namespace that felt clean now makes noisy-neighbour incidents a weekly incident-review fixture (Oracle Database 18c Documentation – Optimizing Real-World Performance with Static Connection Pools). The teams that scale past that inflection point aren't the ones who chose the trendiest stack, they're the ones who mapped tenancy model, data isolation, and service decomposition to their actual growth curve before writing the first migration. This guide provides the map to effective SaaS solutions.
TL;DR: What shapes a SaaS build
Multi-tenant architecture, total cost of ownership, and SOC 2 compliance are the three decisions that shape every SaaS build, get any one wrong early and you're refactoring under production load (Binadox - Multi-Tenant vs Single-Tenant SaaS: Cost Analysis Guide 2025).
We've shipped SaaS engagements across a range of product companies, from Booksy's 30,000-item catalog to Applift's 80M+ actions per month, and the recurring trade-off is always tenancy model versus operational cost. A pooled schema cuts cloud spend but complicates per-tenant security controls; a siloed database-per-tenant instance simplifies SOC 2 audit scope but can double infrastructure bills within 18 months of growth (Multi-Tenancy Risks in Healthcare Cloud Systems | Censinet). This guide covers architecture patterns, realistic cost drivers, and AI integration so you can make those calls before the first sprint.
What SaaS development actually means
SaaS development is the practice of building software that runs on shared cloud infrastructure and is delivered to users over a network, with the vendor operating the application rather than the customer installing or hosting it. That delivery model, not the features, is what separates SaaS products from on-premise software, and it changes every architectural decision you make.
With on-premise licensing, a customer receives a file, installs it on their own servers, and manages their own data. The developer ships code; the customer runs it. SaaS inverts that relationship entirely. Your team operates the application on behalf of every account, which means multi-tenant architecture and API-first design are not optional extras, they are the foundation. A single application instance must serve thousands of tenants securely, enforce per-tenant data boundaries, and stay available while your team ships continuous updates.
Think of it this way: on-premise is selling a car; a SaaS development platform is running a taxi network. You own the fleet, keep the vehicles updated, and carry responsibility for every journey. That operational burden is also the business model: recurring revenue in exchange for continuous delivery, security, and reliability.
Tenancy models and data isolation: The foundational choice
Multi-tenant architecture is the foundational choice in SaaS application development, and getting it wrong at schema level is expensive to undo later. The decision determines your security posture, cloud cost structure, query performance under load, and how quickly you can onboard new customers.
The silo-bridge-pool spectrum
The silo-bridge-pool data isolation model describes three points on a spectrum, each with different trade-offs at the database and connection-pool level.
| Model | PostgreSQL implementation | Isolation | Cost | Complexity |
|---|---|---|---|---|
| Silo | Separate DB instance per tenant | Highest | Highest | Low per-tenant, high ops |
| Bridge | Shared instance, schema-per-tenant | High | Medium | Moderate |
| Pool | Shared schema, row-level discriminator (tenant_id) | Lowest | Lowest | High, all queries must filter correctly |
Silo (separate DB instance): Each tenant gets their own PostgreSQL instance. Blast radius from a data breach or a runaway query is contained. Backups, restores, and compliance evidence (think SOC 2 audits) map cleanly to a single tenant (Skedda – SOC 2 Type 2 Compliance: Costs, Timeline & Auditor Selection). The cost is real: at AWS RDS pricing, running hundreds of separate instances adds up fast, and your ops team carries the overhead of each one. This model suits regulated industries, healthcare, finance, where a single tenant's contract requires hard data residency guarantees.
Bridge (schema-per-tenant): One PostgreSQL cluster hosts many schemas. A search_path switch routes each connection to the right schema. You still get meaningful isolation: a developer running SELECT * FROM orders in the wrong schema returns zero rows rather than another tenant's data. The noisy-neighbour risk is real, though, a tenant running an unindexed analytical query locks shared resources. Per-tenant quota enforcement via pg_stat_activity monitoring and statement timeouts is not optional here.
Pool (row-level discriminator): All tenants share tables; every row carries a tenant_id foreign key. PostgreSQL's query planner uses the discriminator column, but only if your indexes are composite (tenant_id, created_at) rather than single-column. Miss that and a full table scan bleeds across all tenants. Row-level security (RLS) policies in PostgreSQL 15+ reduce the risk of accidental cross-tenant data exposure: but they add overhead on every query, and testing coverage has to be thorough (pgDash – Exploring Row Level Security In PostgreSQL).
Connection pool and noisy-neighbour trade-offs
PgBouncer or RDS Proxy sits in front of PostgreSQL in any pool or bridge deployment. In pool mode, a single tenant executing a long-running report can exhaust the connection limit before other tenants get a slot, we've seen this collapse a staging environment during load testing on a bridge-model SaaS application. Per-tenant connection caps, enforced at the proxy layer, are the standard fix.
Migration path matters too. Several SaaS products start pooled for speed of development, then discover that a single enterprise customer demands schema-level isolation as a procurement condition. Retrofitting a bridge model onto a pooled schema is possible, PostgreSQL's pg_dump --schema tooling helps, but plan for weeks of migration work, not days. We saw this kind of pressure on Hive, which scaled to 2,000+ enterprise accounts and had to harden tenant isolation as those accounts arrived.
The right tenancy model depends on your customer segment, your compliance requirements, and the operational maturity of your infrastructure team, not on a generic best practice. Pick the model that matches your current largest enterprise requirement, not the one that feels cleanest on a whiteboard.
Microservices vs. Modular monolith: When each wins
A modular monolith is the right default for most early-stage SaaS products; microservices earn their complexity only when independent deployability of specific domains becomes a genuine operational constraint.
The distinction matters more than teams typically acknowledge. A well-structured modular monolith, bounded domains, explicit internal APIs, no cross-module database joins, often beats a microservices setup on latency and throughput at small-to-mid scale, because it avoids the distributed-systems overhead that Kubernetes-managed microservices introduce: service mesh latency, distributed tracing, inter-service auth, and independent CI/CD pipelines per service.
Where this breaks down is when businesses operate at tenant scale. Here is where your tenancy model constrains your decomposition choices, a differentiating insight that rarely appears in architecture guides.
If you are running silo isolation (separate database instances per tenant), you are already paying the operational overhead of multiple data stores. Decomposing into microservices at that point adds orchestration cost on top of provisioning cost. The economics look different with a pooled schema: a modular monolith sharing a single PostgreSQL connection pool across tenants stays operationally lean until per-domain scaling requirements diverge, for example, when your billing service (think Stripe webhook processing) saturates CPU while your core application remains idle.
| Factor | Modular monolith | Microservices |
|---|---|---|
| Team size | Up to ~15 engineers | 15+ with domain teams |
| Deployment cadence | Weekly or faster, single artifact | Per-service, needs Kubernetes or equivalent |
| Tenancy model fit | Pooled and bridge models | Silo models with high per-tenant load |
| Operational cost | Low | High: service mesh, observability, on-call surface |
| Migration path | Strangler-fig extraction | N/A, already decomposed |
For SaaS application development teams already on a modular monolith, the practical migration path is strangler-fig: extract one high-load bounded context as a standalone service, keep the rest of the application intact, and validate whether the complexity trade-off was worth it before extracting further.
Event-driven architecture fits both patterns but changes the coupling model. In a monolith, internal events are in-process and synchronous by default. Moving to an event bus, Kafka, AWS EventBridge, decouples producers and consumers across module or service boundaries, which is what makes modular-to-microservices migration tractable without a full rewrite. We have found on SaaS delivery engagements that teams who invest in an internal event model early move to service extraction in weeks rather than quarters.
Cloud-native building blocks: Containers, Kubernetes, serverless
Kubernetes, containers, and serverless functions each solve a different SaaS scalability problem, and choosing the wrong one for a given workload inflates both cost and operational complexity.
Containers (Docker) give you reproducible, environment-agnostic packaging. For multi-tenant SaaS application development, this matters because tenant workloads behave identically across staging and production, and per-tenant resource limits are enforceable at the container layer. The trade-off: containers are always-on, so idle tenants still consume memory and account for baseline spend.
Kubernetes adds orchestration: bin-packing, health-checks, rolling deployments, and horizontal pod autoscaling. On AWS (EKS specifically), Kubernetes lets you enforce per-tenant quota at the namespace level, a practical noisy-neighbour mitigation that row-level discriminator isolation alone cannot give you.
The cost, however, is real. Control-plane overhead, the learning curve for developer teams unfamiliar with YAML-heavy config, and the network complexity of service meshes mean Kubernetes earns its place only once you have multiple independently-scaling domains. We typically recommend deferring EKS adoption until a SaaS product clears 50+ enterprise tenants or three separate services with divergent scaling profiles.
Serverless (AWS Lambda, Google Cloud Run) shifts the cost model entirely: you pay per invocation, not per reserved capacity. For event-driven SaaS workflows, file processing, webhook dispatch, scheduled reports, this looks attractive. The constraint is cold-start latency: on a JVM-based runtime it can run several hundred milliseconds, though cold starts typically occur in under 1% of invocations and duration varies from under 100ms to over a second (AWS Lambda documentation, 2024). For synchronous user-facing paths in a SaaS application, that budget is usually unacceptable without provisioned concurrency, which partially erodes the cost advantage.
Think of the three as a stack, not a choice: containers for the core application, Kubernetes for orchestration when independent deployability is real, and serverless at the edges where workloads are spiky and latency tolerance is high.
SaaS tech stack by layer: Rationale over trend
Stack choices should follow architecture decisions, not the reverse. Pick PostgreSQL as your default relational store, choose Stripe for billing, wire OAuth 2.0 into your identity layer, and design every service boundary around an API-first contract, then revisit only when a specific constraint (throughput ceiling, latency budget, licensing cost) forces a change (AWS Database Blog, Understanding statistics in PostgreSQL).
Frontend. React or Next.js for most SaaS application development. Next.js adds server-side rendering and file-based routing with minimal overhead; use it unless your frontend developers already have deep Vue or Angular expertise and the migration cost is unjustifiable.
Backend. Node.js for I/O-heavy, event-driven workloads; Go or Elixir where connection count and throughput matter at the network layer. For a new SaaS application, a modular monolith is the right default, microservices add per-service deployment pipelines, distributed tracing, and network latency before you understand where your service boundaries actually belong.
Database. PostgreSQL handles row-level security, JSONB columns, and schema-per-tenant multi-tenancy without switching engines. Resist the urge to introduce a second store until PostgreSQL demonstrably cannot serve the use case.
Auth, billing, search, observability, build vs buy.
| Layer | Build | Buy |
|---|---|---|
| Auth | Only if OAuth 2.0 / OIDC compliance is deeply custom | Auth0, Cognito, Clerk |
| Billing | Almost never | Stripe (usage-based + subscriptions) |
| Search | If semantic/vector search is core IP | Elasticsearch, Typesense, pgvector |
| Observability | Never from scratch | Datadog, Grafana Cloud, OpenTelemetry |
The SaaS products we've seen waste the most engineering time are those that roll their own billing engine. Stripe's subscription and usage-record APIs cover the vast majority of SaaS pricing models, and the total cost of ownership for a home-built billing system: including SOC 2 audit scope, edge-case proration logic, and dunning flows, consistently exceeds the Stripe fee by a wide margin in our delivery work.
API-first design means every feature ships as an API endpoint before any UI is built. This is not a documentation preference: it is an architecture constraint that keeps your SaaS application testable, partner-integrable, and ready for future mobile or embedded surfaces. REST is the right default; GraphQL earns its complexity only when clients have genuinely divergent field-selection needs across a large content graph.
Adding AI to your SaaS: RAG, copilots, and inference cost
Retrieval-augmented generation is the most practical AI pattern for vertical SaaS products right now, it grounds LLM responses in your tenant's own data without the cost and compliance risk of fine-tuning a model per account.
The architecture is straightforward: chunk and embed your customer's content (documents, tickets, CRM records) into a vector store, retrieve the top-k semantically similar passages at query time, and inject them into the prompt context before the model generates a response. The retrieval step is what separates a useful in-app copilot from one that hallucinates product details. For embedding model selection, start with text-embedding-3-small from OpenAI or a self-hosted alternative such as bge-m3, the latter runs on a single A10G GPU and eliminates per-token egress costs for high-volume SaaS applications.
The inference cost question is real. API-based inference via OpenAI or Anthropic is fast to ship but expensive at scale: GPT-4o: $5.00/1M input tokens, $15.00/1M output tokens (OpenAI API Pricing (May 2026): Every Model, Every Cost). Self-hosted inference with vLLM or TGI on AWS g5.xlarge instances reduces marginal cost significantly once request volume exceeds roughly a few million tokens per day, but adds GPU fleet management overhead. In our SaaS delivery work we typically recommend API-based inference through the first 10,000 active users, then model the crossover point before committing to self-hosted.
Streaming response patterns matter for perceived performance. Chunked token streaming via server-sent events keeps the UI responsive even when generation takes two to four seconds, users tolerate latency when text is visibly arriving. Without streaming, a four-second blank wait produces churn-inducing frustration.
Data privacy is where the tenancy model intersects with AI. If your architecture uses row-level isolation with a shared PostgreSQL schema, you must enforce strict tenant-scoped vector queries, a retrieval bug that leaks one tenant's embedded content into another's context is a SOC 2 incident, not just a software defect. Siloed vector stores (one collection per tenant in Qdrant or pgvector) eliminate that risk at the cost of higher storage overhead. For vertical AI SaaS products handling sensitive data, healthcare, legal, finance, siloed embedding storage is the only defensible choice regardless of the cost premium.
LLM inference cost is often underestimated at the scoping stage. Think of it like cloud egress: invisible until it isn't, then suddenly the largest line item on the infrastructure bill.
SaaS development cost: What drives the number
Total cost of ownership for a SaaS application breaks into four layers: build, infrastructure, compliance, and third-party services, and the ratio between them shifts dramatically as the product matures.
Build cost is the most variable. An MVP scoped to a single user workflow, a pooled PostgreSQL schema, and a Stripe billing integration typically runs $25,000-$150,000 over a 3-6 month engagement with a 4-8 person senior team (Clutch.co 2024 / Developex 2026); a production-grade SaaS application with multi-tenant data isolation, role-based access control, and async job queues sits in a different bracket entirely. The two biggest scope multipliers we see in practice: tenancy model (a siloed, database-per-tenant architecture costs roughly two to three times as much to build and operate as a shared-schema model at equivalent user counts) and the decision to pursue SOC 2 compliance before launch rather than after.
SOC 2 compliance deserves its own line in any SaaS budget. Achieving a Type II report typically adds $15,000-$100,000+ in audit and readiness costs (Thoropass, Secureframe, Vanta, 2025) before you count ongoing monitoring tooling and annual renewal. For B2B SaaS products selling to enterprise accounts, this is not optional, it directly affects sales cycle length.
Infrastructure starts small. A Kubernetes-hosted application on AWS or Google Cloud for a few hundred active accounts might run a few hundred dollars per month, but the number scales with tenant count, data volume, and any inference workloads added for AI features. Token throughput for LLM inference can become the dominant cloud line item faster than most teams think.
Third-party services compound quietly. Stripe's processing fees, a transactional email provider, a logging and observability stack, and a vector store for RAG features each look trivial; together they add up, third-party software and APIs can run around 10% of gross revenue in variable costs plus roughly $1,500 in fixed monthly fees at scale (Financial Models Lab, 2024).
The honest framing for total cost of ownership: the build number is a one-time problem; the infrastructure, compliance, and tooling costs are a recurring tax that grows with your user base. Budget for both.
SaaS compliance and security: SOC 2, GDPR, HIPAA, PCI DSS
SOC 2 compliance, GDPR, HIPAA, and PCI DSS each carry architectural consequences that go well beyond adding a checkbox audit, they actively constrain which tenancy model you can choose and how you authenticate users.
SOC 2 (Type II) is the baseline for B2B SaaS products selling into mid-market and enterprise accounts. Per the AICPA SOC 2 standard documentation, the Trust Services Criteria require documented controls across security, availability, and confidentiality. In practice, this means audit logging at the database and API layer, OAuth 2.0 for identity federation, and encryption at rest and in transit. SOC 2 does not mandate silo tenancy, but enterprise buyers increasingly expect it, a pooled schema with row-level discriminators is harder to audit cleanly because a misconfigured query can theoretically expose cross-tenant data.
GDPR reaches any SaaS application with EU users, regardless of where your servers sit. The right-to-erasure requirement alone forces a data architecture decision: if you use a pooled PostgreSQL schema, deleting one user's records without corrupting aggregate analytics or audit trails requires careful partitioning from day one. Retrofitting this is expensive. Think of it as a schema design constraint, not a legal afterthought.
HIPAA applies to SaaS products handling protected health information (PHI). It requires a Business Associate Agreement with every cloud provider in your stack and strongly favours siloed tenancy: AWS, Azure, and Google Cloud all offer HIPAA-eligible service tiers, but you need to account for the added infrastructure cost of separate data stores per covered entity. On one healthcare engagement, we found that moving from a pooled to a siloed model added roughly 30-40% to the monthly cloud bill, offset partially by the ability to charge enterprise healthcare customers a compliance premium.
PCI DSS applies if your SaaS application handles cardholder data directly. Most SaaS products avoid full PCI scope by using Stripe or a similar payment processor that tokenises card data before it touches your network, Stripe's developer documentation covers how tokenisation removes most PCI DSS obligations from the application layer.
The compliance-to-tenancy mapping looks like this:
| Framework | Minimum viable tenancy | Key architectural trigger |
|---|---|---|
| SOC 2 Type II | Pooled (with audit log isolation) | Per-tenant audit trails, OAuth 2.0 IdP |
| GDPR | Pooled (with erasure partitioning) | Right-to-erasure, data residency controls |
| HIPAA | Siloed preferred | PHI isolation, BAA with cloud provider |
| PCI DSS | Pooled acceptable (with tokenisation) | Offload card data to Stripe or equivalent |
The cost implication is real: SOC 2 Type II readiness adds meaningfully to initial development and audit fees (see the cost ranges above), and HIPAA's siloed model can double per-tenant infrastructure spend at low customer counts before volume amortises the overhead.
SaaS pricing models: Per-seat to usage-based
Usage-based billing fits SaaS applications where value delivery is measurable per event, API calls, active seats, data processed, or messages sent. Per-seat pricing is simpler to forecast but penalises adoption; usage-based pricing aligns cost with value but demands metering infrastructure from day one.
Stripe Billing handles both models, but the back-end work differs sharply. Per-seat requires only account-level seat counts. Usage-based billing requires an event pipeline: emit a metered event per chargeable action, aggregate it server-side, report totals to Stripe's usage records API each billing period. Without that pipeline, you cannot charge accurately or audit disputes. Think of it as a logging system with financial consequences, the data integrity requirements look closer to an audit trail than a standard application log.
| Model | Best fit | Revenue predictability | Metering required | Example |
|---|---|---|---|---|
| Flat-rate | Simple tools, early traction | High | No | Basecamp |
| Per-seat | Team software, B2B SaaS | High | No | Notion, Linear |
| Usage-based | API-first products, infra, AI | Low, medium | Yes | Twilio, Snowflake |
| Tiered | Multi-segment products | Medium | Partial | HubSpot, Intercom |
| Freemium | High-volume consumer/PLG | Low initially | Partial | Dropbox, Figma |
61% of SaaS companies use some form of usage-based pricing (OpenView Partners State of Usage-Based Pricing Report 2023)
Total cost of ownership shifts with model choice. Usage-based products need metering infrastructure, billing reconciliation logic, and anomaly detection, add roughly one engineer-sprint per billing dimension at build time. Freemium apps add cloud cost for non-paying users. Per-seat is cheapest to operate but caps net revenue retention below usage-based peers.
SaaS development FAQs
What is SaaS development and how does it differ from building on-premise software?
How much does SaaS development cost, MVP vs. Full product?
How long does it take to build a SaaS product?
How does SaaS development relate to cloud computing?
How do I choose a SaaS development partner?
How do I build a SaaS MVP without over-engineering it?
How do I add AI features to an existing SaaS product?
Build your SaaS with a team that has done it before
SaaS teams that have shipped multi-tenant architecture and maintained SOC 2 compliance under production load make different decisions from those reading about it for the first time, and those decisions compound across every sprint.
Netguru has delivered 2,500+ projects across 50+ countries, with 400+ engineers who have worked on SaaS application development at every scale. We think the clearest signal of readiness is what a team has already shipped: Applift processing 80M+ actions per month, Booksy's catalog search across 30,000+ items, and Hive scaling to 2,000+ enterprise accounts are the kinds of outcomes that come from having solved tenancy, security, and cloud infrastructure in practice, not in theory.
If you are scoping a SaaS build and want a team that has navigated those trade-offs before, get an estimate for your project.
