Skip to main content

Temporal Cloud vs Self-Hosted Temporal: A CTO Decision Guide

When teams evaluate Temporal, they often start with the wrong question.

The debate is usually framed as infrastructure choice: self-hosted or managed. But for a CTO, the more important question is operational ownership. Do you want your team to build on Temporal, or do you also want your team to run Temporal?

That distinction matters because the developer experience is fundamentally similar in both models. Your engineers still write workflows, run workers, define retries and timeouts, and build long-running business processes. Even with Temporal Cloud, your applications and workers can remain in your own cloud or data center, connecting to a managed Temporal namespace rather than handing application execution over to a third party.

That is why this is not really a hosting decision. It is a decision about where your engineering time should go.

For most organizations, that answer should be obvious: product delivery, workflow design, business logic, and resilience at the application layer. Not operating yet another distributed control plane.

If you are still assessing where Temporal fits in the first place, Xgrid’s Building Enterprise-Grade Workflows with Temporal is the right place to start. It frames durable execution the way enterprise teams actually experience it: as a reliability and architecture problem, not just a developer convenience.

The real tradeoff

Self-hosted Temporal gives you maximum control. You own the deployment model, the persistence layer, the visibility stack, the upgrade process, the failure domains, and the operating standards around all of it.

Temporal Cloud gives you a managed control plane. Temporal handles the service layer while your team keeps ownership of the workflows, workers, and business systems that actually create value. Every standard Temporal Cloud namespace includes synchronous replication across three availability zones, automatic failover, and a 99.9% SLA. For many companies, that removes an entire category of operational burden before it ever lands on the platform team’s roadmap. 

That difference sounds subtle on paper. In practice, it changes the economics of adoption.

Self-hosted can be a strong fit when control is a hard requirement. If you have strict internal deployment boundaries, specialized platform requirements, or a mature infrastructure team that already operates stateful distributed systems well, self-hosting may make sense.

But that is not the same as saying it is the better default.

When Self-Hosting Temporal Makes Sense

There are legitimate reasons to self-host Temporal, and serious engineering organizations do it successfully.

The first is control. Self-hosting lets your team shape persistence, visibility, networking, and operational policy around internal standards. That can matter in environments where infrastructure ownership is part of the governance model rather than just an implementation detail.

The second is flexibility at the operational edge. Temporal Cloud is opinionated in ways that benefit reliability, but those opinions are still constraints. Temporal Cloud enforces namespace and configuration limits, defaults retention to 30 days unless adjusted, and puts guardrails around custom search attributes and other namespace-level settings. Self-hosted gives you more room to tune the system around your own preferences and lifecycle requirements.

Area Temporal Cloud Self-Hosted Temporal
Throughput Limits Default 500 Actions/sec, 2,000 requests/sec, 4,000 ops/sec per namespace No fixed platform limit; depends on your infrastructure capacity
Retention Default 30 days (configurable 1–90 days) Fully configurable; can retain or archive indefinitely
Workflow History 51,200 events or 50 MB per Workflow Execution Same core limit applies
Custom Search Attributes Limited per namespace and cannot be removed once created Can be created and removed by operators

The third is platform maturity. If your organization already has excellent SRE practices, strong database operations, disciplined observability, and clear ownership across upgrades, failover, and security posture, then running Temporal yourself may not feel like a burden. It may just feel like another internal platform service.

That is a valid position.

But it only remains valid if you are honest about everything that comes with it.

The hidden tax of self-hosting

The biggest mistake teams make with self-hosted Temporal is underestimating the long tail of ownership.

Initial deployment is not the hard part. The hard part is everything after that: visibility tuning, upgrade coordination, backup and disaster recovery strategy, cluster sizing, security hardening, observability design, operational debugging, and all the routine platform work that slowly accumulates around any critical distributed system.

What makes this difficult is that Temporal quickly becomes mission-critical infrastructure. Once core business workflows depend on it, every operational decision carries real consequences. A schema migration can affect running workflows. An overloaded visibility store can slow down debugging during incidents. A poorly planned upgrade can stall workers or introduce subtle compatibility issues between SDKs and the server. Even something as simple as history growth or retry behavior can quietly create pressure on persistence and visibility backends. In other words, you are not just running a service—you are operating the durability layer for long-running business processes. That responsibility compounds over time, and it is where many teams realize that running Temporal well requires significantly more platform maturity than getting it running in the first place.

Temporal’s own production readiness guidance makes the contrast clear. Self-hosted Temporal does not provide RBAC or audit logging out of the box. Temporal Cloud adds RBAC, SSO support, audit logging, encryption, and compliance-oriented controls including SOC 2 Type II and HIPAA support. That is not just a security talking point. It is an operations and governance difference that becomes very real once the platform is tied to critical workflows.

Observability is another area where self-hosted responsibility compounds quickly. Temporal gives you durable execution, but visibility into production behavior is still something you have to operationalize correctly. That is one reason Xgrid’s Temporal Observability in Production: The Gap Nobody Talks About Until It Costs You resonates so strongly with engineering leaders: the workflow is the unit of truth, and if your instrumentation is fragmented across workers, retries, logs, and spans, operating Temporal at scale becomes much harder than it looked in a proof of concept.

The pattern is consistent: self-hosted gives you freedom, but it also gives you responsibility for problems that rarely help differentiate your business.

Why Temporal Cloud Is the Default Choice for Most Teams

Temporal Cloud is the stronger default because it removes the part of Temporal that most companies should not want to own.

The reliability baseline is already compelling. Standard namespaces include three-zone synchronous replication, automatic failover, and a 99.9% SLA. Temporal Cloud also supports private connectivity via AWS PrivateLink and Google Private Service Connect, which means workers can reach Temporal over private network paths rather than requiring public internet exposure. Audit logs can be streamed into AWS Kinesis or GCP Pub/Sub, making the platform easier to align with enterprise security operations.

That matters because most companies do not gain strategic advantage from being exceptionally good at running workflow control planes. They gain advantage by shipping faster, reducing failure rates in business processes, and scaling durable automation without dragging platform teams into another permanent operational commitment.

Temporal Cloud also improves the governance story. Usage and billing are surfaced at the namespace level, so leadership can tie costs back to actual workloads more directly instead of hiding orchestration costs inside generalized infrastructure overhead.

There is even a performance argument in Cloud’s favor. Temporal has published benchmarks showing strong latency performance for Temporal Cloud due to architectural optimizations in the managed service. However, latency characteristics ultimately depend on deployment topology. In some cases, self-hosted deping additional network round trips. In practice, the performance difference is usually less about the platform deployments with co-located workers and databases can achieve lower end-to-end latency by eliminating choice itself and more about how workers, persistence, and the Temporal service are deployed relative to each other. Vendor benchmarks should always be read critically, but they are still important because they dismantle one of the most common assumptions used to justify self-hosting by default. 

In other words, the case for Cloud is not just that it is easier.

It is that it lets your best engineers stay focused on the work that actually matters.

The migration question is no longer a blocker

One of the traditional arguments for self-hosting has been reversibility. Teams worry that once they choose Cloud, they are locked into an operating model that will be hard to unwind.

That concern is weaker than it used to be.

Temporal now documents an automated migration path for moving from self-hosted deployments to Temporal Cloud, including zero-downtime namespace handover using workflow replication for open workflows. For long-running workflows, this can reduce migration risk significantly, although migration planning still matters for retention differences, worker cutover, and workflow history handling, though the automated migration capability is currently in pre-release. That does not make migration trivial, but it does change the strategic calculus. Choosing Cloud is no longer the same thing as giving up optionality. (Temporal Docs)

That is especially relevant for enterprises modernizing legacy orchestration. If you are moving away from cron-heavy scripts, fragile workers, or ad hoc queue-based coordination, Xgrid’s Don’t Rewrite Your Domain: How to Migrate Long-Running Processes to Temporal Safely is a useful follow-on read. It shows how to move long-running processes into Temporal without turning migration into a rewrite program.

A practical rule for CTOs

If infrastructure control is itself a hard requirement, self-hosted Temporal is a valid choice.

If it is not, Temporal Cloud is probably the better decision.

That recommendation is not ideological. It is organizational.

Choose self-hosted when you genuinely need the control and already have the platform maturity to own the consequences. Choose Cloud when you want durable execution in production without expanding your operational surface area more than necessary.

That is also why Temporal tends to become more valuable as systems become more complex. In AI systems, long-running agents, human-in-the-loop operations, and multi-service business workflows, durability matters more, not less. The orchestration layer becomes increasingly important, but that still does not mean you should run it yourself unless you have a good reason. Xgrid’s Agentic AI with Temporal: From Prototype to Production, The Memory Problem: Why Long-Running AI Agents Crash in Production, and Temporal vs Airflow vs Argo all reinforce the same broader point: the orchestration decision is really a systems design and operating model decision.

Bottom line

Self-hosted Temporal is not the wrong choice. But it should be a deliberate one.
For most CTOs, Temporal Cloud is the better default because it preserves what makes Temporal powerful while removing a category of infrastructure work that most teams should not want to own. Your engineers still build workflows. Your workers still run in your environment. Your systems still benefit from durable execution. But your platform team avoids inheriting another control plane that needs to be tuned, upgraded, secured, and defended indefinitely. (Temporal Docs)

That is the strategic difference.

And for most organizations, that is enough to make the decision.

Frequently Asked Questions

Is Temporal Cloud the same as open-source Temporal?

No. Temporal Cloud is the managed version of Temporal operated by the Temporal team. It runs the same core workflow engine but removes the need to manage infrastructure, upgrades, scaling, and reliability operations yourself.

Does Temporal Cloud run my workflows?

No. Your workers and application code still run in your own infrastructure. Temporal Cloud only manages the workflow orchestration service that stores workflow state and coordinates execution.

Is self-hosting Temporal difficult?

Deploying Temporal is straightforward, but operating it reliably in production is more complex. Teams must manage databases, scaling, upgrades, observability, and disaster recovery as more workflows become mission-critical.

When should a company choose self-hosted Temporal?

Self-hosting usually makes sense when infrastructure control is required, such as in regulated environments, organizations with strong platform engineering teams, or cases where infrastructure must follow strict internal standards.

When is Temporal Cloud the better choice?

Temporal Cloud is typically better when teams want durable workflow orchestration without operating the infrastructure. It allows engineering teams to focus on application logic while Temporal manages scaling, reliability, and upgrades.

Can you migrate from self-hosted Temporal to Temporal Cloud?

Yes. Temporal provides migration approaches that allow teams to move namespaces and replicate open workflows to Temporal Cloud. Migration still requires planning, but it reduces the risk of being locked into a single deployment model.

What is the Temporal Cloud used for?

Temporal Cloud is used to run reliable workflow orchestration in production without managing the underlying Temporal infrastructure. Teams use it for microservice coordination, long-running business processes, and durable AI or automation workflows.

Can Temporal run on Kubernetes?

Yes. Self-hosted Temporal is commonly deployed on Kubernetes clusters using Helm charts or infrastructure automation. Many organizations run Temporal alongside application services in Kubernetes environments.

Do Temporal workers run in the cloud?

Temporal workers can run anywhere—in Kubernetes, virtual machines, or cloud infrastructure. Even when using Temporal Cloud, workers typically run inside the organization’s own environment.

Related Articles

Related Articles