Your Django or Flask is not keeping up with growth. Latency rising, Cloud bill exploding, the GIL blocking what should be concurrent. We migrate to Go with the strangler pattern — no big-bang, keeping the business running every sprint.
Not a blind rewrite, not a language purist's crusade. A domain-by-domain migration with metrics before and after, and API compatibility for as long as the change takes.
−40%
Typical Cloud Run cost after migrating the hot path
10×
Throughput per instance with goroutines vs uvicorn workers
<100 ms
Typical cold-start, vs 1–3 s for a Python process with heavy deps
0
Big-bangs. Progressive strangler-pattern migration
When to migrate
Not every Python backend needs migrating. These are the concrete symptoms where Go gives back more than it costs.
The GIL serialises real concurrency. As you push RPS, p99 spikes even with CPU at 30%. Adding uvicorn workers buys time, it does not solve it.
To serve the same throughput, Python needs 3–10× more instances than Go. On Cloud Run with autoscaling, that is direct invoice. Migrating the hot path cuts cost without changing logic.
Python + Django/FastAPI + ML deps takes 1–3 s to boot. On Cloud Run with scale-to-zero, that means failed requests or timeouts. A Go binary boots in <100 ms.
asyncio works but its mental model is complex and breaks inside sync libraries. Celery + Redis for async tasks is operationally expensive. In Go, a goroutine + channel covers 90% of those cases.
mypy helps but is not enforced. In Go, the compiler rejects type mismatches before they reach CI. The bug curve per release flattens visibly.
Python Docker images with NumPy, PyTorch and friends easily exceed 1 GB. In Go, a static binary on a scratch or distroless image lands around 30 MB. Faster deploys, smaller attack surface.
When NOT to migrate
Honest comparison
Each language wins where it should. This is the comparison we hand to CTOs on the fence — no stack tribalism.
| Dimension | Python | Go | Verdict |
|---|---|---|---|
| p99 latency at high concurrency | GIL-limited · uvicorn workers as workaround | Stable under load · native goroutines | Go |
| Cold-start (Cloud Run / Lambda) | 1–3 s typical with dependencies | <100 ms with optimised binary | Go |
| Image footprint | 300 MB – 2 GB | <30 MB with multi-stage | Go |
| Prototyping speed | Unbeatable · notebook → script → service | More boilerplate · always compiles | Python |
| ML / data science ecosystem | De-facto standard · NumPy, PyTorch, scikit | Limited · ONNX runtime for inference | Python |
| Typing and compile-time error catching | mypy / pyright are optional · runtime errors | Strict compiler · explicit errors | Go |
| Concurrency | asyncio works but breaks in sync libs · GIL | Goroutines + channels · first-class language feature | Go |
| Talent availability | Abundant at every level | More niche · but high average level | Tie |
| Cloud Run deployment | Reasonable · constant Dockerfile tweaks | Idiomatic · scratch/distroless image | Go |
| Monthly cost at equal throughput | Baseline | Typically −40% to −60% | Go |
How we migrate
The new Go service progressively absorbs traffic via proxy or feature flags. The API stays compatible. The Python monolith stays alive until the last endpoint has been migrated.
01
Audit of the existing monolith: endpoints, per-endpoint latency, test coverage, critical dependencies, existing observability. We identify the 3–5 hot endpoints that justify starting the migration. We define the target SLO.
1–2 wks
02
The new Go service exposes exactly the same APIs as the Python monolith. OpenAPI / Protobuf as a signed contract, contract testing in CI. No client changes — the frontend or mobile apps do not touch code.
1 wk
03
We rewrite the hottest endpoint or set of endpoints in Go. Same DB, same Redis, same queue. Integration tests against real fixtures. Initial deployment at 0% traffic — full shadow run to validate behavioural parity.
3–6 wks
04
Routing 1% → 5% → 25% → 100% of traffic to the Go service via proxy (Envoy, Cloud Load Balancer) or per-user feature flags. Metrics compared at every step: latency, error rate, response parity. Rollback in seconds if anything drifts.
2–3 wks per domain
05
We repeat the cycle per domain. When the last Python endpoint has migrated, we shut down the monolith. Operational documentation, runbooks, Grafana / Cloud Monitoring dashboards handed over to the customer's team.
Scope-dependent
What we migrate
Not everything always migrates. Sometimes only the hot path. These are the most frequent patterns we see in production.
The Django/Flask serving the mobile app or SPA. Usually the highest-traffic path and the easiest to migrate — squeezes the most out of goroutines and immediately reduces Cloud cost.
What Celery + Redis does today gets absorbed by a Go service with Pub/Sub or NATS. Bounded goroutine pool, dead-letter queue, controlled backpressure. 10× throughput per instance without touching the business logic.
The orchestration part (function calling, retries, streaming) in Go; pure ML (training, fine-tuning) stays in Python. Common hybrid pattern — we apply it at CityXerpa.
Splitting a Django monolith by domain (auth, billing, catalogue, notifications). Each domain becomes an independent Go microservice with its own DB. Strangler pattern per domain, leaving the rest untouched.
Where p99 < 100 ms is contractual. Python does not get there; Java/JVM works but weighs more. Go fits exactly: right latency and footprint, cheaper talent than Java.
Endpoints that receive webhooks (Stripe, Twilio, GitHub) and trigger downstream processing. Go makes these endpoints saturation-proof with goroutines + Pub/Sub to decouple the hot path.
Technical FAQ
It depends on the size and fragmentation of the monolith. A medium-sized Python service with 8–15 hot endpoints migrates in 3–5 months with the strangler pattern, with no downtime and the business running. Large monoliths with hundreds of endpoints are 9–18-month projects, but ROI usually shows in month 3 once the first hot domain has migrated.
Unit tests, yes — they are language-specific. Integration and end-to-end tests (that hit the API) NO: those are precisely the ones that validate the new Go service returns the same response as the old Python. Keeping them across the entire migration is the main safety net.
Common case. Three options: (1) Dribba runs the migration end-to-end and hands over docs + 2 weeks of pairing sessions; (2) hybrid model — our Go engineers embedded with your Python team to transfer knowledge while migrating; (3) intensive prior training (1–2 weeks) for the team and ongoing accompaniment. Python engineers with experience pick up Go in 4–8 weeks — the learning curve is short.
Fully. The exposed API does not change while the migration runs. Same endpoints, same schemas, same error codes. The frontend or mobile apps notice nothing. If you want to evolve the API to v2 after the migration, we do that as a separate project.
By default, only the application layer. The DB stays (usually Postgres). The Go service uses pgx or sqlc to hit the same tables. If the DB is itself part of the problem (poor model, heavy queries, missing indexes), we flag it during discovery and propose improvements as a separate track.
We do not migrate them. PyTorch, TensorFlow, scikit-learn, NumPy/pandas stay in Python — Go does not compete there. The split is: HTTP service layer in Go, ML layer in Python (minimal FastAPI or gRPC server). The Go service calls Python via internal gRPC. Common, effective pattern.
Four metrics at every traffic step: (1) p50/p95/p99 latency of the new Go service vs the old Python; (2) error rate per endpoint; (3) response parity — byte-for-byte response comparison for 1000 random requests; (4) Cloud Run cost per request. Grafana or Cloud Monitoring dashboards with automatic alerts if any metric drifts >5%.
Hot-path migration (3–5 endpoints in a medium Python service) ranges €60,000–120,000 over 3–5 months. Full migration of a monolith to a Go microservices architecture ranges €150,000–400,000 over 9–18 months. Fixed price per domain after Discovery. The bill usually amortises in 6–12 months from Cloud savings alone.
Has your Python monolith stopped scaling?
We deliver a report with: hot endpoints, test parity, strangler plan per domain, cost estimate and ROI projection. Even if the plan does not hire us.
100% senior in-house team in Barcelona and Andorra · Reply within 24 h