Concurrency battle! Which language wins?

2026-05-20

Every language claims it can “do many things at once.” Almost none of them mean the same thing by it. Some run code on every core simultaneously; some only look like they do; one of our contenders has no idea what a thread even is and somehow still ships to a billion devices a day.

Before the bell rings, the one distinction that decides the whole fight:

Concurrency is structuring a program as independent tasks that can make progress in overlapping time windows. One barista taking five orders and juggling them.
Parallelism is physically running more than one task at the same instant — five baristas. You need multiple cores, and crucially, a runtime that will actually use them.

A language can be brilliant at concurrency and terrible at parallelism (hello Ruby, Python, JavaScript). That’s not always a flaw: most web work is I/O-bound — waiting on a database, a network call, a disk — and during that wait there’s no CPU work to parallelise anyway.

Concurrency is about dealing with many things; parallelism is about doing many things. The fight below is really "how much of each, and how safely?"

The scorecard at a glance

Language	Concurrency model	True CPU parallelism in one process?	Core primitive	Shared mutable state?	Sweet spot
Ruby (MRI)	Threads under a GVL; Ractors/processes for parallelism	✗ threads · ✓ Ractors/processes	`Thread`, `Fiber`, `Ractor`	Yes (but GVL hides many races)	I/O-bound web apps
Python (CPython)	Threads under the GIL; `asyncio`; `multiprocessing`	✗ classic · ✓ free-threaded 3.13+/processes	`threading`, `asyncio`, `multiprocessing`	Yes	I/O-bound, scripting, data (C ext)
Elixir (BEAM)	Actor model — preemptive lightweight processes	✓ scheduler per core	process + message passing	No — share-nothing	Millions of connections, fault-tolerance
Java (JVM)	1:1 OS threads + shared memory; virtual threads (Loom)	✓	`Thread`, virtual threads, `java.util.concurrent`	Yes (JMM, locks)	CPU-bound + high-throughput servers
C++	Native OS threads, manual everything	✓	`std::thread`, `std::async`, atomics, coroutines	Yes (a data race is undefined behaviour)	Maximum performance, systems, low latency
JavaScript	Single-thread event loop + async I/O; Workers	✗ in one thread · ✓ via Workers	`Promise`/`async`, `Worker`, `SharedArrayBuffer`	No in one thread (message-passing Workers)	I/O-bound, UIs, glue code
HTML 😄	None. It is a markup language.	✗ (it does not compute)	`<marquee>`, arguably	Nothing to share	Winning by refusing to play

↔ scroll the table sideways to see every column.

Now the contenders, one at a time.

Ruby — one pass through the GVL

MRI Ruby has a Global VM Lock: only one thread can execute Ruby code at a time, no matter how many cores you own. The lock exists to keep the interpreter’s internals (object allocation, GC, C extensions) safe without fine-grained locking everywhere.

The GVL is released during blocking I/O, so threads still overlap database and network waits — which is exactly what a web request spends most of its time doing. For CPU parallelism you reach for processes (Puma cluster, Sidekiq) or experimental Ractors; JRuby and TruffleRuby drop the GVL entirely.

Verdict: great at I/O concurrency, no in-process CPU parallelism. The “win” is that for typical Rails workloads you rarely notice — you scale CPU with processes, not threads.

Python — the GIL’s near-identical twin

CPython’s Global Interpreter Lock is the GIL to Ruby’s GVL: one bytecode-executing thread at a time, released around I/O and inside many C extensions (NumPy releases it during heavy math, which is why scientific Python feels parallel). The story is so similar that the diagram above could be relabelled and reused.

Same lock, same escape hatches: asyncio for I/O concurrency, multiprocessing for CPU parallelism, C extensions that drop the lock — and now an experimental no-GIL interpreter.

Verdict: ties Ruby. The differentiator is momentum — the free-threaded build means Python may quietly leave this weight class.

Elixir — a scheduler on every core, a process for everything

Now it gets interesting. Elixir runs on the BEAM (the Erlang VM), built for telephone switches that must never go down. Its concurrency unit is the process — not an OS process, not a thread, but a featherweight green process costing ~a few KB. You spawn millions. Each has its own heap (so GC is per-process and never stops the world), they share nothing, and they talk only by sending immutable messages to each other’s mailboxes.

Share-nothing + preemptive scheduling + per-process heaps = concurrency that scales to millions of connections and shrugs off crashes. The model that WhatsApp and Discord lean on for exactly this reason.

Verdict: the heavyweight champion of concurrency. It won’t out-crunch C++ on a tight numeric loop, but for “hold a million live connections and never fall over,” nothing here is close.

Java — real threads, shared memory, and now virtual threads

The JVM gives you honest 1:1 OS threads: schedule them on every core, get genuine parallelism. The price is shared mutable memory, governed by the Java Memory Model — you coordinate with synchronized, volatile, locks, and the excellent java.util.concurrent toolbox (thread pools, concurrent collections, CompletableFuture). Powerful, fast, and absolutely able to deadlock or race if you’re careless.

Virtual threads let you write straightforward blocking code that scales like async — the runtime unmounts a virtual thread from its carrier whenever it blocks. Java effectively bought Elixir's concurrency ergonomics while keeping shared-memory parallelism.

Verdict: the best all-rounder. Real parallelism and, since Loom, cheap massive concurrency. The catch is the one constant of shared-memory threading: you can still write the bug.

C++ — maximum power, zero seatbelts

std::thread is a thin wrapper over an OS thread. No GVL, no GIL, no runtime, no garbage collector pausing you — your threads hit the metal on every core. You get atomics, mutexes, condition variables, a formal memory model (since C++11), and coroutines (C++20). You also get to personally guarantee there are no data races, because a data race in C++ is undefined behaviour — not a crash, not an exception, but “the compiler may do anything.”

If the workload is CPU-bound and every nanosecond counts — trading engines, game engines, codecs — C++ wins on raw throughput. Modern tooling (ThreadSanitizer, RAII locks, std::jthread) softens the edges, but the safety is on you.

Verdict: the raw-power champion. Highest ceiling, lowest guard rails. You win the benchmark and accept the responsibility.

JavaScript — one thread, an event loop, and zero data races

JavaScript runs your code on a single thread. It feels concurrent because almost nothing blocks: I/O is handed off (to the browser’s Web APIs, or libuv in Node), and when it finishes, a callback is queued and the event loop picks it up between turns. Promises and async/await are sugar over this queue. Because there’s only one thread touching your variables, data races on shared state are impossible by construction — a genuinely underrated win.

One thread keeps your logic race-free; offloaded I/O keeps it responsive. For CPU-bound work you spin up Workers — which, like Elixir, communicate by message passing rather than shared memory.

Verdict: punches far above its single thread for I/O-bound and UI work, with a safety guarantee the shared-memory languages envy. It just can’t crunch numbers on every core without leaving the comfort of one event loop.

HTML — the undefeated champion (by forfeit)

HTML is not a programming language. It has no variables, no loops, no threads, no locks. It cannot deadlock. It has shipped exactly zero race conditions in its entire history.

The only contender that wins by never entering the ring. There's a serious point hiding in the joke: the most reliable concurrent code is the concurrency you don't write — push it down into a runtime (BEAM), a browser, or a database that has already solved it.

So who wins?

There’s no single belt — it depends on what you’re fighting for. Plotting raw CPU-parallel power against how safe and ergonomic the concurrency is:

No universal winner — pick the corner that matches the job.

A cheat sheet for picking your fighter:

Hold a huge number of live connections and never go down → Elixir. Built for it; the others bolt it on.
CPU-bound throughput with a mature ecosystem → Java (or C++ if every nanosecond and you trust your team’s discipline).
Squeeze the absolute most out of the hardware → C++, eyes open.
I/O-bound web apps, fast to build → Ruby, Python, or JavaScript — the GVL/GIL/single-thread “limitation” rarely bites, because you’re waiting on the database, and you scale CPU with processes.
A UI or anything in the browser → JavaScript, gratefully accepting its race-free single thread.

The honest conclusion: most “concurrency problems” in everyday backends are really I/O-overlap problems, and almost every language here handles those fine. True CPU parallelism is the rarer need — and the moment you reach for it, the language’s model (shared memory vs. share-nothing) matters far more than its raw speed.

See it in motion. Web servers are where most of us actually meet this fight: processes (workers) for CPU parallelism, threads for I/O overlap, a connection pool to the database, all behind a load balancer. Tune the worker and thread counts and watch throughput, the GVL, and the database queue react live in the Scaling Web Servers simulator →.

← os cs fundamentals