Running 1 Million Threads in Java

Shazin Sadakath
4 min readOct 31, 2022

--

Before we dive into creating a million threads let’s have a look at the history of Java threads.

Java threads have been around since Java 1.0 to provide concurrency within Java applications. Initially they were called “Green Threads” which were predominantly threads managed entirely by the JVM. This was because at the time of Java 1.0’s release the CPUs were mostly single core and OSes at that time didn’t support Kernel/Platform threads (more on this later) at all. This was a One to Many implementation where one actual thread may cater many Java threads. These Java threads had their own call stack and were using a lot of memory.

But as CPUs and OSes became more and more advanced and supported Kernel/Platform level threads, Java implementation of threads also changed. Java later adopted a One to One implementation where a Java thread is effectively a thin wrapper over a Kernel/Platform thread. A Kernel/Platform thread is created and managed by the OS and apart from the heap storage each Java thread wrapped over a Kernel/Platform thread may consume over 1 Mega byte of memory and was expensive to create. To eliminate this Thread pools were used to reuse Java threads.

This had an upper bound on how many threads a Java application can have to support concurrency. The following formula was introduced to calculate the ideal Thread pool size.

Thread Pool Size = Number of CPU Cores + 1

This was always a limitation in Java as other more latest programming languages like Go (goroutines), Akka (actors) and Erlang (processes) have different models for concurrent programming.

To address this limitation Project Loom was born to support “Virtual Threads” in Java, which are light weight than existing Kernel/Platform wrapped threads. Now these Virtual threads will again be managed by the JVM not by underlying OS similar to Green threads of Java 1.0. The striking difference between Green threads and Virtual threads though is that a Virtual thread can have a dynamic call stack which can grow and shrink whereas Green threads had fixed call stacks which consumed memory.

In order to achieve Virtual threads 3 new concepts are introduced within the JVM by Project Loom.

  1. Scheduler — This is a ForkJoin pool which will usually have the size which is equal to number of cores in the CPU.
  2. Carrier Thread — Carrier thread is a Kernel/Platform thread which will be used to execute tasks of Virtual threads. The Scheduler will have a pool of Carrier Threads. A Virtual thread’s call stack will be mounted and unmounted to Carrier thread’s stack as and when it runs in it.
  3. Continuation — This is similar to run, yield calls where a Virtual thread can switch between running and idling based on what it does which is done by the JVM.
    Ex:- When a Virtual thread does a blocking call like a database query or HTTP request it may yield until it gets a response so that other Virtual threads can execute.

So let’s see Virtual threads in action and compare that against Platform/Kernel threads. Java 19 has released with a preview of Project Loom which is what I have used to carry out this demonstration. Initially I ran an application which tries to create 1 million Platform/Kernel threads.

long start = System.currentTimeMillis();
CountDownLatch countDownLatch = new CountDownLatch(1_000_000);
for (int i=0;i<1_000_000;i++) {
Thread normalThread = new Thread(() -> {
System.out.println("Hello, World from Regular Thread : " + Thread.currentThread().getName());
number.incrementAndGet();
try {
Thread.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
countDownLatch.countDown();
});
normalThread.start();
}
countDownLatch.await();
long end = System.currentTimeMillis();
System.out.println("Time Taken = "+(end - start));
System.out.println("Number = "+number.get());

This resulted in 113 thread count within the JVM and took over 100k milliseconds (1.5 minutes) to finish in my Macbook Pro M1 pro CPU laptop with 16 Giga bytes of memory.

Then I tested a similar code using Virtual threads. When compiling and running Virtual threads in Java 19 I had to use — enable-preview flag.

AtomicInteger number = new AtomicInteger(0);
CountDownLatch countDownLatch = new CountDownLatch(1_000_000);
long start = System.currentTimeMillis();
for (int i=0;i<1_000_000;i++) {
Thread.startVirtualThread(() -> {
System.out.println("Hello, World from Virtual Thread");
number.incrementAndGet();
try {
Thread.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
countDownLatch.countDown();
});
}
countDownLatch.await();
long end = System.currentTimeMillis();
System.out.println("Time Taken = "+(end - start));
System.out.println("Number = "+number.get());

Which ran in only 13k milliseconds (13 seconds) in the same configurations using lesser number of threads (46 threads).

As a control I ran another Java application without any threads but just sleeps to check the default number of threads which was around 22.

This demonstration clearly shows that Virtual threads do not map One to One to Kernel/Platform threads and in some cases can execute faster due to its ability to use a pool of Carrier threads to execute its task. Both tasks had an I/O operation and an idling time.

Conclusion

This clearly doesn’t mean one can blindly replace all existing threads with Virtual threads. Yet this opens of new way of thinking where creating a Virtual thread per task becomes possible again in Java. This means a Web server which uses Virtual threads can create a Virtual thread per request without having to worry about running out of JVM memory.

This opens up new avenues in implementing concurrency in Java and I am pretty excited to see its full release possibly in Java 20 or later.

--

--