Hero Image
- Mihai Surdeanu

Long JVM pause detector in Java

Today, we are going to implement a simple JVM pause detector thread that can be used to detect long GC pauses inside your application.

First of all: What is a JVM pause? Sometimes our JVM is experiencing long garbage collection pauses that can be translated in multiple stop-the-world events and as a consequence in stopping our application threads. Less JVM pauses means more time for application threads to be scheduled by our processor.

Secondly: Why we should detect those JVM pauses? By writing a detector thread, we can capture those events programmatically and we can enhance our monitoring stack. You can also spend some time to activate GC logging for your application by following this tutorial.

A possible implementation for a LongJVMPauseDetector thread will be:

package org.myalerts.component;

import lombok.extern.slf4j.Slf4j;

import java.util.concurrent.atomic.AtomicReference;

@Slf4j
public final class LongJVMPauseDetector {

    private static final int PRECISION = 50;

    private static final int THRESHOLD = 500;

    private final AtomicReference<Thread> workerRef = new AtomicReference<>();

    private long lastWakeUpTime;

    public void start() {
        final var worker = new Thread("jvm-pause-detector-worker") {

            @Override public void run() {
                lastWakeUpTime = System.currentTimeMillis();

                while (true) {
                    try {
                        Thread.sleep(PRECISION);

                        final var now = System.currentTimeMillis();
                        final var pause = now - PRECISION - lastWakeUpTime;

                        if (pause >= THRESHOLD) {
                            lastWakeUpTime = now;

                            // TODO: Here you can log a monitoring event with detected pause: pause!
                        } else {
                            lastWakeUpTime = now;
                        }
                    } catch (InterruptedException e) {
                        if (workerRef.compareAndSet(this, null)) {
                            log.error(getName() + " has been interrupted.", e);
                        } else if (log.isDebugEnabled()) {
                            log.debug(getName() + " has been stopped.");
                        }

                        break;
                    }
                }
            }
        };

        if (!workerRef.compareAndSet(null, worker)) {
            log.warn(LongJVMPauseDetector.class.getSimpleName() + " already started!");
            return;
        }

        worker.setDaemon(true);
        worker.start();

        if (log.isDebugEnabled()) {
            log.debug("LongJVMPauseDetector was successfully started");
        }
    }

    public void stop() {
        final var worker = workerRef.getAndSet(null);

        if (worker != null && worker.isAlive() && !worker.isInterrupted()) {
            worker.interrupt();
        }
    }

}

There is a small TODO line above. Here, you will have to inject your monitoring event that will be generated. As you can see, your monitoring event will be generated also if detected pause is bigger or equal than 500 ms. In theory, if there is no GC activity, this thread will be scheduled every 50 ms, due to Thread.sleep(PRECISION); line. If a full GC happens exactly during sleep phase and duration of this event is 2 minutes, next time when our detector thread will be scheduled, a pause higher than 500 ms will be detected and an alert raised.

Happy coding!

Other Related Posts:

Despre Map-uri în Java

Despre Map-uri în Java

Salutare dragilor,

Acum câteva săptămâni am discutat despre mai multe implementări de liste în Java și am evidențiat câteva dintre cele mai importante asemănări și deosebiri. Astăzi, rămânem la capitolul colecții, dar vom vorbi despre Map-uri.

14th Feb 2022 - Mihai Surdeanu