Microbenchmarking with Java

Hello guys,

Today, we are going to continue the series of articles about Java ecosystem. More specific, today, we are going to talk about JMH (Java Microbenchmark Harness).

First of all, what is JMH?

JMH is a tool for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM. The tool is provided by the OpenJDK and source code is available here.

How to setup a project with JMH?

You can easily setup a Maven project from your IDE. If you are using IntelliJ, it's perfect, because you also have a plugin available. This project will be a new standalone project which will depend on your existing code. This approach is preferred to ensure that the benchmarks are correctly initialized and produce reliable results. In addition, you can go one step forward and you can create multiple benchmarks that can be integrated in your deployment pipeline. In other words, you can also use this to right some performance scenarios to make sure there is no degradation is time for some specific code. If you are dealing with a real time application, you will find quite useful this tool.

To get started, we can actually keep working with Java 11 and simply define the dependencies:

<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-core</artifactId>
    <version>1.35</version>
</dependency>
<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-generator-annprocess</artifactId>
    <version>1.35</version>
</dependency>

The latest versions of the JMH Core and JMH Annotation Processor can be found in Maven Central.

To write a new benchmark, you only need to have experience with Java and with annotations. Nothing special.

A trivial example - string concatenation

Let's suppose that we want to concatenate all numbers from 0 (included) to 1000 (excluded) in a string and to print it to console. For this, there are multiple approaches. The first one, the most trivial one is to use a local variable and a for loop to accumulate the value string. The second one is to use StringBuilder. Last one, is more fancy, because is using stuff added in Java 8 - streams.

Having all this in mind, the question is quite simple: which one will be the fastest one to accomplish our task?

How to setup all cases?

First step is to create a new Maven project and to add JMH dependencies.
Next step is to create a BenchmarkRunner class:

public class BenchmarkRunner {

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

}

Upcoming step is to setup the actual benchmark:

@Fork(value = 1, warmups = 1)
@Warmup(iterations = 1)
@Measurement(iterations = 1)
@OutputTimeUnit(value = TimeUnit.NANOSECONDS)
public class StringConcatBenchmark {

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    public void trivialAppend() {
        String value = "";
        for (int i = 0; i < 1_000; i++) {
            value += i;
        }
        System.out.println(value);
    }

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    public void builderAppend() {
        StringBuilder value = new StringBuilder();
        for (int i = 0; i < 1_000; i++) {
            value.append(i);
        }
        System.out.println(value);
    }

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    public void streamAppend() {
        System.out.println(IntStream.range(0, 1_000).boxed().map(Object::toString).collect(Collectors.joining()));
    }

}

Last step is to run BenchmarkRunner from command line or from your IDE.

The results are printed in a couple of seconds to console:

Benchmark	Mode	Score	Units
StringConcatBenchmark.builderAppend	avgt	39285.889	ns/op
StringConcatBenchmark.streamAppend	avgt	51488.468	ns/op
StringConcatBenchmark.trivialAppend	avgt	245048.328	ns/op

As you can see, the winner is StringBuilder. I forgot to say that I'm running the benchmark using OpenJDK 11.

More details about some annotations / terms

@Fork: The JMH benchmark is run for a number of forks. In my case, is just single time.
@Warmup(iterations = 1): For each fork, a number of iterations are configured as warmups. This is to get the JVM to warmup the code we are measuring. This is important to avoid variations due to JVM initialization phase.
@Measurement(iterations = 1): How many iterations for current benchmark will be measured? Again, in my case is just one measurement.
@OutputTimeUnit: Specifies output time unit. By default is set to seconds.
@Benchmark: Defines the actual benchmark execution.
@BenchmarkMode: Represents the mode of the benchmark. Supported values are:
- Throughput (default mode): This is used to measure the number of times a method is executed in a certain time.
- AverageTime: This is to get the average time for method execution.
- SampleTime: Sampled time for each operation. Shows different percentiles (50, 90, 99), min and max time values.
- SingleShotTime: This measures the time for one single operation.
- All: Incorporates all above measure modes.

Where to find a good documentation about this tool?

There is no official documentation. Why? Because there are multiple samples provided and they are quite nice organized / explained.

You can find more samples to use as source of inspiration here. This is the official samples repository, but for sure you can find other sources on the internet.

Good luck!