Microbenchmarking Scala with JMH

Many times we find that there are multiple ways to write a piece of code and sometimes the choice may be determined by which implementation executes fastest. We might want to have a shootout between the different implementations to find out which one is fastest. The Java Microbenchmark Harness (JMH) tool can help us get an experimental answer to this type of question. The sbt-jmh plugin makes it very easy to execute JMH tests on Scala or Java code in an sbt project.

toHexString

Suppose we need to implement a method to convert an Array[Byte] to it's hexadecimal representation. There are multiple ways we could do this. A concise approach is to use Scala's formatted string interpolation.

scala> def toHexString(bytes: Array[Byte]) =
           bytes.map(b => f"$b%02x").mkString
toHexString: (bytes: Array[Byte])String
scala> toHexString("Scala".getBytes)
res1: String = 5363616c61

String interpolation looks convenient, but maybe we should try a version not using string interpolation to see if there's a cost to using it.

scala> def toHexString(bytes: Array[Byte]) =
           bytes.map(b => "%02x".format(b)).mkString
toHexString: (bytes: Array[Byte])String
scala> toHexString("Scala".getBytes)
res3: String = 5363616c61

As a third option, we might consult StackOverflow to see how others solve this problem and maybe get better performance.

scala>   def toHexString(bytes: Array[Byte]) = {
     |     val hexArray: Array[Byte] = Array(
     |           '0', '1', '2', '3', '4',
     |           '5', '6', '7', '8', '9',
     |           'A', 'B', 'C', 'D', 'E',
     |           'F')
     |     val hexChars = Array.fill(bytes.size * 2)(0.toByte)
     |     for {
     |       j <- 0 to bytes.length - 1
     |       v = bytes(j) & 0xFF
     |     } {
     |       hexChars(j * 2) = hexArray(v >>> 4)
     |       hexChars(j * 2 + 1) = hexArray(v & 0x0F)
     |     }
     |     new String(hexChars)
     |   }
toHexString: (bytes: Array[Byte])String
scala> toHexString("Scala".getBytes)
res1: String = 5363616C61

I know which one I'd choose for readability, but let's see which one performs the best.

JMH

To use JMH via the sbt-jmh plugin, we need to create an sbt project with a project/plugins.sbt file with the following line:

addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.2.4")

and enable it in your project in build.sbt:

enablePlugins(JmhPlugin)

Now we need to create a class that contains methods to benchmark. We tell JMH to benchmark a method by using the Benchmark annotation. We can then configure how the method will be tested using the BenchmarkMode and OutputTimeUnit annotations.

// Must not be in default package
package com.chariotsolutions.jmh.sample
import org.openjdk.jmh.annotations.Benchmark
import org.openjdk.jmh.annotations.BenchmarkMode
import org.openjdk.jmh.annotations.Mode
import org.openjdk.jmh.annotations.OutputTimeUnit
import java.util.concurrent.TimeUnit
/* Default settings for benchmarks in this class */
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@BenchmarkMode(Array(Mode.Throughput))
class TestHexString {
  @Benchmark
  def interpolation: Unit = toHexStringInterp(randomArray)
  @Benchmark
  def format: Unit = toHexStringFormat(randomArray)
  @Benchmark
  def stringManip: Unit = toHexString(randomArray)
  def toHexStringInterp(bytes: Array[Byte]) =
        bytes.map(b => f"$b%02x").mkString
  def toHexString(bytes: Array[Byte]) = {
    val hexArray: Array[Byte] = Array(
                 '0', '1', '2', '3', '4',
                 '5', '6', '7', '8', '9',
                 'A', 'B', 'C', 'D', 'E',
                 'F')
    val hexChars = Array.fill(bytes.size * 2)(0.toByte)
    for {
      j <- 0 to bytes.length - 1
      v = bytes(j) & 0xFF
    } {
      hexChars(j * 2) = hexArray(v >>> 4)
      hexChars(j * 2 + 1) = hexArray(v & 0x0F)
    }
    new String(hexChars)
  }
  def toHexStringFormat(bytes: Array[Byte]) =
     bytes.map(b => "%02x".format(b)).mkString
  def randomArray: Array[Byte] = {
    val a = Array.fill(20)(0.toByte)
    scala.util.Random.nextBytes(a)
    a
  }
}

Note that this class must not be in the default package, otherwise JMH will not run right and the sbt session will die. Also note that I put OutputTimeUnit and BenchmarkMode annotations at the class level to set defaults for all of my benchmark methods. I have three methods that are marked with the Benchmark annotation. These simply call the appropriate toHexString method with a random array of bytes.

Running JMH

We can run our benchmark using jmh:run -i 20 -wi 10 -f1 -t1 in sbt. In this command, -i 20 says that we want to run each benchmark with 20 iterations, -wi 10 says to run 10 warmup iterations, -f 1 says to fork once on each benchmark, and -t1 says to run on one thread. Increasing the number of threads would let us see if the throughput of our benchmark method will scale up. Increasing the number of forks lets us verify performance across multiple JVM instances. If no values are provided, JMH will default to 20 warmup iterations, 20 measurement iterations, 1 thread, and 10 forks. All of these values could be set via annotations in the test code as well. I'm using one fork here minimize execution time, but more than one fork should usually be used for accurate results.

Running this should produce logging output as the test executes. Once it's completed we should see a summary of the results like this:

[info] # Run complete. Total time: 00:01:31
[info]
[info] Benchmark                     Mode  Cnt     Score    Error   Units
[info] TestHexString.format         thrpt   20    63.825 ±  0.863  ops/ms
[info] TestHexString.interpolation  thrpt   20    62.952 ±  1.090  ops/ms
[info] TestHexString.stringManip    thrpt   20  1355.426 ± 14.119  ops/ms
[success] Total time: 92 s, completed Sep 30, 2015 6:06:51 AM

We can see that there is no difference between string interpolation and using format() directly. The more complicated string manipulation approach is noticeably faster, however. If speed is a high concern when creating the hex string, we clearly should be using that approach instead of the other two.

We've seen an example of using JMH to quickly create and execute microbenchmarks to check out performance characteristics with different implementations. For more in-depth information, check out the sbt-jmh samples. There is also a Jenkins plugin that would allow you to run JMH benchmarks as part of your CI workflow.