MUnit is a new Scala testing library

February 1, 2020

Hello world! I'm excited to announce the first release of MUnit, a new Scala testing library with a focus on actionable errors and extensible APIs. You may be thinking "Why create Yet Another Scala testing library?". It's a good question and this post is my attempt to explain the motivations for creating MUnit.

Like many other existing testing libraries, MUnit has no external Scala dependencies and is published for a wide range of compiler versions and platforms.

Scala Version	JVM	Scala.js (0.6.x, 1.x)	Native (0.4.x)
2.11.x	✅	✅	✅
2.12.x	✅	✅	n/a
2.13.x	✅	✅	n/a
0.21.x (Dotty)	✅	n/a	n/a

MUnit tries to distinguish itself by focusing on the following features:

Tests as values: test cases are represented as normal data structures that you can manipulate and abstract over.
Rich filtering capabilities: MUnit provides fine-grained control over what tests are enabled for which environments.
Actionable errors: the formatting of failed test cases is optimized for giving you as much information as possible to understand how to fix the test case.
Tooling integrations: MUnit is implemented as a JUnit runner and tries to build on top of existing JUnit functionality where possible.
Insightful test reports: the MUnit sbt plugin allows you to analyze historical data about your tests to answer questions like "is this test suite flaky?" and "which tests are slowing down my CI?".

TL;DR

To use MUnit, first add a dependency in your build.

// build.sbt
libraryDependencies += "org.scalameta" %% "munit" % "0.4.3"
testFrameworks += new TestFramework("munit.Framework")

Next, write a test case:

// src/test/scala/com/MySuite.scala
class MySuite extends munit.FunSuite {
  test("hello") {
    assert(41 == 42)
  }
}

Check out the getting started guide.

Rich filtering capabilities

Using tags, MUnit provides a extensible way to disable/enable tests based on static and dynamic conditions.

For example, the MUnit codebase itself is cross-built against 11 different combinations of Scala compiler versions (2.11, 2.12, 2.13, Dotty) and platforms (JVM,JS,Native). Our CI also runs tests on JDK 8/11 and Linux/Windows. Inevitably, some test cases end up getting disabled in certain environments.

Imagine that we have test case that for some reason should only run on Windows in Scala 2.13. We can implement a custom Window213 tag with the following code:

import scala.util.Properties
import munit._
object Windows213 extends Tag("Windows213")
class MySuite extends FunSuite {
  override def munitTestTransforms = super.munitTestTransforms ++ List(
    new TestTransform("Windows213", { test =>
      val isIgnored =
        test.tags(Windows213) && !(
          Properties.isWin &&
            Properties.versionNumberString.startsWith("2.13")
        )
      if (isIgnored) test.tag(Ignore)
      else test
    })
  )

  test("windows-213".tag(Windows213)) {
    // Only runs when operating system is Windows and Scala version is 2.13
  }
  test("normal test") {
    // Always runs like a normal test.
  }
}

By encoding the environment requirements in the test implementation, we prevent the situation where users run sbt test commands that are invalid for their active operating system or Scala version.

Check out the filtering tests guide to learn more how to enable/disable tests with MUnit.

Actionable errors

The design goal for MUnit error messages is to give you as much context as possible to address the test failure. Let's consider a few concrete examples.

Demo showing source location for failed assertion

In the image above, you can cmd+click on the .../test/scala/munit/DemoSuite.scala:7 path to open the failing line of code in your editor. By highlighting the failing line of code, you also immediately gain some understanding for why the test might be failing.

Demo showing diff between values of a case class

In the image above, the failing assertEquals() displays a diff comparing two values of a User case class. The "Obtained" section includes copy-paste friendly syntax of the obtained value, which can be helpful in the common situation when a failing test case should have passed because the expected behavior of your program has changed.

Demo showing diff between multiline strings

In the image above, the failing assertNoDiff() includes a stripMargin formatted multiline string of the obtained string. The assertNoDiff() assertions is helpful for comparing multiline strings ignoring non-visible differences such as Windows/Unix newlines, ANSI color codes and leading/trailing whitespace.

Demo showing how to include clues in error messages

In the image above, the clue(a) helpers are used to enrich the error message with additional information that is displayed when the assertion fails.

Demo showing highlighted stack traces

In the image above, stack trace elements that are defined from library dependencies like the standard library are grayed out making it easier to find stack trace elements that are relevant for your code. This can be helpful when debugging large exception stack traces. This feature is inspired by the pretty-printing of stack traces in utest.

Check out the writing assertions guide to learn more how to write assertions with helpful error messages.

Tooling integrations

The tooling side of a testing library is equally important as the library APIs. MUnit is implemented as a JUnit runner, which means that any existing tool that knows how to run a JUnit test suite knows how to run MUnit test suites.

For example, IntelliJ already detects MUnit test suites even if IntelliJ has no custom logic to support MUnit.

Demo showing IntelliJ running MUnit tests

Likewise, build tools such as Gradle and Pants can integrate with MUnit using their existing JUnit integrations.

Insightful test reports

MUnit has an sbt plugin to store structured JSON data about test results in Google Cloud. The data can then be used to generate HTML reports based on historical test data.

The image below shows test cases in the Metals codebase sorted by how frequently they fail on the master branch.

Click on image to open full report

The Metals test suite ignores failures in tests that are tagged as flaky. However, it's clear that DefinitionLspSuite.missing-compiler-plugin is not flaky, it consistently fails on every run. On the other hand, PantsLspSuite.basic has only failed once out of eleven test runs so it seems to be legitimately flaky.

The Metals codebase has ~1.5k test cases, some which run against up to seven different Scala compiler versions. It's not ideal that some test cases fail non-deterministically but it's normal that it happens as the project grows and we support more build tools, Scala versions and features. While there is no silver bullet for avoiding flaky test failures, having data about how frequently a test fails is at least a starting point to begin addressing the problem.

Check out the generating test reports guide to learn how to configure your build to upload test reports to Google Cloud using the MUnit sbt plugin. The plugin is implemented as an sbt TestsListener so should work with any testing library (including ScalaTest, utest, ...) although it has so far only been tested against MUnit.

Credits

I want to thank @gabro for implementing Dotty support, porting the Metals codebase to MUnit and sharing tons of valuable feedback. Without your initial interest in MUnit I probably would not have polished the project for a proper release.

Conclusion

MUnit is a new Scala testing library with a focus on actionable errors and extensible APIs. MUnit is already used in several Scalameta projects including scalameta/scalameta, scalameta/metals and scalameta/mdoc.

Most of the ideas in this post are not new. The features in MUnit are heavily inspired by existing testing libraries including ScalaTest, utest, JUnit and ava (a JavaScript testing library). However, I'm not aware of a testing library that provides the combination of all the features presented in this post in one solution and I hope that explains the motivation for why MUnit exists.

Happy testing ✌️