Fork me on GitHub

A Whirlwind Tour of Scalameta


For any Scalameta related questions, don't hesitate to ask on our gitter channel: Join the chat at https://gitter.im/scalameta/scalameta

Setup


Library

You can use Scalameta as a library, the scalameta umbrella package includes modules for trees, tokens, parsing, pretty printing, semantic API and more,
libraryDependencies += "org.scalameta" %% "scalameta" % "3.7.3"
Optionally, for extra utilities, you can use Contrib
libraryDependencies += "org.scalameta" %% "contrib" % "3.7.3"

Tutorial

The examples mentioned in this tutorial are available in a single repository that you can clone and run locally.
  1. Clone the accompanying repo.
  2. Run sbt test to make sure everything works.
  3. Open the file Playground.scala
    package scalaworld
    
    import scala.meta._
    import org.scalameta.logger // useful for debugging
    
    class Playground extends org.scalatest.FunSuite {
      import scala.meta._
    
      test("part 1: tokens") {
        val tokens = "val x = 2".tokenize.get
        logger.elem(tokens.syntax)
        logger.elem(tokens.structure)
      }
    
      test("part 2: trees") {
        val tree = "val x = 2".parse[Stat].get
        logger.elem(tree.syntax)
        logger.elem(tree.structure)
      }
    
    }
  4. To test playground on every edit, run sbt ~library/test.
  5. Setup the project in your favorite IDE, for example IntelliJ, ENSIME or vim.

Ammonite REPL

To experiment with Scalameta in the REPL, you can run the following in the Ammonite REPL
import $ivy.`org.scalameta:scalameta_2.12:3.7.3`, scala.meta._

Tokens


Make sure you have Scalameta installed as a library from Setup. You can decide to run these examples from the console or from sbt, for example in the tutorial repo.

This whole workshop will assume you have this import in scope:

scala> import scala.meta._
import scala.meta._

Here's how to tokenize a small expression.

scala> "val x = 2".tokenize.get
res0: scala.meta.tokens.Tokens = Tokens(, val,  , x,  , =,  , 2, )
Let's discuss the most interesting methods on tokens.

Tokens.syntax

The simplest method we can call is Tokens.syntax The method returns a string representation of the actual code behind the tokens, or how the code should look like to a developer.

scala> "val x = 2".tokenize.get.syntax
res0: String = val x = 2

Tokens.toString() uses .syntax behind the scenes. However, you should never rely on toString() when manipulating Scalameta structures, prefer to explicitly call .syntax. It's maybe not so obvious why right now but it will make more sense soon.

Tokens.structure

Another useful method is Tokens.structure. The method shows details that may be relevant to us as metaprogrammers.

scala> "val x = 2".tokenize.get.structure
res0: String = Tokens(BOF [0..0), val [0..3),   [3..4), x [4..5),   [5..6), = [6..7),   [7..8), 2 [8..9), EOF [9..9))

.structure is often useful for debugging and testing.

Tokens vs. Token

The class Tokens is a wrapper around a sequence of Token objects. There are multiple subtypes of Token while there only one type Tokens.

scala> "val x = 2".tokenize.get.head
res0: scala.meta.tokens.Token =

BOF stands for "Beginning of file". Let's see what other kinds of token types are in the string
scala> "val x = 2".tokenize.get.
  map(x => f"${x.structure}%10s -> ${x.productPrefix}").
  mkString("\n")
res0: String =
BOF [0..0) -> BOF
val [0..3) -> KwVal
    [3..4) -> Space
  x [4..5) -> Ident
    [5..6) -> Space
  = [6..7) -> Equals
    [7..8) -> Space
  2 [8..9) -> Int
EOF [9..9) -> EOF

Even spaces get their own tokens. The [0...3) part indicates that the val tokens start at offset 0 and ends at offset 3.

==

How does token equality look like?

scala> "foobar".tokenize.get(1) == "foobar kas".tokenize.get(1)
res0: Boolean = false
Huh, why are they not the same?

Token equality is implemented with reference equality. You need to be explicit if you actually mean syntactic (.syntax), or structural (.structure) equality.

The tokens are syntactically equal.
scala> "foobar".tokenize.get(1).syntax == "foobar kas".tokenize.get(1).syntax
res0: Boolean = true
Even if we move the tokens around
scala> "kas foobar".tokenize.get(3).syntax == "foobar kas".tokenize.get(1).syntax
res0: Boolean = true
The tokens are also structurally equal.
scala> "foobar".tokenize.get(1).structure == "foobar kas".tokenize.get(1).structure
res0: Boolean = true
However, they are not structurally equal if we move them around.
scala> "kas foobar".tokenize.get(3).structure == "foobar kas".tokenize.get(1).structure
res0: Boolean = false

.get

Tokenization can sometimes fail, for example in this case:

scala> """ val str = "unclosed literal """.tokenize
res0: scala.meta.tokenizers.Tokenized =
<input>:1: error: unclosed string literal
 val str = "unclosed literal
           ^

If you prefer, you can safely pattern match on the tokenize result

scala> """ val str = "closed literal" """.tokenize match {
  case tokenizers.Tokenized.Success(tokenized) => tokenized
  case tokenizers.Tokenized.Error(e, _, _) => ???
}
res0: scala.meta.tokens.Tokens = Tokens(,  , val,  , str,  , =,  , "closed literal",  , )

Conclusion

Scalameta tokens are the foundation of Scalameta. Sometimes you don't have access to a parsed AST and then your best shot is work with tokens.

In the following chapter we will discuss another exciting data structure: the incredible scala.meta.Tree.

Trees


Reminder. We assume you have this import in scope:

scala> import scala.meta._
import scala.meta._

q"Quasiquotes"

The easiest way to get started with Scalameta trees is using quasiquotes.

scala> q"case class User(name: String, age: Int)"
res0: meta.Defn.Class = case class User(name: String, age: Int)

Quasiquotes can be composed

scala> val method = q"def `is a baby` = age < 1"
method: meta.Defn.Def = def `is a baby` = age < 1

scala> q"""
case class User(name: String, age: Int) {
  $method
}
"""
res0: meta.Defn.Class = case class User(name: String, age: Int) { def `is a baby` = age < 1 }

Quasiquotes can also be used to deconstruct trees with pattern matching

scala> q"def `is a baby` = age < 1" match {
  case q"def $name = $body" =>
    s"You ${name.syntax} if your ${body.syntax}"
}
res0: String = You `is a baby` if your age < 1

NOTE. Quasiquotes currently ignore comments:

scala> q"val x = 2 // assignment".syntax
res0: String = val x = 2
If you need comments, you can use .parse[T]
scala> "val x = 2 // assignment".parse[Stat].get.syntax
res0: String = val x = 2 // assignment

.parse[T]

If the contents that you want to parse are only known at runtime, you can't use quasiquotes. For example, this happens when you need to parse file contents.

Here's how to parse a compilation unit.

scala> "object Main extends App { println(1) }".parse[Source].get
res0: scala.meta.Source = object Main extends App { println(1) }

Pro tip. You can also call .parse[T] on a File, just like this

scala> new java.io.File("readme/ParseMe.scala").parse[Source]
res0: scala.meta.parsers.Parsed[scala.meta.Source] =
class ParseMe { println("I'm inside a file") }

If we try to parse a statement as a compilation unit we will fail.

scala> "val x = 2".parse[Source]
res0: scala.meta.parsers.Parsed[scala.meta.Source] =
<input>:1: error: expected class or object definition
val x = 2
^

We need to explicitly parse it as a statement (Stat).

scala> "val x = 2".parse[Stat].get
res0: scala.meta.Stat = val x = 2

We can also parse case statement

scala> "case Foo(bar) if bar > 2 => println(bar)".parse[Case].get
res0: scala.meta.Case = case Foo(bar) if bar > 2 => println(bar)

Scalameta has dozens of parsers:

However, .parse[Stat] and .parse[Source] are usually all you need. However, a comprehensive list of the parsers, and their quasiquote syntax is documented here.

Dialects

I didn't tell the whole story when I said you need to pass in a type argument to parse statements. You also need to pass in a dialect! However, Scalameta will by default pick the Scala211 dialect for you if you don't provide one explicitly.

With the SBT dialects, we can parse vals as top-level statements.

scala> dialects.Sbt0137(
  "lazy val core = project.settings(commonSettings)"
).parse[Source].get
res0: scala.meta.Source = lazy val core = project.settings(commonSettings)

We can even parse multiple top level statements

scala> dialects.Sbt0137(
  """
  lazy val core = project.settings(commonSettings)

  lazy val extra = project.dependsOn(core)
  """
).parse[Source].get
res0: scala.meta.Source =

  lazy val core = project.settings(commonSettings)

  lazy val extra = project.dependsOn(core)

For the remainder of the workshop, we will only work with the Scala211 dialect.

Tree.syntax

Just like with tokens, we can also run .syntax on trees.

scala> "foo(bar)".parse[Stat].get.syntax
res0: String = foo(bar)
However, Scalameta can also do this even if you manually construct the tree
scala> Term.Apply(
  Term.Name("foo"),
  Term.Name("bar") :: Nil
).syntax
res0: String = foo(bar)

We never gave Scalameta parentheses but still it figured out we needed them. Pretty cool huh.

Tree.structure

Just like with tokens, we can also run .structure on trees.

scala> "foo(bar)".parse[Stat].get.structure
res0: String = Term.Apply(Term.Name("foo"), List(Term.Name("bar")))

.structure ignores any syntactic trivia like whitespace and comments

scala> "foo  ( /* this is a comment */ bar  ) // eol".parse[Stat].get.structure
res0: String = Term.Apply(Term.Name("foo"), List(Term.Name("bar")))

This can be useful for example in debugging, testing or equality checking.

Tree.collect

You can collect on Scalameta.Tree just like regular collections.

scala> source"""sealed trait Op[A]
    object Op extends B {
      case class Foo(i: Int) extends Op[Int]
      case class Bar(s: String) extends Op[String]
    }""".collect { case cls: Defn.Class => cls.name }
res0: List[meta.Type.Name] = List(Foo, Bar)

Tree.transform

Transform Scalameta.Tree with .transform.

scala> q"myList.filter(_ > 3 + a).headOption // comments are removed :(".transform {
  case q"$lst.filter($cond).headOption" => q"$lst.find($cond)"
}
res0: scala.meta.Tree = myList.find(_ > 3 + a)

.transform does not preserve syntactic details such as comments and formatting. There has been made some work on source aware transformation, see #457, but it still requires a bit more work.

Tree.==

Just like with tokens, tree equality is by default by reference:

scala> q"foo(bar)" == q"foo(bar)"
res0: Boolean = false
This means you need to be explicit if you mean syntactic equality
scala> q"foo(bar)".syntax == q"foo(bar)".syntax
res0: Boolean = true

or structural equality

scala> q"foo(bar)".structure == q"foo(bar)".structure
res0: Boolean = true

Comprehensive trees

A key feature of Scalameta trees is that they comprehensively cover all corners of the Scala syntax. A side effect of this is that the Scalameta tree hierarchy contains a lot of types. For example, there is a different tree node for an abstract def (Decl.Def)

scala> q"def add(a: Int, b: Int)" // Decl.Def
res0: meta.Decl.Def = def add(a: Int, b: Int): Unit
and a def with an implementation (Defn.Def)
scala> q"def add(a: Int, b: Int) = a + b" // Defn.Def
res0: meta.Defn.Def = def add(a: Int, b: Int) = a + b

Fortunately, most of the time you won't need to worry about this. Quasiquotes help you create/match/compose/deconstruct the correct instances. However, occasionally you may need to debug the types of the trees you have.

For your convenience, we've compiled together the most common types in this handy diagram:

Semantic API


The semantic API offers operations to query information from the Scala compiler such as name resolution (println => _root_.scala.Predef.println), symbol signatures (_root_.com.Example.main([Ljava/lang/String;)V. => def main(args: Array[String]): Unit), These operations can for example be used by tools like Scalafix, Metadoc, Metals.

SemanticDB

SemanticDB is a data model for semantic information about programs in Scala and other languages. SemanticDB decouples production and consumption of semantic information, establishing documented means for communication between tools.

The storage format used for the SemanticDB is defined using Protocol Buffers, or "protobuf" for short. The full schema is available here. Files containing SemanticDB binary data use the .semanticdb file extension by convention.

At the time of writing, SemanticDB has an experimental status and the associated APIs are considered internal. If you'd like to learn more, ask on our gitter channel: Join the chat at https://gitter.im/scalameta/scalameta

Contrib


Scalameta contrib is a module that provides common utilities for handling Scalameta data structures.

To use contrib, import scala.meta.contrib._.

Contrib exposes some collection-like methods on Tree.

scala> source"""
class A
trait B
object C
object D
""".find(_.is[Defn.Object])
res0: Option[scala.meta.Tree] = Some(object C)

scala> source"""
class A
trait B
object C {
  val x = 2
  val y = 3
}
object D
""".collectFirst { case q"val y = $body" =>  body.structure }
res1: Option[String] = Some(Lit.Int(3))

scala> source"""
class A
trait B
object C {
  val x = 2
  val y = 3
}
object D
""".exists(_.is[Defn.Def])
res2: Boolean = false

Contrib has a Equal typeclass for comparing trees by structural or syntactic equality.

scala> q"val x = 2".isEqual(q"val x = 1")
res0: Boolean = false

scala> (q"val x = 2": Stat).isEqual("val x = 2 // comment".parse[Stat].get)
res1: Boolean = true

scala> (q"val x = 2": Stat).isEqual[Syntactically]("val x = 2 // comment".parse[Stat].get)
res2: Boolean = false

scala> q"lazy val x = 2".mods.exists(_.isEqual(mod"lazy"))
res3: Boolean = true

scala> q"lazy val x = 2".contains(q"3")
res4: Boolean = false

scala> q"lazy val x = 2".contains(q"2")
res5: Boolean = true

Contrib has an AssociatedCommments helper to extract leading and trailing comments of tree nodes.

scala> val code: Source = """
/** This is a docstring */
trait MyTrait // leading comment
""".parse[Source].get
code: scala.meta.Source =

/** This is a docstring */
trait MyTrait // leading comment

scala> val comments = AssociatedComments(code)
comments: scala.meta.contrib.AssociatedComments =
AssociatedComments(
  Leading =
    trait [28..33) => List(/**∙This∙is∙a∙docstring∙*/)

  Trailing =

)

scala> val myTrait = code.find(_.is[Defn.Trait]).get
myTrait: scala.meta.Tree = trait MyTrait

scala> comments.leading(myTrait) -> comments.trailing(myTrait)
res0: (Set[meta.tokens.Token.Comment], Set[meta.tokens.Token.Comment]) = (Set(/** This is a docstring */),Set())

FAQ


Macros?

Originally, Scalameta was founded to become a better macro system for Scala, but over time we shifted focus to developer tools and spun off the new macro system into a separate project: scalacenter/macros.

What is the quasiquote for X?

Here is an overview of quasiquote syntax: https://github.com/scalameta/scalameta/blob/master/notes/quasiquotes.md.

Can I use Scalameta with Scala.js?

Yes, the main Scalameta modules support Scala.js.

Can I use Scalameta with Scala Native?

No, but we're almost there.

I'd like to use the Semantic API as a replacement for scalac's presentation compiler, is this doable? intended usage?

In principle, this should be doable, but it requires custom development and is not supported at the moment.

How do I use the Semantic API

See scalacenter/scalafix.g8.

Does Scalameta integrate with Zinc in order to achieve the Semantic API?

No, Zinc and Scalameta and unrelated. The Scalameta Semantic API is enabled with a scalac compiler plugin called semanticdb-scalac. semanticdb-scalac is designed to accommodate incremental compilation in order to play nicely with Zinc.

Where can I ask more questions?