For any Scalameta related questions, don't hesitate to ask on our gitter channel:
scalameta umbrella package
includes modules for trees, tokens, parsing, pretty printing, semantic API
and more,
libraryDependencies += "org.scalameta" %% "scalameta" % "3.7.4"
Optionally, for extra utilities, you can use Contrib
libraryDependencies += "org.scalameta" %% "contrib" % "3.7.4"
sbt test to make sure everything works.Playground.scala
package scalaworld
import scala.meta._
import org.scalameta.logger // useful for debugging
class Playground extends org.scalatest.FunSuite {
import scala.meta._
test("part 1: tokens") {
val tokens = "val x = 2".tokenize.get
logger.elem(tokens.syntax)
logger.elem(tokens.structure)
}
test("part 2: trees") {
val tree = "val x = 2".parse[Stat].get
logger.elem(tree.syntax)
logger.elem(tree.structure)
}
}sbt ~library/test.import $ivy.`org.scalameta:scalameta_2.12:3.7.4`, scala.meta._
Make sure you have Scalameta installed as a library from Setup. You can decide to run these examples from the console or from sbt, for example in the tutorial repo.
This whole workshop will assume you have this import in scope:
scala> import scala.meta._
import scala.meta._
Here's how to tokenize a small expression.
scala> "val x = 2".tokenize.get
res0: scala.meta.tokens.Tokens = Tokens(, val, , x, , =, , 2, )
Let's discuss the most interesting methods on tokens.
The simplest method we can call is Tokens.syntax
The method returns a string representation of the actual code behind
the tokens, or how the code should look like to a developer.
scala> "val x = 2".tokenize.get.syntax
res0: String = val x = 2
Tokens.toString() uses .syntax behind the scenes.
However, you should never rely on toString() when manipulating
Scalameta structures, prefer to explicitly call .syntax.
It's maybe not so obvious why right now but it will make more sense soon.
Another useful method is Tokens.structure.
The method shows details that may be relevant to us as metaprogrammers.
scala> "val x = 2".tokenize.get.structure
res0: String = Tokens(BOF [0..0), val [0..3), [3..4), x [4..5), [5..6), = [6..7), [7..8), 2 [8..9), EOF [9..9))
.structure is often useful for debugging and testing.
The class Tokens is a wrapper around a sequence of Token objects.
There are multiple subtypes of Token while there only one type Tokens.
scala> "val x = 2".tokenize.get.head
res0: scala.meta.tokens.Token =
BOF stands for "Beginning of file".
Let's see what other kinds of token types are in the string
scala> "val x = 2".tokenize.get.
map(x => f"${x.structure}%10s -> ${x.productPrefix}").
mkString("\n")
res0: String =
BOF [0..0) -> BOF
val [0..3) -> KwVal
[3..4) -> Space
x [4..5) -> Ident
[5..6) -> Space
= [6..7) -> Equals
[7..8) -> Space
2 [8..9) -> Int
EOF [9..9) -> EOF
Even spaces get their own tokens.
The [0...3) part indicates that the val tokens start at
offset 0 and ends at offset 3.
How does token equality look like?
scala> "foobar".tokenize.get(1) == "foobar kas".tokenize.get(1)
res0: Boolean = false
Huh, why are they not the same?
Token equality is implemented with reference equality.
You need to be explicit if you actually mean syntactic (.syntax),
or structural (.structure) equality.
scala> "foobar".tokenize.get(1).syntax == "foobar kas".tokenize.get(1).syntax
res0: Boolean = true
Even if we move the tokens around
scala> "kas foobar".tokenize.get(3).syntax == "foobar kas".tokenize.get(1).syntax
res0: Boolean = true
The tokens are also structurally equal.
scala> "foobar".tokenize.get(1).structure == "foobar kas".tokenize.get(1).structure
res0: Boolean = true
However, they are not structurally equal if we move them around.
scala> "kas foobar".tokenize.get(3).structure == "foobar kas".tokenize.get(1).structure
res0: Boolean = false
Tokenization can sometimes fail, for example in this case:
scala> """ val str = "unclosed literal """.tokenize
res0: scala.meta.tokenizers.Tokenized =
<input>:1: error: unclosed string literal
val str = "unclosed literal
^
If you prefer, you can safely pattern match on the tokenize result
scala> """ val str = "closed literal" """.tokenize match {
case tokenizers.Tokenized.Success(tokenized) => tokenized
case tokenizers.Tokenized.Error(e, _, _) => ???
}
res0: scala.meta.tokens.Tokens = Tokens(, , val, , str, , =, , "closed literal", , )
Scalameta tokens are the foundation of Scalameta. Sometimes you don't have access to a parsed AST and then your best shot is work with tokens.
In the following chapter we will discuss another exciting data structure: the incredible scala.meta.Tree.
Reminder. We assume you have this import in scope:
scala> import scala.meta._
import scala.meta._
The easiest way to get started with Scalameta trees is using quasiquotes.
scala> q"case class User(name: String, age: Int)"
res0: meta.Defn.Class = case class User(name: String, age: Int)
Quasiquotes can be composed
scala> val method = q"def `is a baby` = age < 1"
method: meta.Defn.Def = def `is a baby` = age < 1
scala> q"""
case class User(name: String, age: Int) {
$method
}
"""
res0: meta.Defn.Class = case class User(name: String, age: Int) { def `is a baby` = age < 1 }
Quasiquotes can also be used to deconstruct trees with pattern matching
scala> q"def `is a baby` = age < 1" match {
case q"def $name = $body" =>
s"You ${name.syntax} if your ${body.syntax}"
}
res0: String = You `is a baby` if your age < 1
NOTE. Quasiquotes currently ignore comments:
scala> q"val x = 2 // assignment".syntax
res0: String = val x = 2
If you need comments, you can use .parse[T]
scala> "val x = 2 // assignment".parse[Stat].get.syntax
res0: String = val x = 2 // assignmentIf the contents that you want to parse are only known at runtime, you can't use quasiquotes. For example, this happens when you need to parse file contents.
Here's how to parse a compilation unit.
scala> "object Main extends App { println(1) }".parse[Source].get
res0: scala.meta.Source = object Main extends App { println(1) }
Pro tip. You can also call .parse[T] on a File,
just like this
scala> new java.io.File("readme/ParseMe.scala").parse[Source]
res0: scala.meta.parsers.Parsed[scala.meta.Source] =
class ParseMe { println("I'm inside a file") }If we try to parse a statement as a compilation unit we will fail.
scala> "val x = 2".parse[Source]
res0: scala.meta.parsers.Parsed[scala.meta.Source] =
<input>:1: error: expected class or object definition
val x = 2
^
We need to explicitly parse it as a statement (Stat).
scala> "val x = 2".parse[Stat].get
res0: scala.meta.Stat = val x = 2
We can also parse case statement
scala> "case Foo(bar) if bar > 2 => println(bar)".parse[Case].get
res0: scala.meta.Case = case Foo(bar) if bar > 2 => println(bar)
Scalameta has dozens of parsers:

.parse[Stat] and .parse[Source] are
usually all you need. However, a comprehensive list of the parsers,
and their quasiquote syntax is documented here.
I didn't tell the whole story when I said you need to pass in a type
argument to parse statements.
You also need to pass in a dialect!
However, Scalameta will by default pick the Scala211 dialect
for you if you don't provide one explicitly.
With the SBT dialects, we can parse vals as top-level statements.
scala> dialects.Sbt0137(
"lazy val core = project.settings(commonSettings)"
).parse[Source].get
res0: scala.meta.Source = lazy val core = project.settings(commonSettings)
We can even parse multiple top level statements
scala> dialects.Sbt0137(
"""
lazy val core = project.settings(commonSettings)
lazy val extra = project.dependsOn(core)
"""
).parse[Source].get
res0: scala.meta.Source =
lazy val core = project.settings(commonSettings)
lazy val extra = project.dependsOn(core)
For the remainder of the workshop, we will only work with the Scala211
dialect.
Just like with tokens, we can also run .syntax on trees.
scala> "foo(bar)".parse[Stat].get.syntax
res0: String = foo(bar)
However, Scalameta can also do this even if you manually construct the tree
scala> Term.Apply(
Term.Name("foo"),
Term.Name("bar") :: Nil
).syntax
res0: String = foo(bar)
We never gave Scalameta parentheses but still it figured out we needed them. Pretty cool huh.
Just like with tokens, we can also run .structure on trees.
scala> "foo(bar)".parse[Stat].get.structure
res0: String = Term.Apply(Term.Name("foo"), List(Term.Name("bar")))
.structure ignores any syntactic trivia like whitespace and comments
scala> "foo ( /* this is a comment */ bar ) // eol".parse[Stat].get.structure
res0: String = Term.Apply(Term.Name("foo"), List(Term.Name("bar")))
This can be useful for example in debugging, testing or equality checking.
You can collect on Scalameta.Tree just like regular collections.
scala> source"""sealed trait Op[A]
object Op extends B {
case class Foo(i: Int) extends Op[Int]
case class Bar(s: String) extends Op[String]
}""".collect { case cls: Defn.Class => cls.name }
res0: List[meta.Type.Name] = List(Foo, Bar)
Transform Scalameta.Tree with .transform.
scala> q"myList.filter(_ > 3 + a).headOption // comments are removed :(".transform {
case q"$lst.filter($cond).headOption" => q"$lst.find($cond)"
}
res0: scala.meta.Tree = myList.find(_ > 3 + a)
.transform does not preserve syntactic details such as comments
and formatting. There has been made some work on source aware transformation,
see #457,
but it still requires a bit more work.
Just like with tokens, tree equality is by default by reference:
scala> q"foo(bar)" == q"foo(bar)"
res0: Boolean = false
This means you need to be explicit if you mean syntactic equality
scala> q"foo(bar)".syntax == q"foo(bar)".syntax
res0: Boolean = true
or structural equality
scala> q"foo(bar)".structure == q"foo(bar)".structure
res0: Boolean = true
A key feature of Scalameta trees is that they comprehensively cover
all corners of the Scala syntax.
A side effect of this is that the Scalameta tree hierarchy contains a
lot of types.
For example, there is a different tree node for an abstract def (Decl.Def)
scala> q"def add(a: Int, b: Int)" // Decl.Def
res0: meta.Decl.Def = def add(a: Int, b: Int): Unit
and a def with an implementation (Defn.Def)
scala> q"def add(a: Int, b: Int) = a + b" // Defn.Def
res0: meta.Defn.Def = def add(a: Int, b: Int) = a + b
Fortunately, most of the time you won't need to worry about this. Quasiquotes help you create/match/compose/deconstruct the correct instances. However, occasionally you may need to debug the types of the trees you have.
For your convenience, we've compiled together the most common types in this handy diagram:

println => _root_.scala.Predef.println),
symbol signatures (_root_.com.Example.main([Ljava/lang/String;)V. =>
def main(args: Array[String]): Unit),
These operations can for example be used by tools like
Scalafix,
Metadoc,
Metals.
SemanticDB is a data model for semantic information about programs in Scala and other languages. SemanticDB decouples production and consumption of semantic information, establishing documented means for communication between tools.
The storage format used for the SemanticDB is defined using
Protocol Buffers,
or "protobuf" for short. The full schema is available
here.
Files containing SemanticDB binary data use the .semanticdb file
extension by convention.
At the time of writing, SemanticDB has an experimental status and
the associated APIs are considered internal. If you'd like to learn more,
ask on our gitter channel:
To use contrib, import scala.meta.contrib._.
Contrib exposes some collection-like methods on Tree.
scala> source"""
class A
trait B
object C
object D
""".find(_.is[Defn.Object])
res0: Option[scala.meta.Tree] = Some(object C)
scala> source"""
class A
trait B
object C {
val x = 2
val y = 3
}
object D
""".collectFirst { case q"val y = $body" => body.structure }
res1: Option[String] = Some(Lit.Int(3))
scala> source"""
class A
trait B
object C {
val x = 2
val y = 3
}
object D
""".exists(_.is[Defn.Def])
res2: Boolean = false
Contrib has a Equal typeclass for comparing trees by structural or
syntactic equality.
scala> q"val x = 2".isEqual(q"val x = 1")
res0: Boolean = false
scala> (q"val x = 2": Stat).isEqual("val x = 2 // comment".parse[Stat].get)
res1: Boolean = true
scala> (q"val x = 2": Stat).isEqual[Syntactically]("val x = 2 // comment".parse[Stat].get)
res2: Boolean = false
scala> q"lazy val x = 2".mods.exists(_.isEqual(mod"lazy"))
res3: Boolean = true
scala> q"lazy val x = 2".contains(q"3")
res4: Boolean = false
scala> q"lazy val x = 2".contains(q"2")
res5: Boolean = true
Contrib has an AssociatedCommments helper to extract leading
and trailing comments of tree nodes.
scala> val code: Source = """
/** This is a docstring */
trait MyTrait // leading comment
""".parse[Source].get
code: scala.meta.Source =
/** This is a docstring */
trait MyTrait // leading comment
scala> val comments = AssociatedComments(code)
comments: scala.meta.contrib.AssociatedComments =
AssociatedComments(
Leading =
trait [28..33) => List(/**∙This∙is∙a∙docstring∙*/)
Trailing =
)
scala> val myTrait = code.find(_.is[Defn.Trait]).get
myTrait: scala.meta.Tree = trait MyTrait
scala> comments.leading(myTrait) -> comments.trailing(myTrait)
res0: (Set[meta.tokens.Token.Comment], Set[meta.tokens.Token.Comment]) = (Set(/** This is a docstring */),Set())