Quasiquotes in Scala

on waitingforcode.com

Quasiquotes in Scala

Apache Spark inspired not only the last week's post about closures but also the one you're reading about quasiquotes - a mysterious Scala experimental feature those existence we can difficulty suspect in the first months of work with the language.

The first section speaks about quasiquotes from a bird's-eye view. The next presents some API details. The 3rd part talks about string interpolators provided with quasiquotes. Finally the last section presents some concepts and operations related to the topic of this post.

Generally about quasiquotes

In simple terms Scala quasiquotes are a way to transform a text into executable code. We simply write some code as a String, fill it to one of scala.tools.reflect.ToolBox's methods and immediately get the results. In more specific definition quasiquotes build a structure called Abstract Syntax Tree that represents the constructs defined in the code, such as methods, variables, statements and so forth. One of use cases of this structure are macros .

Abstract Syntax Tree

AST is a data structure used by compilers to represent the structure of compiled code. It's abstract because it doesn't represent every element of the code. Among ignored elements we can find: grouping parentheses, semicolons, whitespaces and comments. And it's a syntax tree because, as told, it represents the source code.

A simplified version of AST for a = 5 +3 expression could look like:

Macro

The code can be defined inside functions. A macro is a kind of functions referenced by name. Everywhere macro's name it's used, the compiler replaces it by the code defined inside a macro. If we consider a simplified version of language-unaware macro defined like:

  define macro print_alert():
      print("Warning !")
  

Every place in the code where an instruction to use the macro is present is then substituted with the macro's content (print("Warning !")). That means:

  print("Hello world")
  use macro print_alert()
  

Compiled to:

  print("Hello world")
  print("Warning !")
  

There are several advantages of using quasiquotes for code generation. They're type-checked at compile time to ensure appropriate ASTs or literals substitution. Moreover, they're easier to use since they return directly Scala's AST. Otherwise we should generate the code with Scala parser at runtime. Finally, they benefit from Scala compiler optimizations.

Scala quasiquotes API

Technically we can use quasiquotes with scala-reflect module. All usable objects are located in scala.reflect.runtime package. Among 2 the most important ones we can distinguish implicit scala.reflect.api.Quasiquotes.Quasiquote class defining all available interpolators explained in the next section. Another important object is already mentioned ToolBox that exposes a lot of useful methods to execute the quasiquotes. The first of them, compile(tree: u.Tree), compiles AST. It shows pretty well the safety brought by compilation time:

it should "ensure type safety at compilation time" in {
  val codeToCompile = q"val nr: String = 1"

  val typeError = intercept[ToolBoxError] {
    MadeToolbox.compile(codeToCompile)
  }

  typeError.toString should include("reflective compilation has failed:")
  typeError.toString should include("type mismatch;")
  typeError.toString should include("found   : Int(1)")
  typeError.toString should include("required: String")
}

Another interesting method is def eval(tree: u.Tree): Any that not only compiles the code but also executes it. The difference between compile and eval is shown in the following test case:

it should "show the difference between eval and compile" in {
  val codeToCompile = q"10 + 30 + 40"

  val evaluatedResult = MadeToolbox.eval(codeToCompile)
  val compiledResult = MadeToolbox.compile(codeToCompile)

  evaluatedResult shouldEqual 80
  val resultOfCompiledEvaluation = compiledResult.asInstanceOf[() => Int]()
  resultOfCompiledEvaluation shouldEqual 80
}

Internally the leaves of AST constructed with quasiquotes are represented by the implementations of scala.reflect.api.Trees#Tree. It means that each quasiquotes string is represented as one of them. For instance, a simple literal expression is considered as scala.reflect.internal.Trees.Literal, definitions are backed by ValDef, DefDef or VarDef, and case clauses by CaseDef case classes. The example from the next test shows AST representation based on Tree implementations:

it should "show AST representation with leaves" in {
  val codeToCompile = q"""val result = "a" + "b""""

  val leaves = showRaw(codeToCompile)

  leaves shouldEqual "ValDef(Modifiers(), TermName(\"result\"), TypeTree(), Apply(Select(Literal(Constant(\"a\")), " +
    "TermName(\"$plus\")), List(Literal(Constant(\"b\")))))"
}

Above tree in more human-friendly version looks like:

Interpolators

Using quasiquotes consists on defining the expression with the help of one of 5 available string interpolators:

  • q - it should be used to define expressions, such as literals, selection, identifiers, exception throwing or loops declaration. Scala's official documentation about syntax contains a full list of supported items. The following 2 examples show use cases for literal and method:
    behavior of "q"
    it should "quote a method" in {
      val quotedMethod = q"def add(nr: Int) = nr + 30"
    
      val rawRepresentation = showRaw(quotedMethod)
    
      rawRepresentation shouldEqual "DefDef(Modifiers(), TermName(\"add\"), List(), List(List(ValDef(Modifiers(PARAM), TermName(\"nr\"), " +
        "Ident(TypeName(\"Int\")), EmptyTree))), TypeTree(), Apply(Select(Ident(TermName(\"nr\")), TermName(\"$plus\")), " +
        "List(Literal(Constant(30)))))"
    }
    
  • tq - we should use this interpolator in the work with types, as for instance with: type identifiers, type projection, type selection or type declaration:
    behavior of "tq"
    it should "create valid Int type" in {
      val intType = tq"Int"
    
      intType.toString shouldEqual "Int"
    }
    
    it should "use tq to declare a type used in a case class" in {
      val paramType = tq"Int"
      val caseClass = q"case class Number(nr: $paramType)"
    
      showCode(caseClass) shouldEqual "case class Number(nr: Int)"
    }
    
  • cq - this interpolator has only one use cases that is the case clause:
    behavior of "cq"
    it should "case from pattern matching" in {
      val caseOfInt = cq" _: Int => 1"
    
      showRaw(caseOfInt) shouldEqual "CaseDef(Typed(Ident(termNames.WILDCARD), Ident(TypeName(\"Int\"))), EmptyTree, Literal(Constant(1)))"
    }
    
  • pq - we can use it in all kind of patterns, such as: extractor pattern, binding pattern, wildcard pattern or literal pattern. Two first are shown below:
    behavior of "pq"
    
    it should "build extractor pattern" in {
      val extractor = pq"Seq(1, 2, 3, 4, 5, 6, 7)"
    
      val pq"Seq($nr1,$nr2,..$remainingNrs)" = extractor
    
      nr1.toString() shouldEqual "1"
      nr2.toString() shouldEqual "2"
      remainingNrs.toString() shouldEqual "List(3, 4, 5, 6, 7)"
    }
    
    it should "build binding pattern" in {
      val bindingPattern = pq"""lettersList @ Seq("a", "b","c")"""
      val pq"$varName @ $varValue" = bindingPattern
    
      varName.toString shouldEqual "lettersList"
      varValue.toString shouldEqual "Seq(\"a\", \"b\", \"c\")"
    }
    
  • fq - this interpolator deals with enumerators, as generators, value definition or guards:
    behavior of "fq"
    
    it should "generate for loop enumerator" in {
      val forLoopEnumerator = fq"a <- 1 to 10"
      val container = q"val container = new scala.collection.mutable.ListBuffer[Int]()"
    
      val evaluatedForLoop = MadeToolbox.eval(q"$container; for ($forLoopEnumerator) { container.append(a) }; container")
    
      val evaluationResult = evaluatedForLoop.asInstanceOf[ListBuffer[Int]]
      evaluationResult should have size 10
      evaluationResult should contain allOf (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    }
    

Quasiquotes concepts and operations

Quasiquotes bring some important concepts and operations. Among the group of the former ones we distinguish hygiene. This idea describes the generated code that is considered hygienic only when it ensures the absence of name collisions between regular and generated code, as shown in the following 2 test cases illustrating referential transparency and narrow sense hygiene:

it should "collide with already defined method" in {
  def add(nr: Int) = nr + 1
  val codeToCompile = MadeToolbox.parse(
    """
      | def add(nr: Int): Int = nr + 2
    """.stripMargin
  )
  MadeToolbox.eval(codeToCompile)

  val sum = add(5)

  sum shouldEqual 6
}
it should "collide in narrow sense" in {
  val initialTree = q"val nr = 10; nr"
  val q"$definition; $value" = initialTree
  val collisionCodeToCompile = q"$definition; { val nr = 20; println(nr); $value }"

  val result = MadeToolbox.eval(collisionCodeToCompile)

  result shouldEqual 20
}

From the group of operation the most important one is quoting that constructs the representation for quoted expression, exactly as here:

it should "quote a method" in {
  val quotedMethod = q"def add(nr: Int) = nr + 30"

  val rawRepresentation = showRaw(quotedMethod)

  rawRepresentation shouldEqual "DefDef(Modifiers(), TermName(\"add\"), List(), List(List(ValDef(Modifiers(PARAM), TermName(\"nr\"), " +
    "Ident(TypeName(\"Int\")), EmptyTree))), TypeTree(), Apply(Select(Ident(TermName(\"nr\")), TermName(\"$plus\")), " +
    "List(Literal(Constant(30)))))"
}

Another interesting operation is unquoting. It expands already existent quasiquotes into a quasiquote string thanks to ${...} interpolation:

it should "show unquoting" in {
  val q"def add($param) = nr + 30" = q"def add(nr: Int) = nr + 30"

  param.toString shouldEqual "val nr: Int = _"
}

Two other operations are also a pair of reversible actions called lifting and unlifting. The former ones is a method to unquote custom data types in quasiquotes. The latter does the inverse - it translates quasiquotes to custom data types. Both are shown in below examples:

it should "show lifting" in {
  val text = "a,b,c"
  val liftedExpression = q"""$text.split(",").mkString("-")"""

  val liftedExpressionResult = MadeToolbox.eval(liftedExpression)

  showRaw(liftedExpression) shouldEqual "Apply(Select(Apply(Select(Literal(Constant(\"a,b,c\")), TermName(\"split\")), " +
    "List(Literal(Constant(\",\")))), TermName(\"mkString\")), List(Literal(Constant(\"-\"))))"
  liftedExpressionResult shouldEqual "a-b-c"
}

it should "show unlifting" in {
  val q"""${text: String}.split(",").mkString("-")""" = q""""a,b,c".split(",").mkString("-")"""
  
  text shouldEqual "a,b,c"
}

Please notice that to make it work, lifted and unlifted types must provide the implementation for Liftable's apply and Unliftable's unapply methods.

Scala quasiquotes are one of basic methods to implement macros. As shown throughout the article, they provide different string interpolators (q, fq, ...) to define different elements of syntax trees. Thanks to them and Toolbox object we can easily transform text into evaluated bytecode. Please notice though that quasiquotes are still marked as experimental feature. But since they're not a Scala-exclusive instruction - Haskell or Elixir have them too - it's good to know that "something" like this exists and for what purpose it's used.

Share, like or comment this post on Twitter: