How to write advanced SQL queries without escaping characters in Apache Spark SQL?

I appreciate Apache Spark SQL because you can use it either as a data engineer, with some programmatic logic, or as a data analysts only by writing SQL queries. And sometimes writing these queries can be painful - especially when you need to escape special characters for a multi-line query.

Fortunately, Scala provides different String interpolators that you can use to make your queries more readable. One of them are triple quotes """....""" that will ignore the quotes included in the String. It should improve the readability. You can see it by comparing the following 2 queries written without and with that interpolator:

case class VisitBrowser(name: String, version: String, lang: String)

"Firefox query" should "return 2 users" in {
  import sparkSession.implicits._
  val usersVisits = Seq(
    ("u1", VisitBrowser("Firefox", "1.23", "en-EN")),
    ("u2", VisitBrowser("Firefox", "1.23", "fr-FR")),
    ("u3", VisitBrowser("Firefox", "1.23", "en-EN")),
    ("u4", VisitBrowser("Chrome", "16.23", "en-EN"))
  ).toDF("user_id", "browser")
  val userClicks = Seq(
    ("u1"), ("u2")

  val firefoxVisitsWithClicks = sparkSession.sql(
    """SELECT v.* FROM visits v
      |JOIN clicks c ON c.user_click_id = v.user_id
      |WHERE = "Firefox"""".stripMargin
  val firefoxVisitsWithClicksNoTripleQuotes = sparkSession.sql(
    "SELECT v.* FROM visits v JOIN clicks c ON " +
    "c.user_click_id = v.user_id WHERE = \"Firefox\""
  ) => row.getAs[String]("user_id")).collect() should contain allOf("u1", "u2") => row.getAs[String]("user_id")).collect() should contain allOf("u1", "u2")

If you would like to include some variables in the triple quotes, you should use a raw interpolator before, like here:

    val variable = "X"
    val textWithVariable = raw"""The variable is ${variable}""".stripMargin
    textWithVariable shouldEqual "The variable is X"
    val textWithoutVariable = """The variable is ${variable}""".stripMargin
    textWithoutVariable shouldEqual "The variable is ${variable}"