Structural types

Versions: Scala 2.12.1

Duck typing and Scala aren't the words going well together. However with a special kind of types we can achieve similar effect that in dynamically typed languages.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I wrote one on that topic! You can read it online on the O'Reilly platform, or get a print copy on Amazon.

I also help solve your data engineering problems 👉 contact@waitingforcode.com 📩

This post goes through the concept of structural types in Scala. Its first section shows the most basic points about it. The second one gives a little bit more examples of what we can do with these types. The last section has a micro-benchmark showing whether using structural type is penalizing or not.

Structural types defined

Since structural types are strongly related to duck typing, let's explain first this kind of "typing". Duck typing comes from dynamically typed languages and it's based on behavior rather than on static types:

function print_string(printable_instance):
  printable_instance.print()

As you can see, there is no expected type declaration at compile time. Thanks to that we can pass as parameter any object, as long as it implements print() method. In Scala stuctural type is a kind of specification that given object must met in order to be accepted as, for instance, input parameter. This specification has the form of { ...specification... } and it may look like:

describe("structural type") {
  it("should be used as method parameter") {
    object DotConcatenator {
      def concatenate(l1: String, l2: String): String = s"${l1}.${l2}"
    }
    object DashConcatenator{
      def concatenate(letter1: String, letter2: String): String = s"${letter1}-${letter2}"
    }

    def concatenateLetters(concatenateFunction: {def concatenate(letter1: String, letter2: String): String},
                            letters: Seq[String]): String = {
      letters.reduce((letter1, letter2) => concatenateFunction.concatenate(letter1, letter2))
    }

    val concatenatedWithDot = concatenateLetters(DotConcatenator, Seq("a", "b", "c"))
    val concatenatedWithDash = concatenateLetters(DashConcatenator, Seq("a", "b", "c"))

    concatenatedWithDot shouldEqual "a.b.c"
    concatenatedWithDash shouldEqual "a-b-c"
  }
}

As you can see, here we expect that the object provides a concatenate method taking 2 strings as parameters, regardless their names. We don't care about any other methods because they're not used in the implementation. Aside from methods, the structural type can also define fields:

it("should define values") {
  object SmallMultiplier {
    val MultiplyValue = 2
  }
  object BigMultiplier {
    val MultiplyValue = 4
  }

  def multiply(multiplier: {val MultiplyValue: Int}, nrToMultiply: Int): Int = {
    nrToMultiply * multiplier.MultiplyValue
  }

  val smallMultiplierResult = multiply(SmallMultiplier, 5)
  smallMultiplierResult shouldEqual 10
  val bigMultiplierResult = multiply(BigMultiplier, 5)
  bigMultiplierResult shouldEqual 20
}

You may wonder why should we need such construct in strongly typed languages as Scala ? Well, it brings some advantages. First, we don't need to create an enormous class hierarchy if we want to share some code among object behaving similarly. Thus, it brings flexibility by reducing codebase complexity. Also, despite lack of typing in the name, it's type-safe - Scala will check if passed type respects defined the specification. In the other side, it's much slower than regular methods because it uses reflection. The difference is especially visible for the case of loops and is shown in the last section.

Going deeper

The possibility to define method and fields in structural types isn't a single feature. We can also define more than one method or field:

it("should be used with 2 methods") {
  type MathOperator = {
    def add(nr1: Int, nr2: Int): Int
    def multiply(nr1: Int, nr2: Int): Int
  }
  def addAndMultiplyByTwo(mathOperator: MathOperator, nr1: Int, nr2: Int): Int = {
    mathOperator.multiply(mathOperator.add(nr1, nr2), 2)
  }

  val result = addAndMultiplyByTwo(new {
    def add(nr1: Int, nr2: Int) = nr1 + nr2
    def multiply(nr1: Int, nr2: Int) = nr1 * nr2
  }, 3, 5)

  result shouldEqual 16
}

In addition, structural type can be a part of first-order type:

it("should be used in first-order type") {
  object DotConcatenator {
    def concatenate(l1: String, l2: String): String = s"${l1}.${l2}"
  }
  object DashConcatenator{
    def concatenate(letter1: String, letter2: String): String = s"${letter1}-${letter2}"
  }

  class Concatenator[T <: {def concatenate(letter1: String, letter2: String): String}](concatenator: T) {
    def concatenateLetters(letter1: String, letter2: String): String = concatenator.concatenate(letter1, letter2)
  }

  val concatenatorsWithDots = new Concatenator(DotConcatenator)
  concatenatorsWithDots.concatenateLetters("a", "b") shouldEqual "a.b"

  val concatenatorWithDashes = new Concatenator(DashConcatenator)
  concatenatorWithDashes.concatenateLetters("a", "b") shouldEqual "a-b"
}

And if you bother about the readability problem by declaring structural types just with {}, you should know that they can be aliased:

it("should be created with new keyword from typed alias") {
  type Sum = {
    def add(nr1: Int, nr2: Int): Int
  }
  val sumImplementation = new {
    def add(nr1: Int, nr2: Int) = nr1 + nr2
  }
  def addNumbers(addFunc: Sum, nr1: Int, nr2: Int): Int = {
    addFunc.add(nr1, nr2)
  }

  val sumResult = addNumbers(sumImplementation, 1, 2)

  sumResult shouldEqual 3
}

Impact on loops

In order to estimate the impact of structural types on loops, I created a small micro-benchmark with JMH for the following code:

package com.waitingforcode.duck_typing

import java.util.concurrent.TimeUnit

import org.openjdk.jmh.annotations._

@OutputTimeUnit(TimeUnit.MILLISECONDS)
@BenchmarkMode(Array(Mode.All))
class DuckTypingMicroBenchmark {

  @Benchmark
  def verify_duck_typing: Unit = {
    type MathOperator = {
      def add(nr1: Int, nr2: Int): Int
      def multiply(nr1: Int, nr2: Int): Int
    }
    def addAndMultiplyBy2(mathOperator: MathOperator, nr1: Int, nr2: Int): Int = {
      mathOperator.multiply(mathOperator.add(nr1, nr2), 2)
    }
    for (nr <- 0 to 10000000) {
      addAndMultiplyBy2(MathOperatorAdding1ToEachValue, nr, nr+2)
    }
  }

  @Benchmark
  def verify_trait(): Unit = {
    def addAndMultiplyBy2(mathOperator: MathOperatorTrait, nr1: Int, nr2: Int): Int = {
      mathOperator.multiply(mathOperator.add(nr1, nr2), 2)
    }
    for (nr <- 0 to 10000000) {
      addAndMultiplyBy2(MathOperatorAdding1ToEachValueFromTrait, nr, nr+2)
    }
  }

}

trait MathOperatorTrait {
  def add(nr1: Int, nr2: Int): Int
  def multiply(nr1: Int, nr2: Int): Int
}

object MathOperatorAdding1ToEachValueFromTrait extends MathOperatorTrait {
  override def add(nr1: Int, nr2: Int): Int = nr1 + nr2 + 1 + 1

  override def multiply(nr1: Int, nr2: Int): Int = (nr1 + 1) * (nr2 + 1)
}

object MathOperatorAdding1ToEachValue {
  def add(nr1: Int, nr2: Int): Int = nr1 + nr2 + 1 + 1
  def multiply(nr1: Int, nr2: Int): Int = (nr1 + 1) * (nr2 + 1)
}

SBT files look like:

// build.sbt
import sbt.Keys._

lazy val root = (project in file(".")).
  enablePlugins(JmhPlugin).
  settings(
    name := "micro_benchmark",
    version := "1.0",
    scalaVersion := "2.12.6"
  )

// build.properties
sbt.version = 1.0.2

// plugins.sbt
logLevel := Level.Warn
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.3.2")

The microbenchmark ran with sbt jmh:run -i 20 -wi 10 -f1 -t1 -rf json command where: 20 is the number of measurement iterations, 10 is the number of warm-up iterations, f1 is the number of forks (benchmark ran in a subprocess) and t1 is the number of threads used in the benchmark. The last 2 options are related to the output format. After benchmarking the methods we can clearly see that structural types perform worse for all measurements:

throughput - trait-based code is able to perform 5 times more operations per milliseconds than the one based on "duck-typed" version.

Benchmark                                                                 Mode  Cnt    Score    Error   Units
DuckTypingMicroBenchmark.verify_duck_typing                              thrpt   20    0.005 +/-  0.001  ops/ms
DuckTypingMicroBenchmark.verify_trait                                    thrpt   20    0.027 +/-  0.003  ops/ms

average time - also regarding to the average execution time, the version using traits outperforms structural types:

Benchmark                                                                 Mode  Cnt    Score    Error   Units

DuckTypingMicroBenchmark.verify_duck_typing                               avgt   20  214.825 +/- 19.933   ms/op
DuckTypingMicroBenchmark.verify_trait                                     avgt   20   52.278 +/- 12.948   ms/op

The difference is almost of the same order of magnitude.

sampling time - this metric continuously calls benchmarked methods and randomly samples the time needed for the call. Here the differences are also important and the longest execution time of verify_trait method doesn't reach the time of structural types test:

Benchmark                                                                 Mode  Cnt    Score    Error   Units
DuckTypingMicroBenchmark.verify_duck_typing                             sample   68  334.654 +/- 51.313   ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.00    sample       225.444            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.50    sample       292.553            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.90    sample       515.008            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.95    sample       625.423            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.99    sample       821.035            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.999   sample       821.035            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p0.9999  sample       821.035            ms/op
DuckTypingMicroBenchmark.verify_duck_typing:verify_duck_typing·p1.00    sample       821.035            ms/op
DuckTypingMicroBenchmark.verify_trait                                   sample  467   43.999 +/-  3.304   ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.00                sample        26.149            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.50                sample        34.341            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.90                sample        76.546            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.95                sample        91.200            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.99                sample       118.269            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.999               sample       180.617            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p0.9999              sample       180.617            ms/op
DuckTypingMicroBenchmark.verify_trait:verify_trait·p1.00                sample       180.617            ms/op

single shot invocation time - it calls benchmarked method once and retrieves its execution time. Since we configured our micro-benchmark to run in 20 iterations, this metric was taken 20 times. And unsurprisingly, structural type performs much worse than its trait-based adversary:

Benchmark                                                                 Mode  Cnt    Score    Error   Units
DuckTypingMicroBenchmark.verify_duck_typing                                 ss   20  247.599 +/- 79.390   ms/op
DuckTypingMicroBenchmark.verify_trait                                       ss   20   63.493 +/- 21.201   ms/op

Structural types are a convenient way to imitate duck-typing in strongly typed language as Scala. Unlike classical class hierarchy inheritance, structural types prefer to define a part of behavior that given object must provide. It reduces the complexity of hierarchies but in other side may lead to performance issues because of reflection used at resolution time. As proven in the 3rd part, when used with loops, structural types perform much worse than regular type-based constraints.

Consulting

With nearly 16 years of experience, including 8 as data engineer, I offer expert consulting to design and optimize scalable data solutions. As an O’Reilly author, Data+AI Summit speaker, and blogger, I bring cutting-edge insights to modernize infrastructure, build robust pipelines, and drive data-driven decision-making. Let's transform your data challenges into opportunities—reach out to elevate your data engineering game today!

👉 contact@waitingforcode.com
🔗 past projects

TAGS: #One Scala feature per week