Checked exceptions don't exist in Scala. However thanks to functional data structures we can manipulate expected errors differently.
Data Engineering Design Patterns
Looking for a book that defines and solves most common data engineering problems? I'm currently writing
one on that topic and the first chapters are already available in 👉
Early Release on the O'Reilly platform
I also help solve your data engineering problems 👉 contact@waitingforcode.com 📩
This post focuses on 2 functional methods to deal with errors in Scala. The first one uses Option type and is described in the first section. Another method is even more idiomatic since it uses more representative type called Either.
But before diving into these 2 methods, let's clarify what is an exception, covered in the >>TODO: post << and what is the error. The former one is something being exceptional that we don't expect to happen. For instance it can be a service returning temporary unavailable HTTP code. In the other side an error is something that we expect to happen and we know how to deal with it. A missing entry in a map is one of examples. Another substantial difference is functional purity broke by exceptions and respected by errors. Also very often the program doesn't know how to recover from exceptions while it does for the errors.
Functional errors with Option
The first method letting use to deal with errors are optional values. Scala represents them as the instances of scala.Option implementations: Some and None. The former marks the existence of the value while the latter the lack of value. We use optional values when we want to know if an operation worked. In such case, the function returns Some.
Dealing with probably missing values with Option has also an important role in code documentation. If a method can return Some or None, the caller is automatically prepared to handle both situations:
describe("option") { it("should enforce caller to think about missing values") { def loadUser(login: String): Option[String] = { if (login == "a") { None } else { Some(login) } } val loadedAUser = loadUser("a") // Option in the returned type enforces the clients to think about "what to do if the value is missing" // Thanks to that the risk of NPE is reduced. Unfortunately we can't predict in advance // all possible places that can return null so the risk will still exist. val loadedUserData = loadedAUser.getOrElse("missing") loadedUserData shouldEqual "missing" } }
Used in for comprehensions options can play the role of circuit breakers because they support biasing. It means that if any of options returns None, no subsequent functions will be executed:
it("should be used as circuit breaker") { var calledUserLogin = false var calledUserFavoritePages = false var calledUsersPersonalData = false def loadUserLogin: Option[String] = { calledUserLogin = true Some("login") } def loadUserFavoritePages: Option[Seq[String]] = { calledUserFavoritePages = true None } def loadUserPersonalData: Option[String] = { calledUsersPersonalData = true Some("a,b,c") } val userInformation = for { userLogin <- loadUserLogin userFavoritePages <- loadUserFavoritePages userPersonalData <- loadUserPersonalData } yield { s"User=${userLogin} / ${userFavoritePages.mkString(",")} / ${userPersonalData}}" } userInformation shouldBe None calledUserLogin shouldBe true calledUserFavoritePages shouldBe true calledUsersPersonalData shouldBe false }
Options have some drawbacks and their use should depend on the context. As told, if we want to only know about the existence or the lack of value, they're good candidates to use. In the other side, if we want to know what given function didn't retrieve asked data, we probably will need another structure as for instance Try or Either.
Functional errors with Either
When we don't want propagate exceptions and despite of that we want to keep the reason of the failure, we can wrap the function's call with Try or catch the exception at the called function's level with Either type. Unlike Option this type is binary and it contains 2 values called left and right. In case of error handling, the left value represents the failure and the right one the success. Using Either is good choice for the recoverable errors, for instance if we want to retry an operation after one failure or if simply we can substitute the error with temporary value:
describe("either") { trait UserRetrievalError { def reason: String } class DatabaseProblemError extends UserRetrievalError { override val reason: String = "DB_ERROR" } class UserNotFoundError extends UserRetrievalError { override val reason: String = "USER_NOT_FOUND" } case class LoadError(errorType: AnyRef) case class User(id: Int, login: String) def loadUser(errorType: Option[UserRetrievalError]): Either[LoadError, User] = { if (errorType.isDefined) { Left(LoadError(errorType.get)) } else { Right(User(1, "user_1")) } } it("should be used to handle missing value with an error") { // Here we simply map the error to some label val loadedUser = loadUser(Some(new UserNotFoundError())) val userResponse = loadedUser match { case Left(error) => { error.errorType match { case _: DatabaseProblemError => "database problem error" case _: UserNotFoundError => "user not found" } } case Right(user) => s"User found: ${user}" } userResponse shouldEqual "user not found" } it("should be used to reexecute a method") { // Here we suppose we have a single one retry for the DatabaseProblemError code def handleLoadedUserResponseWithOneRetry(response: Either[LoadError, User]): Option[User] = { val loadedUser = loadUser(Some(new DatabaseProblemError())) val userResponse = loadedUser match { case Left(error) => { error.errorType match { case _: DatabaseProblemError => { val response = loadUser(None) response.toOption } case _ => None } } case Right(user) => Some(user) } userResponse } val loadFirstResult = loadUser(Some(new DatabaseProblemError())) val userWithRetry = handleLoadedUserResponseWithOneRetry(loadFirstResult) userWithRetry shouldBe defined userWithRetry.get shouldEqual User(1, "user_1") } }
Please notice however that Either is not reserved to error handling. It can be used for other purposes. To get an idea we can take a look at one of the most prolific Scala projects - Apache Spark. After analyzing Either's declarations can figure out that it's used for: to manage blocks (returns different types if the block was cached or not) or to manage data in memory (returns either the chunk of not persisted values or the number representing the total size written to memory.
Option and Either are, aside of Try, another data types helping us to deal with unexpected things in the code. The important difference between them is that Option can only tell if there is a value while Either can also tell why given value is missing. Both simplify application's understanding because of their documentational role. If one function returns Option or Either, we immediately know that in our client's code we'll need to deal with eventual problems. And if the absence of value is unrecoverable, we can still throw an exception to fail-fast - even though it's not a prefered way by functional programming purists. After all the programming is an art of being pragmatic.