Waiting for code

on waitingforcode.com

Stream safely in Scala

Scala Stream offers something we have not a habit to see in other languages - lazy computation of the values alongside the memoization. However it's sometimes misleading and some people think about Streams as about iterators, i.e. a data structure computing and forgetting about the results. Such thinking can often lead to memory problems, especially with infinite streams. Continue Reading →

Java, SAM and Scala

During the years Java was much more verbose than its JVM-based colleagues (Clojure, Groovy, Scala, ...). But it changed with Java 8 and its Lambda expressions. However the years of the verbosity brought some patterns into Java-based applications. One of them is known as SAM and fortunately it can be easily used with Scala. Continue Reading →

Enums in Scala

Some people coming from Java are often confused about enumerations. In Java, they're often used not only to enumerate things as in dictionaries but also to create singletons. In Scala, the latter use case is much more reserved to objects and apparently, only the former one remains. We'll see through this post whether it's true or not. Continue Reading →

Edge partitioning strategies

Previously we've learned about the vertices and edges representations in Apache Spark GraphX. At this moment to not introduce too many new concepts at once, we deliberately omitted the discovery of edges partitioning. Luckily, a new week comes and it lets us discuss that. Continue Reading →

Regular expressions in Scala

Even though the regular expressions look similarly in a lot of languages, each of them brings some own constructs. Scala is not an exception for this rule and we'll try to see it in this post. Continue Reading →

Defining schemas in Apache Spark SQL with builder design pattern

Schemas are one of the key parts of Apache Spark SQL and its distinction point with old RDD-based API. When we deal with data coming from a structured data source as a relational database or schema-based file formats, we can let the framework to resolve the schema for us. But the things complicate when we're working with semi-structured data as JSON and we must define the schema by hand. Continue Reading →