Peeking Under The Covers in Scala

by
Tags:
Category:

There are many times where the Scala compiler performs some magic for us and normally we’re happy about that. But there are times when it would be useful to see what it’s doing under the covers. This post explores one technique for peeking under the covers: asking the compiler to show it’s translation of our code.

Understanding For Expression Translation

What output does the following code produce?

object Main extends App {
  for{
    x <- 1 to 5
    _ = print("hi")
  } print(x)
}

If you said hihihihihi12345, you’re right. But how can we explain that output? It would be very easy to look at this and say the output would be hi1hi2hi3hi4hi5. We could try to apply the rules defined in the Scala Language Specification. Or we could just ask the compiler to show us the translation it is performing. The complier processes source code through 25 phases, which can be shown with scalac -Xshow-phases. The first of these phases is called parser. It performs basic desugaring, which is enough for our needs. We can have the compiler show us the program after the parser phase using the setting -Xprint:parser.

Running the command scalac -Xprint:parser Main.scala will produce the output:

[[syntax trees at end of                    parser]] // Main.scala
package  {
  object Main extends App {
    def () = {
      super.();
      ()
    };
    1.to(5).map(((x) => {
  val x$1 = println("hi");
  scala.Tuple2(x, x$1)
})).foreach(((x$2) => x$2: @scala.unchecked match {
      case scala.Tuple2((x @ _), _) => println(x)
    }))
  }
}

There’s a lot of extra stuff in there! Stripping away the implementation details, the compiler is turning the for expression into the following expression:

(1 to 5).map(x => {
  println("hi")
  x
 }).foreach(x => println(x))

Now it makes perfect sense what happens in the for expression! The map expression is fully evalulated before foreach is applied to the result of map. Having seen this, the rules offered in the SLS should be easier to comprehend. We could use this technique to check the translation of any other for expression.

Peeking At REPL Magic

We can also use this technique to understand how the REPL works. Students in my classes often find the REPL a bit confusing because it appears to let them violate rules of Scala, like assigning a new value to a val. In fact it doesn’t, but it’s not obvious what’s actually happening. If we run scala -Xprint:parser, we’ll enter the REPL and get a bunch of output on the console. Once it stops, type in val x = 0 and hit enter. The next output should be:

val x = 0
[[syntax trees at end of                    parser]] //
package $line3 {
  object $read extends scala.AnyRef {
    def () = {
      super.();
      ()
    };
    object $iw extends scala.AnyRef {
      def () = {
        super.();
        ()
      };
      object $iw extends scala.AnyRef {
        def () = {
          super.();
          ()
        };
        val x = 0
      }
    }
  }
}
[[syntax trees at end of                    parser]] //
package $line3 {
  object $eval extends scala.AnyRef {
    def () = {
      super.();
      ()
    };
    lazy val $result = $line3.$read.$iw.$iw.x;
    lazy val $print: String = {
      $line3.$read.$iw.$iw;
      "".$plus("x: Int = ").$plus(scala.runtime.ScalaRunTime.replStringOf($line3.$read.$iw.$iw.x, 1000))
    }
  }
}
x: Int = 0

The REPL is defining a package called $line3 and and object called $read with nested $iw objects. The innermost $iw contains our definition of x as 0 at line 19. The REPL also creates a $line3.$eval object that contains output for the console. Now type in x+3 and the output should be:

scala> x + 3
[[syntax trees at end of                    parser]] //
package $line4 {
  object $read extends scala.AnyRef {
    def () = {
      super.();
      ()
    };
    object $iw extends scala.AnyRef {
      def () = {
        super.();
        ()
      };
      import $line3.$read.$iw.$iw.x;
      object $iw extends scala.AnyRef {
        def () = {
          super.();
          ()
        };
        val res0 = x.$plus(3)
      }
    }
  }
}
... eval omitted ...
res0: Int = 3

Again the REPL has created $line4.$read.$iw.$iw. But notice on line 14 that $line4.$read.$iw is importing $line3.$read.$iw.$iw.x – the x that we defined previously. Now type in val x = 100 and the output shows that $line5.$read.$iw.$iw is created with a definition of x as 100. Once again typing in x + 3 shows the following output:

scala> x + 3
[[syntax trees at end of                    parser]] //
package $line6 {
  object $read extends scala.AnyRef {
    def () = {
      super.();
      ()
    };
    object $iw extends scala.AnyRef {
      def () = {
        super.();
        ()
      };
      import $line5.$read.$iw.$iw.x;
      object $iw extends scala.AnyRef {
        def () = {
          super.();
          ()
        };
        val res1 = x.$plus(3)
      }
    }
  }
}
... $eval omitted ...
res1: Int = 103

The REPL has defined $line6.$read.$iw.$iw to hold our expression. But now the outer $iw is importing x from $line5 rather than $line3! So, the value of our first x has not changed – it’s still there in $line3 – but the REPL has changed from using the $line3 version of x to the $line5 version. It’s a neat illusion that gives us some flexibility in the REPL while preserving Scala language rules.

Conclusion

I have shown two scenarios where -Xprint:parser can be useful, but there are certainly more along the same lines, such as seeing when apply methods are being called. I wouldn’t expect to use this option frequently, but it’s certainly handy to know about when it’s needed.