We’re using Databinder Dispatch a lot in the Cloud Services Engineering group at VMware, and late last week I was discussing it with one of my colleagues, a very senior (not average!) Java developer. I showed him a snippet of Dispatch code and said I wouldn’t expect anyone to understand it on first reading. He seemed surprised by that, unfortunately in the sense that he seemed to believe that it was expected that team members understand Dispatch code on first reading. Then Dave Pollak’s excellent Yes, Virginia, Scala is Hard post appeared, calling me out by name. :-) While it’s extremely flattering that Dave thinks I’m a statistical outlier with respect to programming language expertise, his comment, along with my disappointment to find that a very capable colleague apparently felt pressure to understand something that I expect no one to understand immediately, impels me to try to address the question of Scala’s complexity.
There are at least two obvious forks in the road that I could follow in attempting this. Luckily for me, Michael Fogus followed one fork with a vengeance in his wonderful Scala is for drivers post, which nicely covers a cost:benefit analysis of Scala, acknowledging the initial learning curve but, crucially, pointing out that it abates over time, unlike some of the alternatives. This gives me the opportunity to pursue the other fork: what is likely to be mysterious to the average, or even not-so-average, Java developer, and perhaps to offer some suggestions as to some key facts about Scala that might help shorten the time horizon of the learning curve, or lessen the slope, or if I’m very lucky, both.
Here’s some code I whipped up to demonstrate some points:
First, the purpose of the code: it prints an XML document containing the first 500 artists (people, rather than bands) on MusicBrainz. I had originally intended to go farther, allowing the user to name an artist, and printing a list of that artist’s contemporaries, which is why there’s a dependency on Jorge Ortiz’s scala-time lurking in there, but just getting a list of artists from MusicBrainz was sufficient to demonstrate the points I want to make.
So what hints can I offer even very senior Java developers faced with code like this, and how do the issues they address show up in the code?
1. Scala has no operators.
If I had to pick one thing to get new Scala developers to internalize, this would be it with a bullet. Looking at the Scala code above, we see quite a lot of this: /, <:<, ->, \, <<?, <>, +, and ++ all look like they’re operators in the sense that they’re not Scala syntax such as <-, not things we’ve named ourselves, and they’re surrounded by whitespace. But if they aren’t operators, what are they?
They’re methods. But don’t methods come after .’s? Not necessarily.
2. Scala has dot-, parenthesis-, and semicolon-inference.
Methods-as-operators only work if you can eliminate . and () pairs with relative ease. Scala’s rules for this are actually pretty simple, but only if you know them, which Java developers obviously don’t by definition. Paul Phillips does a great job explaining in this post.
Back to Dispatch’s methods. My colleague made a very helpful comment upon seeing << used to pass POST parameters in some Dispatch code: “Scala works with Java, and in Java << is a bit-shift left operator.” The problem is, he’s right. It took me a couple of days to see what the two issues are that make a method like << confusing to a Java developer. The second is related to the “no operators” observation, so I’ll state it as a…
Corollary to 1: in languages with operators, the operator defines what can appear around it; in Scala, what’s around the method defines what the method can be and what it does.
A good example of this is the expression params + ("offset" -> offset.toString). A Java developer probably thinks of + as an operator that works on the obvious numeric types. params, though, is a Map. To make matters even more confusing, it’s an immutable Map, so even if + is a method on it, what could it possibly mean? A quick look at the ScalaDocs for Map quickly shows that + is indeed a method on Map, which simply takes a new key/value pair (giving us a hint that -> might be used to construct a pair) and returns a new Map with the key/value pair in it. So if params is OK to pass to something expecting a Map[String, String], so is params + ("offset" -> offset.toString).
The first confusing point, though, is the idea that Scala “works with” Java. To narrow in on that a little, Scala “works with” Java in the sense that there’s a high degree of interoperability among Java and Scala classes, interfaces, objects, traits, and methods—things we might think of as “first class” from the point of view of the JVM. But Java’s operators are an artifact of Java syntax, and Scala doesn’t use Java syntax. There’s no reason this should be obvious to a Java developer.
But what is this thing that’s taking params + ("offset" -> offset.toString)? It’s <<?, another method that looks like an operator. Since it’s a method, it must be defined, somehow, on api. api is itself an odd construction, seeming to start with a call to a function :/ on a hostname, followed by / methods with path elements. And that’s exactly what it is.
3. If it looks like a function application, it is one.
I talked about this in Constructive Criticism but it bears repeating, because it’s very important: a token followed by ( followed by some expression followed by ) is a function application, period, the end. That means, yes, in val foo = Map("bread" -> "butter"); foo("bread"), foo(“bread”) is a function application, i.e. a Map is a function. That seems weird to Java developers, who aren’t used to…
4. All functions are objects; objects with an apply() method are functions.
In particular, an object with an apply() method can maintain state that’s important to the apply() method, as is obviously the case for a Map. This is Scala’s answer to Norman Adams’ old observation from the Scheme community that “Objects are a poor man’s closures.” Norm’s right, but because Scala runs on the JVM and CLR and interoperates with other languages on those platforms, its objects are native, and it’s the apply() method that gives these objects their functional nature. To the extent idiomatic Scala code is functional as much as, or more than, object-oriented, this will also pose a learning curve for Java developers.
5. Almost anything can be defined almost anywhere.
Java developers might be surprised to find a class definition inside a def, or an import inside an object. This gives the developer maximal expressive power on one hand, but on the other hand it also helps control the scope within which things like methods-as-operators work. Like all modern languages, Scala is lexically scoped: you can read from the “inside out,” as it were, to find where unfamiliar things in the code come from. In the expression http(api <<? params <> identity), you can trust that <<? is defined relative to api, otherwise the code wouldn’t compile. Similarly, the entire expression api <<? params <> identity must match the type that the http object/function (instantiated by new Http and applied with ( and )) expects. Learning to follow code inside-out is an important skill when you’re faced with code using an unfamiliar API, even if you actually know the Scala language, per se, quite well.
Gotchas
Does this mean that all is peaches and cream in Scala, and all Java developers need to do is learn a few simple rules to understand any and all Scala code? Not at all. Here are a few things that come immediately to mind that are, in my opinion, likely to be ongoing stumbling blocks to Java developers learning Scala.
Implicit resolution
Scala supports implicit objects and implicit defs. They lend amazing power to the language, but mastering their use is one of Scala’s bigger challenges. To fully understand what a sophisticated piece of Scala code does, you’ll need to understand the implicit resolution rules. Like .- and parenthesis-inference, it’s not that the rules are especially complex; it’s just that you either know them or you don’t, and Java developers, again by definition, don’t. For what it’s worth, I don’t either.
Methods bind to the left… except when they don’t.
Methods that don’t end with : are defined on the object to their left. Methods that do end with : are defined on the object to their right. Yeah, I know.
Executive Summary
While it’s true you can, for the most part, write Scala as if it were Java, idiomatic Scala tends not to be very much like Java in the end. Worse, some intuitions brought from Java, such as those regarding operators, can actively lead you astray. I’ve given five basic observations about Scala in an effort to help focus Java developers’ intuitions in a constructive direction, and noted in passing a couple of things that, even after over two years of Scala, continue to be thorns in my side.
My one-sentence summary, though, would be: there’s no substitute for actually learning the language, and yes, Virginia, Scala is learnable.