Quick iterations with Scala, JRebel, and Maven

As a developer who’s followed a path from Java through Ruby to Scala, I miss the quick turnaround time an interpreted language like Ruby enables: change your code, hit the up arrow on your command-line and hit enter. Despite all the debugging time and headaches type safety saves, adding a compiling step still slows iterations.  (This slows me personally down even more because I always try to jam some other activity, like writing this blog entry, into my compile times.)

One thing I don’t miss about Ruby is the console, but only because Scala has one. For you Java/PHP developers, a console is nice to have because it lets you quickly test out little snippets of code, both to verify some syntax you forgot and to test your objects.

Using the maven-scala-plugin, the free JRebel plug-in, and the scala console, I’ve been able to get pretty close to script like iteration speed.

  1. mvn scala:cc – The scala continuous compilation command is part of the maven-scala-plugin.  It saves the maven startup, compiler startup, and human command-line time.  When you save a source file, the process detects the change, and nearly instantly starts compiling.

    Unfortunately, scala:cc exits when there is a compilation error, which is annoying in practice.  I’d prefer it either waited until a file changes to start compile again, or at waited for a keypress.   To work around this, I use the following command:

    • (while true; do mvn scala:cc; sleep 10; done)

    That will keep your CPU a bit overly busy, but will save you some thought.

  2. Use JRebel.  The JRebel plugin-in, a costly but probably well-worth-it library for Java developers, is available for free for Scala developers in what I assume is a random act of kindness.  JRebel detects changes to class files and reloads them without restarting the runtime, and it generally just works.
  3. Use the Scala console to run your code. The console is easily launched using maven-scala-plugin with the command: mvn scala:console

Now every time you save a .scala file, it will be compiled.  Wait the few seconds for compilation, then run your test command in the Scala console, for instance a command-line simulation: liivid.MyClass.main(Array(“–test”)). JRebel will reload any changed .class files. The console has a convenient command line history (try the up arrow), which it preserves even after restarts.

I still can’t iterate as fast as I can with Ruby or PHP, but this is huge improvement.

As a side-note, I am playing around with the Play framework, a Java and soon-to-be Scala framework which uses some clever magic to allow interpreted language behavior in a Java/Scala web framework. I’ve been playing with it in Java, and really like it. Just change your source file and reload the browser – no need to wait for compilation. Very convenient. The Play framework is also stateless, and contains a fully integrated stack with hibernate and a built-in, production ready (supposedly) webserver – no piecing together Spring, Maven, and whatever else configs. Check it out..

Advantages of EC2

I have been using Amazon EC2 for a number of months now to host CribQ.

In a few words, EC2 allows you to spawn virtual servers whenever you need them and pay for them by the hour.

I thought it would be helpful to post a list of pros and cons based on my experience.  This is in response to a question about shared hosting, so the response is somewhat in that context.

Pros:

  1. No long term commitment.  You’re paying by the hour.  ($0.10 – $0.4o cents)  Your balance sheet will thank you.
  2. Internal expertise.  As opposed to using a shared host, you will have the internal expertise to set up your system from scratch, whether you choose to start with a LAMP image or a base Linux install.  There are many many free public images that will get you started, and may require little modification to run your application.
  3. Play space.  You can create additional instances of your application for load testing (client and/or server), testing new architectures, rewrites, versions, etc.  I especially like the ability to create load testing clients, something that is very hard to do cheaply any other way.  How else can you pay $1.20 for 3 hours with 4 CPUs and free bandwidth to load test your application.  (Make sure to use the internal IP address to get the free bandwidth.)
  4. Scaling.  You can easily scale vertically (upgrade to a larger 2 or 4 CPU instance) or horizontally (add instances).  Rather than trying to predict your needs, if you understand how to scale on EC2, you can scale as your demand picks up, and even scale dynamically from hour to hour.
  5. You get tons of RAM.  A small instance has 1.7Gb.  A large instance has 7.5Gb.  Compare that to what you get with other virtual hosts.
  6. You can more easily and cheaply leverage S3 for backup, storage, and serving of large files, and even SimpleDB for persistent storage.
  7. They have excellent bandwidth.
  8. Less worry about hardware failure.  Failures do occur, although it should occur less often than a dedicated server, and recovery is much easier.
  9. No CPU throttling or other usage limitations.  At a shared host, it is common practice to kill long running scripts that are using significant CPU.
  10. Dedicated IP address.  It’s yours and yours alone, as long as you keep your instance running.

Cons:

  1. No static IP addresses.  You’ll need to look around and decide on how you want to manage this.  Hopefully Amazon will address this problem soon.  Basically if you change to a new instance, you will probably also get a new IP.
  2. No international presence.  S3 storage can be located in Europe, but everything else is in the US – all in Seattle I think.  If you’re running a site targeted at Hong Kong, EC2 is not your best choice.
  3. Virtualization does have a performance impact.  You are not getting something as fast as the specs would indicate.  The difference depends on what your doing and you’ll have to read around about this.
  4. Lack of “persistent” primary storage.  This is a bit of a red herring, but if you shut your instance down you will lose the main storage where you database most likely resides.  You must explicitly back up your data.  I view this as a positive because it forces you to have a good backup/restore process, and S3 is knocking on your door.   Equate the extremely low chance of instance corruption with primary hard drive failure and you would be in the same situation.
  5. Not the cheapest option.  The price starts at $72/month for a single CPU instance.  Shared hosting can be as low as $6/month.
  6. Lack of support and management tools.   You’re not buying into a full service shared hosting solution with frequently updated install scripts and 24 hour support.